Download Supplementary Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene therapy of the human retina wikipedia , lookup

Copy-number variation wikipedia , lookup

Pathogenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Genomic library wikipedia , lookup

History of genetic engineering wikipedia , lookup

Ploidy wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Y chromosome wikipedia , lookup

Gene wikipedia , lookup

Ridge (biology) wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

NEDD9 wikipedia , lookup

Genomic imprinting wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Neocentromere wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome evolution wikipedia , lookup

Chromosome wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Microevolution wikipedia , lookup

Designer baby wikipedia , lookup

X-inactivation wikipedia , lookup

Genome (book) wikipedia , lookup

Polyploid wikipedia , lookup

Gene expression profiling wikipedia , lookup

Oncogenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Karyotype wikipedia , lookup

Gene expression programming wikipedia , lookup

RNA-Seq wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Transcript
A whole-genome analysis of the lung cancer example.
Supplementary Figure 1 presents whole-genome results. The cluster separability measure in
Supplementary Figure 1A suggests a 3-cluster solution under the non-sparse model.
Supplementary Figure 1C shows heatmaps with samples arranged under the 3-cluster
assignments. From the whole-genome view, the most visible pattern that separates cluster 1 from
the rest is gain of chromosome 1q, while there seems to be a “global” pattern of differential
expression between cluster 2 and 3. When annotated with the mutational status of the lung
cancer genes, no clear pattern is observed (top panels of Supplementary Figure 1C). Notably, the
patterns observed on chromosomes 8 and 12 in Figure 3 disappear in the whole-genome context.
This is not entirely unexpected given the considerably more complex patterns of alterations in
the whole genome. A differential weighting scheme could be useful in aggregating individual
chromosome clustering results. This is a future research topic beyond the scope of this paper.
Finally, although sparse solutions lead to a smaller POD statistic, they select too few or none of
the DNA probes. The 2-cluster solution was therefore primarily driven by gene expression under
the sparse models. It is known that the lasso penalization works well in sparse data (where there
are few non-zero coefficients), which does not seem to be the case for the lung gene expression
data. Other types of regularization methods may be explored in the future.
Supplementary figure legend.
Supplementary Figure 1: Lung cancer subtypes whole-genome view. A. Model selection based
on the POD statistic. B. Cluster separability plot showing a 3-cluster solution. C. Heatmap of
copy number (left) and gene expression (right). Columns are tumor samples sorted by the
iCluster assignments. Rows are genes ordered by their genomic position with chromosomes
numbered to the left of the vertical axis of the heatmap. The sex chromosomes are not included.