* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Supplementary Document
Gene therapy of the human retina wikipedia , lookup
Copy-number variation wikipedia , lookup
Pathogenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Genomic library wikipedia , lookup
History of genetic engineering wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Y chromosome wikipedia , lookup
Ridge (biology) wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Genomic imprinting wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Neocentromere wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome evolution wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Microevolution wikipedia , lookup
Designer baby wikipedia , lookup
X-inactivation wikipedia , lookup
Genome (book) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Oncogenomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
A whole-genome analysis of the lung cancer example. Supplementary Figure 1 presents whole-genome results. The cluster separability measure in Supplementary Figure 1A suggests a 3-cluster solution under the non-sparse model. Supplementary Figure 1C shows heatmaps with samples arranged under the 3-cluster assignments. From the whole-genome view, the most visible pattern that separates cluster 1 from the rest is gain of chromosome 1q, while there seems to be a “global” pattern of differential expression between cluster 2 and 3. When annotated with the mutational status of the lung cancer genes, no clear pattern is observed (top panels of Supplementary Figure 1C). Notably, the patterns observed on chromosomes 8 and 12 in Figure 3 disappear in the whole-genome context. This is not entirely unexpected given the considerably more complex patterns of alterations in the whole genome. A differential weighting scheme could be useful in aggregating individual chromosome clustering results. This is a future research topic beyond the scope of this paper. Finally, although sparse solutions lead to a smaller POD statistic, they select too few or none of the DNA probes. The 2-cluster solution was therefore primarily driven by gene expression under the sparse models. It is known that the lasso penalization works well in sparse data (where there are few non-zero coefficients), which does not seem to be the case for the lung gene expression data. Other types of regularization methods may be explored in the future. Supplementary figure legend. Supplementary Figure 1: Lung cancer subtypes whole-genome view. A. Model selection based on the POD statistic. B. Cluster separability plot showing a 3-cluster solution. C. Heatmap of copy number (left) and gene expression (right). Columns are tumor samples sorted by the iCluster assignments. Rows are genes ordered by their genomic position with chromosomes numbered to the left of the vertical axis of the heatmap. The sex chromosomes are not included.