* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Supplementary Information (doc 33K)
Non-coding DNA wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Pharmacogenomics wikipedia , lookup
SNP genotyping wikipedia , lookup
Pathogenomics wikipedia , lookup
Minimal genome wikipedia , lookup
Comparative genomic hybridization wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Point mutation wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
Gene nomenclature wikipedia , lookup
Gene desert wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Ridge (biology) wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Gene expression programming wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene therapy wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Oncogenomics wikipedia , lookup
Saethre–Chotzen syndrome wikipedia , lookup
Genome (book) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Metagenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Genome evolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Helitron (biology) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Microevolution wikipedia , lookup
Designer baby wikipedia , lookup
Supplementary information CNV validation by quantitative PCR Quantitative PCR (qPCR) was performed on LightCycler 480 System (Roche Applied Science). TaqMan Copy Number Assay (Applied Biosystems) was selected (http://www.appliedbiosystems.jp/). via As Life a Techonologies reference gene, website we used commercially available TaqMan Copy Number Reference Assay RNase P. Each sample was examined in 10 µl reaction mixture (TaqMan® Genotyping Master Mix: 5µl, TaqMan® Copy Number Assay 20× working stock: 0.5µl, TaqMan Copy Number Reference Assay 20×: 0.5µl, Nuclease-free water: 2µl and 5ng/µl genomic DNA: 2µl). The qPCR thermal cycling conditions were as follows: initiation at 95 °C for 10 minutes for hot start, followed by 40 cycles of 95 °C for 15 seconds and 60 °C for 1 minute. The PCR efficiency of each assay was extracted from the calibration curves of mixed DNA from three different samples with normal copy numbers, serially diluted from 40 ng to 2.5 ng of genomic DNA. Data analysis was further performed using the ∆∆CT method. CNV ratio was normalized ratio of ∆∆CT, based on the calibration samples with normal copy numbers (=∆∆CTsample/∆∆CTcalibration). Experiments in triplicate were repeated 4 times. Differences in CNV ratio between each case and 8 controls were tested using t-test (Supplementary Figure 2). Pathway analysis In the pathway analysis, we searched any gene-sets that are more frequently affected by rare and large CNVs in the patients compared with controls. The pathway analysis consisted of three parts; deriving gene-sets, testing and creating functional clusters of gene-sets. Gene-sets were derived from the Gene Ontology (http://www.geneontology.org/) and from KEGG (http://www.genome.jp/kegg/). R version 2.15.1 (http://www.r-project.org/) and Bioconductor 2.11 (http://www.bioconductor.org/) were used for deriving gene-sets from the databases (“GO.db” version 2.5.0 and “KEGG.db” version 2.5.0), Gene symbol to Entrenz ID conversion (“org.Hs.eg.db” version 2.5.0) and calculation of tests. Gene-sets with sizes > 700 genes or with sizes < 5 genes were excluded, because large and small gene-sets are less likely to produce useful biological meaning. To test the differences between cases and controls, Fisher’s exact test was used based on the number of samples whose rare and large CNVs disrupted at least one gene in a gene-set. Calculation procedures are as follows. (1) Let i index samples and j index gene-sets. (2) Define indicator I(i,j) = 1 if a sample i carries a CNV overlapping at least one gene in gene-set j, and zero otherwise. In other words, a sample can contribute to multiple gene-sets but cannot contribute more than once to the same gene-set. (3) The total score for gene-set j is obtained by summing over i. For testing, this sum is partitioned between cases and controls. (4) Based on the total scores of cases and controls, Fisher’s exact test for gene-set j was conducted. Processes of (2), (3) and (4) were repeated to obtain P values for every gene-sets. Functional clusters were created to interpret a large number of significant gene-sets after testing. Cytoscape2.8.1 (http://www.cytoscape.org/) was used for creating functional clusters of significant gene-sets. Clustering rules are as follows. Define “support-genes” as those genes more frequently overlapped by CNVs in cases than in controls. If two gene-sets share some support-genes, then these two gene-sets are connected. An edge means two gene-sets share support-genes. An edge width is proportion of shared support-genes versus total genes within two gene-sets. A node means a gene-set. A node size is proportional to the number of genes included in a gene-set. Supplementary Figure Legend Supplementary Figure1: Workflow of CNV detection Supplementary Figure 2: Quantitative PCR of duplications in PARK2 region in four narcolepsy patients. Red bar plots indicate the patients (CN=3_1〜 CN=3_4) with duplications (copy number = 3). Blue bar plots indicate healthy controls (CN=2_1〜CN=2_8) with normal copy number (copy number = 2). CNV ratio is ratio of copy number. If copy number is 2, the ratio is expected to be 1. Bars on the top of bar plots show 1 standard deviation. Each t-tests between patients and controls reached significant level (P = 3.26×10-16). Supplementary Figure 3: Quantitative PCR of duplications in PARK2 region in two EHS patients. Red bar plots indicate the patients (CN=3_5, CN=3_6) with duplications (copy number = 3). Blue bar plots indicate healthy controls (CN=2_9〜CN=2_22) with normal copy number (copy number = 2). CNV ratio is ratio of copy number. If copy number is 2, the ratio is expected to be 1. Bars on the top of bar plots show 1 standard deviation. Each t-tests between a patient and 14 controls reached significant level (P = 2.07×10-8). Supplementary Table Legend Supplementary Table 1:Gene-sets significantly associated with narcolepsy The table shows the list of 32 significant gene-sets from both KEGG and the Gene Ontology with FDR 5%. P value was calculated using I(i,j) score. Define indicator I(i,j) = 1 if a sample i carries a CNV overlapping at least one gene in gene-set j, and zero otherwise. The total I(i,j) score for gene set j is obtained by summing over i. For testing, this sum is partitioned between cases and controls, shown in the column of the number of sample. The number in the first column is index used in gene-set index in Figure 2.