Download Supplementary Information (doc 33K)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Non-coding DNA wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Pharmacogenomics wikipedia , lookup

SNP genotyping wikipedia , lookup

Pathogenomics wikipedia , lookup

Minimal genome wikipedia , lookup

Comparative genomic hybridization wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Point mutation wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene desert wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Ridge (biology) wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Gene expression programming wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene therapy wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Genome (book) wikipedia , lookup

Genomic imprinting wikipedia , lookup

Metagenomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Genome evolution wikipedia , lookup

Gene wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Helitron (biology) wikipedia , lookup

Gene expression profiling wikipedia , lookup

RNA-Seq wikipedia , lookup

Microevolution wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Copy-number variation wikipedia , lookup

Transcript
Supplementary information
CNV validation by quantitative PCR
Quantitative PCR (qPCR) was performed on LightCycler 480
System (Roche Applied Science). TaqMan Copy Number Assay (Applied
Biosystems)
was
selected
(http://www.appliedbiosystems.jp/).
via
As
Life
a
Techonologies
reference
gene,
website
we
used
commercially available TaqMan Copy Number Reference Assay RNase P.
Each sample was examined in 10 µl reaction mixture (TaqMan® Genotyping
Master Mix: 5µl, TaqMan® Copy Number Assay 20× working stock: 0.5µl,
TaqMan Copy Number Reference Assay 20×: 0.5µl, Nuclease-free water: 2µl
and 5ng/µl genomic DNA: 2µl). The qPCR thermal cycling conditions were as
follows: initiation at 95 °C for 10 minutes for hot start, followed by 40 cycles
of 95 °C for 15 seconds and 60 °C for 1 minute. The PCR efficiency of each
assay was extracted from the calibration curves of mixed DNA from three
different samples with normal copy numbers, serially diluted from 40 ng to
2.5 ng of genomic DNA. Data analysis was further performed using the ∆∆CT
method. CNV ratio was normalized ratio of ∆∆CT, based on the calibration
samples with normal copy numbers (=∆∆CTsample/∆∆CTcalibration). Experiments
in triplicate were repeated 4 times. Differences in CNV ratio between each
case and 8 controls were tested using t-test (Supplementary Figure 2).
Pathway analysis
In the pathway analysis, we searched any gene-sets that are more
frequently affected by rare and large CNVs in the patients compared with
controls. The pathway analysis consisted of three parts; deriving gene-sets,
testing and creating functional clusters of gene-sets. Gene-sets were derived
from the Gene Ontology (http://www.geneontology.org/) and from KEGG
(http://www.genome.jp/kegg/). R version 2.15.1 (http://www.r-project.org/)
and Bioconductor 2.11 (http://www.bioconductor.org/) were used for deriving
gene-sets from the databases (“GO.db” version 2.5.0 and “KEGG.db” version
2.5.0), Gene symbol to Entrenz ID conversion (“org.Hs.eg.db” version 2.5.0)
and calculation of tests. Gene-sets with sizes > 700 genes or with sizes < 5
genes were excluded, because large and small gene-sets are less likely to
produce useful biological meaning.
To test the differences between cases and controls, Fisher’s exact
test was used based on the number of samples whose rare and large CNVs
disrupted at least one gene in a gene-set. Calculation procedures are as
follows. (1) Let i index samples and j index gene-sets. (2) Define indicator
I(i,j) = 1 if a sample i carries a CNV overlapping at least one gene in gene-set
j, and zero otherwise. In other words, a sample can contribute to multiple
gene-sets but cannot contribute more than once to the same gene-set. (3) The
total score for gene-set j is obtained by summing over i. For testing, this sum
is partitioned between cases and controls. (4) Based on the total scores of
cases and controls, Fisher’s exact test for gene-set j was conducted. Processes
of (2), (3) and (4) were repeated to obtain P values for every gene-sets.
Functional clusters were created to interpret a large number of
significant
gene-sets
after
testing.
Cytoscape2.8.1
(http://www.cytoscape.org/) was used for creating functional clusters of
significant gene-sets. Clustering rules are as follows. Define “support-genes”
as those genes more frequently overlapped by CNVs in cases than in controls.
If two gene-sets share some support-genes, then these two gene-sets are
connected. An edge means two gene-sets share support-genes. An edge width
is proportion of shared support-genes versus total genes within two gene-sets.
A node means a gene-set. A node size is proportional to the number of genes
included in a gene-set.
Supplementary Figure Legend
Supplementary Figure1: Workflow of CNV detection
Supplementary Figure 2: Quantitative PCR of duplications in PARK2 region
in four narcolepsy patients. Red bar plots indicate the patients (CN=3_1〜
CN=3_4) with duplications (copy number = 3). Blue bar plots indicate
healthy controls (CN=2_1〜CN=2_8) with normal copy number (copy
number = 2). CNV ratio is ratio of copy number. If copy number is 2, the ratio
is expected to be 1. Bars on the top of bar plots show 1 standard deviation.
Each t-tests between patients and controls reached significant level (P =
3.26×10-16).
Supplementary Figure 3: Quantitative PCR of duplications in PARK2 region
in two EHS patients. Red bar plots indicate the patients (CN=3_5, CN=3_6)
with duplications (copy number = 3). Blue bar plots indicate healthy controls
(CN=2_9〜CN=2_22) with normal copy number (copy number = 2). CNV
ratio is ratio of copy number. If copy number is 2, the ratio is expected to be 1.
Bars on the top of bar plots show 1 standard deviation. Each t-tests between
a patient and 14 controls reached significant level (P = 2.07×10-8).
Supplementary Table Legend
Supplementary Table 1:Gene-sets significantly associated with narcolepsy
The table shows the list of 32 significant gene-sets from both KEGG and the
Gene Ontology with FDR 5%. P value was calculated using I(i,j) score. Define
indicator I(i,j) = 1 if a sample i carries a CNV overlapping at least one gene in
gene-set j, and zero otherwise. The total I(i,j) score for gene set j is obtained
by summing over i. For testing, this sum is partitioned between cases and
controls, shown in the column of the number of sample. The number in the
first column is index used in gene-set index in Figure 2.