Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Promoter (genetics) wikipedia , lookup
Gene regulatory network wikipedia , lookup
Gene expression profiling wikipedia , lookup
Gene desert wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Genome evolution wikipedia , lookup
Molecular ecology wikipedia , lookup
Molecular evolution wikipedia , lookup
Methods for ARIC Carotid MRI Genotyping Project Gene Selection Candidate genes related to athersclerosis were identified by the ARIC investigators and provided to the ARIC DNA laboratory for compilation and verification (n=281). SNP Selection TagSNPs within these genes were derived using the Haploview Program (http://www.broad.mit.edu/mpg/haploview/) based on two sources of SNPs: the Caucasian (CEU) and Yoruban (YRI) population from the International HapMap project (http://www.hapmap.org/). The data was analyzed in a race specific manner on a gene by gene basis using the gene definitions provided by the HapMap database. Nonsynonymous SNPs were selected by an automated search of public SNP databases. dbSNP (http://www.ncbi.nlm.nih.gov/SNP/index.html) was used as the primary source for the data on each gene. The UC Santa Cruz Genome Assembly (http://genome.ucsc.edu/) was used to supplement dbSNP where no data was available and to help resolve ambiguous data. The algorithm used for the SNPs selection was Haploview’s implementation of the Broad Institute’s Tagger software. The R squared cut off for Tagger was set to 0.8 and the LOD threshold to 2. In addition, Tagger was used in aggressive multi-marker mode. SNPs with a minor allele frequency (MAF) of less that 0.05 were excluded from consideration before the tagSNPs were calculated. All tagSNPs selected by Tagger for the CEU population were included in the SNP panel. TagSNPs that were not in blocks, or only tagged themselves in the YRI population were not included. Nonsynonymous SNPs with a MAF >0.05 and a limited number of additional candidate SNPs were included if provided by an ARIC investigator. The final SNP set for each gene was determined by taking the union of the four SNP sets (nonsynonymous, tagSNPs from each population and PI requested SNPs) for each gene. The overall SNP set is time-dependent and is likely to change as the data at the various SNP databases in refined or expanded. At the time this SNP panel (n=6,890) was created we used Haploview v. 3.32pr, Hap Map Data Rel 22/phase Apr 07 and NCBI build 36/dbSNP build 126. Sample Selection The ARIC coordinating center provided the ARIC DNA lab with a pull list of 2,110 Carotid MRI individuals. Eight individuals had withdrawn consent to use their DNA and were not included in the sample set used for genotyping. Genotyping After SNP selection, assay design and oligonucleotide manufacturing a total of 6,104 SNPs in 281 genes were selected for genotyping. Illumina’s FastTrack Genoyping Services (San Diego, CA) was utilized for completion of the ARIC Carotid MRI genotyping and DNA samples (n=2,101) were shipped to Illumina in June 2007. A custom designed iSelect Infinium BeadChip with 7,600 bead types was used (http://www.illumina.com/pages.ilmn?ID=158) to generate the genotype data. Page 1 of 2 Genotyping Quality Control After genotyping was complete, the following exclusion criteria were applied which resulted in a final data set of 5,266 SNPs that was provided to the ARIC coordinating center on October 3, 2007. Failed production genotyping (i.e. assay did not work) (n=346) Missing data > 20% (n=20) Monomorphic in all samples (n=472) Hardy-Weinberg Equilibrium (HWE) was not utilized as an exclusion criterion for data submission, thus race-specific HWE should be calculated and a cut off value determined on an individual basis. Known replicate samples were included in the plate set for genotyping and concordance of these duplicated samples for all SNPs was 99.998% (126,674 / 126,676 pairs matched). The following table describes the two non-matching pairs. SNP_NAME rs4649182 rs81663 ID QC_E67 QC_E67 Genotype1 A/A T/T Genotype2 A/T T/A Blind duplicate samples were also included and these quality control results will be calculated and distributed by the ARIC coordinating center. There were 31 of the 2,101 samples that failed to produce genotype results for all SNPs. These 31 samples remained in the final data set for completion, but all of the genotypes were set to missing (“XX”). Page 2 of 2