* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download SNP Analysis (GAW15 data)
Point mutation wikipedia , lookup
Minimal genome wikipedia , lookup
Genomic library wikipedia , lookup
Population genetics wikipedia , lookup
Gene expression profiling wikipedia , lookup
Copy-number variation wikipedia , lookup
Medical genetics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene desert wikipedia , lookup
Human genome wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Biology and sexual orientation wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Human genetic variation wikipedia , lookup
Heritability of IQ wikipedia , lookup
Genomic imprinting wikipedia , lookup
Genome evolution wikipedia , lookup
Designer baby wikipedia , lookup
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
SNP genotyping wikipedia , lookup
Gene expression programming wikipedia , lookup
Microevolution wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome-wide association study wikipedia , lookup
Public health genomics wikipedia , lookup
Skewed X-inactivation wikipedia , lookup
Genome (book) wikipedia , lookup
Y chromosome wikipedia , lookup
X-inactivation wikipedia , lookup
High Density SNP analysis of 642 Caucasian Families with Rheumatoid Arthritis Identifies Two Novel Regions of Linkage on Chromosomes 11p12 and 2q33 Christopher I. Amos, Wei V. Chen, and NARAC Previous Genome Scans using Microsatellites We did two independent genome wide scans using microsatellites at 10 cM interval. Outside of the MHC, no genetic region has been identified that meets accepted criteria for “definite” linkage. Table4 A meta-analytical study of data published by groups in the U.K, Europe, the U.S. provided evidence (p<0.01) for loci influencing RA risk on chromosomes 6p, 6q, 16 centromeric and 12p. 2 Current Status for Worldwide RA Gene Searching Consistent associations clearly implicate a role of the major histocompatibility complex (MHC) on chromosome 6p21 in risk for rheumatoid arthritis. – – Recently, the R620W variant of the PTPN22 locus on chromosome 1p13 has been shown to confer increased risk for rheumatoid arthritis, with odds ratios ranging between 1.5-2.0 for heterozygotes, and over 3.0 for homozygous carriers of the variant. – – – the MHC region makes the largest single contribution (relative recurrence risk ~1.8 ) to disease susceptibility7 A set of alleles at the DRB1 locus, many of which share a common polymorphic sequence, the “shared epitope”, explain a large portion, but not all, of the genetic risk within the MHC. This finding has been extensively replicated , and is now accepted as the most robust genetic association with RA outside of the MHC. Interestingly, the 620W PTPN22 allele is also associated with several other autoimmune disorders including type 1 diabetes, autoimmune thyroid disease, systemic lupus erythematosus and some forms of juvenile arthritis. additional variability in the PTPN22 locus may account for a more minor proportion of risk for RA Associations with CTLA4 (chromosome 2q33.1) and PADI4 (chromosome 1p36) have also been supported in some additional studies suggesting that these genes may also contribute to RA susceptibility in Caucasian populations, but with rather modest relative risks . Goals of this study Obtain refined evidence for linkage using a larger sample size and denser map of markers, thus improving informativity Evaluate evidence for linkage by clinical strata to identify more homogeneous subgroups Comparison of microsatellite and SNP scans Largest Single Genome Wide Linkage Scan in RA >5,700 informative SNP markers (Illumina IV SNP linkage panel) – – – – Data from 5744 markers passed Quality Control requirements in the lab (98.1%) Hardy-Weinberg Disequilibrium Detected in 5/4995 markers at p<=0.001 (excluding chromosome 6, X and XY) 18 Markers on Y chromosome were dropped, 866 markers dropped for strong LD (D’>0.7) from many analyses 293 markers are on chromosome X, 19 on XYp and 7 are on XYq 642 Caucasian families containing affected sibling pairs with rheumatoid arthritis, recruited by the North American Rheumatoid Arthritis Consortium (NARAC) – Table 1. Structure and sampling of 642 caucasian sibling pair families studied Informativity of SNP scan Improved Information Content – (entropy) Microsatellites – 0.526 Information Content including all SNP markers - 0.756 Information Content After excluding SNP markers in LD - 0.749 Information Content Information content is a measure of how informative a marker or map of markers is in a collection of pedigrees in order to extract the maximum amount of inheritance information for a linkage analysis. Information content is a function of marker heterozygosity and the number of meioses in the genetic study. For multipoint linkage analysis, information content is also a function of marker density and spacing. It is important to have high information content throughout the genome for genome-wide searches for disease susceptibility loci or other traits so that regions of no linkage can be excluded, regions of significant linkage can be detected, and the linkage interval can be accurately defined. Statistical Analysis Applied SNPLINK – set of Perl scripts that manage SNP data and can call Merlin Summarized evidence for linkage using Kong and Cox LOD scores from Merlin Analyzed complete data or after eliminating markers showing D’ > 0.7. Reanalyzed pseudoautosomal regions after balancing same-sex and opposite sex pairs Removal of Markers in LD Linkage analysis of tightly linked loci can lead to an excess of false positive results if the markers are in strong linkage disequilbrium and parents are not available for genotyping (Huang et al., 2004). We choose to remove markers that showed D’ values greater than 0.7, because earlier analyses on simulated stata have shown little inflation in LOD score if this criterion is used. – – – 866 markers dropped for strong LD (D’>0.7) from analyses (15.12%); when only autosomal chromosomes are considered, 784 out of 5407 markers tested were dropped (14.50%). Over all markers, there was an average decrease in LOD score of 0.129 when markers in linkage disequilibrium were dropped. For only the autosomal chromosomes, dropping markers in LD lead to a decrease of 0.120 in LOD score. Dramatic decreases in LOD scores were noted on a few chromosomes when markers in LD were dropped, for example on chromosome 21, the LOD score decreased from 11.59 to 1.11. The most prominent of these instances involved markers located at the telomeres, and can be explained by the lack of any flanking markers that are not in linkage disequilibrium with the set of markers. Effects of Different Cutoffs for LD Removal R-square > 0.16: fewer markers dropped R-square > 0.05: also didn’t qualitatively change the results – the LOD scores on chromosomes 2, 4, 7, 10 and 11 increased slightly with the maximum increase being from 2.35 to 2.55 on chromosome 4, while on chromosomes 5, 6, 12, 16, and 18 there were modest decreases in LOD score, with the largest decreases being on chromosome 18 from 1.47 to 1.08 and chromosome 5 from 2.55 to 2.32. The overall moderate changes in LOD scores argue strongly against false positive results due to LD after adjustment for LD (D’< 0.7) between markers. 10 Effects of Untyped Parents we checked the LOD scores among families with no parents genotyped (59%) versus families with at least one genotyped parent (41%) for the major regions of linkage on chromosomes 2, 4, 7, 10 and 11. LOD scores remained positive in all family groups. On chromosomes 2, 7, and 11 the LOD scores from the families with one or more parents typed were higher, while on chromosomes 4 and 10, the LOD scores were higher for the set of families without typed parents. – The only potential region of concern occurs on chromosome 4, for which the maximum LOD score in families without typed parents was 2.50, while among the families with at least one genotyped parent the LOD score was only 0.08. However, the finding that LOD scores on chromosome 4 do not decrease when restricting to R-squared values for LD among adjacent markers of 0.05 or less indicates that the evidence for linkage on this chromosome does not reflect a false positive due to LD among tightly linked markers. – 11 Effects of Large Centromere Since genetic maps are not available for the majority of SNPs that were available in this panel, we assumed that 1 megabase is 1 centiMorgan. In the proximity of the centromeric regions this mapping approach is inaccurate because the centromeric regions have suppressed recombination. we reperformed analysis setting the recombination to zero for the 6 chromosomes with large centromeres (1, 3, 9, 11, 16, 19). Excluding recombination in these regions led to slightly lower LOD scores. 12 Effects of Markers in Hardy Weinberg Disequilibrium – Hardy-Weinberg Disequilibrium Detected in 5/4995 markers at p<=0.001 (excluding chromosomes 6, X and XY) Chromosome 6: known strong genetic factors conforms well to the expected frequency of Hardy-Weinberg disequilibrium with this large set of markers. – Of note, marker rs238510 at 103.49 megabases in the linkage peak on chromosome 4 and just proximal to the potential candidate gene, B-cell scaffold protein with ankyrin repeats 1 (Bank1), showed a significant (p<=0.001) departure from Hardy-Weinberg equilibrium. – Departures from Hardy-Weinberg equilibrium can occur for numerous reasons including genotype call failures and association between marker alleles and disease susceptibility. Since the latter would be expected for certain regions showing strong evidence for linkage, we here only report results including all markers. 13 Chromosome 1 4 Microsatellites SNPs (wo LD) SNPs (w LD) 3 LOD 2 1 0 0 50 100 150 -1 -2 Mb 200 250 SNPs Identify a 2q Linkage 4 Microsatellites SNPs (wo LD) SNPs (w LD) 3 LOD 2 1 0 -1 0 50 100 150 -2 Mb 200 250 SNPs Identify an 11p Linkage 4 3 Microsatellites SNPs (wo LD) SNPs (w LD) LOD 2 1 0 0 20 40 60 80 -1 Mb 100 120 Chromosome 4 4 LOD 3 w-LD wo-LD 2 1 0 -1 0 50 100 Physical Distance (Mb) 150 200 Chromosome 5 4 LOD 3 w-LD wo-LD 2 1 0 -10 -1 40 90 Physical Distance (Mb) 140 190 Chromosome 6 20 15 w-LD wo-LD LOD 10 5 0 -20 30 80 -5 Physical Distance (Mb) 130 180 Chromosome 7 LOD 4 3 w-LD 2 wo-LD 1 0 0 20 40 60 80 100 -1 Physical Distance (Mb) 120 140 160 Chromosome 10 4 LOD 3 w-LD wo-LD 2 1 0 0 20 40 60 80 -1 Physical Distance (Mb) 100 120 140 Chromosome 18 4 LOD 3 w-LD wo-LD 2 1 0 0 20 40 60 -1 Physical Distance (Mb) 80 Reanalysis of pseudoautosomal regions after balancing same-sex and opposite sex pairs 19 on XYp and 7 are on Xyq Selection of sib pairs that include an excess of same-sex pairs leads to an excess of sharing in the pseudoautosomal region, while opposite sex-pairs show a decrease in sharing. There are more affected female-female sib pairs in the dataset than the affected female-male or male-male sib pairs. So after dropping malemale affected pairs, there are still more female-femlae pairs than female-male pairs (not equal number of same-sex and opposite sex pairs). Two ways were used to bring female-male pairs in balance with sex concordant pairs: 1. After dropping male-male pairs and 180 (37%) of female-female pairs, the LOD scores for all SNPs dropped to less than 0 after LD removal. – 2. Also when including all male-male pairs and dropping out 236 (49%) female-female pairs, the LOD scores for all SNPs dropped to less than 0 after LD removal. – Results Chromosome LOD Scores Peak SNP Position Peak SNP Position w-LD wo-LD wo-LD w-LD wo-LD w-LD 1 1.65 1.41 rs335523 2 4.02 3.52 rs1354905 189649014 rs1949429 192602000 4 3.78 2.35 rs223383 104209760 rs1384401 105023898 5 3.06 2.55 rs1857844 24569227 rs903391 43206838 6p 18.53 16.14 rs169679 28964566 rs11908 32991663 6q 5.32 0.84 211882862 rs1547502 216620855 14880100 Candidate Genes Near Linkage Peaks Chr SNP Genes 1 rs1547502 LYPLAL1, TGFB2, ZNT8, EPRS 2 rs1949429 4 rs1384401 MYO1B, STAT1, STAT4, GLS, serum deprivation response (phosphatidylserine binding protein), transmembrane protein with EGF-like and two follistatin-like domains 2 Tachkynin receptor 3, CENPE, MANBA, NFKB, BANK1 (B-cell scaffold protein with ankyrin repeats 1) Results Chromosome LOD Scores Peak SNP Position Peak SNP Position w-LD wo-LD wo-LD w-LD wo-LD w-LD 7 2.49 1.94 rs903898 146244418 rs2040587 114230026 8 1.61 0.91 rs1735173 146076058 rs1375956 106521094 10 3.76 2.54 rs579142 60472666 rs1227938 70502871 11 3.92 3.09 rs2035693 39201341 rs1462224 41024049 12 5.09 1.46 rs7960480 132017084 rs2009625 22216512 16 1.73 1.69 rs1946155 53412897 53412897 rs1946155 Candidate Genes Near Linkage Peaks Chr SNP Genes 5 rs903391 7 rs2040587 Chemokine L28, Selenoprotein SEPP1, HMG-CoA synthase H1C, FoxP2, protein phosphatase 1, 10 rs1227938 Tachykinin receptor 2, HK1, regulatory inhibitor) subunit 3A, desert transmembrane 4 superfamily member tetraspan NET-7, neurogenin 3, PROTEOGLYCAN 1 11 rs1462224 Desert, NGL1, TRAF6 (4Mb), CD44, API5 Results Chromosome LOD Scores Peak SNP Position Peak SNP Position w-LD wo-LD w-LD w-LD wo-LD wo-LD 17 1.74 0.97 rs411602 69136700 rs764426 67565205 18 1.9 1.47 rs1792723 52015361 rs663220 52349253 19 4.52 0.21 rs1465789 63637868 rs306450 61192862 20 3.11 1.02 rs1434789 132900 rs914433 55196924 21 11.59 1.11 rs2835629 37442056 rs2837710 40861199 X 3.22 1.17 rs2015312 56385990 rs2057652 10529107 XY 8.69 3.91 rs2535444 2282688 rs700447 153441492 Candidate Genes Near Linkage Peaks Chr SNP Genes 12 rs2009625 16 rs1946155 Cytidine 5-prime-monophosphate nacetylneuraminic acid synthetase, ABCC9, STAT8A RBL2, FTS, fto (Fatso), MMP2, SLC6A2 XY rs1462224 SYBL1, SPRY3, IL9R Stratified Analysis Previous Studies identified major differences among some strata, notably when stratifying by sex, shared epitope, and when adjusting for antiCCP levels. Individuals with antiCCP titers less than 20 were categorized as negative and individuals with antiCCP titers of 20 or higher were classified as positive Conclusions Strong evidence for linkages to chromosomes 2, 6, 11 – prominent and previously unreported linkage peaks are observed on chromosomes 2q33 and 11p12 (within a ‘gene desert’) with LOD scores of 3.52 and 3.09 respectively, after adjustment for LD (D’< 0.7) between markers. – Broad linkage signal on chromosome 6 Stratification/covariate effects most pronounced for antiCCP strata for chromosomes 4, 5, 6 and 7 HLA shared epitope status and sex had significant impact on linkage evidence within the MHC region of chromosome 6 only. Future Study Follow up association studies on the 2q33 and 11p12 regions, and on antiCCP+ disease subset Fine mapping and gene identification supplemented by the selection of candidate genes based on evoving knowledge of the disease 32 The End. Thank you!