Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Loss-of-co-Homozygosity mapping and exome sequencing of a Syrian pedigree identified the candidate causal mutation associated with rheumatoid arthritis. Yukinori Okada1,2, Namrata Gupta2, Daniel Mirel2, Stacey Gabriel2, Thurayya Arayssi3, Faten Mouassess4, Walid AL. Achkar4, Layla A. Kazkaz5,6, Robert M. Plenge1,2. 1. Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA. 2. Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA. 3. Weill Cornell Medical College-Qatar, Education City, Doha, Qatar. 4. Molecular Biology and Biotechnology Dept, Human Genetics Division, Damascus, Syria. 5. Tishreen Hospital, Damascus, Syria. 6. Syrian Association for Rheumatology, Damascus, Syria. Background/Purpose: Although there are >50 rheumatoid arthritis (RA) risk loci that contain common variants, there are no loci that harbor rare mutations that influence RA risk in a Mendelian fashion. Here, we perform whole exome sequencing to search for rare, causal mutations in a 4-generation, 49-person consanguineous Syrian pedigree in which 8 individuals were affected with rheumatoid arthritis (RA). Method: We performed GWAS genotyping on 16 family members (affected and unaffected) and genome-wide exome sequencing in the 4 anti-CCP positive RA cases. We developed a novel non-parametric linkage analysis we term “Loss-of-co-Homozygosity” (LOcH) mapping that extends homozygosity mapping to include any type of inheritance mode. LOcH uses genome-wide SNP data to search the regional stretches that lose one or both homozygous genotypes (i.e., lose “co-homozygosity”) in affected cases, to identify ancestry-shared haplotype. Candidate mutations selected by exome sequencing and LOcH mapping were further validated by iPlex assay in 24 family members. Result: Using GWAS data and LOcH mapping, we identified 12% of the genome in which the same ancestral haplotype was shared among all RA cases. Exome sequence identified 15 nonsense or missense candidate mutations shared among all cases. Validation iPlex assay found that 1 mutation preferentially segregated in cases compared to controls (P = 0.023). The mutated gene is phospholipase B1 (PLB1) at 2p23, which has been implicated in human epidermal barrier function. Conclusion: While additional investigation of PLB1 mutation is required, our approach highlights a novel method of statistical analysis of genome-wide sequence data. ~ Study design ~ GWAS genotyping of 4 RA cases and 12 controls ~ Syrian family with RA ~ Exome sequence of 4 RA cases We enrolled consanguineous Syrian family with rheumatoid arthritis (RA). 49 family members include 8 RA cases (II-12,13,14, III-3,17,18, IV-5,9) and 1 anti-CCP antibody-positive control (III-2). Filtering of nonsense/missense variants shared among RA cases using 1000Genome/ESP/dbSNP databases “LOcH mapping” of genetic locus shared among 4 RA cases Selection of 15 candidate causal variants 4 RA cases with ▪ Exome sequence, ▪ GWAS genotyping, ▪ iPlex validation assay. Validation of the candidate variants by iPlex assay for all available 24 family members 12 controls with ▪ GWAS genotyping, ▪ iPlex validation assay. 4 variants shared among 5 RA cases and 1 anti-CCP antibody positive control 1 RA cases, 1 anti-CCP positive control, 6 controls with ▪ iPlex validation assay. 1 variant at PLB1 gene preferentially segregated in RA cases compared to controls ~ LOcH (Loss-of-co-Homozygosity) mapping ~ LOcH mapping identifies ancestry-shared haplotype among affected cases ~ Exome sequence of RA cases ~ LOcH mapping imputes candidate mutations in non-exomed subjects. Subjects : 4 RA cases. Exon capture : Agilent SureSelect Human All Exon Kitv2 (~44Mb). Sequencer : Illumina HiSeq. ▪ In a family with Mendlian disease, the causal mutation resides on the same ancestry-shared haplotype. ▪ Regardless of recessive/dominant mode of inheritance, all the case have at least one ancestry-shared haplotype, which should be “Lossof-co-Homozygosity (LOcH)” in GWAS data. ▪ LOcH mapping can screen the loci with the causal mutation, as an extension of Homozygosity mapping to a disease with unknown inheritance mode. Co-Homozygosity in genotype counts AA AB BB + + + AA AB BB + - + Loss-of-co-Homozygosity in genotype counts AA AB BB AA AB BB + - - + + - AA AB BB AA AB BB - + - - + + AA AB BB - - + ~ LOcH stretches and exome-derived mutations ~ LOcH stretches and SNVs LOcH stretches and Indels ▪ LOcH mapping can impute presence of the exome-derived mutation of each additional control using GWAS data in the LOcH stretch. ▪ When a control has ancestry-shared haplotype, LOcH stretch remains after inclusion of a control. ▪ When a control does not have ancestry-shared haplotype, a LOcH stretch diminishes after inclusion of a control. Analysis : GATK pipeline at Broad (GRch37.64). Mean/Median depth of the variants : 290.1/204. Genotype concordance with GWAS data : 99.56% Variant filtering of exome sequence data 65,524 variants identified by exome sequence. Ts/Tv=2.74 14,804 missense / nonsense variants. 900 variants not in dbSNP 132/1000G Phase I/ESP 5400 with non-reference allele frequency ≥0.05. 476/156 SNVs/Indels with available genotypes in all 4 cases. 13/8 SNVs/Indels with ≥1 non-ref alleles in all 4 cases. ~ Candidate causal SNV in PLB1 gene at 2p23 ~ Distribution of PLB1 mutation ▪ We conducted validation iPlex assay of candidate causal mutations for all 24 available family subjects. ▪ 13 exome-derived SNVs and 2 Indels included in LOcH stretches were selected. ▪ 3 SNVs and 1 Indel were observed for all 5 RA cases and 1 anti-CCP positive control. ▪ Of these, a non-synonymous SNV in phospholipase B1 (PLB1) gene at 2p23 preferentially segregated in cases compared to controls (P = 0.023) Subjects with PLB1 mutation Subjects without PLB1 mutation ▪ LOcH mapping for 4 RA cases identified 36 LOcH stretches covering 12% of genome. ▪ All exome-derived candidate causal SNVs were included in LOcH stretches (P = 1.1×10-12). ▪ Only 2 of 8 exome-derived candidate causal Indels were included in LOcH stretches (P = 0.44). ▪ JavaTM software for LOcH mapping and genotype imputation is available for the request to the authors. ▪ Distinct overlap rates between SNVs and Indels suggested lower quality of exome-derived indels. ▪ Contact : Yukinori Okada, MD, PhD, [email protected]