* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Poster Patrocles_V3
Ridge (biology) wikipedia , lookup
X-inactivation wikipedia , lookup
Gene therapy wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Copy-number variation wikipedia , lookup
Population genetics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Non-coding RNA wikipedia , lookup
History of genetic engineering wikipedia , lookup
Saethre–Chotzen syndrome wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Public health genomics wikipedia , lookup
Gene nomenclature wikipedia , lookup
Transposable element wikipedia , lookup
Genomic imprinting wikipedia , lookup
Frameshift mutation wikipedia , lookup
SNP genotyping wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Genome (book) wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Gene desert wikipedia , lookup
Genome evolution wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Oncogenomics wikipedia , lookup
Epigenetics of human development wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Genome editing wikipedia , lookup
Gene expression programming wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene expression profiling wikipedia , lookup
RNA interference wikipedia , lookup
Designer baby wikipedia , lookup
Point mutation wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
RNA silencing wikipedia , lookup
Helitron (biology) wikipedia , lookup
Microevolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Compiling polymorphic miRNA-target interactions: the Patrocles database. Samuel Hiard1, Xavier Tordoir2, Wouter Coppieters2, Carole Charlier2 and Michel Georges2 Bioinformatics and Modeling, GIGA & Department of Electrical Engineering and Computer Science – University of Liège, Sart-Tilman B28, Liège, Belgium 2 Unit of Animal Genomics, Department of Animal Production, Faculty of Veterinary Medicine & CBIG, University of Liège (B43), 20 Boulevard de Colonster, 4000-Liège, Belgium. 1 Abstract Using positional cloning, we have recently identified the mutation responsible for muscular phenotype of the Texel sheep. It is located in the 3’UTR of the GDF8 gene - a known developmental repressor of muscle growth - and creates an illegitimate target site for miRNA expressed in the same tissue. This causes miRNA-mediated translation inhibition of mutant GDF8 transcripts which leads to muscle hypertrophy. We followed up on this finding by searching for common polymorphisms and mutations that affect either (i) RNAi silencing machinery components, (ii) miRNA precursors or (iii) target sites. These might likewise alter miRNA-target interaction and could be responsible for substantial differences in gene expression level. They have been compiled in a public database (“Patrocles”: www.patrocles.org), where they are classified in (i) DNA sequence polymorphisms (DSP) affecting the silencing machinery, (ii) DSP affecting miRNA structure or expression and (iii) DSP affecting miRNA target sites. DSP from the last category were organized in four classes: destroying a target site conserved between mammals (DC), destroying a non-conserved target site (DNC), creating a non-conserved target site (CNC), or shifting a target site (S). To aid in the identification of the most relevant DSP (such as those were a target site is created in an antitarget gene), we have quantified the level of coexpression for all miRNA-gene pairs. Analysis of the numbers of Patrocles-DSP as well as their allelic frequency distribution indicates that a substantial proportion of them undergo purifying selection. The signature of selection was most pronounced for the DC class but was significant for the DNC and CNC class as well, suggesting that a significant proportion of non-conserved targets is truly functional. The Patrocles database allowed for the selection of DSP that are likely to affect gene function and possibly disease susceptibility. The effect of these DSP is being studied both in vitro and in vivo. In conclusion, Patrocles-DSP could be widespread and underlie an appreciable amount of phenotypic variation, including common disease susceptibility. Categories of DNA sequence polymorphisms (DSP) affecting miRNA-mediated gene regulation miRNA DSP altering miRNA recognition sites in the target DSP altering the sequence of the miRNA . Stabilizing or destabilizing the interaction with the target (pSNP) DSP altering the concentration of the miRNA Copy Number Variants emcompassing the pri-miRNA DSP altering the transcription rate of the pri-miRNA . Cis or trans-acting DSP affecting the processing efficiency of the pri- or pre-miRNA Altering existing target sites . Stabilizing or destabilizing the interaction with the miRNA Creating illegitimate target sites DSP altering the target’s 3’UTR e.g. polymorphic polyadenylation Silencing machinery DSP altering the amino-acid sequence of silencing components 50 35 Copy Number Variants encompassing silencing components 12 Kd MSTN X: 0 X : 913 X: 0 L: 0 L : 639 L: 0 B: 0 X : 5282 L : 7967 B : 858 B : 225 X : 4524 L : 7365 B : 708 B: 0 X : 202 Shifted X : 391 L : 361 L : 691 B: 85 Table 1 Texel Created Destroyed Polymorphic Shifted Romanov Quantifying miRNA putative target co-expression The g+6723G-A natural polymorphism causes translational inhibition of the Texel MSTN allele by creating an illegitimate target site for two miRNA expressed in the same tissue, this leads to muscle hypertrophy. Mutations in miRNAs Initial dG = miRNA Expression : - Fahr et al. 2005 - Compute observed frequency - Compute expected frequency ( - Kolmogorov-Smirnov test B : 16 a) b) 0 L: 0 1 (1 P(8nt _ match))UTR _ Length 7 ) c) 10 20 30 40 U| C - C A GA UAAUG GAGG GCC CUCU G GUGUUCAC GCG CCUUGAUU U CUCC CGG GAGA C CGUAAGUG CGC GGAAUUAA C C^ G A A G AC CAUAU 80 70 60 50 P-Value : Determined by 1000 random permutation of genes + KS 0 X : 3661 L : 4325 B : 592 X : 424 L : 269 B : 73 X : 3363 L : 4157 B : 529 X: L: B: X : 1000 L : 1313 B : 197 0 0 0 X : 14 L : 21 B : 11 d) L = Lewis et al. 2005 : Reverse complement of (A + 2 8) of mature miRNA (MiRBase) B = Both First try : eij 2 2 mutations acting in cis or trans on the pri-miRNA promoter (or host gene) may influence transcription rate: Copy Number Variants (CNV) may affect the number of copies of the miRNA or the integrity of the pri-miRNA host: DSP in components of the RNA silencing machinery may affect its overall efficacy. We followed 19 genes involved in miR biology for coding SNP, CNV, eQTL and allelic imbalance: CNV encompass Drosha and DGCR8 genes and 6 genes present non synomymous mutations (table 3) 2 miRNA_id hsa-mir-627 hsa-mir-124a-3 hsa-mir-513-1 hsa-mir-662 hsa-mir-518e hsa-mir-125a hsa-mir-606 hsa-mir-449b hsa-mir-520c hsa-mir-34a hsa-mir-646 hsa-mir-560 hsa-mir-568 hsa-mir-581 hsa-mir-92b hsa-mir-581 hsa-mir-608 nt 2 5 6 7 7 8 10 12 13 14 14 15 15 15 17 21 22 allele A/C G/T -/C G/A -/A G/T -/A A/G G/C C/A/T T/G -/GCGG T/G G/A G/C T/G C/G SNP_ID rs2620381 rs34059726 rs35027589 rs9745376 rs34416818 rs12975333 rs34610391 rs10061133 rs7255628 rs35301225 rs6513497 rs10660600 rs28632138 rs810917 rs12759620 rs1694089 rs4919510 Table 2: DSP in mature miRNA Globally CoExpression : trji t 10 20 30 40 U| C - C A GACC UAAUG GAGG GCC CUCU G GUGUUCAC GCG UUGAUU U CUCC CGG GAGA C CGUAAGUG CGC AAUUAA C C^ G A A G ACGU CAUAU 80 70 60 50 A first CNV map of the human genome has been recently constructed (Redon et al., 2006). We found 43 miRNAs residing in regions involved in CNV, 19 without known host gene and 24 in a host gene which were completely (18) or partially (6) included in a CNV. - 80% of miRNAs hosted by genes - Deduce expression from corresponding gene expression - Experimental data gri j g mutations in the pre-miRNA may affect stability or processing efficacy, 71 SNP in the premiR: eg.: * -32.2 At least eight host genes were found amongst the differentially regulated genes reported in these studies. An additional one is showing allelic imbalance. derived expression 1 * Initial dG = We identified miRNA host genes characterized by inherited variation in expression levels, reasoning that this might affect the cellular concentration of passenger miRNAs. We compiled host genes influenced by both trans- and cis-acting “expression QTL” (eQTL) identified either by linkage analysis or by association studies and host genes having shown allelic imbalance in heterozygous individuals (review by Pastinen et al., 2006; Spielman et al, 2006). a) X = Xie et al. 2005 : Predicted putative miRNA target sites by identification of octamer motifs in 3’UTRs characterized by unusually high motif conservation scores (i.e. proportion of conserved amongst all occurrences). mutations in the mature miRNA (table 2) 6 SNP in the miR seed (yellow) 11 SNP in the mature miR (white) For the 474 human miRNAs in Rfam (oct 2006): - 186 host genes for 229 miR (48.3%) - 245 miR without host gene Not conserved B: -40.3 * miRNA X: Reduced circulating MSTN protein in Texel (T1) vs WT (W1) Reduction of ~1.5X Allelic imbalance of MSTN at the mRNA level Texel allele (A) < WT allele (G) in heterozygous animals In Mouse Conserved 15 Reduction of >3X Schematic representation of the MSTN gene and sequence context of the polymorphic miRNA-MSTN interaction (left). Muscle hypertrophy in Texel compared to wild-type Romanov sheep (right). Gene Expression : SymAtlas (http://symatlas.gnf.org/SymAtlas/) Not conserved 20 Kd For specific miRNAs Conserved 25 10 How? In Human Polymorphic 100 For a pSNP to be affect function, miRNA and putative target need to have overlapping expression domains. To assist in the identification of relevant pSNPs, we therefore have devised a way to quantify the degree of co-expression for miRNA-gene pairs Compiling candidate pSNPS Destroyed cDNA genomic DSP altering the concentration of silencing components Why? Created Nature Genetics, 2006 MWM Target T1 miRNA-mediated gene silencing emerges as a key regulator of cellular differentiation and homeostasis to which metazoans devote a considerable amount of sequence space. This sequence space is bound to suffer its toll of mutations of which some will be selectively neutral while others will be advantageous or more often at least slightly deleterious. DNA sequence polymorphisms (DSP) occurring within this sequence space certainly contribute to phenotypic variation including disease susceptibility and agronomically important traits. An important question is how important their contribution actually is. DSP may affect miRNA-mediated gene regulation by perturbing core components of the silencing machinery, by affecting the structure or expression level of miRNAs, or by altering target sites (Table 1). DSP in core components of the silencing machinery may affect its overall efficacy. Mutations that drastically perturb RNA silencing will obviously be rare given their predictable highly deleterious consequences. Yet, DSP with subtle effects on gene function may occur. As distinct targets may be more or less sensitive to variations in miRNA concentration or silencing efficiency, such DSP may affect some pathways more than others. Specific miRNA-target interactions may be influenced by mutations affecting either the miRNA or its target. On the miRNA side of the equation: (i) the sequence of the mature miRNA may be altered, thereby either stabilizing or destabilizing its interaction with targets, (ii) mutations in the pri- or pre-miRNA may affect stability or processing efficiency, (iii) mutations acting in cis or trans on the pri-miRNA promoter may influence transcription rate, and (iv) Copy Number Variants (CNV) may affect the number of copies of the miRNA or the integrity of the pri-miRNA host. On the target side of the equation: (i) mutations may affect functional target sites thereby destabilizing or stabilizing the interaction with the miRNA, (ii) mutations may create illegitimate miRNA target sites (either in the 3’UTR or maybe even in other segments of the transcript) which will be particularly relevant if occurring in antitargets, (iii) mutations causing polymorphic alternative polyadenylation may affect a gene’s content in target sites. miR mediated translational inhibition of the Texel MSTN allele W1 Introduction gene Drosha Drosha Drosha DGCR8 DGCR8 DGCR8 DGCR8 DGCR8 Exportin-5 Exportin-5 Exportin-5 Exportin-5 Exportin-5 Exportin-5 Exportin-5 Dicer1 Argonaute 1 Argonaute 1 Argonaute 1 Argonaute 1 Argonaute 1 Argonaute 2 allele G/T G/C C/T G/T A/G T/C T/C C/T C/T G/T C/G G/A C/T A/G C/T C/G C/T A/T T/C G/A G/C C/G external_id rs35342496 rs12517177 rs1559205 rs9606253 rs11546015 rs35569747 rs35987994 rs5748529 rs34324334 rs11544379 rs12173786 rs35794454 rs1111785 rs11544382 rs7759854 rs4566088 rs12564106 rs12735796 rs12739932 rs17855789 rs12746607 rs35369360 Table 3: non synonymous SNP in components of miR pathway CoExpression distribution of known antitargets - CoExpression of known antitarget gene and miRNA is quite low - Why? This function doesn’t differenciate moderate coexpression across all tissues and extremely high Nb of Known antitargets coexpression in one tissue Screen shot CoExpression Score Patrocles finder Evidence for purifying selection against pSNPs of conserved and non-conserved target sites Why? Why? What is the evidence that any of the candidate pSNPs listed above truly affect gene function and hence phenotype? Indirect evidence that a significant proportion of them are functional can be obtained from population genetics. Indeed, pSNPs without appreciable effect on gene function will evolve neutrally, subject only to the vagaries of random genetic drift while pSNPs affecting gene function may undergo positive, negative or balancing selection via their effect on phenotype. Selection may leave distinct signatures on the level of inter-species divergence, intra-species variability, allelic distribution and linkage disequilibrium How? Patrocles is built using the public information provided by Ensembl. But the laboratories that work on SNPs often discover new ones. So, there must be a tool that allows these labs to obtain the information about stabilized, destabilized or illegitimate target sites How? End users must provide one or two sequences for, respectively, (i) the analysis of the presence of octamers or (ii) the comparison of the two sequences regarding to the content in octamers. They also have to possibility to provide an alignment of each sequence if they care about conservation. Screen shots - Generation of 100 random sets of SNPs - Processed through pipeline Acknowledgements Results: PAI P5/25 from the Belgian SSTC (n° R.SSTC.0135), EU “Callimir” STREP project. C.C. is chercheur qualifié from the FNRS. Less pSNPs in real data Differences between X and L pSNPs that destroy conserved target site are highly underrepresented (expected) pSNPs that either destroy or create nonconserved target site are also underrepresented ( functional even if not conserved across mammals)