Download Comparative Genomics Methods for Alternative Splicing of

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Comparative Genomics Methods for
Alternative Splicing of Eukaryotic Genes
Liliana Florea
Department of Computer Science
Department of Biochemistry
GWU
[email protected]
202-994-1057
Jan 24th, 2007
Alternative Splicing
• Alternative splicing = ability of a gene to produce different mRNAs and
proteins under different developmental, tissue and disease-vs-normal
conditions, by using distinct combinations of the gene’s exons.
• Alternative splicing is important and interesting to characterize:
 Possible mechanism to increase protein diversity during species evolution
 Aberrant splicing is often associated with disease, in particular cancers (BRCA1, FGFR-2)
In DT3 rat prostate cancer cells,
the constitutive exon IIIc ( )
is repressed and the alternative
exon IIIb ( ) is expressed in
the mRNA transcript of FGF-R2(*).
 Potentially combinatorial numbers of splice variants per gene (DSCAM – 38,000)
 Most human genes are alternatively spliced (~70%)
Liliana Florea, CS/SEAS
Alternative Splicing Annotation with AIR
• The AIR pipeline annotates genes and splice variants in eukaryotic genomes based
on mRNA, EST and protein evidence; used to annotate the Celera rat genome
Map evidence
to the genome:
Splice graph
= ‘gene’
Enumerate
variants
Score, rank and
select variants
Liliana Florea, CS/SEAS
cDNA1
cDNA2
cDNA3
cDNA4
Genomic axis
XSV 1
XSV 2
XSV 3
XSV 4
XSV 1
XSV 2
XSV 3
XSV 4
1
2
3
5
4
MappingPotential Coverage Fragmentation Longest IntronOri
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1/2 = 0.5
4/5 = 0.8
1.0
1.0
2/4 = 0.5
1/1 = 1.0
2/4 = 0.5
1.0
1.0
2/4 = 0.5
1/1 = 1.0
2/4 = 0.5
1.0
cumScore
1.0
0.81
0.75
0.75
(Florea et al., Genome Res. 2005; Florea et al., CSHL 2004; DiFrancesco et al., CSHL 2004)
Alternative Splicing and Evolution
•
Classes of exons: constitutive (nonAlt), alternative major-form (AltD),
alternative minor-form (AltI)
•
Conservation: establish the presence (P)/ absence (A) of human
exons in each of the other species (>50% presence in ‘multiz’
alignments; http://genome.ucsc.edu )
1. Evolutionary analysis of exon creation
•
AltI exons are more frequently associated
with exon creation (insertion) than the other
categories
• AltI exons have resulted mostly by recent
insertions (~15% occurred before the chicken
split, compared to ~80% for AltD and ~75% for
nonAlt)
2. Sequence conservation in introns
1.1
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
((((HUM,CHP),(MUS,RAT))DOG)CHK)
nonAlt
Alt
AltD
AltI
Error
Early inserts or ancestral (CF)
Dog/Rodent/Primate inserts (D)
Rodent/Primate inserts (R)
Primate inserts (C)
Human inserts (H)
• AltI introns sequences are more frequently conserved than for AltD, nonAlt
• AltD introns are less frequently conserved than for nonAlt and AltI
• These tendencies become stronger as the evolutionary distance increases
3. Sequence variation in exons
• AltI exons show increased I and V rates compared to nonAlt and AltD exons at all 3 codon positions, which
may indicate positive selection (MUS, DOG, CHP comparisons)
• AltD exons show decreased I and V rates compared to nonAlt and AltI exons at all 3 codon positions, which
may indicate effects of purifying selection
Liliana Florea, CS/SEAS
(Florea and Zhao, CFG 2005)
Prediction of Splicing Regulatory Elements
• Exon selection during splicing is controlled by splicing regulatory elements
within the exon (exonic) or in its vicinity (intronic)
• Regulatory elements may act to promote the inclusion of the exon (enhancers)
or to inhibit it (silencers)
• Current work:
 Identify exonic motifs over-represented in alternative (Alt) versus constitutive (nonAlt)
exons
 Identify intronic motifs over-represented in the vicinity of alternative (Alt) versus
constitutive (nonAlt) exons
 Validate the motifs by comparing the sequence conservation within the motif regions
versus within the entire gene in multiple species (human, mouse, rat, dog, chimp, cow,
chicken)
Liliana Florea, CS/SEAS
w/ Erhan Guven