Download Efficient Exact p-Value Computation and Applications to

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Milestones due today.
Anything to report?
http://cs273a.stanford.edu
[Bejerano Fall09/10]
1
Lecture 17
Ultraconservation evolutionary data
Finish early – come hear the talk with us?
http://cs273a.stanford.edu
[Bejerano Fall09/10]
2
Sequence Conservation implies Function
Comparative Genomics of Distantly related species:
functional region!
human
...CTTTGCGA-TGAGTAGCATCTACTATTT...
mammalian
ancestor
mouse
...ACGTGGGACTGACTA-CATCGACTACGA...
(but which function/s?...)
http://cs273a.stanford.edu
[Bejerano Fall09/10]
3
Human Genome full of Conserved Non-Coding Elements
Human
Genome:
3*109 letters
1.5%
known
function
>50%
junk
compare to other species
>5% human genome functional
3x more functional DNA than known!
~106 substrings do not code for protein
What do they do then?
http://cs273a.stanford.edu
[Bejerano Fall09/10]
4
Conserved elements in the Human Genome
all human-mouse alignments
human-mouse
ancestral repeats alignment
election
Difference:
5% of
Human Genome
85%id on average
[Mouse consortium, Nature 2002]
http://cs273a.stanford.edu
[Bejerano Fall09/10]
5
Conserved elements in the Human Genome
all human-mouse alignments
human-mouse
ancestral repeats alignment
Simple
but
Unexpected
(the lure of Bioinormtaics)
election
Difference:
5% of
Human Genome
Ultraconservation
85%id on average
[Mouse consortium, Nature 2002]
http://cs273a.stanford.edu
[Bejerano Fall09/10]
6
Typical DNA Conservation levels
(dot = base identical to human)
Conserved elements between human and mouse
are on average 85% identical. [mouse consortium, 2002]
http://cs273a.stanford.edu
[Bejerano Fall09/10]
7
Ultraconserved Elements
fish
481 elements perfectly conserved (100%id) over
200bp or more between human, mouse and rat.
http://cs273a.stanford.edu
[Bejerano Fall09/10] [Bejerano et al., Science 2004]
8
Ultraconserved Elements: Why?
Hundreds of long substrings identical between amniotes
 they must have rejected many different changes.
But... all functions we understand in our genome are
*
encoded using redundant codes.
CDS
ncRNA
*
*
*
*
TFBS
seq.
http://cs273a.stanford.edu
[Bejerano Fall09/10]
9
Ultras are Functional
Back in 2004 we hypothesized:
481 ultraconserved
elements
exonic subset –
post transcriptional regulation
[Ni et al., Genes Dev.; Lareau et al., Nature, 2007]
http://cs273a.stanford.edu
“nonexonic” subset –
transcriptional regulators
[Pennacchio et al., Nature, 2006]
[Bejerano Fall09/10]
10
Genomic Distribution of Ultraconserved Elements
•exonic
•non
•possibly
http://cs273a.stanford.edu
[Bejerano Fall09/10]
11
UC.338 comes from an ancient repeat
ultraconserved
exon
novel
coelacanth
repeat
enhancer
LF-SINE
[Bejerano et al, Nature ,2006]
http://cs273a.stanford.edu
[Bejerano Fall09/10]
12
Ultras are Under Strong Human Selection
Mutational cold spots? NO. Rare (new) mutations are
introduced to the population.
Fierce purifying selection? YES. Very few of these get
anywhere near fixation.
AA GA
humans
Ultra DAF
A
chimp
NonSyn DAF
[Katzman et al, Science ,2007]
http://cs273a.stanford.edu
[Bejerano Fall09/10]
13
Touch an Ultra And You - DIY
http://cs273a.stanford.edu
[Bejerano Fall09/10][Ahituv et al., PLoS Biology, 2007] 14
What can’t we measure in the lab?
s
1 e
Pr(fixation | N e , s ) 
2 N e s
1 e
Ne is population size, s selective dis/advantage.
Both of which are VERY wrong in the lab.
http://cs273a.stanford.edu
[Bejerano Fall09/10]
15
So it can happen – but does it FIX?
t
DNA element
http://cs273a.stanford.edu
[Bejerano Fall09/10]
16
Count Fraction Lost, Binned by %id
t
bin
by
%id
count_all
human
macaque
dog
mouse
rat
count_hole
100bp
sliding
window
http://cs273a.stanford.edu
[Bejerano Fall09/10]
17
Quite Some Time Later
http://cs273a.stanford.edu
[Bejerano Fall09/10]
18
Pragmatic Genomics
define goal
run sensible approach
while (results full of artefacts*)
{
characterize artefact
write handler into code
rerun
}
* eg: sequencing errors, assembly errors
contaminating sequence, ambiguous situations, etc.
http://cs273a.stanford.edu
[Bejerano Fall09/10]
19
Ultras are Fiercely Retained through Evolution
100%id primates-dog: 1,691,090bp
rodents deleted: 1,447bp (0.086%)
Ultras are
>300 fold
more
persistent
than
neutral DNA (25% deleted)
http://cs273a.stanford.edu
the genomic deletion is
1  e s
Pr(fixation | N e , s ) 
1  e 2 N e s
[Bejerano Fall09/10]
20
How special are the Ultras?
election
Ultraconservation
http://cs273a.stanford.edu
[Bejerano Fall09/10]
21
Adding More Species
Aha!!
http://cs273a.stanford.edu
[Bejerano Fall09/10]
22
Adding More Species
Few species
More and more species
Hmmm….
http://cs273a.stanford.edu
[Bejerano Fall09/10]
23
Most Non-Coding Elements likely work in cis…
“IRX1 is a member of the Iroquois homeobox gene family.
Members of this family appear to play multiple roles
during pattern formation of vertebrate embryos.”
gene deserts
regulatory jungles
9Mb
http://cs273a.stanford.edu
[Bejerano Fall09/10]
24
… and Ultras are the tip of a functional iceberg
gene deserts
regulatory jungles
9Mb
This dense regulatory jungle contain a single ultra
http://cs273a.stanford.edu
[Bejerano Fall09/10]
25
Related documents