Download RichardDurbin_CSI2011

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetic drift wikipedia , lookup

Metagenomics wikipedia , lookup

Population genetics wikipedia , lookup

Human genetic variation wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Mark de Pristo
But 1-2% of 3 billion is still a lot!
The fraction of variants that is novel
varies by type
• 3-4,000,000 variants per individual
– 97.8% of variants in NA12891 are in pilot data
• 10-11,000 nonsynonymous changes
– 95% of this class in NA12891 are in pilot data
• 80-100 premature stop codons
– 88% of this class in NA12891 are in pilot data
• 50-100 HGMD “recessive disease causing”
mutations
– 85% of this class in NA12891 are in pilot data
1000 Genomes Project pilot paper
Functional variants are more likely to be rare
Individuals in outbred populations will still carry
many variants not in the 1000GP and other
similar data sets
• Exponential population
growth in last 10,000 years
gives long tips to the tree
• In “big” populations, tips
are hundreds of
generations long, so tens of
thousands of private
variants per sample,
hundreds functional
This behaviour is very dependent on population
structure.
In genetic isolates the tree relating haplotypes is
smaller, and the tips are shorter
Isolates share recently diverged chromosomes
with long shared haplotypes
Case study: Kuusamo
– Settled by 34 families in 1680s
– Small indigenous Lapp
population disappeared rapidly
– Very little immigration after
initial settlement
– Current population ~20 000
– Enriched phenotypes, e.g.
scizophrenia
Fit population simulation model
to genotype data from a fixed sample
“Nx plot”: x% of new sample DNA is shared in segments of length >y
Best fit model
100 founders, no migration
4 generations with 2x growth,
8 generations with 1.25x growth
With ~2% migration per generation
Kimmo Palin
Orcades population simulation
20 subpopulations (parishes), constant size 1/3 of census 1841
size, endogamy within parishes >~50% from records, 40
generations, immigration generations 20-29 (1400-1670)
Kimmo Palin
How much variation do we cover with
how much sequence?
In the end, each individual carries private mutations
Kees Albers, Kimmo Palin, Karola
Rehnstrom, Leopold Parts, Aylwyn
Scally, Jared Simpson, Weldon
Whitener