Download intro

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computational phylogenetics wikipedia , lookup

Genetic algorithm wikipedia , lookup

Corecursion wikipedia , lookup

Gene prediction wikipedia , lookup

Population genetics wikipedia , lookup

Transcript
Bioinformatics and
Mathematical Genetics
Simon Myers
[email protected]
An example of a modern genetic dataset
• We are gathering data in Oxford (in collaboration with
Chicago colleagues)
• 10 chimpanzees
• Having their DNA sequence read across the ~3 billion base
chimp genome
• First time this has been done in chimps
• Modern technology means this is an affordable project
Q: Why are we doing this?
A: Because of the insights it can yield into evolutionary
processes, not just in chimps, but in humans
Data for a small region
~300,000,000,000 base pairs of sequence in total
A project to sequence 1000 human genomes:
~20,000,000,000,000 base pairs
Opportunities (I)
• Genome-wide data carries an immense amount of information
• We can compare genomes between humans and chimps:
–
–
–
–
Chimpanzees and humans are 98.6% similar at the DNA level...but
What places do we differ?
Can we explain what makes us human?
Many places that differ are the result of random chance
• We can look within the set of chimps
–
–
–
–
How are chimps evolving?
How do individual chimpanzees differ at the DNA level?
Lots of similar variation data has been or is being gathered in humans
Is the chimpanzee data similar, or different, in terms of overall patterns?
Opportunities (II)
• Variation patterns that we see reflect evolutionary forces
–
–
–
–
Mutation, selection, recombination, migration,...
Do these forces work in similar ways in both species?
Are, e.g., similar types of gene under selection in humans and chimpanzees?
We can learn how dynamic, or conserved, we ought to think of these forces
• A particular interest is recombination:
Father
Mother
Child
–
–
–
–
This does differ, strongly, between the species, at least at scales of thousands of bases
Is there any sharing? At what scales?
How does DNA sequence relate to recombination in chimps and humans?
Knowing the answer will lead directly to information about constraints on
recombination
– In turn, perhaps insights in humans: disease, infertility, and differences among
populations
Challenges
....as night follows day
A lot of data
– Computationally intensive
– Need for careful algorithm
construction
Go from “raw” sequence to something useful:
– Must align to compare species, dealing with errors in the data
Understand how the forces we care about influence the data
–
–
–
–
Evolutionary modelling
Think about relationships among individuals in the sample
Development of inference techniques
Must be applicable to these large datasets
The aim of this module is to give an introduction to:
– Approaches to address these types of challenges
– What we have already learnt using Bioinformatics and mathematical genetics
Brief overview
• Mixture of lectures, practicals, exercises and some reading
• Week 1: Genomes, genetic variation and evolutionary forces
–
–
–
–
–
–
–
Today: how and why do genomes evolve?
Later in the week:
Alignment of genomes, to e.g. discover genes
Phylogenetics: to build trees
Modelling evolution, to relate biological parameters and data
Exploring variation patterns in practice
Inference on biological parameters
• Week 2: Phenotype and function
– 3-day project, led by Jotun Hein, to identify, and analyse, the evolution of a
unique functional element in our genome
– This is also the basis of the assessment, by presentation
– Relating variation among individuals to human phenotypes
– What mutations cause disease, differences in metabolite levels....
Introductory practical
The first practical explores the role of
randomness in evolution
Some mutations carry a benefit to those
who carry them, or are deleterious
– Individuals with beneficial mutations
have, on average, more children on
average
However, inheritance is highly stochastic
– Luck is involved in successfully finding a mate, and raising children
– Parents pass on a random subset of the mutations they carry to their children
Randomness and selection act in opposition
– How does randomness affect selection?
– How often do “neutral” mutations, that are neither beneficial nor deleterious,
succeed?
– These questions can be explored with the Wright-Fisher model