Download Lecture #7

Document related concepts

Bisulfite sequencing wikipedia , lookup

Twin study wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Metagenomics wikipedia , lookup

Gene expression profiling wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Epistasis wikipedia , lookup

Minimal genome wikipedia , lookup

Human genetic variation wikipedia , lookup

Genomic imprinting wikipedia , lookup

Genealogical DNA test wikipedia , lookup

SNP genotyping wikipedia , lookup

Molecular Inversion Probe wikipedia , lookup

Non-coding DNA wikipedia , lookup

Medical genetics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Designer baby wikipedia , lookup

Genome evolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Koinophilia wikipedia , lookup

Public health genomics wikipedia , lookup

Inbreeding avoidance wikipedia , lookup

Genetic drift wikipedia , lookup

Gene wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Inbreeding wikipedia , lookup

Pathogenomics wikipedia , lookup

Microsatellite wikipedia , lookup

Population genetics wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Human leukocyte antigen wikipedia , lookup

Microevolution wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Transcript
Summary of last lesson
• Excellent review of techniques for pop gen
• Methods of analysis
• Previous lesson: density dependence/janzen
connel/red queen hypothesis/type of markers
• humnogous fungus
• Testing the marker/testing sample size
Frequency-, or density
dependent, or balancing selection
• New alleles, if beneficial because linked to
a trait linked to fitness will be positively
selected for.
– Example: two races of pathogen are present, but
only one resistant host variety, suggests second
pathogen race has arrived recently
• Rapid generation time of pathogens. Reticulated evolution
very likely. Pathogens will be selected for INCREASED
virulence
• In the short/medium term with long lived trees a pathogen
is likely to increase its virulence
• In long term, selection pressure should result in
widespread resistance among the host
Overview
Armillaria bulbosa (gallica)
• Known as the Humungous Fungus,
or honey mushroom
• Form rhizomorphs, which make up
much of the “humungous” part
• Basidiocarp: cap 6 cm in diameter,
stem is 5-10 cm tall
• Facultative tree root pathogen
Life cycle: Reproduction
• Sexual
– Basidiocarps release spores (n) after
karyogamy and meiosis
– 2 mating-type loci, each with multiple
alleles in the population
– Isolates (n) must have different alleles
at two mating type loci to be sexually
compatible
• Asexual
– vegetative spreading of rhizomorph
• The large mass of rhizomorph that is
genetically isolated is called a clone
Building up the question…
• “By extending the areas sampled in
subsequent years, we were finally able to
delimit the large area occupied by this
genotype and then go on to show that this
genotype likely represents and ‘individual’”
- Myron Smith
Researcher’s Question
• The clonal “individual” is especially
difficult to define because the network of
hyphae is underground
• How do you unambiguously identify an
individual fungi within a local population?
Approach
1. Collect samples
2. Check mating type
- Somatic
compatibility test
- Distrubution of
mating-type alleles
3. Molecular testing
- RFLP
- RAPD
4. Statistics
5. More testing
Methods and Materials 1
1. Collecting samples
• Researcher collected samples over a 30 hectare
area by baiting Armillaria with poplar stakes
and taking tissues and spores
• They then grew the successfully colonized
stakes in soil taken from the study site
• Each fungal colony cultured was called an
isolate.
Methods and Materials 2
Example (not Armillaria)
2. Checking mating type
- Somatic incompatibility
For two fungal isolates to fuse, all
somatic compatibility loci must be
the same.
Fusion means they’re clones 
Methods and Materials 2
• 2. Checking mating type
- Distrubution of mating alleles
- Mating occurs only when coupled isolates have
different alleles at two unlinked, multiallelic loci: A
and B. (They have an incompatibility system)
- If fruit bodies had the same alleles at A and B, and
were collected from the same area, they were assumed
to be from the same clone
Result 1
• Somatic compatilbilty:
– isolates from vegetative mycelium from a large
sampling area fused
• Mating alleles
– They had the same mating type
Result 1
• “Clone 1” was found to
exceed 500 m in
diameter
– Used previously
collected mtDNA
restriction fragment
patterns
Sensitivity of Approach
• Problem: These tests alone are not enough
to distinguish a clone from closely related
individuals
Why?
• Q: The first two tests were not sensitive enough to
tell a clone from a close relative…Why?
• A: Spores from same point source have the same
mating-type alleles, but the offspring they produce
after inbreeding are genetically distinct.
Methods and Materials 3
3. Molecular Testing
- RFLP analysis at 5 polymorphic, heterozyg.
loci of mtDNA from “Clone 1”
- RAPD analysis at 11 loci
RAPDS vs. RFLPs
• Use 1 short PCR
primer
• When it finds match
on template at a
distance that can be
amplified (primer
binds twice within 50
to 2000 bp) RAPD
amplicon
• Dominant, annoymous
• Total genomic, vs single
locus
• Use endonuclease to
digest DNA at specific
restriction site
• Run digest and see how
amplicon was cut
• Single locus is codominant
• RFLP
Result 2
– All 5 loci from Clone 1 were heterozygous and identical
(both alleles present at loci: 1,1)
• RAPD
– All 11 RAPD products were present in all vegetative
isolates”
Statistical Analysis
• The probability of retaining heterozygosity at
each parental locus in an individual produced
by mating of sibling monospore isolates…
= 0.0013
• So they were pretty confident that cloning was
responsible for their results, not inbreeding
More testing, just in case
• To be completely confident, they tested:
– 1) that nearby Clone 2 was different and lacked 5 of
the Clone 1 heterozyg. RAPD fragments,
– 2) more loci, totaling
• 20 RAPD fragments
• 27 nuclear DNA RFLP fragments
** all were identical in Clone 1
Sensitivity of RAPDs
• Tested on subset of spores from same
basidiocarp
• RAPDs differentiated among full sibs
Conclusions
• Somatic compatibility, mating allele loci,
mtDNA, RFLP, and RAPD tests all indicate
that a single organism could indeed occupy
a 15 hectare area
Conclusions
• The larger individual, Clone 1 was
estimated to weigh 9700 kg and be over
1500 years old
Implications
• ?????
• Fungi are one of the oldest and largest organisms
on the planet
• Recycle nutrients…very important!
• Armillaria bulbosa also a pathogen; its effects on
forest above may be huge as well.
HOST-SPECIFICITY
•
•
•
•
•
Biological species
Reproductively isolated
Measurable differential: size of structures
Gene-for-gene defense model
Sympatric speciation: Heterobasidion,
Armillaria, Sphaeropsis, Phellinus,
Fusarium forma speciales
NJ
Phylogenetic relationships
within the Heterobasidion
complex
Het INSULARE
Fir-Spruce
True Fir EUROPE
Spruce EUROPE
True Fir NAMERICA
Pine Europe
Pine EUROPE
Pine N.Am.
Pine NAMERICA
0.05 substitutions/site
The biology of the organism
drives an epidemic
• Autoinfection vs. alloinfection
• Primary spread=by spores
• Secondary spread=vegetative, clonal spread, same
genotype . Completely different scales (from small to
gigantic)
Coriolus
Heterobasidion
Armillaria
Phellinus
OUR ABILITY TO:
• Differentiate among different individuals
(genotypes)
• Determine gene flow among different areas
• Determine allelic distribution in an area
WILL ALLOW US TO
DETERMINE:
• How often primary infection occurs or is disease
mostly chronic
• How far can the pathogen move on its own
• Is the organism reproducing sexually? is the
source of infection local or does it need input from
the outside
IN ORDER TO UNDERSTAND
PATTERNS OF INFECTION
• If John gave directly Mary an infection, and Mary gave it
to Tom, they should all have the same strain, or
GENOTYPE (comparison=secondary spread among forest
trees)
• If the pathogen is airborne and sexually reproducing, Mary
John and Tom will be infected by different genotypes. But
if the source is the same, the genotypes will be sibs, thus
related
Recognition of self vs. non self
• Intersterility genes: maintain species gene
pool. Homogenic system
• Mating genes: recognition of “other” to
allow for recombination. Heterogenic
system
• Somatic compatibility: protection of the
individual.
Recognition of self vs. non self
• What are the chances two different
individuals will have the same set of VC
alleles?
• Probability calculation (multiply frequency
of each allele)
• More powerful the larger the number of loci
• …and the larger the number of alleles per
locus
Recognition of self vs. non self
• It is possible to have different genotypes
with the same vc alleles
• VC grouping and genotyping is not the
same
• It allows for genotyping without genetic
tests
• Reasons behing VC system: protection of
resources/avoidance of viral contagion
Somatic incompatibility
Quic kTime™ and a
TIFF (Unc ompres sed) decompress or
are needed to see this picture.
Quic kTime™ and a
TIFF (Unc ompres sed) decompress or
are needed to see this picture.
More on somatic compatibility
• Perform calculation on power of approach
• Temporary compatibility allows for
cytoplasmic contact that then is interrupted:
this temporary contact may be enough for
viral contagion
SOMATIC COMPATIBILITY
• Fungi are territorial for two reasons
– Selfish
– Do not want to become infected
• If haploids it is a benefit to mate with other, but
then the n+n wants to keep all other genotypes out
• Only if all alleles are the same there will be fusion
of hyphae
• If most alleles are the same, but not all, fusion
only temporary
SOMATIC COMPATIBILITY
• SC can be used to identify genotypes
• SC is regulated by multiple loci
• Individual that are compatible (recognize one
another as self, are within the same SC group)
• SC group is used as a proxy for genotype, but in
reality, you may have some different genotypes
that by chance fall in the same SC group
• Happens often among sibs, but can happen by
chance too among unrelated individuals
Recognition of self vs. non self
• What are the chances two different
individuals will have the same set of VC
alleles?
• Probability calculation (multiply frequency
of each allele)
• More powerful the larger the number of loci
• …and the larger the number of alleles per
locus
Recognition of self vs. non self:
probability of identity (PID)
• 4 loci
• 3 biallelelic
• 1 penta-allelic
• P= 0.5x0.5x0.5x0.2=0.025
• In humans 99.9%, 1000, 1 in one million
INTERSTERILITY
• If a species has arisen, it must have some
adaptive advantages that should not be
watered down by mixing with other species
• Will allow mating to happen only if
individuals recognized as belonging to the
same species
• Plus alleles at one of 5 loci (S P V1 V2 V3)
INTERSTERILITY
• Basis for speciation
• These alleles are selected for more strongly
in sympatry
• You can have different species in allopatry
that have not been selected for different IS
alleles
MATING
• Two haploids need to fuse to form n+n
• Sex needs to increase diversity: need
different alleles for mating to occur
• Selection for equal representation of many
different mating alleles
MATING
• If one individuals is source of inoculum,
then the same 2 mating alleles will be found
in local population
• If inoculum is of broad provenance then
multiple mating alleles should be found
MATING
• How do you test for mating?
• Place two homokaryons in same plate and
check for formation of dikaryon
(microscopic clamp connections at septa)
Clamp connections
QuickTi me™ and a
TIFF ( Uncompressed) decompr essor
are needed to see thi s p icture.
QuickTi me™ and a
T IFF (Uncom pressed) decom pressor
are needed to see t his pict ure.
QuickTime™ and a
TIFF (U ncompressed) decompressor
are needed to see t his picture.
MATING ALLELES
• All heterokaryons will have two mating allelels,
for instance a, b
• There is an advantage in having more mating
alleles (easier mating, higher chances of finding a
mate)
• Mating allele that is rare, may be of migrant just
arrived
• If a parent is important source, genotypes should
all be of one or two mating types
Two scenarios:
• A, A, B, C, D, D, E,
H, I, L
• A, A, A,B, B, A, A
Two scenarios:
• A, A, B, C, D, D, E,
H, I, L
• A, A, A,B, B, A, A
• Multiple source of
infections (at least 4
genotypes)
• Siblings as source of
infection (1 genotype)
SEX
• Ability to recombine and adapt
• Definition of population and
metapopulation
• Different evolutionary model
• Why sex? Clonal reproductive approach can
be very effective among pathogens
Long branches in between
groups suggests no sex is
occurring in between
groups
NJ
Het INSULARE
Fir-Spruce
True Fir EUROPE
Spruce EUROPE
True Fir NAMERICA
Pine Europe
Pine EUROPE
Pine N.Am.
Pine NAMERICA
0.05 substitutions/site
NJ
11.10 SISG CA
2.42 SISG CA
Small branches within a clade indicate
sexual reproduction is ongoing within that
group of individuals
BBd SISG WA
F2 SISG MEX
NA S
BBg SISG WA
14a2y SISG CA
15a5y M6 SISG CA
6.11 SISG CA
9.4 SISG CA
AWR400 SPISG CA
9b4y SISG CA
15a1x M6 PISG CA
1M PISG MEX
9b2x PISG CA
A152R FISG EU
A62R SISG EU
A90R SISG EU
890 bp
CI>0.9
EU S
A93R SISG EU
J113 FISG EU
J14 SISG EU
J27 SISG EU
J29 SISG EU
0.0005 substitutions/site
EU F
NA P
Index of association
Ia= if same alleles are associated too
much as opposed to random, it means
sex is not occurring
Association among alleles calculated
and compared to simulated random
distribution
Evolution and Population
genetics
• Positively selected genes:……
• Negatively selected genes……
• Neutral genes: normally population genetics
demands loci used are neutral
• Loci under balancing selection…..
Evolution and Population
genetics
• Positively selected genes:……
• Negatively selected genes……
• Neutral genes: normally population genetics
demands loci used are neutral
• Loci under balancing selection…..
Evolutionary history
• Darwininan vertical evolutionary models
• Horizontal, reticulated models..
NJ
Phylogenetic relationships
within the Heterobasidion
complex
Het INSULARE
Fir-Spruce
True Fir EUROPE
Spruce EUROPE
True Fir NAMERICA
Pine Europe
Pine EUROPE
Pine N.Am.
Pine NAMERICA
0.05 substitutions/site
NJ
11.10 SISG CA
Geneaology of “S” DNA insertion into P
ISG confirms horizontal transfer.
2.42 SISG CA
BBd SISG WA
F2 SISG MEX
Time of “cross-over” uncertain
NA S
BBg SISG WA
14a2y SISG CA
15a5y M6 SISG CA
6.11 SISG CA
9.4 SISG CA
AWR400 SPISG CA
9b4y SISG CA
15a1x M6 PISG CA
1M PISG MEX
9b2x PISG CA
A152R FISG EU
A62R SISG EU
890 bp
CI>0.9
A90R SISG EU
EU S
A93R SISG EU
J113 FISG EU
J14 SISG EU
J27 SISG EU
J29 SISG EU
0.0005 substitutions/site
EU F
NA P
Because of complications such
as:
• Reticulation
• Gene homogeneization…(Gene duplication)
• Need to make inferences based on multiple genes
• Multilocus analysis also makes it possible to differentiate
between sex and lack of sex (Ia=index of association), and
to identify genotypes, and to study gene flow
Basic definitions again
• Locus
• Allele
• Dominant vs. codominant marker
– RAPDS
– AFLPs
How to get multiple loci?
• Random genomic markers:
– RAPDS
– Total genome RFLPS (mostly dominant)
– AFLPS
• Microsatellites
• SNPs
• Multiple specific loci
– SSCP
– RFLP
– Sequence information
Watch out for linked alleles (basically you are looking at the same thing!)
RAPDS use short primers but not
too short
• Need to scan the genome
• Need to be “readable”
• 10mers do the job (unfortunately annealing
temperature is pretty low and a lot of
priming errors cause variability in data)
RAPDS use short primers but not
too short
• Need to scan the genome
• Need to be “readable”
• 10mers do the job (unfortunately annealing
temperature is pretty low and a lot of
priming errors cause variability in data)
RAPDS can also be obtained
with Arbitrary Primed PCR
• Use longer primers
• Use less stringent annealing conditions
• Less variability in results
Result: series of bands that are
present or absent (1/0)
Root disease center in true fir caused by H. annosum
Ponderosa pine
Incense cedar
Yosemite Lodge 1975 Root disease centers outlined
Yosemite Lodge 1997 Root disease centers outlined
WORK ON PINES HAD
DEMONSTRATED INFECTIONS ARE
MOSTLY ON STUMPS
• Use meticulous field work and genetics
information to reconstruct disease from
infection to explosion
• On firs/sequoia if the stump theory were
also correct we would find a stump within
the outline of each genotype
Are my haplotypes sensitive
enough?
• To validate power of tool used, one needs to
be able to differentiate among closely
related individual
• Generate progeny
• Make sure each meiospore has different
haplotype
• Calculate P
RAPD combination
1
2
• 1010101010
• 1011101010
• 1010101010
• 1010111010
• 1010101010
• 1010001010
• 1010101010
• 1010000000
• 1011001010
• 1011110101
Conclusions
• Only one RAPD combo is sensitive enough
to differentiate 4 half-sibs (in white)
• Mendelian inheritance?
• By analysis of all haplotypes it is apparent
that two markers are always cosegregating,
one of the two should be removed
If we have codominant markers
how many do I need
• IDENTITY tests = probability calculation
based on allele frequency… Multiplication
of frequencies of alleles
• 10 alleles at locus 1 P1=0.1
• 5 alleles at locus 2 P2=0,2
• Total P= P1*P2=0.02
Have we sampled enough?
• Resampling approaches
• Saturation curves
– A total of 30 polymorphic alleles
– Our sample is either 10 or 20
– Calculate whether each new sample is
characterized by new alleles
Saturation (rarefaction) curves
No
Of
New
alleles
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Dealing with dominant
anonymous multilocus markers
•
•
•
•
Need to use large numbers (linkage)
Repeatability
Graph distribution of distances
Calculate distance using Jaccard’s similarity
index
Jaccard’s
• Only 1-1 and 1-0 count, 0-0 do not count
1010011
1001011
1001000
Jaccard’s
• Only 1-1 and 1-0 count, 0-0 do not count
A: 1010011 AB= 0.6
B: 1001011 BC=0.5
C: 1001000 AC=0.2
0.4 (1-AB)
0.5
0.8
Now that we have distances….
• Plot their distribution (clonal vs. sexual)
Now that we have distances….
• Plot their distribution (clonal vs. sexual)
• Analysis:
– Similarity (cluster analysis); a variety of
algorithms. Most common are NJ and UPGMA
Now that we have distances….
• Plot their distribution (clonal vs. sexual)
• Analysis:
– Similarity (cluster analysis); a variety of
algorithms. Most common are NJ and UPGMA
– AMOVA; requires a priori grouping
AMOVA groupings
• Individual
• Population
• Region
AMOVA: partitions molecular variance
amongst a priori defined groupings
Example
• SPECIES X: 50%blue, 50% yellow
AMOVA: example
Scenario 1
v
v
Scenario 2
POP 1
POP 2
Expectations for fungi
• Sexually reproducing fungi characterized by high
percentage of variance explained by individual
populations
• Amount of variance between populations and
regions will depend on ability of organism to
move, availability of host, and
• NOTE: if genotypes are not sensitive enough so
you are calling “the same” things that are different
you may get unreliable results like 100 variance
within pops, none among pops
Results: Jaccard similarity coefficients
Frequency
P. nemorosa
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.90
0.92
0.94
0.96
Coefficient
1.00
0.98
Frequency
P. pseudosyringae: U.S. and E.U.
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.90
0.92
0.94
0.96
Coefficient
0.98
1.00
P. pseudosyringae genetic similarity
patterns are different in U.S. and E.U.
0.7
Frequency
0.6
0.5
Pp U.S.
0.4
Pp E.U.
0.3
0.2
0.1
0.0
0.9
0.91
0.92
0.93
0.94
0.95
0.96
0.97
Jaccard coefficient of similarity
0.98
0.99
Results: P. nemorosa
4175A
p72
p39
p91
1050
P. ilicis
P. pseudosyringae
p7
2502
p51
2055.2
2146.1
5104
4083.1
2512
2510
2501
2500
2204
2201
2162.1
2155.3
2140.2
2140.1
2134.1
2059.2
2052.2
HCT4
MWT5
p114
p113
p61
p59
p52
p44
p38
p37
p13
p16
2059.4
p115
2156.1
HCT7
p106
0.1
P. nemorosa
Results: P. pseudosyringae
P. ilicis
P. nemorosa
4175A
2055.2
p44
= E.U. isolate
0.1
FC2D
FC2E
GEROR4
FC1B
FCHHD
FCHHC
FC1A
p80
FAGGIO 2
FAGGIO 1
FCHHB
FCHHA
FC2F
FC2C
FC1F
FC1D
FC1C
p83
p40
BU9715
p50
p94
p92
p88
p90
p56B
p45
p41
p72
p84
p85
p86
p87
p93
p96
p39
p118
p97
p81
p76
p73
p70
p69
p62
p55
p54
HELA2
HELA 1
P. pseudosyringae
The “scale” of disease
• Dispersal gradients dependent on propagule size,
resilience, ability to dessicate, NOTE: not linear
• Important interaction with environment, habitat, and niche
availability. Examples: Heterobasidion in Western Alps,
Matsutake mushrooms that offer example of habitat
tracking
• Scale of dispersal (implicitely correlated to
metapopulation structure)---
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
RAPDS> not used often now
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
RAPD DATA W/O COSEGREGATING MARKERS
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
PCA
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
AFLP
•
•
•
•
Amplified Fragment Length Polymorphisms
Dominant marker
Scans the entire genome like RAPDs
More reliable because it uses longer PCR
primers less likely to mismatch
• Priming sites are a construct of the sequence
in the organism and a piece of synthesized
DNA
How are AFLPs generated?
• AGGTCGCTAAAATTTT (restriction site in red)
• AGGTCG
CTAAATTT
• Synthetic DNA piece ligated
– NNNNNNNNNNNNNNCTAAATTTTT
• Created a new PCR priming site
– NNNNNNNNNNNNNNCTAAATTTTT
• Every time two PCR priming sitea are within 4001600 bp you obtain amplification
Coco Solo
Mananti
Ponsok
David
Coco Solo
0
237
273
307
Mananti
Ponsok
David
0
60
89
0
113
0
Distances between study sites
White mangroves:
Corioloposis caperata
Forest fragmentation can lead to loss of gene flow among
previously contiguous populations. The negative
repercussions of such genetic isolation should most severely
affect highly specialized organisms such as some plantparasitic fungi.
AFLP study on single spores
Coriolopsis caperata on
Laguncularia racemosa
Site
# of isolates
# of loci
% fixed alleles
Coco Solo
11
113
2.6
David
14
104
3.7
Bocas
18
92
15.04
Coco Solo
Coco Solo
Bocas
David
0.000
0.000
0.000
Bocas
0.2083
0.000
0.000
David
0.1109
0.2533
0.000
Distances =PhiST between pairs of
populations. Above diagonal is the Probability
Random d istance > Observed distance (1000
iterations).
Using DNA sequences
•
•
•
•
•
Obtain sequence
Align sequences, number of parsimony informative sites
Gap handling
Picking sequences (order)
Analyze sequences
(similarity/parsimony/exhaustive/bayesian
• Analyze output; CI, HI Bootstrap/decay indices
Using DNA sequences
•
•
•
•
Testing alternative trees: kashino hasegawa
Molecular clock
Outgroup
Spatial correlation (Mantel)
• Networks and coalescence approaches
From Garbelotto and Chapela,
Evolution and biogeography of matsutakes
Biodiversity within species
as significant as between
species
Microsatellites or SSRs
• AGTTTCATGCGTAGGT CG CG CG CG CG
AAAATTTTAGGTAAATTT
• Number of CG is variable
• Design primers on FLANKING region, amplify DNA
• Electrophoresis on gel, or capillary
• Size the allele (different by one or more repeats; if number
does not match there may be polimorphisms in flanking
region)
• Stepwise mutational process (2 to 3 to 4 to 3 to2 repeats)