Download Interference Do crossovers interefere with one another? Or, if a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Essential gene wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Population genetics wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Gene desert wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Public health genomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

RNA-Seq wikipedia , lookup

Genomic imprinting wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene expression programming wikipedia , lookup

Minimal genome wikipedia , lookup

Gene wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Genome evolution wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Ridge (biology) wikipedia , lookup

Genome (book) wikipedia , lookup

Designer baby wikipedia , lookup

Microevolution wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene expression profiling wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Neocentromere wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Transcript
Interference
Do crossovers interefere with one another?
Or, if a crossover occurs in one region, does this somehow inhibit or interfere with the
probability that a crossover will occur in a nearby region?
We can address this question using three-point crosses, and comparing the number of
double crossovers observed to that expected in the absence of interference. That is, if we
assume that the occurrence of crossovers is independent of one another.
In our first example three-point testcross above, we constructed the genetic map below,
and we observed a total of 10 double recombinants out of a total of 510 progeny
examined.
C
11.8cM
A
21.5cM
B
We can predict the proportion of double recombinants expected in the absence of
intereference. That is, we assume whether a recombination event occurs between C and A
is independent of whether one occurs in the region between A and B.
this is simply obtained as the product of the two probabilites.
So probability of double recombinant in the absence of interference is:
Prob(double) = 0.118 x 0.215 = 0.025
Since we examined 510 progeny, the expected number is 510 x 0.025 = 12.75 .
Recall that we observed 10 double recombinants. In this case there isn't much difference
between the observed and expected.
Interference is equals 1 - (observed double rec / expected double rec)
I = 1 - 10 / 12.75 = 0.22, or about 22 percent. (is this statistically significant we'd need to
carry out an appropriate statistical test and we'd need a reasonably large sample size).
In our second example of a three-point cross we observed 20 double recombinants out of
1000 progeny and deduced the map below. R
T
S
28 cM
20 cM
Expected proportion of double recombinants = 0.2 x 0.28 = 0.056
Expected number = 1000 x 0.056 = 56
I = 1- 20 / 56 = 0.64; Which happens to be about, 64%. Here, there may well be
interference, and interference usually occurs at some level in most organisms and is
particularly detectable for gene regions that are nearby .
Goodness of fit test for linkage
One issue we haven't yet addressed is a statistical test for linkage. If ratios for a testcross
depart radically from a 1:1:1:1 ratio, we can perhaps be confident that linkage might
occur, however, what if the ratios are off, but not by very much.
We can test this using a chi-square test similar to our previous ones. In this case,
however, we will do what often called a chi-square test for independence (that is we ask
whether the genes segregate independently). The test is used in other branches of biology
as well and may be called a test of independence, or a contigency test.
So, imagine we had the following data
Crossed AABB X aabb
F1 all
AaBb
Female
Testcross: AaBb
Male
aabb
x
AaBb
Aabb
aaBb
55
40
45
But he got:
aabb
60
There certainly seem to be more parentals than recombinants, but is this just due to
random variation or is there truly evidence for linkage here?
To do the chi-squared test for independence, we need first to generate the expected
numbers of progeny of each type assume there is no linkage and then compare this to the
observed data.
First organize the data in a square table as follows:
Observed table
Aa
aa
totals
Bb
55
45
100
bb
40
60
100
totals
95
105
200
Using this observed table, we can estimate the proportion of each genotype we expect to
see assuming independent assortment.
so for AaBb we expect 95/200 x 100/200 = 0.2375
or to convert this to the expected Number of AaBb progeny 200 x 0.2375 = 47.5
for Aabb we expect 95/200 x 100/200 = 0.2375 or 47.5 progeny
for aaBb we expect 105/200 x 100/200 = 0.2625 (x 200) give 52.5 progeny
for aabb we expect 105/200 x 100/200 = 0.2625 (x 200) gives 52.5 progeny
Observed table
Bb
55
45
100
bb
40
60
100
totals
95
105
200
Expected table
Bb
Aa
47.5
aa
52.5
totals
100
bb
47.5
52.5
100
totals
95
105
200
Aa
aa
totals
so now we have observed and expect so calculated the chi-square statistic as usual.
Chisq = sum{(obs-exp)2 /exp}
chisq = (55-47.5) 2 /47.5 + (40-47.5) 2 /47.5 + (45-52.5) 2 /52.5 +(60-52.5) 2 /52.5
= 4.51
How many degrees of freedom?
Well here we have departure from before.
We have 4 observed classes, so we lose 1 degree of freedom leaving 3.
However, we used the data to obtain the expecteds. In fact, we estimated two independent
probabilities from the data, the probability of being Aa and the probability of being Bb,
so we lose 1 degree of freedom for each.
So df = 4-1-2 = 1
The easy way to get degrees of freedom is just (nrows-1) x (ncolumns -1) = (2-1) x (2-1)
=1
since χ2 1, 0.05 = 3.841 and our value is greater than this we reject that null hypothesis that
the genes are unlinked (ie assort independtly) therefore we have evidence for linkage.
Now estimate the recombination frequency as: r = 85/200 = 0.425 or 42.5 %
Centromere mapping with tetrads
Note that we've considered the mapping of genes, and DNA-based markers which could
be a segment of a gene, and non-coding region of DNA, or even a single position on a
DNA sequence (SNP).
Using Neuropora tetrads, it is also possible to map the position of genes relative to the
centromere.
So let's return to our spore colour schematic in the asci of Neurospora.
Here we've generated a Dd heterozygote, and are following meiosis.
D gives dark spores, while d gives light coloured spores.
Now consider the location of the spore colour gene, D/d relative to the centromere.
Centromere
D
If there is no crossing over between the centromere and the locus D, then there are two
possible patterns of spore formation This is because of the way in which spores are
formed in a linear meitoic pattern and because the particular alleles remain attached to the
partental centromeres in the absence of recombination
1st meiotic
division
Diploid
meiocyte
2nd meiotic
division
mitosis
Mature ascus with 8
ascospores
And this reversed
So here are the two asci type without recombination between D and centromere.
d
d
d
d
D
D
D
D
If there is recombination between locus and centromere, this yields four possibilities
So now let's imagine you do this cross and score a large number of asci.
D
D
D
D
d
d
d
d
280
d
d
d
d
D
D
D
D
280
d
d
D
D
d
d
D
D
10
D
D
d
d
D
D
d
d
10
d
d
D
D
D
D
d
d
10
D
D
d
d
d
d
D
D
10
So, now we've counted the number of asci where there has been no recombination
between the D locus and the centromere, as well as those where there has been
recombination.
Note that here we are counting entire products of meiosis, not simply counting each
individual genotype. So let's consider one of the asci that had recombination. Recall that
recombination will involve a pair of chromatids, leaving the other pair as parentals or non
recombinants. Thus for each ascus that indicated recombination, only half of the progeny
(or spores) are the result of recombination.
So the recombination proportion r = (number recombinant asci/ 2)/total number of asci
r = (40/2)/ 600 = 0.033 or 3.3cM
Thus the map would appear as
Centromere
3.3cm
D
Lod Scores and Human pegrees and linkage.
One issue in human genetics is we can't force mating. People tend to be resistant to this.
So, we must use the matings/pedigrees available to us.
Furthermore, human families are typically small.
As a result, a single family alone will typically not provide sufficient power to determine
whether genes might, or might not be linked.
This issue can be of considerable significance in human medical genetics, where we
might wish to find a disease causing genes and the approach to use is typically to find
linked genetic markers, to allow us to close in on the gene.
To resolve this, one can combine information across a number of pedigrees, and an
approach to doing this involves the use of Lod (log of the odds ratio) scores. Lod scores
are also used in other branches of genetics, and they bear considerable resemblance to
maximum likelihood ratios (a statistical method for estimation).
The approach of using Lod scores, involves determining the probability of observing a
particular set of progeny for a single family first assuming the two genes are not linked
(i.e. assume independent assortment). It is then possible to construct the probability of
observing the progeny assuming a range of different linkage values (or the value of, r,
estimated from the family). The ratio of the probability with linkage, to that assuming no
linkage is calculated (the so-called odds ratios) and then simply take the log10 of the
number. The r value corresponding to the greatest Lod score, provides the best estimate
of the r. The Lod scores can simply be added up across a number of pedigrees. If the Lod
score exceeds 3 (this means probability of observing the progeny assuming linkage is
1000 times greater than that if the genes were unlinked) it is typically assumed that the
genes are linked.
An example perhaps illustrates this best.
Imagine you are interested in a disease causing gene, d, as opposed to the normal allele
D. And you also have a molecular marker with two alleles G1 and G2 where you can
detect both alleles using RFLPs or some other molecular method. You also know the
parental versus recombinant types because of information in the pedigree.
You explore a number of family pedigrees for one where the there is a double
heterozygote cross DdG1G2 x ddG2G2 (this is essentially a test cross)
So you then observe that six progeny are produced.
2 are DG1 parental
2 are dG2 parental
1 is dG1 recomb
1 is DG2 recomb
(we'd estimate r = 2/6 = 0.33)
Now we can write out the probability of observing these progeny assuming various r
values. First beginning with r = 0.5, or indept assort.
Under independent assortment we expect to see 0.25 of each progeny type.
So prob of observing these six progeny is .25x.25x.25x.25x.25x.25 = 0.000244
if r = .2
Then you expect to see 0.8/2 = 0.4 of each parental and 0.2/2 of each recomb.
So probability is 0.4*.4*.4*.4*.1*.1 = 0.000256
if r=.333
0.3334 x 0.1672 = 0.000343
etc. for other probs.
You can then do this for other r-values.
Then construct the ratio, and then the Log10 of the ratio
Tabulate below
Probabilty
Ratio
Lod score
.5
.4
.33
.3
.2
0.000244
1
0.000324
1.327104
0.12290496
0.000343
1.404453
0.14750706
0.000338
1.382976
0.14081464
0.000256
1.048576
0.02059991
0
So here we can see, not surprizingly that the highest LOD score is where r = 0.33 which
is the estimate of we'd expect , r = 2recomb/6 progeny = 0.333.
The LOD value for r = 0.33 falls well below 3.0 and so to ask whether these genes are
truly linked we would need to examine more pedigrees segregating for these genes, and
then simply add the Lod values together. If they exceed three, then we can be reasonably
convinced the genes are linked, and we can estimate the recombination frequency from
the data. This then might provide us with information we need to begin to find the gene
(at the DNA level) for the disease allele D.
LOD score
LOD score
0.2
0.15
0.1
0.05
0
0.1
0.2
0.3
recom bination (r)
0.4
0.5