Download Molecular population genetics Magnus Nordborg* and Hideki Innan

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genomic library wikipedia , lookup

Epistasis wikipedia , lookup

Behavioural genetics wikipedia , lookup

Metagenomics wikipedia , lookup

DNA barcoding wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

RNA-Seq wikipedia , lookup

Public health genomics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Pathogenomics wikipedia , lookup

Gene wikipedia , lookup

Group selection wikipedia , lookup

Genetic drift wikipedia , lookup

Genomics wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Gene expression programming wikipedia , lookup

Designer baby wikipedia , lookup

History of genetic engineering wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Human genetic variation wikipedia , lookup

Genome evolution wikipedia , lookup

Genome editing wikipedia , lookup

Helitron (biology) wikipedia , lookup

Medical genetics wikipedia , lookup

Koinophilia wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Population genetics wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
69
Molecular population genetics
Magnus Nordborg* and Hideki Innan
Molecular population genetics is entering a new era dominated
by studies of genomic polymorphism. Some of the theory that
will be needed to analyze data generated by such studies is
already available, but much more work is needed. Furthermore,
population genetics is becoming increasingly relevant to other
fields of biology, for example to genetic epidemiology, because
of disease gene mapping in general populations.
Addresses
Department of Biological Sciences, University of Southern California,
835 W 37th St, SHS 172, Los Angeles, California 90089-1340, USA
*e-mail: [email protected]
Current Opinion in Plant Biology 2002, 5:69–73
1369-5266/02/$ — see front matter
© 2002 Elsevier Science Ltd. All rights reserved.
Abbreviations
Ka
rate of nucleotide substitution at nonsynonymous sites
Ks
rate of nucleotide substitution at synonymous sites
LD
linkage disequilibrium
Rpm1 Resistance to Pseudomonas syringae ssp. maculicola1
tb1
teosinte branched1
Introduction
We are currently witnessing a technology-driven explosion
in the availability of genetic polymorphism data. These
developments are revolutionizing evolutionary genetics,
which for most of its history has been a field noticeably
lacking in data [1]. They are also bringing the theory of
population genetics back into the limelight as researchers
are faced with analyzing their sequences [2]. Our purpose
here is to review these developments, in particular as they
apply to plant biology (although we will not hesitate to use
examples from other organisms when more appropriate).
Because the relevant population genetics theory is not
common knowledge, we will explain the basic ideas before
proceeding to the data.
Modeling polymorphism
The ‘Neutral Theory’ of molecular evolution [3] plays a
central role in the analysis of population genetic data.
The Neutral Theory holds that the majority of the polymorphisms seen within and among species are selectively
neutral, or at least nearly so. Neutrality makes mathematical
modeling relatively easy; however, much more important
is the fact that assuming neutrality gives rise to a natural
null model. Selection and other phenomena of interest, such
as particular scenarios of population structure (i.e. subdivision)
and migration, can then be viewed as perturbations of a
standard neutral model.
Consider a sample of copies of a particular gene or short
chromosomal segment. Focus on a particular site. All of the
sampled copies of this site must be related to each other
through some kind of genealogical tree. This is true
regardless of whether the sites were sampled from the
same or different populations, or even from different
species (see Figure 1). It is also true in the presence of
recombination, although different sites will then typically
have different trees. Selectively neutral mutations at a
site can be thought of as having occurred according to a
random process along the branches of the tree for that
site. Precisely because they are neutral, they will not have
affected the tree itself. The pattern of polymorphism will
thus reflect the tree in a statistical sense. To understand
the genomic pattern of polymorphism expected in a
population, we need to understand the underlying pattern
of trees.
It is extremely important to understand the difference
between gene trees and species trees. A species tree is an
abstract notion that refers to the genealogical relationship
among a number of species (i.e. which species begat
which). A gene tree, on the other hand, is concrete and
refers to the genealogical relationship among a number of
homologous copies of a site in the genome. When a single
gene copy is sampled from each of a number of species
that have been separated (without hybridization) for a
sufficiently long time, the gene tree must reflect the
species tree very closely. For example, regardless of which
part of the genome is studied, a copy of a specific gene
from a human and a copy from a chimpanzee will always be
more closely related to each other than either is to a copy
from a cow. If we consider copies from less well separated
species such as human, chimp, and gorilla, however, the
picture is less clear: in this case, it is at least possible that
for some genes, humans are closer to gorillas than to chimps
[4•]. Finally, if we consider copies sampled from members
of the same species, it is clear that the picture is completely
different: for example, in humans, three copies sampled
from a Swede, a Japanese, and an African could be related
in any way. Of course, some trees may be more common
than others, but the point is that any particular tree must
be treated as random. To the extent that genomic patterns
exist, they can only be discovered using statistics.
The fact that the genealogy of genes sampled from a
population is random adds an extra level of randomness to
the pattern of variation we expect to see across the genome.
The existence of an underlying genealogy means that most
classical statistical methods (which rely on independent
samples) do not apply to population genetic data [5•,6•]. In
order to handle such data, a simple stochastic model known
as ‘the coalescent’ has been developed [1,5•,6•,7–10]. The
coalescent is a powerful simulation tool that makes it easy
to gain insight into the pattern of variation expected under
various evolutionary scenarios [5•]. It also forms the basis of
the modern, computationally intensive, inference methods
that are currently being developed [6•].
70
Growth and development
Figure 1
(a)
(b)
(d)
(c)
(e)
Examples of gene trees in: (a) a standard
neutral model with constant population size;
(b) an exponentially growing population; and
(c) a structured population with two
subpopulations. The open circles represent
sampled copies and the closed circles
represent their most recent common ancestors
(MRCA). Selection can mimic demography,
but affects only the selected site (and the
region surrounding it). (d) A gene tree after a
fixation of an advantageous mutation occurred.
The shape of tree is expected to be similar to
that in a growing population as seen in (b).
(e) A gene tree that occurs when two alleles
are maintained for a long time by balancing
selection. The shape of tree is similar to that in
a structured population (c), with mutation
taking the place of migration [5•].
Current Opinion in Plant Biology
We now turn to specific questions that have been addressed
empirically, while keeping these general remarks in mind.
Inferring demography
Demographic history affects the shape of gene trees and
thus the pattern of polymorphism. Figure 1 illustrates this
by comparing a few simple examples to a standard model
without structure and with constant population size. In a
population whose size has been growing exponentially
(Figure 1b), the most recent common ancestor (MRCA) of
the sample is relatively young and the external branches of
the gene tree are expected to be long in comparison with
the internal branches. Such a tree is said to be ‘star-like’. A
large proportion of mutations on the tree will appear only
once in the sample (singleton polymorphisms). Similar
trees are expected in a population that has recovered from
a severe reduction of population size (i.e. a ‘bottle-neck
event’). In contrast, a completely different shape of tree is
expected in a structured population. Consider a population
in which there are two subpopulations connected by
infrequent migration (Figure 1c). Although genes sampled
from the same subpopulation may have a relatively recent
common ancestor, the MRCA of the whole sample may be
ancient. As a result, most mutations on the tree will result
in fixed differences between the subpopulations.
As would be expected given the trees in Figure 1, it is
possible to infer the demographic history of a population
from the pattern of DNA polymorphism. This is much
more difficult than one might expect, however, because of
the huge stochastic variance of the coalescent process.
Even in a constant-size population, we sometimes observe
trees that look like they came from a growing or structured
population. The only way to reduce this variance is to
collect data from a number of independent (unlinked) loci,
and to rely on the fact that the demographic history affects
the entire genome in the same way.
The conclusions that can be drawn from a single locus (or
non-recombining genome, such as the mitochondrial DNA
[mtDNA] or chloroplast DNA) are in general extremely
limited. The best-known example is from humans, in which
early studies of mtDNA (which has a star-like tree similar to
that illustrated in Figure 1c) suggested a rapid population
expansion from a very small population [11], a conclusion
that has since been contradicted by studies of nuclear
regions [12]. Similarly, in Arabidopsis thaliana, early studies
indicated extreme population subdivision, perhaps as a
result of ancient admixture [13], but later studies revealed
that this pattern is not seen in the entire genome [14]. The
situation is further complicated by surveys of genome-wide
amplified fragment-length polymorphism (AFLP) markers
that suggest weak isolation by distance, and a relatively
recent population expansion [15–17]. The picture that
emerges is much more complex than the simple models
illustrated in Figure 1, and seems to involve both ancient
subdivision and recent expansion. This is perhaps not
unexpected given the history of glaciation in the areas where
A. thaliana is currently found [16]. Large differences in the
pattern of polymorphism between genomic regions are also
seen in barley [18]. It is clear from these examples that
studies designed to investigate demography require a large
number of genome-wide markers. Few such studies are yet
available, but they will surely become commonplace. In
many cases, estimating the population structure will not be
a goal in itself but a prerequisite for other analyses. This is
Molecular population genetics Nordborg and Innan
exemplified by a recent study of maize, in which population
structure was inferred using 141 SSR (simple sequence
repeat) loci in order to carry out so-called linkage disequilibrium (LD) mapping [19••]. We return to this topic below.
Domesticated species such as maize are of particular interest
because the process of domestication may well have
involved a dramatic bottleneck. As a result, the level of
variation might be expected to be very low in domesticated
species. However, for grasses, this does not appear to be
the case [20]. For example, maize has about 80% of the
diversity found in its wild relative [21]. Nevertheless,
variability does seem to have been reduced in a short
region of the teosinte branched1 (tb1) locus, which might
have been subject to strong artificial selection during the
domestication of this crop by ancient agriculturists [22••].
As we will see below, such a reduction in variability is
precisely what population genetics theory would predict.
Detecting selection
Selection differs from demography in that it affects specific
sites (i.e. those that are functionally important and, indirectly,
those that are linked to functionally important sites) rather
than the entire genome. Most ‘tests of selection’ take
advantage of this by comparing different genomic regions
or different kinds of sites [23•].
Directional selection between species
Adaptive evolution can be detected by comparing the
substitution rate at nonsynonymous sites (Ka) with that at
synonymous sites (Ks). Because most amino-acid changes
are likely to be deleterious, Ka is usually much smaller
than Ks. However, Ka can be increased by strong selection
for a novel protein function. Very rapid amino-acid
evolution (Ka>Ks for some comparisons) can be seen in
chitinases in the genus Arabis [24]. Chitinases are involved
in plant defense and so the strong selection for novel
chitinase function may be the result of plant–pathogen
co-evolution. Another example is provided by the
Hawaiian silversword alliance [25]. These species are
distributed on six of the eight main islands of Hawaii, and
many of them are single-island endemics. There is great
variation in habitat, growth form and morphology among
the various species, which seem to have diverged from
each other quite recently. Intriguingly, the rate of nonsynonymous substitution in certain regulatory genes seems
to be faster in the Hawaiian silversword alliance than in
their North American relatives. The same increase in
nonsynonymous substitution is not seen for structural
genes, indicating that the regulatory genes have undergone
rapid evolution. Some caution is needed, however, as an
increase in Ka can have several causes. When Ka>Ks, there
can be little doubt that selection has been involved, but
this is an extremely conservative criterion.
Directional selection within species (selective sweeps)
Whenever an advantageous mutation is driven by selection
through a population, it will leave a trace in the surrounding
71
chromosomal region. Intuitively, selection causes a form
of ‘bottleneck’ that is limited to the selected site and the
surrounding chromosomal region (Figure 1d). The result is
a local loss of variation [26]. Such a footprint of selection
appears to exist in the vicinity of the maize gene tb1. The
amount of variation in the 5′ regulatory region of tb1 is
significantly smaller than that in other regions [22••].
Furthermore, the shape of the gene tree for this region
seems to be star-like. These observations are consistent
with the hypothesis that tb1 played an important role
during the domestication of maize. It is well known that
tb1 accounts for significant morphological differences in
crosses between maize and its progenitor, teosinte.
Balancing selection
Balancing selection refers to any kind of selection that
preserves polymorphism, that is, keeps alleles from
drifting to low frequencies and being lost by chance. This
means that the selectively different alleles will be older
than what is expected for different alleles at loci that are
not subject to balancing selection (i.e. most of the genome)
(Figure 1e). As the oldest alleles will have had most
time to diverge (i.e. to accumulate selectively neutral
differences in their flanking regions), a peak of increased
polymorphism is formed surrounding the locus or site
under selection [27]. The canonical example of this
phenomenon is the increased variability associated with
the major histocompatibility complex (MHC) in mammals
[28,29]. An equally impressive example from plants is
provided by self-incompatibility loci, where selection
maintains polymorphism because rare alleles always have a
selective advantage [30]. A more recently discovered
example comes from the disease-resistance locus Rpm1
(Resistance to Pseudomonas syringae ssp. maculicola1) in
A. thaliana [31••]. Two major alleles exist at this locus:
resistant and non-resistant. The non-resistant allele turns
out to be a deletion of the entire Rpm1 gene. Sequence
analysis of the flanking region in a number of A. thaliana
accessions revealed a gene tree that has a very long branch
between the two alleles (Figure 1e), and a pattern of
polymorphism that is significantly different from that
expected under the standard neutral model. It should be
noted that determining whether an observed peak is
indeed significant is difficult, in particular because population
structure increases the risk of false positives [32].
Genomic patterns of selection
About ten years ago, it was discovered that the level of
sequence variation in Drosophila melanogaster is not constant
across the chromosome, but is positively correlated with
the recombination rate [33]. The pattern of variation cannot
be explained by a correlation between mutation and
recombination, because the rate of divergence between
species is not correlated with the recombination rate.
Instead, it seems that the pattern is caused by some form
of continual selection (e.g. purifying selection against
deleterious mutations [34] or recurrent selective sweeps
[26]), the rationale being that each selective event would
72
Growth and development
have a greater affect on variation in regions of low
recombination. Since its initial discovery [33], correlation
between level of sequence variation and recombination rate
has been found in many other organisms. In plants, this
pattern has now been observed in wheat [35], tomato
[36,37], sea beet [38], and maize [39], although the correlation
is nowhere near as clear in plants as in D. melanogaster,
possibly because the pattern is obscured by the effects of
population structure.
Recombination, linkage disequilibrium,
and mapping
Although recombination has often been ignored in studies
of DNA sequence polymorphism, it is easily incorporated
into the standard coalescent model [5•,9,40]. The main
effect of recombination is that it allows linked sites to have
different trees. These trees will be correlated; the strength
of the correlation depends on the genetic distance
between the sites. The correlation in the underlying
genealogies may result in correlation among alleles in
haplotypes. Such non-random association among alleles is
known as LD.
LD has received much attention recently because it may
be used for fine-scale mapping [41] of genes that are
responsible for naturally occurring phenotypic variation
(e.g. human disease loci). The idea behind LD mapping is
simply to look for marker alleles, or multi-locus haplotypes,
that are associated with the phenotype in the general
population. Neither crosses nor pedigrees are needed.
LD mapping depends crucially on the chromosomal extent
of LD. If LD decays too slowly with distance, it cannot be
used for fine-scale mapping; if it decays too rapidly, an
impracticably dense map is needed [42,43]. Genomic data
on the extent of LD are only available for a few organisms.
Several studies have been made in humans, but the picture
is far from clear [44]. Depending on the region and sample,
estimates range from ten to several hundred kilobase pairs.
In Drosophila, LD typically decays within 1 kb [45].
Genomic surveys of LD are available for two plant species.
In maize, LD decays rapidly, on a scale similar to or
even faster than that observed in Drosophila [39,46]. In
Arabidopsis, LD is much more extensive, decaying within
perhaps 250 kb (M Nordborg et al., unpublished data).
This is consistent with the difference in breeding system
between these two species: whereas maize undergoes
outcrossing, Arabidopsis is highly selfing, and selfing is
expected to increase LD greatly [47].
The difference in LD between these two organisms has
implications for LD mapping. A recent study in maize
found association between particular polymorphic sites
and phenotypic variation for flowering time [19••]. Given
the much more extensive LD, mapping at such a fine scale
will almost certainly not be possible in A. thaliana. On the
other hand, the extensive LD in this species may make it
feasible to carry out genomic screens using markers every
50 kb or so, an approach that is unlikely to work in maize.
Conclusions
Even though the first study of DNA sequence polymorphism
was published almost 20 years ago [48], such studies have
started to become common only recently [1]. Ironically, we
are just about to witness another technological leap: from
studies of single loci to genomic polymorphism studies. It
is certain that several completely sequenced genomes of
model organisms will soon be available. To make sense of
these data, much new theory will be needed. The amount
of data will make it possible to take population structure
into account [49] and to identify many polymorphisms that
are selectively important. Comparative studies between
closely related species or variants may reveal much about
the molecular basis of adaptive evolution. In plants, the
evolution of development is likely to be of particular
interest. Because polymorphism data are now important in
fields other than evolutionary biology (e.g. in genetic
epidemiology), it seems certain that population genetics
will receive much more attention in the next 20 years than
in the past 20 years.
Acknowledgements
We would like to thank D. Weigel for comments on the manuscript.
References and recommended reading
Papers of particular interest, published within the annual period of review,
have been highlighted as:
• of special interest
•• of outstanding interest
1.
Felsenstein J: From population genetics to evolutionary genetics.
In A View Through the Trees of Evolutionary Genetics: From
Molecules to Morphology. Edited by Singh RS, Krimbas CB. New
York: Cambridge University Press; 2000:609-627.
2.
Chakravarti A: Population genetics — making sense out of
sequence. Nat Genet 1999, 21:56-60.
3.
Kimura M: The Neutral Theory of Molecular Evolution. Cambridge,
UK: Cambridge University Press; 1983.
4.
•
Chen F-C, Li W-H: Genomic divergences between humans and
other hominoids and the effective population size of the common
ancestor of humans and chimpanzees. Am J Hum Genet 2001,
58:444-456.
This paper investigates the human–chimp–gorilla relationship using many
genes: an excellent introduction to the difference between species trees and
gene trees.
5.
•
Nordborg M: Coalescent theory. In Handbook of Statistical Genetics.
Edited by Balding DJ, Bishop M, Cannings C. Chichester, UK: John
Wiley & Sons Inc; 2001:179-212.
See annotation [6•].
6.
•
Stephens M: Inference under the coalescent. In Handbook of
Statistical Genetics. Edited by Balding DJ, Bishop M, Cannings C.
Chichester, UK: John Wiley & Sons Inc; 2001:213-238.
These two papers [5•,6•] introduce coalescent theory, and discuss the statistical issues involved in the analysis of polymorphism data. The original
description of the coalescent can be found in [7–10].
7.
Kingman JFC: On the genealogy of large populations. J Appl Prob
1982, 19A:27-43.
8.
Kingman JFC: The coalescent. Stochastic Proc Appl 1982,
13:235-248.
9.
Hudson RR: Properties of a neutral allele model with intragenic
recombination. Theor Popul Biol 1983, 23:183-201.
10. Tajima F: Evolutionary relationship of DNA sequences in finite
populations. Genetics 1983, 105:437-460.
Molecular population genetics Nordborg and Innan
11. Cann RL, Stoneking M, Wilson AC: Mitochondrial DNA and human
evolution. Nature 1987, 325:31-36.
12. Przeworski M, Hudson HH, Di Rienzo A: Adjusting the focus on
human variation. Trends Genet 2000, 16:296-302.
13. Innan H, Tajima F, Terauchi R, Miyashita NT: Intragenic
recombination in the Adh locus of the wild plant Arabidopsis
thaliana. Genetics 1996, 143:1761-1770.
14. Aguadé M: Nucleotide sequence variation at two genes of the
phenylpropanoid pathway, the FAH1 and F3H genes, in
Arabidopsis thaliana. Mol Biol Evol 2001, 18:1-9.
15. Miyashita NT, Kawabe A, Innan H: DNA variation in the wild plant
Arabidopsis thaliana revealed by amplified fragment length
polymorphism analysis. Genetics 1999, 152:1723-1731.
16. Sharbel TF, Haubold B, Mitchell-Olds T: Genetic isolation by
distance in Arabidopsis thaliana: biogeography and postglacial
colonization of Europe. Mol Ecol 2000, 9:2109-2118.
17.
Innan H, Stephan W: The coalescent in an exponentially growing
metapopulation and its application to Arabidopsis thaliana.
Genetics 2000, 155:2015-2019.
18. Lin J-Z, Brown AHD, Clegg MT: Heterogeneous geographic patterns
of nucleotide sequence diversity between two alcohol
dehydrogenase genes in wild barley (Hordeum vulgare subspecies
spontaneum). Proc Natl Acad Sci USA 2001, 98:531-536.
19. Thornsberry JM, Goodman MM, Doebley J, Kresovich S, Nielsen D,
•• Buckler ES IV: Dwarf8 polymorphisms associate with variation in
flowering time. Nat Genet 2001, 28:286-289.
The first example of LD mapping in plants, and the first example in any organism
of the use of unlinked markers to correct for the effects of population structure.
20. Buckler ES IV, Thornsberry JM, Kresovich S: Molecular diversity,
structure and domestication of grasses. Genet Res 2001, 77:213-218.
21. White SE, Doebley JF: The molecular evolution of terminal ear1, a
regulatory gene in the genus Zea. Genetics 1999, 153:1455-1462.
22. Wang R-L, Stec A, Hey J, Lukens L, Doebley J: The limits of
•• selection during maize domestication. Nature 1999, 398:236-239.
This study suggests that the tb1 locus may have been subject to a ‘selective
sweep’ during the domestication of maize. Similar studies of other loci, and
in other species, are likely to follow.
23. Kreitman M: Methods to detect selection in populations with
•
applications to the human. Annu Rev Genomics Hum Genet 2000,
1:539-559.
A good summary of the many methods for detecting selection that have been
developed during the past 20 years.
24. Bishop JG, Dean AM, Mitchell-Olds T: Rapid evolution in plant
chitinases: molecular targets of selection in plant–pathogen
coevolution. Proc Natl Acad Sci USA 2000, 97:5322-5327.
25. Barrier M, Robichaux RH, Purugganan MD: Accelerated regulatory
gene evolution in an adaptive radiation. Proc Natl Acad Sci USA
2001, 98:10208-10213.
26. Kaplan NL, Hudson RR, Langley CH: The ‘hitch-hiking’ effect
revisited. Genetics 1989, 123:887-899.
27.
Hudson RR, Kaplan NL: The coalescent process in models with
selection and recombination. Genetics 1988, 120:831-840.
28. Hughes AL, Nei M: Pattern of nucleotide substitution at major
histocompatibility complex loci reveals overdominant selection.
Nature 1988, 335:167-170.
73
31. Stahl EA, Dwyer G, Mauricio R, Kreitman M, Bergelson J:
•• Dynamics of disease resistance polymorphism at the Rpm1 locus
of Arabidopsis. Nature 1999, 400:667-671.
The authors describe the striking pattern of DNA polymorphisms in the flanking region of an insertion/deletion polymorphism of the Rpm1 gene, which is
presumably maintained by strong balancing selection.
32. Filatov DA, Charlesworth D: DNA polymorphism, haplotype
structure and balancing selection in the Leavenworthia PgiC
locus. Genetics 1999, 153:1423-1434.
33. Begun DJ, Aquadro CF: Levels of naturally occurring DNA
polymorphism correlate with recombination rates in
D. melanogaster. Nature 1992, 356:519-520.
34. Charlesworth B, Morgan MT, Charlesworth D: The effect of
deleterious mutations on neutral molecular variation. Genetics
1993, 134:1289-1303.
35. Dvorák J, Luo M-C, Yang Z-L: Restriction fragment length
polymorphism and divergence in the genomic regions of high and
low recombination in self-fertilizing and cross-fertilizing Aegilops
species. Genetics 1998, 148:423-434.
36. Stephan W, Langley CH: DNA polymorphism in Lycopersicon and
crossing-over per physical length. Genetics 1998, 150:1585-1593.
37.
Baudry E, Kerdelhué C, Innan H, Stephan W: Species and
recombination effects on DNA variability in the tomato genus.
Genetics 2001, 158:1725-1735.
38. Kraft T, Säll T, Magnusson-Rading I, Nilsson N-O, Halldén C: Positive
correlation between recombination rates and levels of genetic
variation in natural populations of sea beet (Beta vulgaris subsp.
maritima). Genetics 1998, 150:1239-1244.
39. Tenaillon MI, Sawkins MC, Long AD, Gaut RL, Doebley JF, Gaut BS:
Patterns of DNA sequence polymorphism along chromosome 1 of
maize (Zea mays ssp. mays L.). Proc Natl Acad Sci USA 2001,
98:9161-9166.
40. Griffiths RC, Marjoram P: An ancestral recombination graph. In
Progress in Population Genetics and Human Evolution. Edited by
Donnelly P, Tavaré S. New York: Springer-Verlag; 1997:257-270.
41. Cardon LR, Bell JI: Association study designs for complex
diseases. Nature Rev Genet 2001, 2:91-99.
42. Kruglyak L: Prospects for whole-genome linkage disequilibrium
mapping of common disease genes. Nat Genet 1999,
22:139-144.
43. Altshuler D, Daly M, Kruglyak L: Guilt by association. Nat Genet
2000, 26:135-137.
44. Pritchard JK, Przeworski M: Linkage disequilibrium in humans:
models and data. Am J Hum Genet 2001, 69:1-14.
45. Langley CH, Lazzaro BP, Phillips W, Heikkinen E, Braverman JM:
Linkage disequilibria and the site frequency spectra in the su(s)
and su(wa) regions of the Drosophila melanogaster
X chromosome. Genetics 2000, 156:1837-1852.
46. Remington DL, Thornsberry JM, Matsuoka Y, Wilson LM, Whitt SR,
Doebley J, Kresovich S, Goodman MM, Buckler ES IV: Structure of
linkage disequilibrium and phenotypic associations in the maize
genome. Proc Natl Acad Sci USA 2001, 98:11479-11484.
47.
Nordborg M: Linkage disequilibrium, gene trees and selfing: an
ancestral recombination graph with partial self-fertilization.
Genetics 2000, 154:923-929.
29. Parham P, Ohta T: Population biology of antigen presentation by
MHC class I molecules. Science 1996, 272:67-74.
48. Kreitman M: Nucleotide polymorphism at the alcohol
dehydrogenase locus of Drosophila melanogaster. Nature 1983,
304:412-417.
30. Charlesworth D, Awadalla P: Flowering plant self-incompatibility:
the molecular population genetics of Brassica S-loci. Heredity
1998, 81:1-9.
49. Pritchard J, Stephens M, Donnelly P: Inference of population
structure using multilocus genotype data. Genetics 2000,
155:945-959.