Download Population divergence and candidate signatures of natural selection

Document related concepts

Ecology wikipedia , lookup

Hologenome theory of evolution wikipedia , lookup

Organisms at high altitude wikipedia , lookup

High-altitude adaptation in humans wikipedia , lookup

Natural selection wikipedia , lookup

Evolution wikipedia , lookup

Genetics and the Origin of Species wikipedia , lookup

The eclipse of Darwinism wikipedia , lookup

Genetic drift wikipedia , lookup

Koinophilia wikipedia , lookup

Introduction to evolution wikipedia , lookup

Transcript
UNIVERSITY OF CALGARY
Population divergence and candidate signatures of natural selection in alpine and lowland
ecotypes of the allotetrapoloid plant, Anemone multifida (Ranunculaceae)
by
Jamie R McEwen
A THESIS
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES
IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE
DEGREE OF MASTER OF SCIENCE
DEPARTMENT OF BIOLOGICAL SCIENCES
CALGARY, ALBERTA
AUGUST, 2012
© Jamie R McEwen 2012
Abstract
Adaptation plays a central role in population divergence and speciation. Studying the
evolutionary history of populations due to neutral evolutionary processes and the effects
of natural selection enables the identification of genes under natural selection in the wild.
In this thesis, I conducted a genome scan to elucidate candidate signatures of natural
selection in alpine and lowland ecotypes of the allopolyploid plant, Anemone multifida. I
found numerous signatures of divergent natural selection between alpine and lowland
populations and between alpine populations, but natural selection appeared strongest in
alpine environments. These results are consistent with findings in diploid species, but the
neutral evolutionary structure of the polyploid A. multifida showed complex patterns of
differentiation. Overall, these results indicate divergent natural selection has generated
adaptation to alpine and lowland environments despite complex evolutionary history.
ii
Acknowledgements
1. My supervisors Dr. Jana Vamosi and Dr. Sean Rogers, and committee members
Dr. Lawrence Harder, and Dr. Gordon Chua for their ideas and support
2. Grant support from NSERC, Prairie Adaptation Research Collaborative, Alberta
Conservation Association, and the University of Calgary
3. Sean Rogers and Jana Vamosi lab members
4. Audra McEwen for support through my degrees
iii
Table of Contents
Abstract…………………………………………………………………………………...ii
Acknowledgements………………………………………………………………………iii
Table of Contents…………………………………………………………………………iv
List of Tables………………………………………………………….………………….vi
List of Figures……………………………………………………………………………vii
List of Symbols, Abbreviations, Nomenclatures………………………………………..viii
Chapter 1: Introduction to Natural Selection and Population Genetics…………………...1
1.1 Identifying Signatures of Natural Selection in the Genome…………………..8
1.2 Challenges of Polyploidy………………………………………………….....10
1.3 Amplified Fragment Length Polymorphism (AFLP)………………………..12
1.3 Alpine and Lowland Environments………………………………………….14
1.4 Research Objectives, Hypotheses and Predictions……………………...…...15
Chapter 2: Materials and Methods……………………………………………………….19
2.1 Study Species, Field Sampling and Population Characteristics……………...19
2.2 DNA Extraction, AFLP and Allele Scoring…………………………………21
2.3 Detection of Outlier Loci…………………………………………………….23
2.4 Genetic and Population Structure Analyses………………………………….24
2.5 Phenotype Analyses………………………………………………………….26
Chapter 3: Results………………………………………………………………………..28
3.1 AFLP and the Detection of Outlier Loci……………………………………..28
3.2 Population Structure of Neutral Loci………………………………………...31
3.3 Population Structure of Outlier Loci…………………………………………37
3.5 Phenotypic Differences in Height and Floral colour….……………………..40
Chapter 4: Discussion……………………………………………………………………42
4.1 Genetic Population Structure at Neutral and Outlier Loci…………………...42
4.2 Limitations and Alternate Explanations……………………………………..48
iv
4.3 Future Directions…………………………………………………………….51
Appendix A: Supplementary Data and Methods…………………………………….......68
Appendix B: AFLP Protocol Taken From the AFLP Plant Mapping Protocol for Regular
Plant Genomes (Applied Bioystems……………………………………………………..76
Appendix C: Example of Electropherogram and Raw Data Produced from AFLP……..78
v
List of Tables
Table 1. ………………….………………………………..……………………………...………19
Table 2..………….…………………………………………………………...…………………..21
Table 3..……..…………...……………………………………………………………………….29
Table 4..…………………………………………………………………………………………..35
Table 5..….……………….............................................................................................................40
vi
List of Figures
Figure 1…………………………………………………………………………………………….6
Figure 2. ……………..…………………………………………………………………………...20
Figure 3…..………………………………………………………………………………..…...…30
Figure 4….………………………………………………………………………………………..31
Figure 5..…..…………………………………………………….………………………………..32
Figure 6..………………………………………………………………………………………….33
Figure 7. ………………………………………………………………………………………….34
Figure 8…………………………………………………………………………………………...36
Figure 9…..……………………………………………………………………………………….37
Figure 10.….……………………………………………………………………….......................38
Figure 11……..…………………………………………………………………………………...39
Figure 12………………………………………………………………………………………….41
vii
List of Symbols, Abbreviations, Nomenclatures
AFLP: Amplified Fragment Length Polymorphism
AMOVA: Analysis of Molecular Variance
HWP: Highwood Pass alpine population
HSB: Hailstone Bute alpine population
WC: Willow Creek lowland population
BL: Beauvais Lake lowland population
BHS: Big Hill Springs lowland population
viii
CHAPTER 1: INTORDUCTION TO NATURAL SELECTION AND POPULATION
GENOMICS
Natural selection plays an important role in the adaptation of species to their
environments, divergence between populations and species diversity (references).
Phenotypically, natural selection increases the prevalence of traits in a population or
species that confer some form of adaptation to a particular environment. Individuals with
adaptive phenotypes have higher fitness, surviving to produce viable offspring with traits
that are advantageous in a particular environment (references). Genetically, the alleles
underlying adaptive phenotypes increase in frequency between generations in response to
natural selection, resulting in the evolution of populations towards adaptive traits, although
the dynamics of adaptive shifts can vary (Kauffman 1987; Hadany 2003). Natural selection
requires genetic variation within a population, and populations with little variation often
have limited adaptive potential (Willi et al. 2006). Natural selection generally reduces
genetic variation at loci under selection and causes differentiation between populations if
different alleles or genes are the targets of natural selection (Lenormand 2002). Through
genetic linkage, non-random segregation of alleles at two or more loci between generations,
the regions surrounding genes under selection can also approach fixation, leading to further
differentiation between populations, even at loci that are not directly affected by natural
selection, a phenomenon also known as genetic hitchhiking (Kim & Nielsen 2004).
Population divergence has been driven in large part by natural selection (Schluter
2001). Variation in environmental conditions between populations, or ecological
opportunity during colonization or modification of a habitat can drive differentiation
between populations or ecotypes (an ecotype is a genetically or phenotypically distinct
group within a species; Schluter 2001). A suite of traits that are selected for in one
environment can be non-adaptive or potentially deleterious in alternate environments,
leading to lower fitness of individuals following migration or gene flow between
populations (Lenormand 2002). For example, migrant individuals arriving with a different
suite of adaptations in a new population are likely to produce fewer offspring with adaptive
phenotypes than established individuals, causing eventual loss of non-adaptive phenotypes
despite gene flow, although extensive gene flow can limit the adaptive potential of a
population (Schluter 2001; Nosil et al. 2005; Bridle & Vines 2007). In extreme cases of
divergence, hybrids between individuals with adaptations to different environments can be
selected against, favouring assortative mating and reduced gene flow between ecotypes
(Schluter 2001; Nosil et al. 2005). Reductions in gene flow due to divergent selection
(which would select against individuals migrating between environments) can also enhance
the impact of non-selective mechanisms that can cause genetic differentiation (e.g. genetic
drift, or the random loss of alleles from a population which is enhanced in small
populations), further accelerating differentiation (Dobzhansky 1957). These isolating
effects of adaptation can eventually lead to large scale or whole genome differentiation
between populations or ecotypes, eventually leading to speciation (Peichel et al. 2001;
Rogers & Bernatchez 2005; Via & West 2008; Feder & Nosil 2010). Isolation by
adaptation has been observed in a number of cases by the genetic breakdown of individuals
hybridized between sufficiently diverged ecotypes (Burke & Voss 1998; Presgraves et al.
2003; Svedin et al. 2008; Renaut et al. 2012), although processes such as polyploidy may
still bridge the gap between species at the late stages of speciation (Chapman & Abbott
2010). By studying how natural selection and neutral evolutionary processes have affected
the adaptation and differentiation between populations and individuals within species we
can start to understand the mechanisms by which major evolutionary events such as
speciation initiate and progress.
2
The neutral theory of population genetics states that many genes, and the
phenotypes they affect, evolve through adaptively neutral processes (Lewontin 1974;
Kimura 1983). For example, genetic drift can cause fixation alleles in a population
(Lewontin 1974; Kimura 1983). Mutation, while typically deleterious in effect, is an
important source of new alleles in populations (Hamilton 2009). Demographic events, such
as population bottlenecks or founder effects, can enhance allele fixation in a population by
limiting genetic diversity (Maruyama & Fuerst 1985; Gavrilets & Hastings 2012).
Conversely, gene flow from individuals migrating between populations can introduce or
remove alleles from a population (Felsenstein 1976).
Determination of the causes of evolution of populations due to neutral and adaptive
processes is a challenging process. By demonstrating that variation in genetically based
phenotypes is associated with fitness differences between individuals, the direction,
magnitude and phenotypic loci of natural selection can be determined (Kingsolver et al.
2001). Demonstration that phenotypes of interest have a genetic basis is ultimately required
to separate the effects of phenotypic plasticity (environmentally induced changes in
phenotypes within generations) from selection for a phenotypic variant (Thompson 1991).
Additionally, as many traits may have evolved primarily due to selectively neutral
processes, it is necessary to differentiate between neutral and selective processes in
phenotypic evolution (Luikart et al. 2003; Stinchcombe & Hoekstra 2008). However,
determining patterns of neutral genetic population structure is generally not possible with
phenotypic data alone. By incorporating genetic information from populations, it is
possible to separate the effects of neutral and adaptive evolutionary processes for study,
and potentially find the genetic basis for adaptive phenotypes. By estimating the degree to
which populations are genetically differentiated, signatures of natural selection can be
distinguished at specific phenotypic or genetic loci with very high or low levels of
3
population differentiation (Luikart et al. 2003). In doing so, the causes of the evolution of
populations can be determined.
The development of genetic markers allowed examination the genetic structure of
populations to lay the groundwork for detecting natural selection (Charlesworth 2010). By
estimating population differentiation due to non-selective evolutionary processes, such as
reductions in genetic diversity from population fluctuations, age or sex structure, or the
random loss of alleles from a population through genetic drift, early genetic markers
permitted the separation of the effects of natural selection from non-selective evolutionary
processes (Charlesworth 2010). Additionally, the genetic basis of phenotypes could in
some cases be established through the discovery of associations between markers and
phenotypes, allowing direct tests of the genetic basis of adaptive phenotypic variation
(Charlesworth 2010). Genetic markers have also been used to estimate neutral population
genetic structure for providing a baseline expectation for phenotypic variation between
populations (Whitlock & Guillaume 2009). Genetic markers used in these early studies,
such as allozymes, are generally unreliable for assessing population structure, because they
may themselves be targets of natural selection (Luikart et al. 2003; Charlesworth 2010).
The development of genetic markers that directly amplify DNA, such as microsatellites,
helped circumvent these limitations, as only markers that show neutral variation between
populations can be used to estimate population structure (Luikart et al. 2003; Charlesworth
2010). Microsatellite markers are reliable and accurate for estimating neutral population
genetic structure (e.g. Forstmeier et al. 2012), but their relatively low genomic coverage
has likely limited the discovery of genetic loci that are the targets of natural selection
(Meudt & Clarke 2007). Ultimately, the development of population genomics, which
utilizes genetic techniques that can amplify a many markers distributed throughout the
4
genome, has made discovery of the molecular bases of adaptation and divergence practical
to in any species (Luikart et al. 2003).
By studying the population structure of neutral and outlier loci, the relative
contribution of neutral and adaptive evolutionary processes to population divergence can be
determined (Luikart et al. 2003; Stinchcombe & Hoekstra 2008; Storz & Wheat 2010).
This can be accomplished by generating a distribution of population differentiation
estimates (e.g. Wright’s Fixation Index, or FST) at multiple loci throughout the genome
(Fig. 1; Stinchcombe & Hoekstra 2008). Neutral loci will have intermediate differentiation,
whereas loci that are outliers relative to variation in population differentiation at the
majority of loci may represent signatures of natural selection (Fig. 1, Stinchcombe &
Hoekstra 2008: for alternate hypotheses see Siol et al. 2010; Bierne et al. 2011). Recent
technological advances in amplifying multiple markers simultaneously throughout the
genome have provided the means to study large portions of the genome in many
individuals, greatly enhancing simultaneous estimation of the extent of population
divergence due to neutral evolutionary forces and identification of loci that may be
involved in adaptation and population divergence (Luikart et al. 2003; Stinchcombe &
Hoekstra 2008). Population genomics has high power to detect the molecular mechanisms
underlying adaptive divergence, in addition to providing reliable estimates of population
structure due to neutral evolutionary processes (Luikart et al. 2003; Stinchcombe &
Hoekstra 2008). Many different types of studies have taken advantage of these techniques,
including the mapping of quantitative trait loci associated with adaptive phenotypes
involved in population divergence (Rogers & Bernatchez 2005; Mackay & Stone 2009;
Hager et al. 2009; Schielzeth et al. 2012), and the detection of loci involved in natural
selection and speciation (Stinchcombe & Hoekstra 2008; Stapley et al. 2010; Strasburg et
al. 2012).
5
Figure 1. A theoretical distribution of FST estimates (a measure of population
differentiation) from many genetic loci distributed throughout the genome. Neutral loci
show intermediate differentiation (black points), whereas outlier loci (red points) may have
been affected by natural selection. The dotted line represents the FST threshold for
distinguishing between neutral and outlier levels of population differentiation at each locus.
From Stinchcombe & Hoekstra (2008).
In addition to neutral evolutionary processes that can cause population divergence,
past hybridization or polyploidy may also complicate the discovery of loci involved in
adaptation, due to brief, rapid genomic divergence (Soltis & Soltis 1999). Hybridization
and polyploidy can cause novel gene function and genomic restructuring, which can be
facilitated by or cause rapid ecological divergence, adaptation into to novel environments
and speciation (Soltis & Soltis 1999; Osborn et al. 2003; Lexer et al. 2003; Adams &
Wendel 2005; Baack et al. 2005; Lai et al. 2005; Whitney et al. 2006; Rieseberg et al.
2007). Extensive genomic changes are common, particularly amongst polyploids, and can
6
leave large sections of chromosomes in a highly differentiated state between populations,
meaning genetic variation in a large portion of the genome is caused primarily to genomic
changes brought about by hybridization or polyploidy (Soltis & Soltis 1999; Otto &
Whitton 2000; Leitch & Bennett 2004; Leitch & Leitch 2008). Additionally, introgression
following hybridization or polyploidization can cause gene transfer amongst species,
introducing novel genetic variation into populations (Chapman & Abbott 2010; Whitney et
al. 2010). If populations are sufficiently isolated or incompatible following these major
genomic changes, populations may ultimately diverge via genetic processes unrelated to
adaptation (Soltis & Soltis 1999). The distinction between population divergence caused by
genomic rearrangements or hybridization from natural selection further underscores the
importance of sampling from replicate populations within each environment to distinguish
between these fundamentally different causes of population differentiation.
The determination of the structure of genetic variation in populations that inhabit
different environments provides insight into the effects of selective and neutral
evolutionary processes. The expansion of genomics research to include a diverse range of
organisms and environments will lead to novel insights into the function and causes of
genomic structure and further understanding of evolution in a diverse range of conditions.
In this study, I investigate the population genomics of an allopolyploid plant that occupies a
wide range of environments. The study of the population genomics of an organism with a
complex polyploid genome will develop a foundation for investigating the mechanisms of
evolution in a group of organisms that has received relatively little attention. The goals of
this study are to 1) illustrate the utility of genome scans with a non-model organism with a
complex genome, 2) investigate whether polyploidy inhibits the ability to detect signatures
of selection and/or population differentiation, and 3) to determine whether divergent
7
selection pressure in a wide-ranging plant occur in different environments, such as alpine
and lowland habitats.
1.1 IDENTIFYING SIGNATURES OF NATURAL SELECTION IN THE GENOME
The population genomics approach provides a powerful tool to identify signatures
of natural selection in wild populations for initial insight into the patterns and mechanisms
of adaptation, and is usually the first step in identifying candidate genes for adaptation
(Luikart et al. 2003; Stinchcombe & Hoekstra 2008). Although many studies have sought
the genetic basis of traits known to be adaptive, conducting genome scans on populations
or ecotypes that show ecological differentiation provides the means for finding any loci
associated with divergence or adaptation (Storz 2005; Stinchcombe & Hoekstra 2008;
Stapley et al. 2010; Strasburg et al. 2012), and the genomic processes associated with
evolution (Kim & Nielsen 2004; Nosil et al. 2009a; Skrede et al. 2009; Tice & Carlon
2011). Provided functional information about loci of interest, selection can be detected on
traits not previously hypothesized to be involved in ecological differentiation and traits not
easily quantified by direct observation, such as physiological, biochemical traits or patterns
of gene expression (Nichols et al. 2008; Derome et al. 2008; Whiteley et al. 2008; Paris et
al. 2010; Pavey et al. 2010; Storz & Wheat 2010). By studying many loci distributed
throughout the genome, loci that evolve primarily due to neutral and selective processes
can be characterized. Quantifying the degree to which these markers are differentiated
amongst populations or ecotypes provides a method for distinguishing neutral or
demographic effects affecting the whole genome, as reflected in population differentiation
amongst the majority of markers, from non-neutral effects, such as natural selection, at
each locus (Luikart et al. 2003; Buerkle et al. 2011). Under Hardy-Weinberg conditions,
neutral loci typically exhibit intermediate differentiation, whereas loci affected by
directional or balancing selection may show very high or low differentiation between
8
populations or ecotypes (Luikart et al. 2003). The distribution and frequencies of these
outlier loci amongst populations or environments can suggest how selection caused the
excessively low or high differentiation (Luikart et al. 2003).
Most investigations in non-model organisms that have found signatures of natural
selection have utilized anonymous markers (markers that do not contain sequence
information and cannot be directly used to infer function), such as AFLP, that easily and
predictably amplify across a wide variety of taxa (Stinchcombe & Hoekstra 2008). An
unfortunate limitation of using anonymous markers is the difficulty in obtaining functional
information from loci that show signatures of natural selection (Stinchcombe & Hoekstra
2008), although a few studies show promising results from isolating anonymous markers
for further investigation (Paris et al. 2010; Paris & Despres 2012). With the development of
molecular techniques for sequencing in non-model organisms (e.g. Baird et al. 2008), there
is a potential to utilize high-throughput sequencing to simultaneously identify outlier loci
and obtain sequencing information that may be used to determine the function of the region
under selection. AFLP, however, remains a cost-effective and powerful method for
discovering outlier loci that may be the target of natural selection.
Although many studies have identified outlier loci that might be targets of natural
selection, not all have included information about the ecological context in which the
outlier loci were found (Stinchcombe & Hoekstra 2008). When ecological information is
considered, many identified outlier loci are associated with a particular environment or
ecotype, suggesting that these loci may be targets of natural selection (Rogers &
Bernatchez 2007; Poncet, Herrmann, Gugerli, et al. 2010). These signatures of natural
selection have been discovered in a variety of organisms along gradients of temperature,
precipitation and altitude (Bonin et al. 2006a; Poncet, Herrmann, & Gugerli 2010;
Freedman et al. 2010; Bradbury & Hubert 2010; Nunes, Beaumont, & Butlin 2011; Cox &
9
Broeck 2011), in relation to postglacial colonization and ecological divergence (Bernatchez
et al. 2010; Schluter et al. 2010; Freedman et al. 2010; Renaut & Maillet 2012),
hybridization and introgression (Minder & Widmer 2008; Gagnaire et al. 2009; Whitney et
al. 2010), and host-use differentiation (Egan et al. 2008; Apple et al. 2010; Funk et al.
2011). Loci associated with anthropogenic impacts or artificial selection in wild
populations have also been detected using genome scans (Paris et al. 2010; Orsini et al.
2012), demonstrating the widespread utility of using genome scans for discovering loci
associated with selection. Nevertheless, only a few studies have investigated the function of
outlier loci or attempted to confirm they are actually under selection by demonstrating
variation in fitness conferred by different alleles (Stinchcombe & Hoekstra 2008; Lowry et
al. 2009; Bernatchez et al. 2010; Schluter et al. 2010). Additionally, existing studies have
covered a variety of organisms in different ecological settings, but genome scans for
signatures of natural selection in organisms with large, complex genomes (such as
polyploids) with potentially different responses to natural selection have not yet been
conducted.
1.2 CHALLENGES OF POLYPLOIDY
Polyploidy plays a major role in the evolution of plants, fungi and several animal
lineages (Otto & Whitton 2000; Wendel 2000; De Bodt et al. 2005; Soltis et al. 2009), and
diploid organisms are poor models for understanding polyploid genomics. Polyploid
species have different modes of inheritance, undergo major genomic changes during
formation, and often express genes differently than closely related diploids (Soltis & Soltis
1999; Adams & Wendel 2005). In addition to the challenges of the unconventional
genomes of polyploids, common population-genetic models, such as the Hardy-Weinberg
equilibrium, and metrics, such as F statistics (e.g. FST). assume heterozygous individuals
10
have a maximum of two alleles, posing a challenge for the analysis of polyploid
populations.
Polyploids originate by two main mechanisms: autopolyploidy, or polyploidization
within a single lineage; and allopolyploidy, or polyploidy associated with interspecific
hybridization (Soltis & Soltis 1999). The hybridization of divergent genomes in
allopolyploids can amplify allelic diversity at each genetic locus (Soltis & Soltis 2000).
However, in all polyploids, genome reduction and genetic bottlenecks following
polyploidization may reduce genetic diversity (Leitch & Bennett 2004). In addition to
originating with potentially extensive genetic diversity, the mechanism of inheritance in
polyploids (polysomic inheritance) lowers the probability of allelic loss through genetic
drift or inbreeding, so that allele fixation requires many more generations than in diploid
populations (Ronfort et al. 1998; Soltis & Soltis 2000). Detection of the genomic effects of
natural selection may be constrained by the “fixed heterozygosity” commonly observed in
polyploids due to the effects of polysomic inheritance. Fixed heterozygosity may prolong
adaptive potential in polyploids (via retaining copies of alternative alleles), but it may also
reduce polyploid fitness through the retention of deleterious alleles despite selection against
them (Otto & Whitton 2000). fixed heterozygosity may confound attempts to discover
outlier loci as signatures of selection, as alleles would be consistently present if duplicate
copies were identical by descent. If so, profiling genome-wide expression in divergent
ecotypes may be the best way to discover loci of ecological or evolutionary interest.
The potential for up to four alleles per individual in tetraploid populations poses a
number of unique challenges for commonly used methods in population genetics based on
F-statistics. F-statistics such as FST or Fis use estimates of heterozygosity in diploid
systems to quantify the differentiation between populations (FST) or inbreeding within a
population (Fis; Hamilton 2009). Mathematical methods have been developed to estimate
11
FST from heterozygosity data in polyploids (e.g. Clark & Jasieniuk 2011), but assessing the
allele dosage is difficult or impossible with commonly used genetic markers (but see
Esselink et al. 2004), making the application of these methods impractical. However,
several methods have been developed to estimate F-statistics from haploid data generated
from amplified fragment length polymorphism (AFLP) analysis (Foll & Gaggiotti 2008;
Foll et al. 2010), which in practice gives the same type of data for both polyploids and
diploids (i.e. alleles are either present or absent, there is no direct information on
heterozygosity). AFLP data are also amenable to use with methods for assessing population
structure that do not rely on F-statistics, such as distance- or model-based methods
(Pritchard et al. 2000). AFLPs can be used to estimate F-statistics in polyploids, but care
must be taken when inferring population structure or evolutionary processes. For example,
most polyploid species studied to date have multiple origins, and recurring hybridization
and introgression are common amongst polyploid species and their progenitors (Soltis &
Soltis 1999; Soltis et al. 2004; Grubbs et al. 2009; Wu et al. 2010). Additionally,
population structure in polyploids is often affected by differences in breeding systems. For
instance, apomictic polyploid species often segregate based on self-compatibility, as
opposed to geographic parameters, although several of these studies are based on triploid
populations which often have a higher rate of selfing and infertility than tetraploids
(Chapman et al. 2000; Meirmans et al. 2003; Van Der Hulst et al. 2003; Lo et al. 2009;
Symonds et al. 2010).
1.3 AMPLIFIED FRAGMENT LENGTH POLYMORPHISM (AFLP)
Molecular methods for amplifying a large number of loci in a single reaction
continue to be refined and improved, particularly for high-throughput sequencing (e.g.
Baird et al. 2008). Amplified fragment length polymorphism (AFLP) is a method for
amplifying hundreds of markers throughout the genome and has a proven record in
12
population genomics research in a diverse range of organisms (Meudt & Clarke 2007). The
basics of the molecular method are in four steps: 1) digestion of DNA with restriction
enzymes, 2) ligation of adaptors onto the sticky ends of the DNA at the restriction site, 3)
preselective amplification of successfully ligated DNA fragments using PCR with primers
complimentary to the adaptors, and 4) selective amplification of DNA fragments using
PCR that have complimentary sequences to fluorescently labeled primers with three
arbitrarily chosen nucleotides on the end of the primer. The resulting fragments are
separated using capillary electrophoresis and each AFLP locus is scored based on fragment
size in base pair length. AFLP therefore amplifies specific, repeatable sections of the
genome that can be used for genomic investigation in any organism.
The basis for detecting genetic variation in AFLP relies primarily on the presence
or absence of mutations in the restriction site (Meudt & Clarke 2007). If there is a mutation
in a restriction site the restriction enzyme will not cleave the DNA, thus preventing the
ligation of adaptors and amplification in the final product. The basis of scoring AFLP
markers is either a presence or absence (binary) state at each locus (each locus being a
particular fragment length in base pairs; see Appendix C for example data). The term
“AFLP allele” refers to the two alternate states at these presence/absence loci and not to the
actual alleles at a genetic locus, which can potentially number many more than two alleles.
Dominant AFLP data refers to the state in which an allele is scored as either present or
absent. The height of the AFLP amplification peak can also be used as the basis for
genotyping loci (Fischer et al. 2011), but the reliability and accuracy of peak height data for
genomic inference is largely unknown while dominant AFLP data continues to be a reliable
method for genome scans (e.g. Tice & Carlon 2011).
13
1.3 ALPINE AND LOWLAND ENVIRONMENTS
Alpine and lowland habitats differ extensively in abiotic and biotic conditions
(Billings 1974), and several studies having found divergent adaptation between alpine and
lowland populations (Emery & Chinnappa 1994; Bonin et al. 2006a; Poncet, Herrmann, &
Gugerli 2010; Fischer et al. 2011). The extreme nature of alpine environments has favoured
the evolution of alpine specialist species, and speciation itself may be accelerated in alpine
habitats (Billings 1974; Hughes & Eastwood 2006), making alpine and lowland systems
ideal habitats to study population divergence amongst terrestrial organisms (Schonswetter
et al. 2003; Pinceel et al. 2005; Mráz et al. 2007).
Alpine habitats are generally characterized as having extreme abiotic
environments, being generally colder, more exposed, with a shorter growing season, lower
predation, and more intense, higher short spectrum (blue/UV) radiation than lowland
environments (Billings 1974; Emery & Chinnappa 1994). Lowland habitats are generally
characterized as less extreme environments, but with more competition between species,
warmer, longer growing seasons, less exposed, less intense radiation, but a higher far-red
and infrared spectrum intensity (Billings 1974; Emery & Chinnappa 1994). Alpine and
lowland habitats can be sources of divergent selection in species that occupy both habitats
(Byars et al. 2007; Gonzalo-Turpin & Hazard 2009; Ikeda & Setoguchi 2010). If so, these
differences in selection should be evident at the molecular level. Determination of the
genetic mechanisms contributing to these differences would be important to both
understanding the evolution of adaptive trait variation in these environments, and help
characterize putative candidate genes associated with economically important traits, such as
cold tolerance, response to drought and limited nutrients, and response to environmental
stress in general.
14
In addition to functional regions of the genome that may be affected by natural
selection, ecological differences between lowland and alpine environments can affect
patterns of dispersal, rates of population divergence, and speciation (Hughes & Eastwood
2006; Alvarez et al. 2009; Huang et al. 2011; Buehler et al. 2012). Alpine environments in
particular can be functionally similar to islands, with rapid divergence occurring between
populations or species as alpine habitats are colonized (Hughes & Eastwood 2006). The
isolation of mountain tops amongst intervening temperate habitats can reinforce this
differentiation through restricted gene flow (Aegisdóttir et al. 2009; Huang et al. 2011;
Buehler et al. 2012). Additionally, environmental variation between alpine sites and the
generally extreme nature of alpine habitats may further limit the success of individuals
dispersed between populations, thereby reducing gene flow and accelerating differentiation
between alpine populations (Alvarez et al. 2009; Meirmans et al. 2011). If conditions vary
between alpine populations or different molecular mechanisms of adaptation have evolved
between populations in response to alpine environmental conditions, then gene flow may
be reduced at loci under selection (Lenormand 2002). Lowland environments can have a
more continuous landscape than alpine environments, particularly in the semi-grassland
habitats that were investigated in this study. The relatively homogeneous abiotic
environment across lowland habitats and the lack of major barriers to dispersal allows more
extensive gene flow between lowland populations, which may reduce genetic
differentiation (e.g. Carter & Robinson 1993).
1.4 RESEARCH OBJECTIVES, HYPOTHESES AND PREDICTIONS
In this study, I conducted a genome scan for signatures of natural selection between
alpine and lowland ecotypes of the allopolyploid plant Anemone multifida Poir.
(Ranunculaceae). Anemone multifida is a widespread species (Argentina to Alaska) that
occupies habitats from sea level to high alpine, making it a good candidate for a genome
15
scan for signatures of natural selection in a species with a large, complex genome. A.
multifida is hypothesized to be an allotetraploid based on observations of two distinct
chromosome sets, one of which is similar to chromosomes from a clade of alpine specialist
species, whereas the other set is more similar to chromosomes from a lowland clade
(Meyer et al. 2010; Hoot et al. 2012). Therefore, A. multifida may possess alternate copies
of alleles that are advantageous in alpine environments (from the “alpine” chromosome set)
and lowland environments (from the “lowland” chromosome set), which may explain its
wide habitat range (Meyer et al. 2010; Hoot et al. 2012). A. multifida is distributed from
sea level to 4200 m (approximately 2300 m in North America) and has a discontinuous
range throughout North America and temperate regions of South America (Hoot et al.
2012). Both sympatric and allopatric populations of A. multifida exhibit extensive
morphological variation (Meyer et al. 2012; Hoot et al. 2012). Throughout its North
American distribution there are white, red and pink flowers, whereas only white flowers
occur in South America. The goals of this study were to determine 1) whether the genome
of A. multifida includes outlier loci that may have been the target of natural selection, 2)
whether populations in the same environment possess similar alleles, that differ between
environments, that would indicate that natural selection (vs. genetic drift) has led to alpine
and lowland adaptation, or 3) whether neutral population genetic structure suggests that
non-selective processes have driven population divergence.
For the first goal, I conducted a genome scan on individuals collected from lowland
and alpine populations in and along the Rocky Mountains of western Canada, and
quantified differentiation as FST at each AFLP locus to all of the populations. Loci that
showed very high or very low FST were deemed to be outliers, which could indicate
signatures of natural selection in the genome amongst all populations, although genetic
drift could also account for genetic differentiation at outlier loci. If natural selection has
16
affected the genome of A. multifida, outlier loci should show very limited differentiation
between populations in the case of balancing selection, or extensive differentiation if
divergent selection has had an effect. Alternatively, natural selection has not affected
specific sites throughout the genome of A. multifida, all loci should exhibit similar
intermediate differentiation, with no outlying loci.
The population structure and distribution of outlier loci provides information about
the context in which natural selection might be acting on outlier regions of the genome, but
this analysis alone does determine whether population structure caused by neutral
evolutionary processes could be maintaining differentiation at outlier loci. Therefore,
population structure was analyzed for non-outlier (neutral loci). Analysis across multiple
populations from each environment allows separation of demographic and environmental
factors in determining genetic population structure, and helps identify whether populations
diverged primarily due to selectively neutral processes. If so gene flow or genetic drift have
affected population evolution, which should be evident at the whole genome scale (i.e.
neutral loci). Given low gene flow and/or high genetic drift, populations should be highly
differentiated at neutral loci. In contrast, high gene flow and/or negligible genetic drift
should generate limited genomic differentiation between populations.
Although outliers within genetic data would reveal signatures of natural selection
in the genome of Anemone multifida, the distribution of outlier alleles within and between
each population is necessary to assess if the differing conditions in alpine and lowland
habitats may be driving patterns of genetic differentiation. Therefore, for the second part of
this study, I conducted multiple analyses of population structure for outlier loci using a
combination of distance- and model-based methods, as well as estimates of FST and allele
frequencies within and between populations to determine whether outlier loci were
segregated according to environment. To link the genetic data to potentially adaptive
17
phenotypes I also tested for associations between genetic markers, plant height and floral
colour (as it is a variable trait in this species) amongst all populations. Shorter plant height
may be selected in alpine environments to prevent damage from wind, falling debris and
freezing, and taller plant height may be selected in lowland environments to avoid
competition for light (Billings 1974; Emery & Chinnappa 1994). In the event of a
genotype- phenotype correlation, I could determine whether phenotypes were subject to
balancing or directional selection.
18
CHAPTER 2: MATERIALS AND METHODS
2.1 STUDY SPECIES, FIELD SAMPLING AND POPULATION CHARACTERISTICS
Leaf tissue was sampled from A. multifida individuals during flowering from two
alpine and three lowland sites in Alberta, Canada, during June and July, 2011 (Table 1,
Figure 2). Within populations, plants were sampled along a transect with a minimum
distance of 7 m between individuals. Leaf material was placed in plastic bags with silica gel
for storage and future DNA extraction. Floral colour and plant height were also measured
in the field. To measure floral colour, petals were collected from individuals displaying all
floral colour morphs (white, red and pink) in a sample from Big Hill Springs, Alberta,
placed in a cooler to prevent pigment degradation during transport, and scanned the same
day with an Ocean Optics USB 2000 spectrophotometer to assess floral colour (following
McEwen and Vamosi 2010). Floral colours generally fell into white (uniform transmittance
across the visual spectrum), red (transmittance in visual-red wavelengths), and pink
(slightly higher uniform transmittance and lower transmittance in visual-red spectrum) with
no UV reflectance, so remaining floral colour phenotypes were scored according to their
visual colour without a spectrophotometer. Above-ground plant height was measured on
live individuals from the base of the plant at the soil to the tallest flowering shoot using a
tape measure.
Table 1. Location and elevation of sites from which A. multifida was sampled. The lowland
populations were from Big Hill Springs (BHS), Beauvais Lake (BL), and Willow Creek
(WC), while the alpine populations were from Hailstone Butte (HSB) and Highwood Pass
(HWP).
Population
Big Hill Springs
Beauvais Lake
Hailstone Butte
Highwood Pass
Willow Creek
Final Sample
Size
24
25
29
24
21
Latitude (°N)
Longitude (°W)
Elevation (m)
51.251
49.415
50.205
50.604
50.117
114.386
114.092
114.445
114.984
113.777
1229
1472
2080
2377
1055
19
Figure 2. Location of populations sampled from Alberta, Canada, in and along the Rocky
Mountains and foothills during June and July, 2011. Populations BHS, WC and BL are
lowland (1055 – 1472 metres) and HWP and HSB are alpine (2080 – 2377 metres)
populations.
The environmental differences between populations were not quantified in this
study, but are readily available from other sources for populations adjacent to the sites used
in this study (Emery & Chinnappa 1994). Alpine and lowland environments differ
considerably in their abiotic characteristics (Table 2; Emery & Chinnappa 1994). Alpine
environments generally have more photosynthetically active radiation, stronger winds,
lower temperatures and briefer growing seasons than lowland habitats (Table 2; Emery &
Chinnappa 1994). In lowland environments, the potential effects of more intense
competition are evident in the lower soil moisture and nutrient content, as well as the
greater biomass and height at herbaceous plant layers (Table 2; Emery & Chinnappa 1994).
20
Table 2. The environmental differences between an alpine and a lowland environment near
the sites sampled in this study, based on Emery et al. (1994). Only soil NH3 is provided, as
other soil nutrients (NO3 and PO4) follow a similar pattern (higher nutrient and organic
content in alpine than lowland). PAR - photosynthetically active radiation.
Elevation (m)
PAR (µgE/sm2)
Wind (m/s)
Growing season
temperature (°C)
Herb layer biomass (g/m2)
Herb layer height (cm)
Soil moisture (% wt)
Soil NH3 (ug/g dry mass)
Alpine
2453
2242
6.6
Lowland
1310
1627
2.8
7.9
14.7
142.3
15.3
60.9
62.1
572.2
72.9
35.5
13.9
2.2 DNA EXTRACTION, AFLP AND ALLELE SCORING
DNA was extracted from silica-dried leaf tissues using a standard
CTAB/chloroform DNA extraction protocol (Khanuja et al. 1999). Leaves were crushed in
a microfuge tube, incubated overnight in a CTAB/β-mercaptoethanol buffer to disrupt
tissues and lyse cells. DNA was separated with chloroform and precipitated in ethanol
overnight and re-suspended in ddH2O. DNA quality was determined with agarose gel
electrophoresis to assess any DNA degradation, and a Beckman Coulter DTX 880
Multimode Detector spectrophotometer (Beckman Coulter, Brea, CA, USA) was used to
assess contamination from protein and RNA and quantify DNA. DNA concentration was
standardized to 150 ng/µL and a total of 750 ng was used for amplified fragment length
polymorphism (AFLP) analysis following the Amplification Kit for Regular Plant
Genomes (Applied Biosystems, Carlsbad, CA, USA) using the restriction enzymes EcoR I
and Mse I (New England BioLabs, Ipswitch, MA, USA). DNA was digested by incubating
overnight with the restriction enzymes, T4 DNA ligase, NaCl, BSA and the complementary
adaptors, checked for complete digestion on an agarose gel and diluted to a 10X
21
concentration in water for preselective amplification. Preselective amplification was
conducted with the supplied reagents according to the manufacturer’s instructions
(Appendices A), and checked to verify that amplification occurred in the 100-1500 bp
range on an agarose gel. Preselective product was diluted to a 5X concentration for
selective amplification. Selective amplification was performed on the preselective product
with MseI - EcoRI adaptors CAA-ACG, CAC-ACG, and CTC-AGG with the AFLP
Amplification Core Mix PCR master mix (Applied Biosystems, Carlsbad, CA, USA).
AFLP fragments were separated on an Applied Biosystems 3500xL Genetic Analyzer
(Applied Biosystems, Carlsbad, CA, USA) at the University of Calgary, Department of
Biological Sciences.
Allele sizes (in base pairs) were determined by reference to the internal sizing
standard (GS-500 LIZ) in the software GENEMAPPER v4.0 (Applied Biosystems,
Carlsbad, CA, USA). Fragments between 100-500 bp were scored using automatic allele
binning in Genemapper, with a cut-off intensity of 100 fluorescent units to minimize falseallele calling from low level artifacts in the electropherogram. The polymorphic peaks
identified in Genemapper were then manually checked for quality and consistent scoring.
AFLP alleles with multiple peaks were discarded due to the unreliable sizing of the
fragments. AFLP alleles with amplification at or near the 100 fluorescent unit cut off were
manually checked for consistent scoring, as peaks with amplification just below the
threshold can be a major source of allele drop out (Luikart et al. 2003). The identified
alleles were first checked against five DNA sample replicates on different gels. The error
rate after correcting for peak quality was determined (in terms of the proportion of
inconsistently scored loci there were). Loci that were inconsistently scored between DNA
replicates were removed from the final data set to reduce the error rate as much as possible
for use in determining genotype and population structure. Samples that had weak
22
amplification or high noise across the electropherogram were also discarded to avoid allele
dropout and false-allele calling stemming from failed or non-optimal PCR conditions,
leaving 479 loci in the final dataset
2.3 DETECTION OF OUTLIER LOCI
Genemapper provides the option of exporting both the dominant (binary, present or
absent allele information) and peak height data from each allele (if the allele is present).
Throughout the history of AFLP, most analyses have chosen to use the dominant data as
the basis of genotyping individuals (Foll et al. 2010). I used BayeScan (Foll & Gaggiotti
2008; Foll et al. 2010; Fischer et al. 2011) to identify outlier AFLP loci based on a
decomposition of the logistic transformation of FST for locus i in population j onto locusspecific (αi) and population-specific (βj) components (Foll & Gaggiotti 2008). To identify
outlier loci, the posterior probability that each locus is an outlier (αi≠ 0) was estimated with
a Markov Chain Monte Carlo method. as the proportion of interations for which α was
included in the model during sampling. In this study, I considered a log posterior odds > 10
as indicating that a particular locus is an outlier, as in previous investigations (Foll &
Gaggiotti 2008; White, Stamford, et al. 2010; Alberto et al. 2010; Foll et al. 2010; Fischer
et al. 2011). Amongst the identified outlier loci, the mechanism of selection can be inferred
from αi, with negative αi indicating candidates for balancing selection and positive values
indicating candidates for directional selection (Foll & Gaggiotti 2008). In this study, I used
a burn-in of 50,000 iterations, and a sample size of 10,000 with a thinning interval of 50
(following Foll & Gaggiotti 2008; Fischer et al. 2011). The number and identity of loci
determined to be outliers with peak height and dominant data were compared to determine
if any major discrepancies occurred when using either form of AFLP data, but only binary
data was used for subsequent analyses as most population-genetics programs currently
available accept only dominant data inputs.
23
2.4 GENETIC AND POPULATION STRUCTURE ANALYSES
The number of distinct genetic clusters within each dataset was first identified with
a principal components analysis of the AFLP genotype data (example in Appendix C) using
R statistical software (R Development Core Team 2008), and any apparent clustering in the
neutral and outlier loci data along the first and second principal component axes was
assessed for both within and between population clustering (e.g. Bryc et al. 2010).
Clustering methods can also be useful for visualizations and initial investigations of
clustering, but these distance-based methods alone do not constitute a rigorous test of
genetic clustering and are prone to variation in interpretation of figures and the distance
measurement used (Pritchard et al. 2000). I also used the individual assignment-based
approach implemented in STRUCTURE version 2.3.3 (Pritchard et al. 2000; Falush et al.
2003, 2007; Hubisz et al. 2009). STRUCTURE takes a Bayesian approach by sampling the
posterior probability of the number of distinct populations (or genetic clusters), given the
observed number of genotypes using Markov Chain Monte Carlo (MCMC) methods
(Pritchard et al. 2000), using parameters outlined below. Two ancestry models can be used
with MCMC sampling in STRUCUTRE, one assuming no admixture (i.e. all individuals
come from one population of origin but populations have not interbred since) or allowing
admixture (i.e. gene flow may have occurred between two or more populations). I used the
admixture model for this study, as it seemed most reasonable considering the ecological
and evolutionary history of A. multifida. For the neutral and outlier loci data, simulations
using a burn-in of 10000 iterations and 10000 MCMC replicates after burn-in were used to
determine the probability of the model assuming 1 to 7 populations. These simulations
were replicated 10 times at each level of K (the number of putative populations) to
determine the variation in probability estimates, allowing for a correction of the
24
STRUCTURE results such that the most likely number of unique genetic clusters was
found (Evanno et al. 2005).
I analyzed the distribution of genetic variation among and within populations with
an analysis of molecular variance (AMOVA) analysis for both the outlier and neutral loci
using GenAlEx (Peakall & Smouse 2006: also see Gaudeul et al. 2004; Honnay et al.
2009). AMOVA can be used to test for genetic variance among populations (i.e. significant
population structure), and differentiation amongst individuals within populations (i.e. the
population reproduces sexually). The significance of the proportion of variance attributed
to among-population effects (ϕ) is tested by comparing the observed ϕ to a distribution of ϕ
based on simulated populations of randomly assigned individuals (Peakall & Smouse
2006).
To estimate and test genetic population structure between sampled sites (FST), I
used AFLPsurv v1.0 (Vekemans et al. 2002), which assesses FST from the frequency of the
null allele using a number of options. AFLPsurv uses a Bayesian method to estimate the
frequency of the null allele from the sample size (number of individuals) and the number of
individuals that have a null allele (Vekemans et al. 2002). I chose a Bayesian method with
non-uniform prior distribution of allele frequencies (following the model by Zhivotovsky
1999) assuming Hardy-Weinberg conditions were met, which has been regularly used for
estimating null allele frequencies in AFLP studies (Vekemans et al. 2002; Bonin et al.
2007). AFLPsurv also assumes that individuals are diploid, possibly leading to higher
estimates of population differentiation in polyploid species (i.e. there may be higher
heterozygosity within populations due to the possibility of more than two alleles at each
locus. To assess effects of isolation by distance for the neutral and outlier loci, I assessed
the relation of FST (estimated with AFLPsurv) to inter-population distance with linear
regression.
25
Determination of patterns of population structure at the neutral and outlier loci
assumes that loci are transmitted independently of each other. Instead, loci may be in
gametic-phase disequilibrium, tending to vary together within and between populations,
because of physical linkage or other non-random associations between alleles. In such
cases, functional regions of the genome under selection typically impact areas surrounding
the allele under selection (Nosil et al. 2009b; Feder & Nosil 2010). Testing for gameticphase disequilibrium amongst outliers is necessary to isolate the effects of genetic
hitchhiking from drift or selection at each locus. To test for gametic-phase disequilibrium
amongst outlier loci, I used MultiLocus 1.3 (Agapow & Burt 2001), which calculates the
index of association (Brown et al. 1980; Smith et al. 1993; Haubold et al. 1998) by
comparing the number of loci that are different in pairwise comparisons of all individual
comparison. The variation in the number of different loci between individuals is then tested
against the number of loci that differ between individuals expected when loci are in
equilibrium (Agapow & Burt 2001). Due to computational limitations, disequilibrium was
not assessed between neutral loci.
2.5 PHENOTYPE ANALYSES
Population differences between floral colour and plant height data were assessed to
determine whether environment or demography may have affected the evolution of these
ecologically important phenotypes. All statistical tests were done in R statistical software
(R Core Development Team 2008). Differences in the frequencies of floral colour morphs
among populations were assessed with a chi-square test of independence. ANOVA was
used to test for differences in plant height among populations. Many phenotypes display a
remarkable phenotypic plasticity, particularly between alpine and lowland environments
(e.g. Chinnappa et al. 2005). To determine whether population differences in floral colour
distribution or mean plant height represented plasticity, or may have been affected by
26
natural selection, I searched the genetic data for associations with the measured
phenotypes. Multiple Spearman correlations were used to detect associations between
phenotypes and each allele. As an initial correction for multiple comparisons, Bonferroni
correction was used on the resulting p-values. Given the low power of this approach,
particularly with many comparisons (Ryman & Jorde 2001), I also did multiple comparison
controlling for false discovery rate using the “fdr” option in R following Benjamini &
Hochberg (1995).
27
CHAPTER 3: RESULTS
3.1 AFLP AND THE DETECTION OF OUTLIER LOCI
A total of 759 markers amplified with the three primer combinations. There were
511 AFLP markers that remained after discarding monomorphic markers and those with
poor peak quality. There were 32 of these AFLP loci that were incorrectly scored between
DNA replicates (approximately 6.26%). After removing these 32 incorrectly scored loci,
479 markers remained in the final dataset (see Table 3). Amongst all alpine and lowland
populations 13 loci (2.7%) were significant outliers amongst the dominant AFLP data, and
nine were outliers (1.9%) based on peak height (all of which were included in the dominant
data set. Overall, loci assessed with peak height had lower posterior odds at high FST values
than the dominant data, but slightly higher posterior odds at moderate FST (Fig. 3, Fig. 4).
The false-discovery rate for the dominant data was 0.022, almost half that for peak height
(0.041). The power (1 – false negative rate) for peak height was 0.893, which was lower,
but in a similar range, to the power for the dominant data at 0.898. All outlier loci have
positive α, suggesting divergent selection (Foll & Gaggiotti 2008).
28
Table 3. Basic AFLP primer pair characteristics, including NBANDS, the number of bands
scored, NSAMPLES, the number of samples successfully scored, HE, expected heterozygosity,
HEprimer, s expected heterozygosity averaged over primer combinations, HEpop, the expected
heterozygosity averaged over populations, and P, the proportion of polymorphic markers.
Dye
NBANDS
NSAMPLES
EcoRI-CAA
MseI-ACG
JOE
133
122
EcoRI-CAC
MseI-ACG
JOE
163
122
EcoRI-CTC
MseI-AGG
JOE
183
122
HEprimer
BHS
HE
P
0.094
0.233
0.134
0.307
0.123
0.295
0.117
BL
HE
P
0.084
0.248
0.104
0.294
0.113
0.295
0.100
HSB
HE
P
0.104
0.263
0.127
0.307
0.131
0.328
0.121
HWP
HE
P
0.106
0.301
0.128
0.344
0.135
0.388
0.123
WC
HE
P
0.097
0.308
0.130
0.368
0.129
0.355
0.118
HEpop
0.097
0.124
0.126
29
Figure 3. Relation of FST to the log posterior odds (log(PO)) that a particular locus in the
dominant dataset is an outlier, as identified in BayeScan (Foll et al. 2010; Fischer et al.
2011). A log posterior odds of 1 (vertical line) was used as the outlier threshold. Outliers
may represent signatures of natural selection.
30
Figure 4. Relation of FST to the log posterior odds (log(PO)) that a particular locus in the
dominant dataset is an outlier, as identified in BayeScan (Foll et al. 2010; Fischer et al.
2011). A log posterior odds of 1 (vertical line) was used as the outlier threshold. Outliers
may represent signatures of natural selection.
3.2 POPULATION STRUCTURE OF NEUTRAL LOCI
The first and second principal components of the neutral locus dataset explained
approximately 13.0% and 10.3% of the overall variance, respectively, for a cumulative
proportion of 23.3%. Additional principal components explained less than 6% of the
variance individually. At the neutral loci, most individuals clustered together, regardless of
population of origin, although there was variation amongst individuals along the PC1 axis
(Fig. 5). There were a number of individuals that deviated from the major cluster (Fig. 5).
31
Of particular note, 5 individuals from the HWP alpine population clustered together and
deviated from other individuals and populations (Fig. 5). There was also a cluster of 6
individuals from three populations, including individuals from both alpine and lowland
populations (Fig. 5). These individuals had a lower distance from the main cluster than the
further HWP cluster, but still show a relatively moderate degree of differentiation from
most individuals at the neutral loci.
3 2 1 0 PC2 -­‐1 -­‐2 -­‐3 -­‐4 BHS BL -­‐5 WC HSB -­‐6 HWP -­‐7 -­‐4 -­‐2 0 PC1 2 4 Figure 5. Scatterplot of the first two principal components of variation in AFLP genotype
for neutral loci (loci linked to demographic processes, not including outlier loci). Alpine
populations, HWP and HSB; lowland populations, BHS, BL, and WC.
STRUCTURE analysis detected K 4 or 5 distinct populations. Other models
including K from 1 to 3 or 6 to 7 had substantially lower likelihoods. After correction for
the variance in probability estimates, according to Evanno et al.(2005), the model of K = 4
32
populations had the highest support, whereas K = 5 had substantially lower support than
other models (Fig. 6). The number of distinct genetic clusters within the data was therefore
deemed to be 4 for future plotting and analyses. The inference of 4 distinct candidate
populations (beyond the 3 evident in the PCA) suggests genetic structure amongst
populations. In agreement with the PCA, STRUCTURE identified unique groups in the
HWP population (blue and yellow clusters in Fig. 7, lower plot), corresponding to the HWP
and the HSB/BL/BHS groups as was found in the PCA. Two additional major clusters
(green and red, Fig. 7) colours, with a few individuals assigned roughly equally to both the
red and green clusters overall (Fig. 7).
35 30 ΔK 25 20 15 10 5 0 2 3 4 5 6 7 8 K (number of gene-c clusters) Figure 6. The most likely K number of distinct genetic clusters at the neutral loci (denoted
with the highest ΔK) following the correction method for STRUCTURE results by Evanno
et al. (2005). ΔK in this case is the mean of the second order rate of change in K divided by
the standard deviation of K as determined from 10 replicate simulations at each level of K.
The higher ΔK, the more likely K is the correct number of genetic clusters.
33
Figure 7. Barplots showing the probabilities of individual assignment to each genetic
cluster (represented by different colours) as assigned using neutral loci and assuming 4
genetic clusters in STRUCTURE 2.3.3. The top plot sorts individuals by cluster, and the
bottom plot sorts individuals by site. BHS, BL and WC are lowland and HSB and HWP are
alpine sites.
AFLPsurv identified significant population structure amongst sampled sites at the
neutral loci, with a global FST of 0.041 and a 99% upper limit FST of 0.021 (i.e. p < 0.01).
The HWP alpine population was significantly subdivided from all other populations and
significantly differentiated from the lowland populations (Table 4). Additionally, each
alpine population represented a distinct genetic group, with significant population structure
between both alpine populations (Table 4). The lowland sites did not exhibit significant
genetic structure, with FST values for all comparisons not significantly different from zero
(Table 4). Approximately 9% of the molecular variantion at neutral loci occurred between
sites (AMOVA, ϕ4,121 = 0.089, p <0.001), again indicating significant population structure
amongst sites. Neutral genetic population structure did not vary significantly with distance
between sites and (Linear Regression, r2 = 0.003, F1,8 = 0.024, p = 0.882; Fig. 8).
34
Table 4. FST estimates based on dominant data for all neutral (top panel) and outlier (bottom
panel) AFLP loci for all pairs of five A. multifida populations, . Estimates that differ
significantly from zero at p < 0.01 are bolded, except in the outlier table in which all FST
estimates are significantly greater than zero. The lowland populations are Big Hill Springs
(BHS), Beauvais Lake (BL), and Willow Creek (WC). The alpine populations are Hailstone
Bute (HSB), and Highwood Pass (HWP).
BL
HSB
HWP
WC
BHS
0.018
0.019
0.072
0.004
BL
HSB
HWP
WC
BHS
0.074
0.206
0.338
0.092
BL
HSB
HWP
0.041
0.095
0.013
0.067
0.009
0.057
BL
HSB
HWP
0.167
0.429
0.032
0.445
0.118
0.365
35
0.120 0.100 FST 0.080 0.060 0.040 0.020 0.000 0 50 100 150 200 250 200 250 Distance (km) 0.5 0.45 0.4 0.35 FST 0.3 0.25 0.2 0.15 0.1 0.05 0 0 50 100 150 Distance (km) Figure 8. Isolation by distance between all pairs of populations in this study at the neutral
loci (top panel) and the outlier loci (bottom panel). There was no significant effect of
distance on neutral or outlier population structure.
36
3.3 POPULATION STRUCTURE OF OUTLIER LOCI
The first and second principal components of the neutral locus dataset explained
approximately 29.5% and 16.8% of the overall variance, respectively, for a cumulative
proportion of 46.3%. Further principal components explained less than 9% of the variance
individually. Most individuals from the lowland sites grouped together along the PC1 and
PC2 axes (Fig. 9). Alpine sites tended to cluster separately from each other and the lowland
cluster (Fig. 9). HWP plants overlapped in PC values much less with lowland individuals
than HSB plants (Fig. 9).
2 1.5 BHS BL WC PC2 1 0.5 0 -­‐0.5 -­‐1 -­‐1.5 -­‐2.5 -­‐1.5 -­‐0.5 0.5 1.5 PC1 Figure 9. Scatterplot of the first two principal components from a PCA of outlier loci
(candidate loci showing signatures of natural selection) for two Alpine populations (HWP
and HSB) and three lowland populations (BHS, BL, and WC).
37
STRUCTURE identified 3 or 4 distinct genetic clusters, with 3 clusters receiving
highest support after variance correction (Fig. 10). Specifically, the three genetic clusters
(represented by the different colours in Fig. 11) distinguished the HSB and HWP alpine
sites from each other and from the lowland sites as a group (Fig. 11). Although no
individuals from the lowland or HSB populations had a high probability of assignment to
the HWP cluster, many lowland individuals had high probabilities of assignment to the
HSB cluster (Fig. 11). This contrast suggests some gene flow between the HSB and
lowland sites, but not between the HWP and lowland sites (Fig. 11). Overall, the strong
structuring of outlier loci in the alpine sites suggests contrasting selection between alpine
and lowland environments.
200 180 160 ΔK 140 120 100 80 60 40 20 0 2 3 4 5 6 7 8 K (number of gene-c clusters) Figure 10. The most likely number of distinct genetic clusters, K, at the outlier loci
(denoted with the highest ΔK) detected by STRUCTURE following the correction method
of Evanno et al. (2005). ΔK in this case is the mean of the second-order rate of change in K
divided by the standard deviation of K as determined from 10 replicate simulations at each
level of K. The higher ΔK, the more likely K is the correct number of genetic clusters.
38
Figure 11. Barplot showing the probability of individual assignment to each genetic cluster
(represented by different colours) for outlier loci using the Bayesian approach implemented
in STRUCTURE 2.3.3. Sites represented are lowland (BHS, BL, and WC) as well as alpine
(HSB and HWP).
All analyses of the outlier loci consistently identified three groups of individuals,
corresponding to each alpine site (HSB and HWP) and a lowland cluster. AFLPsurv
identified significant population structure amongst sampled sites at the neutral loci, with a
global FST of 0.255 and a 99% upper limit FST of 0.022 (i.e. p < 0.01). FST differed
significantly from zero between all population pairs, indicating significant genetic
population structure in all populations at the outlier loci (Table 4). Differentiation was most
pronounced (i.e. highest FST ) between lowland and alpine environments as well as alpine
sites (Table 4). Approximately 15% of the genetic variance occurred between environments
in the outlier loci. Between environment variance accounted for a significant proportion of
the genetic variance (AMOVA, ϕ1,121 = 0.152, p < 0.01), indicating significant genetic
population subdivision between environments. Genetic differentiation at the outlier loci did
not vary significantly with distance between sites (Linear Regression, r2 = 0.057, F1,8 =
0.483, p = 0.507; Fig. 8).
The frequencies of alleles at outlier loci tended to vary most between alpine and
lowland populations, with little variation between lowland populations (Table 8). Only one
locus showed similar allele frequencies between the two alpine populations (locus 58);
otherwise allele frequencies differed most the HSB and HWP sites. The mean difference in
39
allele frequencies were 0.179 between HSB and the lowland , 0.393 between lowland and
HWP alpine population, and 0.461 between the HSB alpine and the HWP alpine sites
(Table 8). Thus, outlier loci tended to be associated with alpine environments. No genetic
associations were detected amongst pairs of outlier loci (Appendix A).
Table 5. Allele frequencies of outlier loci each lowland (BHS, BL and WC) and alpine
(HSB and HWP) site.
Outlier
locus1
locus58
locus78
locus169
locus176
locus185
locus194
locus203
locus209
locus220
locus254
locus333
locus427
BHS
(lowland)
0.156
0.479
0.015
0.858
0.293
0.516
0.628
0.512
0.464
0.576
0.149
0.013
0.693
BL
(lowland)
0.217
0.211
0.009
0.924
0.033
0.625
0.563
0.565
0.742
0.541
0.016
0.009
0.361
WC
(lowland)
0.158
0.333
0.043
0.807
0.264
0.746
0.378
0.448
0.753
0.515
0.195
0.018
0.382
HSB
(alpine)
0.691
0.030
0.010
0.773
0.371
0.196
0.106
0.511
0.662
0.469
0.020
0.009
0.297
HWP
(alpine)
0.221
0.013
0.715
0.274
0.018
0.668
0.326
0.043
0.200
0.041
0.782
0.527
0.918
3.4 PHENOTYPIC DIFFERENCES IN HEIGHT AND FLORAL COLOUR
Phenotypic variation was present between populations. Floral colour frequencies
differed significantly between sites (X2 = 30.78, df = 8, p < 0.001; Fig. 13). In particular,
almost all individuals from HSB and HWP alpine sites had white flowers. I observed only 2
red-flowered plants at alpine sites, both at HSB. Plant height also differed significantly
between sites (ANOVA, F4,118 = 8.83, p < 0.0001, Fig. 13), as those at HSB were shorter
than plants other sites (Tukey’s test, p < 0.05: Fig. 13). No AFLP loci showed significant
40
association with floral colour or above ground height phenotypes (Spearman correlation, df
= 121, p > 0.05).
Figure 12. Variation in plant height and flower colour within and among lowland sites,
BHS, BL and WC, and alpine sites, HSB and HWP.
41
CHAPTER 4: DISCUSSION
4.1 GENETIC POPULATION STRUCTURE AT NEUTRAL AND OUTLIER LOCI
In this study, I examined patterns of genetic variation between multiple populations
of Anemone multifida, an allopolyploid plant with a large range spanning both alpine and
lowland environments. Neutral population divergence was evident between alpine
populations and between alpine and lowland environments, though the degree to which
varied between alpine sites. In addition to neutral population structure, there was evidence
for differentiation at the outlier loci according to alpine and lowland environments. Though
the presence of neutral population structure may be related to population structure at the
outlier loci, the presence of differing allele frequencies between environments at the outlier
loci suggests adaptation to alpine and lowland environments may have occurred in A.
multifida.
Amongst all alpine and lowland sites an estimated 2.7% of the genome (1.9% with
peak height data) represents possible signatures of natural selection, within the 1-4% range
reported from other studies of contrasting environments (e.g. Apple et al. 2010; Fischer et
al. 2011; Paris & Despres 2012). Frequencies of outlier loci varied independently,
suggesting that they have evolved independently. These loci were highly differentiated
between populations, as expected from divergent selection, with no evidence for balancing
selection at any locus. Allele frequencues at the outlier loci in each site are unknown from
these data alone, so whether alpine and lowland environments are associated with
population divergence at the outlier loci is also unknown. Additionally, as genetic drift is
also a prominent form of evolutionary divergence between populations (Nosil, Funk, &
Ortiz-Barrientos 2009), drift cannot be excluded as being responsible for the observed
outlier genetic variation without comparing patterns of evolution at the neutral and outlier
loci in multiple populations. However, separate analysis of outliers and neutrally evolving
42
loci enabled the investigation of genetic structure at both sets of loci to determine the
probable roles of genetic drift and divergent natural selection in the structure of genetic
variation between alpine and lowland environments.
The presence of four distinct genetic clusters within the neutral data suggests
neutral evolutionary divergence in A. multifida. Specifically, the HWP alpine population
has apparently diverged neutrally from all other sampled sites, and the HSB alpine
population has diverged from one lowland site. The presence of a unique cluster of
individuals in the HWP alpine population likely contributed to the consistently high FST
estimates for this population. Alpine sites can exert extreme abiotic selection (Billings
1974; Korner 2003), and are often isolated by major geographical barriers to gene flow
(e.g. mountain ranges). The combined effects of extreme environment and restricted gene
flow may have enhanced population divergence in A. multifida, which is consistent with
findings of population divergence and speciation in alpine environments (Bonin et al.
2006b; Hughes & Eastwood 2006; Poncet, Herrmann, Gugerli, et al. 2010; Fischer et al.
2011). The neutral genetic structure amongst the sites in this study suggests neutral
evolutionary processes, such as genetic drift or restricted gene flow, have contributed to
population divergence. Furthermore, patterns of genetic differentiation at the outlier loci
have likely been affected by neutral population divergence in conjunction with natural
selection.
The significant genetic differentiation between all sites at the outlier loci suggests
strong divergence at candidate loci for adaptation amongst environments. The particularly
high FST between alpine and lowland sites suggests accelerated genetic differentiation at the
outlier loci in alpine environments. Similarly, the high FST between the alpine populations
at the outlier loci suggests different alleles are under selection at these sites, although
neutral processes may be primarily responsible for the differentiation at the outlier loci
43
between these populations. The outlier loci represent three genetic clusters: HSB-alpine,
HWP-alpine and lowland populations as a group. Some individuals assigning to the HSBalpine outlier group were present at lowland sites, and HSB-alpine individuals represented
the entire HSB site. The STRUCTURE results suggest that the overlap in clustering
between HSB and lowland sites evident in the principal component and dendrogram
analysis primarily reflects the presence of HSB-alpine alleles in lowland sites and not
lowland alleles in the HSB site. The variation in the proportion of HSB alpine alleles in the
lowland sites could be the primary cause of significant FST estimates at the outlier loci
amongst lowland sites, which otherwise tend to have the lowest FST estimates. The outlier
loci from the HWP-alpine cluster were present only in the HWP site, and appeared to be
highly divergent from the lowland and HSB-alpine groups. The consistently high FST
estimates for all comparisons of outlier loci involving the HWP alpine population are
consistent with the cluster analyses in suggesting extensive evolutionary divergence at
these loci in this population. Overall, these results support divergent natural selection
between environments, which is strongest in alpine habitats.
Common explanations of neutral genetic divergence could explain the observed
genetic structure, but the allopolyploid history of Anemone multifida may have also
affected patterns of neutral genetic differentiation. Many, if not most, polyploid species
have multiple origins (Soltis & Soltis 1999; Symonds et al. 2010). Each allopolyploid
origin could produce lineages with highly divergent genomes almost immediately.
Similarly, differences in the effects of genomic downsizing and restructuring following
polyploidization can cause newly synthesized polyploids to have highly differentiated
genomes (Soltis & Soltis 1999; Otto & Whitton 2000), creating a polyploid complex.
Polyploids tend to have fewer barriers to introgression with closely related species,
including examples in Anemone (Heimburger 1959; Boraiah & Heimburger 1964;
44
Heimburger & Boraiah 1964). Interbreeding with different species in some sites could
introduce highly divergent genetic material into polyploid populations that occur
sympatrically with other species. The sites sampled in this study are situated along the
range limits of a number of western and central North American Anemone species that are
closely related to A. multifida (Meyer et al. 2010; Hoot et al. 2012), raising the possibility
of introgression. Although alpine environments were associated with neutral population
divergence in A. multifida, neutral genetic divergence may also been driven by multiple
polyploidization events and genomic changes associated with hybridization and polyploidy,
perhaps accounting for the two minor divergent clusters of individuals in the neutral data.
In this case, the genetic clusters may associate only weakly with sampled sites because each
genetic cluster represents a parental lineage (i.e. the putative parental species of the
allopolyploid A. multifida).
The presence of individuals assigning to the HSB-alpine cluster at the outlier loci at
lowland sites suggests that selection is potentially not as strong on the HSB-alpine alleles in
lowland environments as in the alpine. The fitness of these individuals is unknown, but
they were flowering during sampling, indicating individuals with HSB-alpine alleles
survived to reproductive stages in the lowland environment. In contrast, the absence of
individuals with a high probability of assignment to the lowland group at the outlier loci in
the HSB site suggests that selection acts strongly against lowland alleles in the HSB alpine
site. Alpine environments generally exert extreme abiotic selection for survival and
reproduction, and successful organisms must function at lower temperatures, shorter
growing seasons, and exposure to wind, intense radiation, and falling debris (Billings 1974;
Korner 2003). The more temperate abiotic conditions in lowland environments relax
selection for extreme abiotic tolerance compared to alpine environments, but biotic stresses
such as competition and herbivory may also exert selection (Billings 1974; Emery &
45
Chinnappa 1994). The differences between these environments can eventually cause
population divergence through adaptation to extremely different ecological conditions
(Bonin et al. 2006a; Poncet, Herrmann, & Gugerli 2010; Fischer et al. 2011), which may
have contributed to the neutral population divergence between alpine and lowland
environments in this study. The apparently weaker selection against HSB-alpine alleles in
the lowland sites may permit more gene flow from HSB to lowland sites, perhaps
accounting for the lower neutral population divergence than between HWP and the lowland
sites. However, the lower frequency of lowland alleles at HSB indicates that selection for
alpine adaptation maintains divergence at the outlier loci in alpine environments.
Despite the evidence for limited divergence at outlier loci between HSB and
lowland environments, the lower frequency of HWP-alpine alleles in lowland sites,
consistently significant neutral genetic differentiation and nearly uniform distribution of
HWP-alpine alleles in the HWP site suggests extensive differentiation at both the outlier
and neutral loci in the HWP site. Additionally, the divergence between HSB and HWP
alpine sites at the outlier loci suggests natural selection for alpine adaptation may have had
different effects at these locations, perhaps because of neutral divergence of the HWP
population. HSB is approximately 300 m lower than HWP, suggesting differences in
environment along elevation gradient may be driving the divergence between all three
outlier clusters. Additionally, HSB is located at the eastern front range of the Rocky
Mountains closer to the lowland populations, whereas HWP site is on the west site of the
front range. The high alpine barrier between HWP and the lowland sites may account for
the increased neutral and outlier population divergence, and the apparently more migration
between the HSB and lowland sites. HSB site was much more exposed to wind, had thinner
soil, and lower shrub cover than the HWP site. The ecological differences between the
alpine sites could account for the population divergence at the outlier loci, but this has yet
46
to be tested. Alternatively, individuals at these locations may have evolved different
molecular mechanisms for convergent adaptations to similar ecological conditions, as has
been observed in other species (Arendt & Reznick 2008). Due to the extensive neutral
population divergence in the HWP population, and without data from additional alpine
populations, whether natural selection or neutral evolutionary processes are the primary
cause for the differentiation at the outlier and neutral loci is uncertain.
The high frequency of white-flowered individuals in alpine environments, and the
shorter plants at HSB site suggests these phenotypes reflect differences between alpine and
lowland environmental conditions. Low shoots can reduce damage from exposure to wind
and falling debris, and closer proximity to the ground can also limit freezing and frost
formation (Billings 1974; Korner 2003). For example, the particularly windy conditions at
the HSB site may explain its short plants, unlike at HWP site, which is not as exposed to
wind. The high frequency of white flowers in alpine environments could indicate
differences in the pollination community at alpine sites has favoured white floral colour.
Alternatively, the lack of pigmentation could be the by-product of lower phytochemical
production from generally lower herbivory in alpine environments (Billings 1974). The
lack of correlation between genotype and phenotype for both floral colour and shoot height
traits could indicate that the genome scan included too few loci to detecting such
associations. Further sampling with different restriction enzymes may yield markers
associated with floral colour or plant height. Alternatively, floral colour or shoot height
may be phenotypically plastic, as is often observed with both plant height and floral colour
(Nicotra et al. 2010).
The higher false discovery rate and fewer detected loci based on AFLP peak height
data than dominant AFLP data suggests the peak height proves lower power. This
conclusion contrasts with a previous investigation that found band intensity had a higher
47
power to discover loci that showed signatures of natural selection (Fischer et al. 2011).
Polyploidy may cause ambiguity in genotyping based on peak height., Four tetraploids, the
four allele copies at each locus introduce more variation in peak height than in diploids,
possibly leading to a greater variation in estimates of genetic differentiation and/or
decreased ability to detect outlier loci due to homoplasy. If so, peak-height estimates of the
number of outliers in polyploid species would be more conservative than in diploid species.
Additionally, Fischer et al.'s (2011) may have involved different selection intensity, so that
estimates of outliers may depend on biological factors other than the simple
presence/absence of natural selection.
4.2 LIMITATIONS AND ALTERNATE EXPLANATIONS
Being amongst the first non-theoretical studies of the population genomics of a
polyploid species, this study contributes information about the evolutionary divergence and
adaptation in a polyploidy species. The results in this study suggest several key differences
and similarities between the genomics of genetic divergence and adaptation for polyploids
and diploids. Genome scans of diploids for signatures of natural selection consistently
discover loci under the effects of divergent natural selection between environments with
different ecological conditions (Luikart et al. 2003; Stinchcombe & Hoekstra 2008).
Divergent selection has been associated with ecological differences between alpine and
lowland environments (Byars et al. 2007; Fischer et al. 2011), along elevation,
precipitation and temperature gradients (Bonin et al. 2006a; Gonzalo-Turpin & Hazard
2009; Poncet, Herrmann, Gugerli, et al. 2010; Freedman et al. 2010; Bradbury et al. 2010;
Nunes, Beaumont, Butlin, et al. 2011; Cox & Broeck 2011), host-use differences (Egan et
al. 2008; Apple et al. 2010; Funk et al. 2011), and ecological opportunity following major
geological events (Hughes & Eastwood 2006; Bernatchez et al. 2010; Schluter et al. 2010).
Similar associations of outlier alleles with environment were evident for polyploid A.
48
multifida. The mechanisms underlying genetic divergence at loci associated with divergent
natural selection may therefore be similar in both polyploid and diploid species.
Specifically, alleles under selection become increasingly common in sites between
generations, leading to the characteristically low genetic variation at loci under the effects
of natural selection (Stinchcombe & Hoekstra 2008). In Anemone multifida, the effects of
polysomic inheritance (i.e. fixed heterozygosity) appear to have not constrained the effects
of natural selection or neutral evolutionary divergence on the genome.
Neutral population divergence in this polyploid species is also similar to that in
diploid species. At neutral loci, if population structure exists, diploid populations typically
show some site-specific component to neutral genetic variation. For example, diploid
populations in different regions typically experience lower gene flow, leading to the
eventual whole-genome divergence between populations. Typical examples include neutral
population divergence due to allopatry (Hoskin et al. 2005; Roberts 2006; Kuehne et al.
2007; Surget-Groba et al. 2012), or simply isolation by distance between populations
(Sharbel et al. 2000; Epperson 2007; Pusadee et al. 2009). The finding of four distinct
genetic clusters in the neutral data and significant genetic population structure between
some sites suggests that neutral evolutionary divergence between polyploid populations
operates similarly, though there was no apparent isolation by distance relationship between
populations. Although the outcome of population divergence appears similar between
polyploids and diploids, polyploidy may affect the time required for neutral population
differentiation. Further investigation into the phylogeographic history of A. multifida, and
sequencing of neutral loci to determine variation in allele copy number would elucidate the
relative contribution of short- and long-term evolutionary processes to evolutionary
divergence in polyploids.
49
The field of the genomics of population divergence and speciation has grown
substantially during the past decade with the progression of molecular markers for studying
genome-wide evolutionary processes (Charlesworth 2010). While this study identified
genetic divergence at a number of loci between alpine and lowland environments and
suggests that natural selection has had similar effects on polyploid and diploid genomes,
the function and polyploid nature of loci that show signatures of natural selection remains
undetermined. Without sequence data or other methods for determining gene copy number,
whether fixed heterozygosity has affected the discovery of some outlier loci remains
undetermined. The potential effects of polyploidy on outlier detection would reduce the
number of estimated outlier loci, so the outliers found in this study may conservatively
represent the extent of natural selection on the genome of A. multifida. Additionally, AFLP
markers may be associated with certain regions of the genome, so the markers used in this
study may not be randomly or evenly distributed throughout the genome (Rogers et al.
2007). Additional outliers may be found with different restriction enzymes are used. The
co-migration of AFLP fragments with similar sizes (size homoplasy) can lead to
overestimates of allele frequencies and potentially decreased estimation of differentiation at
specific loci (Gort et al. 2006; Caballero et al. 2008), reducing the probability of detecting
outlier loci. Overall, the number of outliers found in this study may underestimate the
extent of natural selection on the genome between alpine and lowland ecotypes, and further
genomic analyses may find more outliers associated with alpine and lowland adaptation.
In addition to limitations on outlier discovery, alternative explanations of
environmentally based natural selection can account for genetic differentiation at the outlier
loci (Bierne et al. 2011). Loci associated with genetic incompatibilities between
populations, which can be heightened by natural selection between divergent environments,
may be the primary cause of increased differentiation at the outlier loci (Rogers &
50
Bernatchez 2006; Bierne et al. 2011). Although natural selection still acts on genetic
incompatibilities, the outliers identified in this study may not have direct ecological
function related to alpine or lowland adaptation and instead may correspond to
incompatible genomic regions between lowland and alpine environments. Similarly,
selection against newly arisen deleterious mutations in a population can cause local
differentiation at the mutated locus that would appear similar to other loci associated with
adaptation (Charlesworth et al. 1997), further emphasizing the importance of determining
gene function following outlier identification. Neutral evolutionary processes in some cases
can lead to the heightened differentiation characteristic of outlier loci, particularly if certain
populations have shared ancestry or barriers to gene flow are more prevalent between a
subset of populations (Excoffier et al. 2009; Bonhomme et al. 2010). Neutral mutations
that arise in growing populations can also appear to be highly differentiated, as they
increase in frequency as the population expands (Klopfstein et al. 2006; Hofer et al. 2009),
but are unrelated to adaptation. Although environmental associations at the outlier loci
suggest that divergent selection has played a role in the evolution of the outlier loci,
determination of the exact cause of genetic differentiation will require further genomic
analyses to test these alternate explanations. In particular, the ascertainment of gene
function will enable the differentiation of what role environmentally based selection vs.
alternative non-environmentally based explanations have played in patterns of genetic
variation at the outlier loci.
4.3 FUTURE DIRECTIONS
Direct identification of phenotypes that may be the target of natural selection and
tests for associations with the alleles found in this study may eventually uncover the traits
associated with lowland and alpine adaptation, but this would be time-consuming and
potentially ignores many phenotypes that are difficult to assess visually. Sequence data
51
from large portions of the genome, or the AFLP fragments in this study, provides the
means for simultaneously discovering and characterizing outlier loci, and is more amenable
to future experimentation than anonymous AFLP fragments (Stinchcombe & Hoekstra
2008; Storz & Wheat 2010). Determination of the identity of alleles that show signatures of
natural selection and comparison of the sequence data to known genes is a straightforward
means for finding phenotypes under natural selection, and lays the foundation for
determining the function of ecologically important genes in the case that gene function has
not been previously characterized. Determination of whether the alleles that show
signatures of natural selection confer a fitness advantage to individuals in the wild provides
a more rigorous test for confirming that natural selection actually acts on the alleles
underlying adaptive phenotypes. This is important for removing false positives from the
dataset, separating the effects of genetic drift from natural selection, and establishing a
mechanistic link between genotype, phenotype and natural selection.
In addition to further investigation of the functions of outlier loci, many questions
remain about the cause of neutral evolutionary divergence amongst different genetic
clusters of A. multifida. Neutral genetic population structure was associated with
environmental differentiation between alpine and lowland habitats, but the causes of
divergence of two minor clusters of individuals is unknown. Determination of the
phylogeographic history of Anemone multifida across the range of the species would be a
first step in assessing the larger scale and longer term evolutionary processes that caused
the currently observed neutral evolutionary divergence. Increasing the sample of
populations from across the range of A. multifida, will allow for a reconstruction of the
evolutionary history within this species. Similarly, the resolution of the phylogeny of A.
multifida and closely related Anemone species will be critical to the determination of
whether neutral genetic divergence can be attributed to multiple origins of A. multifida, and
52
whether hybridization between species has contributed to neutral genetic differentiation.
Reports of hybrid breakdown between some A. multifida individuals indicate that neutral
divergence may have progressed to reproductive isolation (Heimburger & Boraiah 1964),
and the examination of interfertility between different genetic groups, such as those
identified in this study, could confirm if the early stages of speciation have indeed
occurred. The utilization of sequence data in some form for the determination of these
questions about neutral divergence will enable the determination of allele copy number, and
thus the potential contribution of polyploidy to the patterns observed in this study. The
comparison of these results in A. multifida with basic population genetic analysis in closely
related diploid species will also clarify whether the patterns of neutral genetic variation
observed in this study are due to polyploidy or are characteristic of the genus.
Perhaps the greatest limitation to this study is the need for more replication of
alpine populations. The HWP population in particular was highly divergent from all other
populations, while there was lower levels of genetic differentiation between the HSB and
lowland populations. In the absence of additional alpine populations for comparison, it is
not possible to determine if the general trend of alpine adaptation and neutral population
divergence is more like the HWP or HSB population (other neither) in A. multifida. Future
studies of the population genomics of alpine and lowland adaptation in this or other species
should seek to 1) replicate alpine populations to a sufficient level, 2) obtain sequence
information to enable the identification of potential function of outlier loci or to establish a
baseline for targeted investigations into gene function, and 3) establish a link between
variation at outlier loci and fitness differences in nature through reciprocal transplant
experiments. Only through sufficient replication of populations and determination of
exactly what function outlier alleles have in adaptation can a clear picture emerge of how
evolution by neutral and selective processes has occurred.
53
Natural selection plays a major role in the adaptation of species to different
habitats, the population divergence and species diversification. This study is amongst the
first to examine the population genomics of a polyploid species on a molecular level.
Polyploidy has played a prominent role in the evolution of plants, fungi and many animal
lineages (Otto & Whitton 2000; Wendel 2000; De Bodt et al. 2005; Soltis et al. 2009), and
the investigation of the mechanisms of adaptation in polyploids will answer many long
standing questions about the evolutionary significance of genome duplication. studies of
the genomics of adaptation and population divergence, even in diploids, are still initial
stages, but applications of recently developed sequencing technology hold great promise
for answering many questions about effects of different evolutionary processes. Through
the study of the effects of natural selection on the genome, the molecular mechanisms of
adaptation can be discovered and characterized, leading to a clear picture of how organisms
adapt to different environmental conditions. The study of the whole-genome effects of
population divergence provides the context for understanding the effects of evolution at
single loci and the means for understanding the relative contribution of neutral evolutionary
processes and natural selection to population divergence. Determination of the contribution
of selective and non-selective evolutionary processes to the adaptation and will expand
understanding of how evolution has shaped contemporary biological diversity and how
adaptation and speciation will progress in the future.
54
Bibliography
Adams KL, Wendel JF (2005) Polyploidy and genome evolution in plants. Current
Opinion in Plant Biology, 8, 135–41.
Aegisdóttir HH, Kuss P, Stöcklin J (2009) Isolated populations of a rare alpine
plant show high genetic diversity and considerable population differentiation.
Annals of Botany, 104, 1313–22.
Agapow PM, Burt A (2001) Indices of multilocus linkage disequilibrium.
Molecular Ecology Notes, 1, 101–102.
Alberto F, Niort J, Derory J, et al. (2010) Population differentiation of sessile oak at
the altitudinal front of migration in the French Pyrenees. Molecular Ecology,
19, 2626–39.
Alvarez N, Thiel-Egenter C, Tribsch A, et al. (2009) History or ecology? Substrate
type as a major driver of spatial genetic structure in Alpine plants. Ecology
Letters, 12, 632–40.
Apple JL, Grace T, Joern A, St Amand P, Wisely SM (2010) Comparative genome
scan detects host-related divergent selection in the grasshopper Hesperotettix
viridis. Molecular Ecology, 19, 4012–28.
Arendt J, Reznick D (2008) Convergence and parallelism reconsidered: what have
we learned about the genetics of adaptation? Trends in Ecology & Evolution,
23, 26–32.
Baack EJ, Whitney KD, Rieseberg LH (2005) Hybridization and genome size
evolution: timing and magnitude of nuclear DNA content increases in
Helianthus homoploid hybrid species. The New Phytologist, 167, 623–30.
Baird NA, Etter PD, Atwood TS, et al. (2008) Rapid SNP discovery and genetic
mapping using sequenced RAD markers. PloS One, 3, e3376.
Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate  : A
Practical and Powerful Approach to Multiple Testing. Journal of the Royal
Statistical Society Series B, 57, 289–300.
Bernatchez L, Renaut S, Whiteley AR, et al. (2010) On the origin of species:
insights from the ecological genomics of lake whitefish. Philosophical
Transactions of the Royal Society of London B, 365, 1783–800.
Bierne N, Welch J, Loire E, Bonhomme F, David P (2011) The coupling
hypothesis: why genome scans may fail to map local adaptation genes.
Molecular Ecology, 20, 2044–72.
55
Billings W (1974) Adaptations and origins of alpine plants. Arctic & Alpine
Research, 6, 129–142.
De Bodt S, Maere S, Van de Peer Y (2005) Genome duplication and the origin of
angiosperms. Trends in Ecology & Evolution, 20, 591–7.
Bonhomme M, Chevalet C, Servin B, et al. (2010) Detecting Selection in
Population Trees: The Lewontin and Krakauer Test Extended. Genetics, 186,
241-262.
Bonin A, Ehrich D, Manel S (2007) Statistical analysis of amplified fragment
length polymorphism data: a toolbox for molecular ecologists and
evolutionists. Molecular Ecology, 16, 3737–58.
Bonin A, Taberlet P, Miaud C, Pompanon F (2006) Explorative genome scan to
detect candidate loci for adaptation along a gradient of altitude in the common
frog (Rana temporaria). Molecular Biology & Evolution, 23, 773–83.
Boraiah G, Heimburger M (1964) Cytotaxonomic studies on new world Anemone
(section Eriocephalus) with woody rootstocks. Canadian Journal of Botany,
42, 891–922.
Bradbury IR, Hubert S, Higgins B, et al. (2010) Parallel adaptive evolution of
Atlantic cod on both sides of the Atlantic Ocean in response to temperature.
Proceedings of the Royal Society B, 277, 3725–34.
Bridle JR, Vines TH (2007) Limits to evolution at range margins: when and why
does adaptation fail? Trends in Ecology & Evolution, 22, 140–7.
Brown AHD, Feldman MW, Nevo E (1980) Multilocus structure of natural
populations of Hordeum spontaneum. Genetics, 96, 523–536.
Bryc K, Auton A, Nelson MR, et al. (2010) Genome-wide patterns of population
structure and admixture in West Africans and African Americans. Proceedings
of the National Academy of Sciences, 107, 786–91.
Buehler D, Graf R, Holderegger R, Gugerli F (2012) Contemporary gene flow and
mating system of Arabis alpina in a Central European alpine landscape.
Annals of Botany, 109, 1359–1367.
Buerkle CA, Gompert Z, Parchman TL (2011) The n = 1 constraint in population
genomics. Molecular Ecology, 20, 1575–81.
Burke J, Voss T (1998) Genetic interactions and natural selection in Louisiana iris
hybrids. Evolution, 52, 1304–1310.
56
Byars SG, Papst W, Hoffmann A a (2007) Local adaptation and cogradient
selection in the alpine plant, Poa hiemata, along a narrow altitudinal gradient.
Evolution, 61, 2925–41.
Caballero A, Quesada H, Rolán-Alvarez E (2008) Impact of amplified fragment
length polymorphism size homoplasy on the estimation of population genetic
diversity and the detection of selective loci. Genetics, 179, 539–54.
Carter AJ, Robinson ER (1993) Genetic structure of a population of the clonal grass
Setaria incrassata. Biological Journal of the Linnean Society, 48, 55–62.
Casper B, Jackson RB (1997) Plant competition underground. Annual Review of
Ecology & Systematics, 1997, 545–570.
Chapman MA, Abbott RJ (2010) Introgression of fitness genes across a ploidy
barrier. The New Phytologist, 186, 63–71.
Chapman HM, Parh D, Oraguzie N (2000) Genetic structure and colonizing success
of a clonal, weedy species, Pilosella officinarum (Asteraceae). Heredity, 84,
401–409.
Charlesworth B (2010) Molecular population genomics: a short history. Genetics
Research, 92, 397–411.
Charlesworth B, Nordborg M, Charlesworth D (1997) The effects of local selection,
balanced polymorphism and background selection on equilibrium patterns of
genetic diversity in subdivided populations. Genetics Research, 70, 155–74.
Chinnappa C, Donald G, Sasidharan R, Emery RN (2005) The biology of Stellaria
longipes (Caryophyllaceae). Botany, 83, 1367–1383.
Clark LV, Jasieniuk M (2011) POLYSAT: an R package for polyploid
microsatellite analysis. Molecular Ecology Resources, 11, 562–6.
Cox K, Broeck AV (2011) Temperature related natural selection in a wind
pollinated tree across regional and continental scales. Molecular Ecology, 20,
2724–38.
Van Der Hulst RGM, Mes THM, Falque M, et al. (2003) Genetic structure of a
population sample of apomictic dandelions. Heredity, 90, 326–35.
Derome N, Bougas B, Rogers SM, et al. (2008) Pervasive sex-linked effects on
transcription regulation as revealed by expression quantitative trait loci
mapping in lake whitefish species pairs (Coregonus sp., Salmonidae).
Genetics, 179, 1903–17.
57
Dobzhansky T (1957) An experimental study of interaction between genetic drift
and natural selection. Evolution, 11, 311–319.
Egan SP, Nosil P, Funk DJ (2008) Selection and genomic differentiation during
ecological speciation: isolating the contributions of host association via a
comparative genome scan of Neochlamisus bebbianae leaf beetles. Evolution,
62, 1162–81.
Emery R, Chinnappa C (1994) Specialization, plant strategies, and phenotypic
plasticity in populations of Stellaria longipes along an elevational gradient.
International Journal of Plant Science, 155, 203–219.
Epperson BK (2007) Plant dispersal, neighbourhood size and isolation by distance.
Molecular Ecology, 16, 3854–65.
Esselink GD, Nybom H, Vosman B (2004) Assignment of allelic configuration in
polyploids using the MAC-PR (microsatellite DNA allele counting-peak
ratios) method. TAG. Theoretical & Applied Genetics, 109, 402–8.
Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of
individuals using the software STRUCTURE: a simulation study. Molecular
Ecology, 14, 2611–20.
Excoffier L, Hofer T, Foll M (2009) Detecting loci under selection in a
hierarchically structured population. Heredity, 103, 285–98.
Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using
multilocus genotype data: linked loci and correlated allele frequencies.
Genetics, 164, 1567–87.
Falush D, Stephens M, Pritchard JK (2007) Inference of population structure using
multilocus genotype data: dominant markers and null alleles. Molecular
Ecology Notes, 7, 574–578.
Feder JL, Nosil P (2010) The efficacy of divergence hitchhiking in generating
genomic islands during ecological speciation. Evolution, 64, 1729–1747.
Felsenstein J (1976) The theoretical population genetics of variable selection and
migration. Annual Review of Genetics, 10, 253–280.
Fischer MC, Foll M, Excoffier L, Heckel G (2011) Enhanced AFLP genome scans
detect local adaptation in high-altitude populations of a small rodent (Microtus
arvalis). Molecular Ecology, 20, 1450–62.
Foll M, Fischer MC, Heckel G, Excoffier L (2010) Estimating population structure
from AFLP amplification intensity. Molecular Ecology, 19, 4638–47.
58
Foll M, Gaggiotti O (2008) A genome-scan method to identify selected loci
appropriate for both dominant and codominant markers: a Bayesian
perspective. Genetics, 180, 977–93.
Forstmeier W, Schielzeth H, Mueller JC, Ellegren H, Kempenaers B (2012)
Heterozygosity-fitness correlations in zebra finches: microsatellite markers can
be better than their reputation. Molecular Ecology, 21, 3237–49.
Freedman AH, Thomassen HA, Buermann W, Smith TB (2010) Genomic signals of
diversification along ecological gradients in a tropical lizard. Molecular
Ecology, 19, 3773–88.
Funk DJ, Egan SP, Nosil P (2011) Isolation by adaptation in Neochlamisus leaf
beetles: host-related selection promotes neutral genomic divergence.
Molecular Ecology, 20, 4671–82.
Gagnaire PA, Albert V, Jónsson B, Bernatchez L (2009) Natural selection
influences AFLP intraspecific genetic variability and introgression patterns in
Atlantic eels. Molecular Ecology, 18, 1678–91.
Gaudeul M, Till-Bottraud I, Barjon F, Manel S (2004) Genetic diversity and
differentiation in Eryngium alpinum L. (Apiaceae): comparison of AFLP and
microsatellite markers. Heredity, 92, 508–18.
Gavrilets S, Hastings A (2012) Founder Effect Speciation  : A Theoretical
Reassessment. The American Naturalist, 147, 466–491.
Gonzalo-Turpin H, Hazard L (2009) Local adaptation occurs along altitudinal
gradient despite the existence of gene flow in the alpine plant species Festuca
eskia. Journal of Ecology, 97, 742–751.
Gort G, Koopman WJM, Stein A (2006) Fragment length distributions and collision
probabilities for AFLP markers. Biometrics, 62, 1107–15.
Grubbs KC, Small RL, Schilling EE (2009) Evidence for multiple, autoploid origins
of agamospermous populations in Eupatorium sessilifolium (Asteraceae).
Plant Systematics and Evolution, 279, 151–161.
Hadany L (2003) Adaptive peak shifts in a heterogenous environment. Theoretical
population biology, 63, 41–51.
Hager R, Cheverud JM, Wolf JB (2009) Relative contribution of additive,
dominance, and imprinting effects to phenotypic variation in body size and
growth between divergent selection lines of mice. Evolution; international
journal of organic evolution, 63, 1118–28.
59
Hamilton MB (2009) Popoulation Genetics. John Wiley and Sons, Chichester, UK.
Haubold B, Travisano M, Rainey PB, Hudson RR (1998) Detecting linkage
disequilibrium in bacterial populations. Genetics, 150, 1341–8.
Heimburger M (1959) Cytotaxonomic studies in the genus Anemone. Canadian
Journal of Botany, 37, 587–612.
Heimburger M, Boraiah G (1964) Genome relationships of Anemone multifida.
Canadian Journal of Genetics & Cytology, 6, 529–539.
Hofer T, Ray N, Wegmann D, Excoffier L (2009) Large allele frequency
differences between human continental groups are more likely to have
occurred by drift during range expansions than by selection. Annals of Human
Genetics, 73, 95–108.
Honnay O, Jacquemyn H, Van Looy K, Vandepitte K, Breyne P (2009) Temporal
and spatial genetic variation in a metapopulation of the annual Erysimum
cheiranthoides on stony river banks. Journal of Ecology, 97, 131–141.
Hoot SB, Reznicek AA, Palmer JD (2012) Phylogenetic relationships in Anemone
(Ranunculaceae) based on morphology and chloroplast DNA. Systematic
Botany, 19, 169–200.
Hoskin CJ, Higgie M, McDonald KR, Moritz C (2005) Reinforcement drives rapid
allopatric speciation. Nature, 437, 1353–6.
Huang CC, Hung KH, Hwang CC, et al. (2011) Genetic population structure of the
alpine species Rhododendron pseudochrysanthum sensu lato (Ericaceae)
inferred from chloroplast and nuclear DNA. BMC Evolutionary Biology, 11,
108.
Hubisz MJ, Falush D, Stephens M, Pritchard JK (2009) Inferring weak population
structure with the assistance of sample group information. Molecular Ecology
Resources, 9, 1322–32.
Hughes C, Eastwood R (2006) Island radiation on a continental scale: exceptional
rates of plant diversification after uplift of the Andes. Proceedings of the
National Academy of Sciences, 103, 10334–9.
Ikeda H, Setoguchi H (2010) Natural selection on PHYE by latitude in the Japanese
archipelago: insight from locus specific phylogeographic structure in Arcterica
nana (Ericaceae). Molecular Ecology, 19, 2779–91.
Kauffman S (1987) Towards a general theory of adaptive walks on rugged
landscapes. Journal of Theoretical Biology, 128, 11–45.
60
Khanuja SPS, Shasany AK, Darokar MP, Kumar S (1999) Rapid isolation of DNA
from dry and fresh samples of plants producing large amounts of secondary
metabolites and essential oils. Plant Molecular Biology Reporter, 17, 1–7.
Kim Y, Nielsen R (2004) Linkage disequilibrium as a signature of selective sweeps.
Genetics, 167, 1513–24.
Kimura M (1983) The Neutral Theory of Evolution. Cambridge University Press,
Cambridge.
Kingsolver JG, Hoekstra HE, Hoekstra JM, et al. (2001) The strength of phenotypic
selection in natural populations. The American Naturalist, 157, 245–61.
Klopfstein S, Currat M, Excoffier L (2006) The fate of mutations surfing on the
wave of a range expansion. Molecular Biology and Evolution, 23, 482–90.
Korner C (2003) Alpine Plant Life. New York.
Kuehne HA, Murphy HA, Francis CA, Sniegowski PD (2007) Allopatric
divergence, secondary contact, and genetic isolation in wild yeast populations.
Current Biology, 17, 407–11.
Lai Z, Nakazato T, Salmaso M, et al. (2005) Extensive chromosomal repatterning
and the evolution of sterility barriers in hybrid sunflower species. Genetics,
171, 291–303.
Leitch I, Bennett M (2004) Genome downsizing in polyploid plants. Biological
Journal of the Linnean, 82, 651–663.
Leitch AR, Leitch IJ (2008) Genomic plasticity and the diversity of polyploid
plants. Science, 320, 481–483.
Lenormand T (2002) Gene flow and the limits to natural selection. Trends in
Ecology & Evolution, 17, 183–189.
Lewontin R (1974) The Genetic Basis of Evolutionary Change. Columbia
University Press, New York.
Lewontin RC, Krakauer J (1973) Distribution of gene frequency as a test of the
theory of the selective neutrality of polymoprhisms. Genetics, 74, 175–195.
Lexer C, Welch ME, Raymond O, Rieseberg LH (2003) The origin of ecological
divergence in Helianthus paradoxus (Asteraceae): selection on transgressive
characters in a novel hybrid habitat. Evolution, 57, 1989–2000.
61
Lo EYY, Stefanovic S, Dickinson T (2009) Population genetic structure of diploid
sexual and polyploid apomictic hawthorns (Crataegus; Rosaceae) in the
Pacific Northwest. Molecular Ecology, 18, 1145–1160.
Lowry DB, Hall MC, Salt DE, Willis JH (2009) Genetic and physiological basis of
adaptive salt tolerance divergence between coastal and inland Mimulus
guttatus. The New Phytologist, 183, 776–88.
Luikart G, England PR, Tallmon D, Jordan S, Taberlet P (2003) The power and
promise of population genomics: from genotyping to genome typing. Nature
Reviews Genetics, 4, 981–94.
Mackay T, Stone E (2009) The genetics of quantitative traits: challenges and
prospects. Nature Reviews Genetics, 10, 565–577.
Maruyama T, Fuerst PA (1985) Population bottlnecks and nonequilibrium models
in population genetics. II. Number of alleles in a small population that was
formed by a recent bottleneck. Genetics, 111, 675–689.
Meirmans PG, Goudet J, Gaggiotti OE (2011) Ecology and life history affect
different aspects of the population structure of 27 high-alpine plants.
Molecular Ecology, 20, 3144–55.
Meirmans PG, Vlot EC, Den Nijs JCM, Menken SBJ (2003) Spatial ecological and
genetic structure of a mixed population of sexual diploid and apomictic
triploid dandelions. Journal of Evolutionary Biology, 16, 343–52.
Meudt HM, Clarke AC (2007) Almost forgotten or latest practice? AFLP
applications, analyses and advances. Trends in Plant Science, 12, 106–17.
Meyer KM, Hoot SB, Arroyo MTK (2010) Phylogenetic Affinities of South
American Anemone (Ranunculaceae), including the Endemic Segregate
Genera, Barneoudia and Oreithales. International Journal of Plant Sciences,
171, 323–331.
Michel A, Sim S, Powell T (2010) Widespread genomic divergence during
sympatric speciation. Proceedings of the National Academy of Sciences, 107,
9724–9729.
Minder AM, Widmer A (2008) A population genomic analysis of species
boundaries: neutral processes, adaptive divergence and introgression between
two hybridizing plant species. Molecular Ecology, 17, 1552–63.
Mráz P, Gaudeul M, Rioux D, et al. (2007) Genetic structure of Hypochaeris
uniflora (Asteraceae) suggests vicariance in the Carpathians and rapid post-
62
glacial colonization of the Alps from an eastern Alpine refugium. Journal of
Biogeography, 34, 2100–2114.
Nichols KM, Edo AF, Wheeler PA, Thorgaard GH (2008) The genetic basis of
smoltification-related traits in Oncorhynchus mykiss. Genetics, 179, 1559–75.
Nicotra AB, Atkin OK, Bonser SP, et al. (2010) Plant phenotypic plasticity in a
changing climate. Trends in Plant Science, 15, 684–92.
Nosil P, Funk DJ, Ortiz-Barrientos D (2009) Divergent selection and heterogeneous
genomic divergence. Molecular Ecology, 18, 375–402.
Nosil P, Vines T, Funk (2005) Reproductive isolation caused by natural selection
against immigrants from divergent habitats. Evolution, 59, 705–719.
Nunes V, Beaumont M, Butlin R (2011) Multiple approaches to detect outliers in a
genome scan for selection in ocellated lizards (Lacerta lepida) along an
environmental gradient. Molecular Ecology, 20, 193–205.
Nunes VL, Beaumont MA, Butlin RK, Paulo OS (2011) Multiple approaches to
detect outliers in a genome scan for selection in ocellated lizards (Lacerta
lepida) along an environmental gradient. Molecular Ecology, 20, 193–205.
Orsini L, Spanier KI, DE Meester L (2012) Genomic signature of natural and
anthropogenic stress in wild populations of the waterflea Daphnia magna:
validation in space, time and experimental evolution. Molecular Ecology, 21,
2160–2175.
Osborn TC, Chris Pires J, Birchler J a., et al. (2003) Understanding mechanisms of
novel gene expression in polyploids. Trends in Genetics, 19, 141–147.
Otto S, Whitton J (2000) Polyploid incidence and evolution. Annual Review of
Genetics, 34, 401–437.
Paris M, Boyer S, Bonin A, et al. (2010) Genome scan in the mosquito Aedes
rusticus: population structure and detection of positive selection after
insecticide treatment. Molecular Ecology, 19, 325–37.
Paris M, Despres L (2012) Identifying insecticide resistance genes in mosquito by
combining AFLP genome scans and 454 pyrosequencing. Molecular Ecology,
1672–1686.
Pavey SA, Collin H, Nosil P, Rogers SM (2010) The role of gene expression in
ecological speciation. Annals of the New York Academy of Sciences, 1206,
110–29.
63
Peakall R, Smouse PE (2006) GenAlEx 6: genetic analysis in excel. Population
genetic software for teaching and research. Molecular Ecology Notes, 6, 288–
295.
Peichel CL, Nereng KS, Ohgi KA, et al. (2001) The genetic architecture of
divergence between threespine stickleback species. Nature, 414, 901–5.
Pinceel J, Jordaens K, Pfenninger M, Backeljau T (2005) Rangewide
phylogeography of a terrestrial slug in Europe: evidence for Alpine refugia and
rapid colonization after the Pleistocene glaciations. Molecular Ecology, 14,
1133–50.
Poncet BN, Herrmann D, Gugerli F, et al. (2010) Tracking genes of ecological
relevance using a genome scan in two independent regional population
samples of Arabis alpina. Molecular Ecology, 19, 2896–907.
Presgraves DC, Balagopalan L, Abmayr SM, Orr HA (2003) Adaptive evolution
drives divergence of a hybrid inviability gene between two species of
Drosophila. Nature, 423, 715–9.
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure
using multilocus genotype data. Genetics, 155, 945–59.
Pusadee T, Jamjod S, Chiang Y-C, Rerkasem B, Schaal B a (2009) Genetic
structure and isolation by distance in a landrace of Thai rice. Proceedings of
the National Academy of Sciences, 106, 13880–5.
Renaut S, Maillet N, Normandeau E, et al. (2012) Genome-wide patterns of
divergence during speciation: the lake whitefish case study. Philosophical
transactions of the Royal Society B, 367, 354–63.
Rieseberg LH, Kim S-C, Randell RA, et al. (2007) Hybridization and the
colonization of novel habitats by annual sunflowers. Genetica, 129, 149–65.
Roberts T (2006) Multiple levels of allopatric divergence in the endemic Philippine
fruit bat Haplonycteris fischeri (Pteropodidae). Biological Journal of the
Linnean Society, 88, 329–349.
Rogers SM, Bernatchez L (2005) Integrating QTL mapping and genome scans
towards the characterization of candidate loci under parallel selection in the
lake whitefish (Coregonus clupeaformis). Molecular Ecology, 14, 351–61.
Rogers SM, Bernatchez L (2006) The genetic basis of intrinsic and extrinsic postzygotic reproductive isolation jointly promoting speciation in the lake
whitefish species complex (Coregonus clupeaformis). Journal of Evolutionary
Biology, 19, 1979–94.
64
Rogers SM, Bernatchez L (2007) The genetic architecture of ecological speciation
and the association with signatures of selection in natural lake whitefish
(Coregonus sp. Salmonidae) species pairs. Molecular Biology and Evolution,
24, 1423–38.
Rogers SM, Isabel N, Bernatchez L (2007) Linkage maps of the dwarf and Normal
lake whitefish (Coregonus clupeaformis) species complex and their hybrids
reveal the genetic architecture of population divergence. Genetics, 175, 375–
98.
Ronfort J, Jenczewski E, Bataillon T, Rousset F (1998) Analysis of population
structure in autotetraploid species. Genetics, 150, 921–30.
Ryman N, Jorde PE (2001) Statistical power when testing for genetic
differentiation. Molecular Ecology, 10, 2361–73.
Schielzeth H, Kempenaers B, Ellegren H (2012) QTL linkage mapping of Zebra
finch beak color shows an oligogenic control of a sexually selected trait.
Evolution, 66, 18–30.
Schluter D (2001) Ecology and the origin of species. Trends in Ecology &
Evolution, 16, 372–380.
Schluter D, Marchinko KB, Barrett RDH, Rogers SM (2010) Natural selection and
the genetics of adaptation in threespine stickleback. Philosophical transactions
of the Royal Society of London B, 365, 2479–86.
Schonswetter P, Paun O, Tribsch a., Niklfeld H (2003) Out of the Alps:
colonization of Northern Europe by East Alpine populations of the Glacier
Buttercup Ranunculus glacialis L. (Ranunculaceae). Molecular Ecology, 12,
3373–3381.
Sharbel TF, Haubold B, Mitchell-Olds T (2000) Genetic isolation by distance in
Arabidopsis thaliana: biogeography and postglacial colonization of Europe.
Molecular Ecology, 9, 2109–18.
Skrede I, Borgen L, Brochmann C (2009) Genetic structuring in three closely
related circumpolar plant species: AFLP versus microsatellite markers and
high-arctic versus arctic-alpine distributions. Heredity, 102, 293–302.
Smith JM, Smith NH, O’Rourke M, Spratt BG (1993) How clonal are bacteria?
Proceedings of the National Academy of Sciences, 90, 4384–8.
Soltis DE, Albert VA, Leebens-Mack J, et al. (2009) Polyploidy and angiosperm
diversification. American Journal of Botany, 96, 336–48.
65
Soltis D, Soltis P (1999) Polyploidy: recurrent formation and genome evolution.
Trends in Ecology & Evolution, 14, 348–352.
Soltis PS, Soltis DE (2000) The role of genetic and genomic attributes in the
success of polyploids. Proceedings of the National Academy of Sciences 97,
7051–7.
Soltis D, Soltis P, Pires J (2004) Recent and recurrent polyploidy in Tragopogon
(Asteraceae): cytogenetic, genomic and genetic comparisons. Biological
Journal of the Linnean Society, 82, 485–501.
Stapley J, Reger J, Feulner PGD, et al. (2010) Adaptation genomics: the next
generation. Trends in Ecology & Evolution, 25, 705–12.
Stinchcombe JR, Hoekstra HE (2008) Combining population genomics and
quantitative genetics: finding the genes underlying ecologically important
traits. Heredity, 100, 158–70.
Storz J (2005) Using genome scans of DNA polymorphism to infer adaptive
population divergence. Molecular Ecology, 14, 671–688.
Storz JF, Wheat CW (2010) Integrating evolutionary and functional approaches to
infer adaptation at specific loci. Evolution, 64, 2489–509.
Strasburg JL, Sherman NA, Wright KM, et al. (2012) What can patterns of
differentiation across plant genomes tell us about adaptation and speciation?
Philosophical transactions of the Royal Society of London B, 367, 364–73.
Surget-Groba Y, Johansson H, Thorpe RS (2012) Synergy between allopatry and
ecology in population differentiation and speciation. International Journal of
Ecology, 2012, 1–10.
Svedin N, Wiley C, Veen T, Gustafsson L, Qvarnström A (2008) Natural and
sexual selection against hybrid flycatchers. Proceedings of the Royal Society
B, 275, 735–44.
Symonds VV, Soltis PS, Soltis DE (2010) Dynamics of polyploid formation in
Tragopogon (Asteraceae): recurrent formation, gene flow, and population
structure. Evolution, 64, 1984–2003.
R Development Core Team (2008) R: A language and environment for statistical
computing. R Foundation for Statistical Computing.
Thompson JD (1991) Phenotypic plasticity as a component of evolutionary change.
Trends in Ecology & Evolution, 6, 246–9.
66
Tice KA, Carlon DB (2011) Can AFLP genome scans detect small islands of
differentiation? The case of shell sculpture variation in the periwinkle
Echinolittorina hawaiiensis. Journal of Evolutionary Biology, 24, 1814–25.
Vekemans X, Beauwens T, Lemaire M, Roldán-Ruiz I (2002) Data from amplified
fragment length polymorphism (AFLP) markers show indication of size
homoplasy and of a relationship between degree of homoplasy and fragment
size. Molecular Ecology, 11, 139–51.
Via S, West J (2008) The genetic mosaic suggests a new role for hitchhiking in
ecological speciation. Molecular Ecology, 17, 4334–45.
Wendel JF (2000) Genome evolution in polyploids. Plant Molecular Biology, 42,
225–49.
White TA, Stamford J, Rus Hoelzel A (2010) Local selection and population
structure in a deep-sea fish, the roundnose grenadier (Coryphaenoides
rupestris). Molecular ecology, 19, 216–26.
Whiteley AR, Derome N, Rogers SM, et al. (2008) The phenomics and expression
quantitative trait locus mapping of brain transcriptomes regulating adaptive
divergence in lake whitefish species pairs (Coregonus sp.). Genetics, 180,
147–64.
Whitlock MC, Guillaume F (2009) Testing for spatially divergent selection:
comparing QST to FST. Genetics, 183, 1055–63.
Whitney KD, Randell RA Rieseberg LH (2006) Adaptive introgression of herbivore
resistance traits in the weedy sunflower Helianthus annuus. The American
Naturalist, 167.
Whitney KD, Randell RA Rieseberg LH (2010) Adaptive introgression of abiotic
tolerance traits in the sunflower Helianthus annuus. The New Phytologist, 187,
230–9.
Willi Y, Van Buskirk J, Hoffmann AA (2006) Limits to the Adaptive Potential of
Small Populations. Annual Review of Ecology, Evolution, & Systematics, 37,
433–458.
Wu LL, Cui XK, Milne RI, Sun Y-S, Liu JQ (2010) Multiple autopolyploidizations
and range expansion of Allium przewalskianum Regel. (Alliaceae) in the
Qinghai-Tibetan Plateau. Molecular ecology, 19, 1691–704.
Zhivotovsky LA (1999) Estimating population structure in diploids with multilocus
dominant DNA markers. Molecular Ecology, 8, 907–13.
67
Appendix A: Supplementary Data and Methods
Cluster dendrograms were generated using the outlier and neutral AFLP data to
assist in visualizing genetic population structure. I moved this analysis to the appendix as
these were largely consistent with the PCA and mostly redundant.
Figure A1. Cluster dendrogram of individuals using Euclidean distance based on the AFLP
genotype data at the outlier loci. Individuals from lowland populations are colour coded
yellow, orange and red, while individuals from alpine populations are coded light blue and
dark blue. Cluster dendrograms were then generated using Euclidean distance to further
estimate the degree and nature of any clustering within the outlier and neutral genetic data
and to further characterize any apparent outlier individuals
68
Figure A2. Cluster dendrogram of individuals using Euclidean distance based on the AFLP
genotype data at the neutral loci. Individuals from lowland populations are colour coded
yellow, orange and red, while individuals from alpine populations are coded light blue and
dark blue.
69
Table A1. Linkage disequilibrium analysis in Multilocus following Agapow & Burt
(2001). The index of association was calculated for pairwise comparisons
between all outlier loci, with IA ≠ 0 indicating a statistically significant
association (linkage) between two loci (Agapow & Burt 2001). There were no
significant associations amongst any outlier loci.
Comparison
1&2
1&3
1&4
1&5
1&6
1&7
1&8
1&9
1&10
1&11
1&12
1&13
2&3
2&4
2&5
2&6
2&7
2&8
2&9
2&10
2&11
2&12
2&13
3&4
3&5
3&6
3&7
3&8
3&9
3&10
3&11
3&12
3&13
4&5
4&6
Observed Index of
Association
-0.013
-0.018
-0.007
0.065
0.163
0.074
-0.005
-0.012
-0.007
-0.017
-0.029
0.063
-0.092
-0.072
0.031
-0.020
0.036
0.001
-0.030
0.022
-0.038
-0.097
-0.017
0.522
-0.092
-0.032
0.000
0.154
0.335
0.168
0.598
0.589
-0.006
-0.081
0.011
70
p-value (p > x)
0.87
0.87
0.87
0.87
0.87
0.87
0.87
0.87
0.87
0.87
0.87
0.87
0.49
0.49
0.49
0.49
0.49
0.49
0.49
0.49
0.49
0.49
0.49
0.67
0.67
0.67
0.67
0.67
0.67
0.67
0.67
0.67
0.67
0.21
0.21
4&7
4&8
4&9
4&10
4&11
4&12
4&13
5&6
5&7
5&8
5&9
5&10
5&11
5&12
5&13
6&7
6&8
6&9
6&10
6&11
6&12
6&13
7&8
7&9
7&10
7&11
7&12
7&13
8&9
8&10
8&11
8&12
8&13
9&10
9&11
9&12
9&13
10&11
10&12
10&13
11&12
11&13
12&13
0.015
0.179
0.254
0.113
0.388
0.356
-0.024
0.028
-0.003
0.017
-0.029
0.035
-0.051
-0.097
0.026
0.016
0.003
0.009
0.003
-0.022
-0.049
0.031
-0.003
-0.007
0.000
-0.007
0.004
0.022
0.128
0.104
0.131
0.125
-0.003
0.074
0.277
0.164
-0.004
0.169
0.136
0.007
0.410
0.033
-0.030
71
0.21
0.21
0.21
0.21
0.21
0.21
0.21
0.17
0.17
0.17
0.17
0.17
0.17
0.17
0.17
0.44
0.44
0.44
0.44
0.44
0.44
0.44
0.55
0.55
0.55
0.55
0.55
0.55
0.62
0.62
0.62
0.62
0.62
0.79
0.79
0.79
0.79
0.61
0.61
0.61
0.9
0.9
0.26
Table A2. Phenotype data for each individual, with population individuals were
sampled from. Measured phenotypes were plant height (in cm) and floral
colour.
Individual
BHS11
BHS13
BHS14
BHS16
BHS17
BHS18
BHS2
BHS20
BHS21
BHS22
BHS23
BHS27
BHS28
BHS32
BHS34
BHS35
BHS36
BHS4
BHS40
BHS44
BHS46
BHS49
BHS5
BHS6
Population
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
BHS
Height (cm)
40.3
31.6
39.8
21.3
23.2
32.5
32.4
23.6
22.3
17.3
30.4
33.3
24.3
29.3
29.1
24.2
27.5
26.7
23.7
25.8
34.4
28.1
29.6
15.3
Floral
Colour
pink
red
red
white
pink
white
red
white
white
white
pink
white
red
white
pink
pink
white
red
red
red
red
red
pink
red
BL1
BL10
BL11
BL12
BL14
BL15
BL
BL
BL
BL
BL
BL
26.5
27.9
33.5
19.3
29.3
19.1
white
white
red
white
white
white
72
BL17
BL18
BL19
BL2
BL20
BL21
BL28
BL30
BL32
BL36
BL37
BL39
BL4
BL42
BL45
BL49
BL50
BL52
BL8
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
BL
41.7
26.3
20.8
20.2
38.6
29.1
27.2
35.4
27.1
16.3
25
17.5
33.7
22.8
31.4
30.1
35.4
30.2
24.3
white
white
pink
white
white
red
white
white
white
white
white
red
white
red
pink
red
white
white
pink
HSB1
HSB10
HSB11
HSB12
HSB13
HSB14
HSB15
HSB16
HSB17
HSB18
HSB2
HSB20
HSB21
HSB22
HSB23
HSB24
HSB25
HSB26
HSB27
HSB28
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
22.3
15.1
20
25.8
23.7
15.4
15.5
25.5
14.3
25.3
22.2
20.8
26.7
13.3
18.4
17.3
18.5
24.3
25.5
19.8
pink
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
red
white
white
pink
73
HSB29
HSB3
HSB30
HSB4
HSB5
HSB6
HSB7
HSB8
HSB9
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
HSB
14.3
14.5
20
24.5
23.7
23
27.8
15.4
29
white
white
white
white
white
red
white
white
white
HWP1
HWP10
HWP11
HWP12
HWP13
HWP14
HWP15
HWP16
HWP18
HWP19
HWP20
HWP21
HWP22
HWP23
HWP24
HWP25
HWP26
HWP3
HWP4
HWP5
HWP6
HWP6
HWP8
HWP9
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
HWP
21.8
26.9
30.3
31.7
25.1
30.9
26.4
27.2
20.1
23.4
20.6
24.5
22.8
21.4
20
27.9
27.6
23
20.5
21.3
31.8
31.1
26.9
23.9
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
white
WC10
WC11
WC12
WC17
WC
WC
WC
WC
31.5
17.1
32.1
25.9
red
white
white
red
74
WC18
WC2
WC22
WC23
WC25
WC28
WC3
WC30
WC31
WC36
WC4
WC42
WC45
WC49
WC7
WC8
WC9
WC
WC
WC
WC
WC
WC
WC
WC
WC
WC
WC
WC
WC
WC
WC
WC
WC
36.2
30.4
25.4
29.7
26.3
22.4
30.8
27.9
32.2
31.4
42.4
24.1
23.4
19.4
36.5
28.3
27.3
75
white
pink
white
white
white
red
white
white
white
white
white
white
white
red
white
red
red
Appendix B: AFLP Protocol Take From the AFLP Plant Mapping Protocol for
Regular Plant Genomes (Applied Biosystems)
Restriction-Ligation:
1. From the AFLP Ligation and Preselective Amplification Module, remove
the tubes labeled MseI Adaptor Pair and EcoRI Adaptor
2. Heat tubes in a water bath at 95 °C for 5 minutes.
3. Allow tubes to cool to room temperature over a 10-minute period.
4. Spin in a microcentrifuge for 10 seconds at 1400 × g (maximum).
5. Combine the following in a sterile 0.5 mL microcentrifuge tube:
a. 10 µL 10X T4 DNA ligase buffer with ATPa
b. 10 µL 0.5 M NaCl
c. 5 µL 1 mg/mL BSA (diluted from 10 mg/mL stock)
d. 100 Units MseI
e. 500 Units EcoRI
f. 100 Weiss Units T4 DNA Ligase
6. Add sterile distilled water to bring the total volume to 100 µL.
7. Mix gently.
8. Spin down in a microcentrifuge for 10 seconds.
9. Store on ice until ready to aliquot into individual reaction tubes.
10. Combine the following in a sterile 0.5-mL microcentrifuge tube:
a. 1.0 µL 10X T4 DNA ligase buffer that includes ATP
b. 1.0 µL 0.5M NaCl
c. 0.5 µL 1.0 mg/mL BSA (dilute from 10 mg/mL if necessary)
d. 1.0 µL MseI adaptor
e. 1.0 µL EcoRI adaptor
f. 1.0 µL Enzyme Master Mix
11. Add 0.5 µg genomic DNA in 5.5 µL sterile distilled water
12. Mix thoroughly, then place in a microcentrifuge for 10 seconds.
13. Incubate at room temperature overnight
14. Add 189 µL of TE0.1 buffer to each restriction-ligation reaction
15. Mix thoroughly.
Preselective Amplification
1. Combine the following in a PCR reaction tube:
a. 4.0 µL diluted DNA prepared by restriction-ligation
b. 1.0 µL AFLP preselective primer pairs
c. 15.0 µL AFLP Core Mix
2. Place the samples in a thermal cycler at ambient temperature.
3. Run the following PCR method:
a. 72°C for 2 minutes
b. 20 cycles of:
i. 94°C for 20 seconds
76
ii. 56°C for 30 seconds
iii. 72°C for 2 minutes
c. 60°C for 30 minutes
d. 4°C Hold
4. Combine the following in a sterile 0.5-mL microcentrifuge tube:
a. 10.0 µL preselective amplification reaction product
b. 190.0 µL TE0.1
5. Mix thoroughly, then spin down in a microcentrifuge for 10 seconds
6. Store the diluted preselective amplification product at 2–6 °C if not used
immediately.
Selective Amplification
1. Combine the following in a PCR reaction tube:
a. 3.0 µL diluted preselective amplification reaction product
b. 1.0 µL MseI[Primer–Cxx] at 5 µM
c. 1.0 µL EcoRI[Dye–primer–Axx] at 1 µM
d. 15.0 µL AFLP Core Mix
2. Run PCR using the thermal cycler parameters:
a. 94°C for 2 minutes
b. Cycle:
i. 94°C for 20 seconds
ii. 66°C (ramped down by 1 degree each cycle until 56°C)
for 30 seconds
iii. 72°C for 2 minutes
c. 60°C for 30 minutes
d. 4°C hold
77
Appendix C: Example of Electropherogram and Raw Data Produced from
AFLP
Figure C1. An example electropherogram following fragment separation of AFLP
fragments via capillary electrophoresis. Dominant loci are scored as either the
presence of a peak at a particular size (e.g. at 100 bp) or by using the height of the
peak in fluorescent units (e.g. 100 FU at 100 bp).
Table C1. Example of binary (D) and peak height (PH) AFLP data exported from
Genemapper v4.0 for two of the outlier loci. Dominant AFLP alleles are scored
either as present (1) or absent (0). Peak Height AFLP alleles are scored as the
height of the amplification peak in the electopherogram (Fig. C1) if present or zero
if there was no amplification.
Individual
BHS11
BHS13
BHS14
BHS16
BHS17
BHS18
BHS2
locus 1 (D)
0
1
0
0
1
0
0
locus 58
(D)
1
1
1
0
1
0
1
78
locus 1
(PH)
0
236
0
0
113
0
0
locus 78
(PH)
0
0
0
0
0
0
0
BHS20
BHS21
BHS22
BHS23
BHS27
BHS28
BHS32
BHS34
BHS35
BHS36
BHS4
BHS40
BHS44
BHS46
BHS49
BHS5
BHS6
BL1
BL10
BL11
BL12
BL14
BL15
BL17
BL18
BL19
BL2
BL20
BL21
BL28
BL30
BL32
BL36
BL37
BL39
BL4
BL42
BL45
BL49
BL50
BL52
BL8
HSB1
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
1
1
1
0
1
0
0
0
1
0
0
1
1
1
1
0
1
0
1
0
1
1
0
1
0
0
1
1
1
0
0
0
1
0
1
1
0
0
0
0
0
1
0
0
0
1
0
1
0
0
0
0
1
0
0
79
0
0
0
0
0
0
0
119
0
230
0
0
0
0
0
0
0
190
0
0
0
0
0
117
0
0
0
0
0
0
0
154
198
141
0
235
0
0
0
186
0
0
202
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
HSB10
HSB11
HSB12
HSB13
HSB14
HSB15
HSB16
HSB17
HSB18
HSB2
HSB20
HSB21
HSB22
HSB23
HSB24
HSB25
HSB26
HSB27
HSB28
HSB29
HSB3
HSB30
HSB4
HSB5
HSB6
HSB7
HSB8
HSB9
HWP1
HWP10
HWP11
HWP12
HWP13
HWP14
HWP15
HWP16
HWP18
HWP19
HWP20
HWP21
HWP22
HWP23
HWP24
1
1
1
1
0
1
0
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
0
1
1
0
0
1
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
80
1771
192
202
215
0
185
0
153
137
218
187
226
0
171
207
154
197
370
225
199
193
332
205
175
208
317
194
162
0
225
0
148
133
0
0
212
242
0
159
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
231
298
157
602
597
444
635
258
199
226
137
153
193
144
220
HWP25
HWP26
HWP3
HWP4
HWP5
HWP6
HWP8
HWP9
WC10
WC11
WC12
WC17
WC18
WC2
WC22
WC23
WC25
WC28
WC3
WC30
WC31
WC36
WC4
WC42
WC45
WC49
WC7
WC8
WC9
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
1
1
0
0
0
0
0
1
1
1
1
0
0
1
0
1
1
0
81
0
0
0
0
0
0
228
0
0
0
220
0
0
0
0
0
0
0
0
0
0
0
0
0
0
278
155
0
0
257
140
173
189
153
0
260
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
138