Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Population genetics wikipedia , lookup
Genome (book) wikipedia , lookup
Public health genomics wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Genetic engineering wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Designer baby wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Quantitative trait locus wikipedia , lookup
History of genetic engineering wikipedia , lookup
Human genetic variation wikipedia , lookup
Population-Based Resequencing Reveals That the Flowering Time Adaptation of Cultivated Barley Originated East of the Fertile Crescent Huw Jones,* Fiona J. Leigh,* Ian Mackay,* Mim A. Bower,à Lydia M.J. Smith,* Michael P. Charles,§ Glynis Jones,§ Martin K. Jones,à Terence A. Brown, and Wayne Powell*1 *National Institute for Agricultural Botany, Cambridge, United Kingdom; Manchester Interdisciplinary Biocentre, University of Manchester, Manchester, United Kingdom; àMcDonald Institute for Archaeological Research, University of Cambridge, Cambridge, United Kingdom; and §Department of Archaeology, University of Sheffield, Sheffield, United Kingdom Gene resequencing and association analysis present new opportunities to study the evolution of adaptive traits in crop plants. Here we apply these tools to an extensive set of barley accessions to identify a component of the molecular basis of the flowering time adaptation, a trait critical to plant survival. Using an association-based study to relate variation in flowering time to sequence-based polymorphisms in the Ppd-H1 gene, we identify a causative polymorphism (SNP48) that accounts for the observed variation in barley flowering time. This polymorphism also shows latitude-dependent geographical distribution, consistent with the expected clinal variation in phenotype with the nonresponsive form predominating in the north. Networks, genealogies, and phylogenetic trees drawn for the Ppd-H1 haplotypes reveal population structure both in wild barley and in domesticated barley landraces. The spatial distribution of these population groups indicates that phylogeographical analysis of European landraces can provide information relevant to the Neolithic spread of barley cultivation and also has implications for the origins of domesticated barley, including those with the nonresponsive ppd-H1 phenotype. Haplotypes containing the nonresponsive version of SNP48 are present in wild barley accessions, indicating that the nonresponsive phenotype of European landraces originated in wild barley. The wild accessions whose nonresponsive haplotypes are most closely similar to those of landraces are found in Iran, within a region suggested as an area for domestication of barley east of the Fertile Crescent but which has previously been thought to have contributed relatively little to the diversity of European cultivars. Introduction The domestication of ‘‘wild’’ cereals is closely associated with the transition from a hunter-gatherer to an agrarian lifestyle in western Asia, and barley is one of the earliest domesticated crops found at Neolithic farming sites (Harlan and Zohary 1966; Zohary and Hopf 2000). Domestication of crop plants was accompanied by a series of genetic changes, which has been described as the domestication syndrome, affecting phenotypes such as seed retention, seed size, and seed dormancy, that differentiate domesticated plants from their wild progenitors (Hillman and Davis 1990; Doebley et al. 2006; Fuller 2007). The spread of agriculture from their area of domestication involved the dispersal of crop plants well beyond their progenitors’ native range and may have required adaptation to new environments. The timing of flowering is a key adaptive trait enabling plants to flower at the optimum time for pollination, seed development and, in the case of wild plants, dispersal (Cockram et al. 2007). A long annual growth season promotes late flowering, thus allowing longer periods of vegetative growth and increased resource storage. Conversely, a short growth season favors early flowering (Roux et al. 2006). Models of flowering time genes developed in the dicot Arabidopsis have been extended by comparative mapping in cereals (Decousset et al. 2000; Dunford et al. 2002; Laurie et al. 2004). In barley, 2 major loci determining photoperiod response have been identified: Photoperiod-H2 (Ppd-H2) (chromosome 1H) that controls flowering time under short days and Photoperiod-H1 1 Present address: Institute of Biological, Environmental & Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom. Key words: Hordeum vulgare, domestication, flowering, Neolithic, photoperiod response, landrace. E-mail: [email protected]. Mol. Biol. Evol. 25(10):2211–2219. 2008 doi:10.1093/molbev/msn167 Advance Access publication July 31, 2008 Ó The Author 2008. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] (Ppd-H1) (Eam1) (chromosome 2H) that controls flowering under long-day conditions (Laurie et al. 1995). The organization and function of the barley (Ppd-H1) locus has been described and a causative polymorphism proposed (Turner et al. 2005). The recent isolation of the Ppd-H1 locus provides an opportunity to investigate how the full spectrum of sequence variation in Ppd-H1, a gene involved in circadian clock function (Gardner et al. 2006) originated and evolved during domestication and dispersal of this major food crop. Resequencing of candidate genes within diverse populations has considerable potential in understanding complex human traits (Topol and Frazer 2007). Resequencing-based association studies have been used to study vernalization response and pathogen resistance in Arabidopsis (Aranzana et al. 2005), but the full potential of these methods has yet to be fully explored in crop plants, particularly in relation to understanding the domestication syndrome and the selective pressures acting on crop plants during their subsequent geographic expansion. Crop landraces and their progenitor wild species have considerable potential as subjects for resequencing-based association studies: the varied evolutionary histories of wild barleys and widely dispersed landraces have generated diverse ecotypes, due to natural or human selection, with a wide range of phenotypic responses. Characteristically, landraces have historical origins, high local genetic adaptation, recognizable identities, genetic diversity, a lack of formal genetic improvement, and an association with traditional farming systems (Camacho Villa et al. 2005). Landraces were established in all regions where crops were grown up to the 19th century. In Europe, where improved agriculture developed in the 19th and 20th centuries, landraces were replaced by varieties developed by selections from landraces and later by the products of systematic plant breeding (Jones et al. 2008). The variation available within these landrace populations could provide a valuable resource to help unravel the 2212 Jones et al. genetic mechanisms that regulate how plants adapt to different habitats. Adaptive variation in photoperiod response may play a role in allowing crops to be grown in new environments or under differing agricultural practices. Variation in the photoperiod response of barley landraces from different agroecological zones has been described (Knüpffer et al. 2003), and this variation is organized in a latitudinal cline. Similar geographic variation has been observed for dormancy phenotypes in the European aspen (Populus tremula) associated with genotypic variation in the candidate gene (PHYB2) for day length–induced dormancy (Ingvarsson et al. 2006). Phenotypic variation in hypocotyl growth in response to light has been correlated to latitude in natural populations of Arabidopsis thaliana from Scandinavia (Stenøien et al. 2002). Latitudinal clines in flowering times have also been observed in Arabidopsis modulated by the FRIGIDA gene (Stinchcombe et al. 2004). A monophyletic origin for barley, with a single domestication in the Israel–Jordan region, was initially suggested by amplified fragment length polymorphism analysis and haplotype data from the gene Bkn-3 (Badr et al. 2000), but polyphyletic origins have also been proposed (Kolodinska et al. 2004; Molina-Cano et al. 2005; Azhaguvel and Komatsuda 2007). Haplotype studies at multiple loci have revealed geographic structure in wild barley populations (Morrell et al. 2003; Kilian et al. 2006), and the genotypes seen in landrace and elite-cultivated barleys have been related to geographically localized subpopulations. These studies give further support to the Israel–Jordan region as a center of domestication, but wild barley subpopulations from areas described by Morrell and Clegg (2007) as lying east of the Fertile Crescent have also contributed to the diversity seen in cultivated barley, probably via a second domestication. There is also archaeobotanical evidence to suggest that different cereal species were domesticated independently in different areas of Southwest Asia (Willcox 2005) suggesting that domestications occurred at a number of different locations within this broad region. The subsequent spread of cultivated barley into Europe has not been subject to equivalent investigation, but barley genetics holds the same potential for understanding the selective pressures acting on early crops and the mechanisms of their dispersal. The patterns and chronology of the spread of the Neolithic into Europe have been described by archaeologists studying the material cultures linked with early agriculture (Dolukhanov et al. 2005). Cereal agriculture spread rapidly into Greece and, from there, by 2 principal routes through the rest of Europe (Price 2000). The southerly spread, through the Mediterranean basin to Italy and Iberia, was apparently rapid but the northerly route more punctuated: after a rapid spread into the Balkans, agricultural expansion seems to have slowed down when it reached Hungary before again spreading rapidly through the Danube and Rhine valleys, followed by further delays in the north European plain and the Alpine foreland (Sümegi and Kertesz 2001; Gkiasta et al. 2003). This paper quantifies variation at the Ppd-H1 photoperiod response gene in a sample of European barley landraces and wild barley accessions and addresses questions regarding the molecular basis of adaptation of flowering time, the origin of the adaptive trait, and the relationship between geographically structured variation in the barley landraces and the dispersal of agriculture into Europe. Materials and Methods Accessions The Ppd-H1 gene was sequenced in 194 barley landraces from Europe (170), the Fertile Crescent (14), and Bolivia (10) and 72 wild barleys from the Fertile Crescent. The Ppd-H1 phenotype was determined in a subset of 87 barley landraces; this subset was also assayed to determine their microsatellite and miniature inverted repeat transposable element (MITE) genotypes. The accessions and their collection sites are given in Supplementary Material online. Sequencing A 3,508-bp region spanning the Ppd-H1 gene was sequenced in both directions. DNA was extracted from one randomly selected coleoptile from each accession using a Qiagen DNeasy96 kit. Seven overlapping polymerase chain reactions (PCRs) were performed in 20 ll reactions containing 1 PCR buffer (Roche Diagnostics, Burgess Hill, UK), 1.5 mM MgCl2, 0.2 mM each deoxynucleoside triphosphate (dNTP), 0.2 M of both forward and reverse primers, 0.65 units Fast-Start Taq (Roche Diagnostics, Burgess Hill, UK), and 50 ng genomic DNA. PCR was performed using an Applied Biosystems 9700 PCR machine using the following cycling conditions; 96 °C for 10 min; 40 cycles of 96 °C for 40 s, primer-specific annealing temperature for 40 s, and 72 °C for 1.5 min; and a final extension of 72 °C for 7 min. Primer sequences and the annealing temperatures are given in Supplementary Material online. PCR products were treated with Exo-SAP (New England Biolabs, Hitchin, UK, and Promega, Southampton, UK) and sequenced (forward and reverse) using the Applied Biosystems Big Dye Terminator 3.1 system. Sequences were aligned using Pre-Gap and Gap (Bonfield et al. 1995; Bonfield and Staden 1996) and polymorphic sites identified. Microsatellite and MITE Genotyping The genotypes at 24 genomic microsatellite loci were determined in a series of 10 ll multiplex PCRs each containing 1 PCR buffer þ MgCl2 (Roche Diagnostics, Burgess Hill, UK), 0.2 mM each dNTP, 0.2 M of both forward and reverse primers, 0.65 units Taq (Roche Diagnostics, Burgess Hill, UK), and 1.0 ll extracted DNA. Primer sequences are given in Supplementary Material online. PCR was performed using an Applied Biosystems 9700 PCR machine using the following cycling conditions: 94 °C for 5 min; 7 cycles of 94 °C for 30 s, 65 °C for 30 s decreasing by 1 °C per cycle, and 72 °C for 30 s; followed by 28 cycles of 94 °C for 30 s, 58 °C for 30 s, and 72 °C for 30 s; and followed by final extension at 72 °C for 7 min. PCR products were separated and visualized using an ABI 3100 genetic analyzer and data analyzed using ABI Genemapper software. The genotypes at 10 MITE loci were determined in a series of PCRs. PCR products were separated and visualized on 1.5% agarose gels stained with ethidium bromide. Primer sequences and Origin of Adaptation in Barley Flowering Time 2213 FIG. 1.—F-ratio scores for Ppd-H1 polymorphisms plotted against their sequence position. The positions of the nonsynonymous SNPs (black bars), synonymous SNPs (gray bars), and a cartoon of the Ppd-H1 gene are shown for reference. F-ratio scores greater than 11.6 are highly significant (P , 0.001). reaction conditions were supplied by Prof. A. Schulman, University of Helsinki, Finland, under a confidentiality agreement. Statistical Methods PowerMarker 3.25 (Liu and Muse 2005) was used to calculate haplotype diversity. DnaSP (Rozas et al. 2003) was used to calculate the nucleotide diversity (p) and Wright’s FST and to detect recombination within the sequenced region. Phylogenetic trees used to visualize evolutionary histories were drawn using the DNADIST and DNAML programs from the PHYLIP package (Felsenstein 2004) and the trees drawn using MrEnt (Zuccon A and Zuccon D 2006). Networks were drawn using SplitsTree 4.6 (Huson and Bryant 2006), NETWORK 4.2.0.1 (Bandelt et al. 1999; Anon 2007), and TCS 2.1 (Clemment et al. 2000). Coalescent ages (q) were calculated using NETWORK 4.2.0.1 (Forster et al. 1996) Phenotyping and Association Study The Ppd-H1 phenotype was determined in 87 barley landraces from Europe by growing them in adjacent glasshouses under long-day and short-day conditions and comparing the flowering dates. The phenotype was expressed as the number of days difference in flowering for long days versus short days. Forty single nucleotide polymorphisms (SNPs) were identified in Ppd-H1 within these 87 barley landraces. The SNP genotypes were recoded as binary data and tabulated together with the Ppd-H1 phenotype data. An analysis of variance (ANOVA) was carried out on a linear regression of Ppd-H1 phenotype against each SNP in turn (R Development Core Team 2007). A common variance for Ppd-H1 phenotype and SNP genotype is expected under the null hypothesis of no association; association can be detected by an increase in the ratio of the variances (F-ratio) (Aranzana et al. 2005; Balding 2006). The possibility of false positives was investigated by applying structured association (Mackay and Powell 2007). The microsatellite data were interrogated for genetic structure within the accession set using STRUCTURE (Pritchard FIG. 2.—Posterior probability (P) of each accession belonging to the ppd-HI nonresponsive distribution with group membership shown for comparison (crosses, SNP48 5 T; circles, SNP48 5 C). Accessions with SNP48 5 T are shown to be associated with the ppd-H1 nonresponsive phenotype. et al. 2000) and the admixture ancestry model with a burnin of 200,000 steps and runs of 1,000,000 steps, following the recommendations and methods of the manual. Genetic structure was investigated for population numbers (K) between 2 and 10. The optimum stable solution was given for K 5 8. The Q-matrix output from STRUCTURE was tabulated with the Ppd-H1 phenotype data and binary SNP data. ANOVA was carried out on a multiple linear regression of Ppd-H1 phenotype against the coefficients of population ancestry (Q-matrix) and each SNP in turn. Associations were detected by an increase in F-ratios, as above, where F-ratios greater than 11.6 were highly significant (P , 0.001). The F-ratio for each polymorphism, estimated by structured association, was plotted against its sequence position (see fig. 1). The structured association method was validated by comparison with a similar regression carried out on 10 MITE markers. The Q-matrix output from STRUCTURE was tabulated with the Ppd-H1 phenotype data and binary MITE data, and ANOVA was carried out on a multiple linear regression of Ppd-H1 phenotype against the coefficients of population ancestry (Q-matrix) and each MITE. No significant association (P . 0.05) was found between the MITEs and the Ppd-H1 phenotype. A mixture of 2 normal distributions for the responsive Ppd-H1 and nonresponsive ppd-H1 phenotypes, with independent means, variances, and a mixing coefficient, were fitted to the observed phenotypes by least squares using the Excel Solver function. The derived means and variances were then used to calculate the posterior probability of membership of each landrace to each of the 2 groups. These were used to plot a cumulative probability curve where the accessions could be identified according to their genotypes (see fig. 2). Results and Discussion Identification of the Causative Polymorphism for the Nonresponsive Phenotype Resequencing the Ppd-H1gene in 266 barley accessions comprising 194 Hordeum vulgare landraces and 2214 Jones et al. 72 Hordeum spontaneum wild barleys identified 83 polymorphisms (78 SNPs, 4 indels, and 1 microsatellite). These are organized into 121 haplotypes; following divergence since domestication only 5 of these haplotypes are shared between the landraces and wild barley. The nucleotide diversity for Ppd-H1 shows a reduction characteristic of a ‘‘domestication bottleneck’’ between wild and cultivated barley (table 1). The observed reduction to 78% of wild nucleotide diversity is comparable to that seen in domesticated maize compared with its wild progenitor (Tenaillon et al. 2004). To examine the phenotypic effects of sequence variation at Ppd-H1, a set of 87 European landraces were grown and evaluated for differences in flowering time under longday and short-day conditions. Barley plants with the responsive phenotype (Ppd-H1) are expected to flower later if grown under short days relative to barley plants grown under long days, and those with the nonresponsive phenotype (ppd-H1) are expected to flower at the same time whether they are grown under long days or short days. The expectation is therefore a bimodal distribution for the Ppd-H1 phenotype, and the observed phenotypes conform to this, with the modal values differing by 18 days (supplementary fig. 1, Supplementary Material online). These results did not appear to be complicated by any contribution from the second photoperiod gene Ppd-H2. Evidence for a phenotypic effect of variation at Ppd-H2 was sought by examining the flowering dates for plants grown under short-day conditions, but these dates showed no deviation from a normal distribution (Anderson-Darling’s test for goodness of fit to the normal distribution) suggesting that no phenotypic effect of Ppd-H2 had been detected by our test system. Of the 40 polymorphisms that were recorded within Ppd-H1 for this set of landraces, 20 showed a significant association with the responsive Ppd-H1 phenotype and 18 polymorphisms showed a highly significant (P , 0.001) association with the Ppd-H1 phenotype after adjustment for the effects of population structure (fig. 1 and supplementary fig. 2, Supplementary Material online). Five polymorphisms, which are in perfect linkage disequilibrium (LD) (r2 5 1) with each other, among the sample of 87 landraces showed the strongest association (F-ratio 5 80.89) with the trait. Among the 5 strongly associated polymorphisms, only SNP48 produces a change in the translated protein. A ‘‘C’’ at SNP48 is associated with early flowering in long days (Ppd-H1), whereas ‘‘T’’ at SNP48 is not associated with early flowering in long days (ppd-H1). SNP79 (the causative polymorphism proposed by Turner et al. [2005]) is not among the 5 polymorphisms that have the strongest association with the Ppd-H1 phenotype (F-ratio for SNP79 5 64.02) and are in perfect LD (r 5 0.86 for LD between SNP48 and SNP79). Although this analysis associates phenotype with genotype at SNP48, a more certain link could be obtained by observing the Ppd-H1 phenotype in near isogenic lines transformed to harbor the alternate allelic forms. Each accession was assigned a probability of belonging either to the Ppd-H1 responsive group or to the ppd-H1 nonresponsive group by fitting a mixture model to the quantitative phenotype. When these posterior probabilities were Table 1 Diversity of Ppd-H1 Haplotypes in Wild and Cultivated Barley Group All accessions Wild barley Landraces Number of Number of Haplotype Accessions Haplotypes Diversity 266 72 194 121 60 66 0.97 0.99 0.95 Nucleotide Diversity (p) 0.0038 ± 0.00001 0.0040 ± 0.00003 0.0031 ± 0.00001 used to plot a cumulative probability curve, the landraces are identified by their genotypes at SNP48 and the association between SNP48 and Ppd-H1 phenotype can be seen (fig. 2), supporting the conclusion that SNP48 is indeed the causative polymorphism. Latitudinal Cline for Ppd-H1 Phenotypes in European Landraces Barley with the responsive phenotype (Ppd-H1) is expected to predominate in regions where the growing season is short and with a dry summer; conversely, the nonresponsive phenotype (ppd-H1) should predominate in regions with a long growing season and a moist summer. In Europe, this should result in a latitudinal cline for flowering time. This pattern is evident in the geographic distribution of the landraces, assigned to their predicted phenotype on the basis of SNP48 (fig. 3A). We divided the landraces by latitude into numerically equal sized classes and plotted the frequency of each group for each class (fig. 3B). The results show that the proportion of landraces with a C at SNP48 (predicted to be responsive Ppd-H1) is higher in southern latitudes and the proportion with T at SNP48 (predicted to be nonresponsive ppd-H1) increases in northern latitudes. This latitudinal cline has a correlation coefficient of r 5 0.46 that is highly significant (P , 0.01) when tested by permutation (n 5 999). A latitudinal cline drawn in the same way for SNP79 (proposed to be causative by Turner et al. [2005]) has a correlation coefficient r 5 0.13 that is not significant (P . 0.05) when tested by permutation (n 5 999). The pattern for SNP48 is consistent with the expected latitude-dependent distribution with the nonresponsive form predominating in the north, offering further support to the conclusion that SNP48 is the causative polymorphism. Evolutionary Relationships between Ppd-H1 Haplotypes The relationships between all 121 Ppd-H1 haplotypes were examined by constructing networks with NETWORK 4.2.0.1 and TCS 2.1. The resulting topologies (fig. 4) are virtually identical and are in broad agreement with the relationships shown by a network drawn using SplitsTree 4.6, with Neighbor-Joining trees produced by using DNADIST and NEIGHBOR, and with a maximum likelihood tree produced by DNAML. Five distinct groups of haplotypes are seen within the networks. All haplotypes with a T at SNP48, and hence predicted to give rise to the nonresponsive ppd-H1 phenotype, cluster together in a single group (Group A, highlighted Origin of Adaptation in Barley Flowering Time 2215 haplotypes. Most of the wild barleys in this group have origins strongly resembling the geographical range ‘‘Zagros and further east’’ (Morrell and Clegg 2007), considered by those authors as harboring a subpopulation genetically distinct from the wild population in the western Fertile Crescent. The 2 landraces in Group D both come from Syria. The final group, E (purple in fig. 4), comprises haplotypes found only in wild barleys. In the Neighbor-Joining and maximum likelihood trees, Group E has the deepest root, suggesting that it is ancestral to the other 4 groups. The networks show evidence of character conflict arising from recombination, but most reticulations are confined within groups, rather than occurring between haplotypes in different groups, even when the parsimony settings in TCS are reduced. This topology therefore indicates that the 5 groups listed above have evolved as independent populations, with some recombination occurring between accessions belonging to each individual population (with the apparent exception of the ancestral Group E) but with only limited gene flow between populations. Implications for the Origins of Domesticated Barley FIG. 3.—Latitudinal cline of Ppd-H1 haplotypes in Europe. (A) Distribution of landraces with Ppd-H1 haplotypes in which SNP48 is either C (open circles; predicted to be responsive Ppd-H1) or T (closed circles; predicted to be nonresponsive ppd-H1). (B) The same results expressed as a histogram with landraces divided into 6 numerically equal sized classes according to latitude. blue in fig. 4). This group comprises 20 haplotypes found only in landraces, 6 found only in wild barley, and 1 shared haplotype. Of the 18 polymorphisms identified as significantly associated with the Ppd-H1 phenotype, only SNP48 partitions congruently among the members of Group A and all other accessions in this way. In contrast, accessions harboring a ‘‘G’’ at SNP79 (predicted to be ppdH1 by Turner et al. [2005]) partition between Groups A, B, and C (73, 5, and 7 accessions, respectively). The 100% coincidence of T at SNP48 and membership of Group A therefore provides further support that SNP48 is the causative polymorphism. The remaining 4 groups contain haplotypes with a C at SNP48 (predicted Ppd-H1). The majority of the responsive haplotypes that are found only in landraces fall into 2 groups, B and C (highlighted yellow and green, respectively, in fig. 4). Group B comprises 18 landrace haplotypes, 3 wild, and 1 shared, and Group C comprises 20 landrace haplotypes, 9 wild, and 3 shared. All the landraces possessing Group B or C haplotypes come from Europe, and all the wild barleys come from the western arm of the Fertile Crescent. The fourth group, D (red in fig. 4), is made up of 24 wild haplotypes and 2 landrace Haplotypes found only in wild barley accessions are broadly distributed across the network and, in particular, are colocated with European landrace haplotypes in each of Groups A, B and C. Hence, both the day length responsive form of cultivated barley (Ppd-H1, Groups B and C) and the nonresponsive form (ppd-H1, Group A) originate from wild barley, rather than one type evolving from the other as a result of mutation during dispersal of landraces in Europe. First, we consider the responsive types. Sixteen of the 38 haplotypes in Groups B and C are found in wild barley, all these accessions located in Turkey, the Israel–Jordan region, or another part of the western Fertile Crescent. All the landraces with Group B or C haplotypes come from Europe. These results are therefore consistent with an origin for the responsive landraces in the western Fertile Crescent, possibly the Israel–Jordan region as previously suggested (Badr et al. 2000; Kilian et al. 2006). Division of these haplotypes into 2 groups could suggest that there were 2 domestications in this region, but the reticulations in the basal regions of these 2 groups leave possible a single origin with subsequent splitting of the domesticated population. Group D also comprises responsive haplotypes present in both wild barley and landraces. Most of these wild accessions originate to the east of the Fertile Crescent and have previously been shown to be genetically distinct from the western wild populations (Morrell and Clegg 2007). The absence from Group D of haplotypes found in European landraces suggests that the eastern wild population has not contributed to functional diversity of the Ppd-H1 phenotype in European germplasm. Group D does, however, include 2 domesticated haplotypes, present in 2 landraces from Syria. The positioning of these haplotypes within Group D indicates that they arose from the wild Group D accessions, in agreement with the suggestion that there was a second domestication of barley from the eastern wild population (Morrell and Clegg 2007). Our results 2216 Jones et al. FIG. 4.—Relationships between Ppd-H1 haplotypes shown as networks constructed with (A) NETWORK and (B) TCS. The nonresponsive wild Iranian and Israeli accessions in Group A are marked with red and blue asterisks, respectively. The 2 Iranian accessions from the Zagros region, suggested by recombination breakpoint analysis to be most closely related to the Group A landraces, are indicated by double asterisks. predict that the landraces from South and East Asia that derive from this second domestication (Saisho and Purugganan 2007) have Ppd-H1 haplotypes falling within Group D. Now we consider the nonresponsive haplotypes, all of which fall within Group A. The 7 of these haplotypes that we located in wild barley occur in accessions from Israel (3 haplotypes from 22 Israeli accessions) and Iran (4 haplotypes from 20). The topologies of the networks place the Israeli haplotypes at the base of Group A, with the Iranian haplotypes in more derived positions reticulated with the nonresponsive landraces. Although it is possible that other populations of nonresponsive barley exist in the wild, the implication of our results is that the ppd-H1 phenotype of European landraces originated in wild barley in Iran, lending support to the earlier finding of an eastern center for barley domestication (Morrell and Clegg 2007). An Iranian origin for the nonresponsive landraces is also supported by detailed analysis of recombination breakpoints in the Group A haplotypes. Eighteen recombination breakpoints were detected within the Ppd-H1 locus by the ‘‘4 gamete’’ test (Hudson and Kaplan 1985), 15 of these present in wild barley haplotypes and 10 in landraces (see Supplementary Material online). The presence of 8 ‘‘private’’ recombinants in wild barley compared with just 3 in landraces is consistent with the reduction in diversity expected during the domestication of a wild species. Examination of the subhaplotypes in the intervals between recombination breakpoints detected several cases where a subhaplotype rare (,10%) in wild barley is common (.40%) in the landraces. The wild barleys harboring these rare subhaplotypes can be interpreted as founders of those landrace lineages that harbor the same subhaplotypes. At breakpoint intervals between positions 498–734 and 634–788 of the sequenced region, subhaplotypes that are rare in wild barley were present in 2 of the Iranian Group A accessions but not in any of the Israeli ones. These subhaplotypes are present in all 97 of the landraces in Group A, strong evidence that these landraces are related to the Iranian wild accessions. The network topology is consistent with linear descent from wild to landrace and hence with a separate domestication event giving rise to the nonresponsive landraces, but the alternative—transfer by admixture between wild and existing landraces—cannot be discounted. The foothills of the Zagros Mountains are among the 3 general regions suggested as centers for a secondary domestication of barley (Morrell and Clegg 2007). One of the 2 wild barleys that our analysis suggests as likely progenitors of the nonresponsive cultivated barleys was sampled in this region (HOR 2882, 32°33#N, 48°33#E). The second of these accessions originates in the north of the Zagros Range (HOR 2680 36°30#N, 48°42#E). Implications for Future Studies of the Spread of Barley Cultivation in Europe Our results are also relevant to attempts to use genetic and phylogeographical analysis of cereal landraces as a means of studying the spread and subsequent development of agriculture in Europe. The European landraces possessing haplotypes falling into Groups A, B, and C are highly differentiated at the Ppd-H1 locus (FST (A vs. B) 5 0.84, FST (A vs. C) 5 0.83, and FST (B vs. C) 5 0.76) suggesting that they are distinct lineages with differing geographic distributions and different dispersal histories. A possible differentiation between the Groups B and C haplotypes is visible when the landrace distribution across Europe is plotted (fig. 5), Group B landraces predominating in Southern Europe in regions with a Mediterranean coast, and Group C being distributed across Europe from the North European Plain, the northern Balkans through to western Europe. Origin of Adaptation in Barley Flowering Time 2217 FIG. 5.—Distribution in Europe of landraces with Ppd-H1 haplotypes belonging to Group B (open circles) and C (triangles). Growth studies (Jones H, unpublished data) have provided no indication of any phenotypic difference between Groups B and C accessions that might lead to differential selective pressure on a north–south cline. The geographic distributions of the Groups B and C landraces show a striking similarity to the original distributions expected if Group B had spread into Europe mainly by the southern route along the Mediterranean and Group C by the northern route through Central Europe. This in itself does not advance our archaeological knowledge as the existence of these routes is well established, but it does suggest that modern landraces preserve at least some of the phylogeographical pattern set up during the original Neolithic spread of agriculture, despite the movement of germplasm anticipated during the 7000– 8000 years since Europe became domesticated. If this interpretation is correct, then more detailed analysis of the genotypes of European landraces (of neutral as well as adaptive markers) might be used to probe aspects of agricultural spread that are not fully resolved, such as the degree of contact between the 2 trajectories as measured by the amount of admixture at locations such as the Alpine foreland where human populations along the 2 principal routes may have come into contact. The predominant distribution of the nonresponsive Group A haplotypes in northern Europe has already been described (fig. 3). Although this distribution is likely to reflect the outcome of continuing selective pressure, the emergence of this group raises questions about the spread of ppd-H1 haplotypes into Europe. Our data show that the nonresponsive haplotypes characterizing the northern part of the cline evolved directly from wild barley populations and are therefore not the result of de novo mutation within cultivars but rather a dispersal from east of the Fertile Crescent. If the ppd-H1 haplotypes entered the cultivated gene pool before agriculture spread out of Southwest Asia, then it is conceivable that these haplotypes spread into Europe with the initial Neolithic expansion of farming, selection during or subsequent to this spread establishing the greater frequency of ppd-H1 in northern Europe. Alternatively, ppd-H1 haplotypes could have spread independently to northern Europe at some period after the initial expansion of agriculture. Our data do not provide any clear indication of the period, if any, that elapsed between the original spread of agriculture and the arrival of nonresponsive haplotypes in Europe, though the possibility that there was a delay between the 2 events is suggested by calculating the coalescent age (q) (Forster et al. 1996) of each group of landraces. Likely founder sequences, that occupy a central position within the phylogeny, are suggested by the topology of the network for each group of European landraces. Their status as likely founders is supported by their frequency within the European landraces, by their similarity to haplotypes seen in wild barley, and by their inclusion of accessions originating near to likely dispersal routes into Europe. The coalescent age of each group of landraces was calculated, based on the network, as the average mutational distance from the likely founding sequence. The value for the Group A landraces (1.61) is lower than the values for the Group B or C landraces (2.13 and 2.36, respectively), possibly suggesting that barley with the nonresponsive form (ppd-H1) dispersed into Europe more recently than barley with the responsive form. If the Group A haplotypes do represent a later spread of domesticated barley into Europe from Iran, as suggested for crops such as the oil plant Lallemantia, introduced during the Bronze Age (Jones and Valamoti 2005), this raises the possibility that these nonresponsive landraces spread to northern Europe not via Turkey and Greece but through Transcaucasia and around the north coast of the Black Sea. Conclusions By population-based resequencing of the photoperiod response gene Ppd-H1 in an extensive sample of barley landraces, we are able to discriminate between candidate causative polymorphisms by the strength of statistical association with phenotype and to identify a new causative polymorphism, SNP48, that has a stronger association with phenotype than that proposed by Turner et al. (2005). Identification of SNP48 as the causative polymorphism is supported by the geographic distribution of SNP48 alleles across Europe, which displays a latitudinal cline consistent with the expected distribution of the Ppd-H1 phenotype, and by phylogenetic analysis, which places all Ppd-H1 haplotypes containing the nonresponsive version of SNP48 in a single group. Haplotypes containing the nonresponsive version of SNP48 are also present in wild barley accessions, indicating that the nonresponsive phenotype of European landraces originated in wild barley. The wild accessions whose nonresponsive haplotypes are most closely similar to those of landraces are found in Iran, within a region suggested as an area for domestication of barley east of the Fertile Crescent (Morrell and Clegg 2007). This eastern origin was the progenitor of cultivated barley now found in Central Asia and the Far East but has previously been thought to have contributed relatively little to the diversity of European cultivars (Morrell and Clegg 2007; Saisho and Purugganan 2007). By revealing that this eastern region was the origin of the nonresponsive version of Ppd-H1 found in Europe, we show that although domestications in this region had only limited impact on the overall genetic 2218 Jones et al. diversity of European barley, the impact on the phenotype of European landraces and modern cultivars derived from them has been significant. Supplementary Material Supplementary figures 1–3 tables 1–4 are available at Molecular Biology and Evolution online (http://www.mbe. oxfordjournals.org/). Acknowledgments This work was funded by the Natural Environment Research Council as part of the consortium project ‘‘The Domestication of Europe’’ and by National Institute for Agricultural Botany. We thank the consortium project partners for helpful advice, in particular Robin Allaby (University of Warwick), Keri Brown (University of Manchester), and Amy Bogaard (University of Oxford) as well as David Laurie and James Beales (John Innes Centre). We also thank Phillipa Smallman (NIAB) for carrying out the MITE genotyping. Literature Cited Anon. 2007. Network 4.2.0.1 user guide [Internet]. Fluxus Technology Ltd., Clare, Suffolk, UK. [cited 17 Aug 2008]. Available from: http://www.fluxus-technology.com. Aranzana MJ, Kim S, Zhao K, et al. (17 co-authors). 2005. Genome-wide association mapping in Arabidopsis thaliana identifies previously known genes responsible for variation in flowering time and pathogen resistance. PLoS Genet. 1:e60. Azhaguvel P, Komatsuda T. 2007. A phylogenetic analysis based on nucleotide sequence of a marker linked to the brittle rachis locus indicates a diphyletic origin of barley. Ann Bot. 100:1009–1015. Badr A, Muller K, Schafer–Pregl R, El Rabey H, Effgen S, Ibrahim HH, Pozzi C, Rohde W, Salamini F. 2000. On the origin and domestication history of barley (Hordeum vulgare). Mol Biol Evol. 17:499–510. Bandelt H-J, Forster P, Röhl A. 1999. Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol. 16:37–48. Balding DJ. 2006. A tutorial on statistical methods for population association studies. Nat Rev Genet. 7:781–791. Bonfield JK, Smith KF, Staden R. 1995. A new DNA sequence assembly program. Nucleic Acids Res. 24:4992–4999. Bonfield JK, Staden R. 1996. Experiment files and their application during large-scale sequencing projects. DNA Seq. 6:109–117. Camacho Villa TC, Maxted N, Scholten M, Ford-Lloyd B. 2005. Defining and identifying crop landraces. Plant Genet Res. 3:373–384. Clemment M, Posada D, Crandall KA. 2000. TCS: a computer programme to estimate gene genealogies. Mol Ecol. 9:1657–1660. Cockram J, Jones H, Leigh FJ, O’Sullivan D, Powell W, Laurie DA, Greenland AJ. 2007. Control of flowering time in temperate cereals: genes, domestication, and sustainable productivity. J Exp Bot. 58:1231–1244. Decousset L, Griffiths S, Dunford RP, Pratchett N, Laurie DA. 2000. Development of STS markers closely linked to the Ppd- H1 photoperiod response gene of barley. Theor Appl Genet. 101:1202–1206. Doebley JF, Gaut BS, Smith BD. 2006. The molecular genetics of crop domestication. Cell. 127:1309–1321. Dolukhanov S, Shukarov A, Gronenborn D, Sokoloff D, Timofeev V, Zaitseva G. 2005. The chronology of Neolithic dispersal in Central and Eastern Europe. J Archaeol Sci. 32:1441–1458. Dunford RP, Yano M, Kurata N, Sasaki T, Huestis G, Rocheford T, Laurie DA. 2002. Comparative mapping of the barley Ppd-H1 photoperiod response gene region, which lies close to a junction between two rice linkage segments. Genetics. 161:825–834. Felsenstein J. 2004. PHYLIP, phylogeny inference package, version 3.6. Seattle (WA): University of Washington. Forster P, Harding R, Torroni A, Bandelt H. 1996. Origin and evolution of native American mtDNA variation: a reappraisal. Am J Hum Genet. 59:935–945. Fuller DQ. 2007. Contrasting patterns in crop domestication and domestication rates: recent archaeobotanical insights from the Old World. Ann Bot. 100:903–924. Gardner MJ, Hubbard KE Hotta CT, Dodd AN, Webb AAR. 2006. How plants tell the time. Biochem J. 397:15–24. Gkiasta M, Russell T, Sherman S, Steele J. 2003. Neolithic transition in Europe: the radiocarbon record revisited. Antiquity. 77:45–62. Harlan JR, Zohary D. 1966. Distribution of wild wheats and barley. Science. 153:1074–1080. Hillman GC, Davis MS. 1990. Domestication rates in wild-type wheats and barley under primitive cultivation. Biol J Linn Soc. 38:39–78. Hudson RR, Kaplan NL. 1985. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics. 111:147–164. Huson DH, Bryant D. 2006. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 23:254–267. Ingvarsson PK, Garcia MV, Hall D, Luquez V, Jansson S. 2006. Clinal variation in phyB2, a candidate gene for day-lengthinduced growth cessation and bud set, across a latitudinal gradient in European aspen (Populus tremula). Genetics. 172:1845–1853. Jones G, Valamoti SM. 2005. Lallemantia, an imported or introduced oil plant in Bronze Age northern Greece. Veg Hist Archaeobot. 14:571–577. Jones H, Lister DL, Bower MA, Leigh FJ, Smith LM, Jones MK. 2008. Approaches and constraints of using existing landrace and extant plant material to understand agricultural spread in prehistory. Plant Genet Res. 6:98–112. Kilian B, Özkan H, Kohl J, von Haeseler A, Barale F, Deusch O, Brandolini A, Yucel C, Martin W, Salamini F. 2006. Haplotype structure at seven barley genes: relevance to gene pool bottlenecks, phylogeny of ear type and site of barley domestication. Mol Genet Genomics. 276:230–241. Knüpffer H, Terentyeva I, Hammer K, Kovaleva O, Sato K. 2003. Ecogeographical diversity—a Vavilovian approach. In: von Bothmer R, van Hintum T, Knüpffer H, Sato K, editors. Diversity in barley. Amsterdam (The Netherlands): Elsevier. p. 53–76. Kolodinska BA, Bothmer RV, Dayteg C, Rashal I, Tuvesson S, Weibull J. 2004. Inter simple sequence repeat analysis of genetic diversity and relationships in cultivated barley of Nordic and Baltic origin. Hereditas. 141:186–192. Laurie DA, Griffiths S, Dunford RP, Christodoulou V, Taylor SA, Cockram J, Beales J, Turner A. 2004. Comparative genetic approaches to the identification of flowering time genes in temperate cereals. Field Crops Res. 90:87–99. Origin of Adaptation in Barley Flowering Time 2219 Laurie DA, Pratchett N, Bezant JH, Snape JW. 1995. RFLP mapping of five major genes and eight quantitative trait loci in a winter spring barley (Hordeum vulgare L.) cross. Genome. 38:575–585. Liu K, Muse SV. 2005. PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics. 21:2128–2129. Mackay I, Powell W. 2007. Methods for linkage disequilibrium mapping in crops. Trends Plant Sci. 12:57–63. Molina-Cano JL, Russell JR, Moralejo MA, Escacena JL, Arias G, Powell W. 2005. Chloroplast DNA microsatellite analysis supports a polyphyletic origin fro barley. Theor Appl Genet. 110:613–619. Morrell PL, Clegg MT. 2007. Genetic evidence for a second domestication of barley (Hordeum vulgare) east of the Fertile Crescent. Proc Natl Acad Sci USA. 104:3289–3294. Morrell PL, Lundy KE, Clegg MT. 2003. Distinct geographic patterns of genetic diversity are maintained in wild barley (Hordeum vulgare ssp. spontaneum) despite migration. Proc Natl Acad Sci USA. 100:10812–10817. Price TD, editor. 2000. Europe’s first farmers. Cambridge: Cambridge University Press. Pritchard JK, Stephens M, Donnelly P. 2000. Inference of population structure using multilocus genotype data. Genetics. 155:945–959. R Development Core Team. 2007. R version 2.5.1 [Internet]. Vienna (Austria): R Foundation for Statistical Computing. Available from: http://www.R-project.org. Roux F, Touzet P, Cuguen J, Le Corre V. 2006. How to be earlyflowering: an evolutionary perspective. Trends Plant Sci. 11:375–381. Rozas J, Sánchez-DelBarrio JCN, Messeguer X, Rozas R. 2003. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics. 19:2496–2497. Saisho D, Purugganan MD. 2007. Molecular phylogeography of domesticated barley traces expansion of agriculture in the Old World. Genetics. 177:1765–1776. Stenøien HK, Fenster CB, Kuittinen H, Savolainen O. 2002. Quantifying latitudinal clines to light responses in natural populations of Arabidopsis thaliana (Brassicaceae). Am J Bot. 89:1604–1608. Stinchcombe JR, Weinig C, Ungerer M, Olsen KM, Mays C, Halldorsdottir SS, Purugganan MD, Schmitt J. 2004. A latitudinal cline in flowering time in Arabidopsis thaliana modulated by the flowering time gene FRIGIDA. Proc Natl Acad Sci USA. 101:4712–4717. Sümegi P, Kertesz R. 2001. Palaeogeographic characteristics of the Carpathian Basin—an ecological trap during the early Neolithic? In: Kertész R, Makkay J, editors. From the Mesolithic to the Neolithic. Budapest (Hungary): Archaeolingua. p. 405–415. Tenaillon MI, U’Ren J, Tenaillon O, Gaut BS. 2004. Selection versus demography: a multilocus investigation of the domestication process in maize. Mol Biol Evol. 21:1214–1225. Topol EJ, Frazer KL. 2007. The resequencing imperative. Nat Genet. 39:439–440. Turner A, Beales J, Faure S, Dunford RP, Laurie DA. 2005. The pseudo-response regulator Ppd-H1 provides adaptation to photoperiod in barley. Science. 310:1031–1034. Willcox G. 2005. The distribution, natural habitats and availability of wild cereals in relation to their domestication in the Near East: multiple events, multiple centres. Veg Hist Archaeobot. 14:534–541. Zohary D, Hopf M. 2000. Domestication of plants in the Old World. 3rd ed. Oxford: Oxford University Press. Zuccon A, Zuccon D. 2006. MrEnt v12 program. Stockholm (Sweden): Swedish Museum of Natural History. Beth Shapiro, Associate Editor Accepted July 25, 2008