* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Session-3.-Molecular..
Pharmacogenomics wikipedia , lookup
Genetically modified crops wikipedia , lookup
Copy-number variation wikipedia , lookup
Gene therapy wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Gene nomenclature wikipedia , lookup
Minimal genome wikipedia , lookup
Gene desert wikipedia , lookup
Ridge (biology) wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Population genetics wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Heritability of IQ wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Genetic engineering wikipedia , lookup
Pathogenomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Nutriepigenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Public health genomics wikipedia , lookup
Genome evolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Human genetic variation wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome (book) wikipedia , lookup
Gene expression programming wikipedia , lookup
Designer baby wikipedia , lookup
Gene expression profiling wikipedia , lookup
Review Identifying the molecular basis of QTLs: eQTLs add a new dimension Bjarne G. Hansen1, Barbara A. Halkier1 and Daniel J. Kliebenstein2 1 Plant Biochemistry Laboratory, Department of Plant Biology and Center for Molecular Plant Physiology, Faculty of Life Sciences, University of Copenhagen, Thorvaldsensvej 40, 1871 Frederiksberg C, Copenhagen, Denmark 2 Department of Plant Sciences, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA Natural genetic variation within plant species is at the core of plant science ranging from agriculture to evolution. Whereas much progress has been made in mapping quantitative trait loci (QTLs) controlling this natural variation, the elucidation of the underlying molecular mechanisms has remained a bottleneck. Recent systems biology tools have significantly shortened the time required to proceed from a mapped locus to testing of candidate genes. These tools enable research on natural variation to move from simple reductionistic studies focused on individual genes to integrative studies connecting molecular variation at multiple loci with physiological consequences. This review focuses on recent examples that demonstrate how expression QTL data can be used for gene discovery and exploited to untangle complex regulatory networks. eQTL analysis: expression as a trait Most plant species contain significant levels of natural genetic and phenotypic variation between individuals within the species for traits ranging from development to metabolism to pathogen resistance. This intraspecific variation is a foundation of research for evolutionary and ecological biologists interested in understanding plant fitness, as well as for plant biologists focused on increasing the fitness or yield of agricultural plants. The latter has been a foundation of plant breeding research for decades and is a rich source of both quantitative genetics theory and innumerous, detailed phenotypic quantitative trait locus (QTL) studies that have been extensively described in other reviews [1–6]. More recently, natural variation has begun to be intensively studied within the model plant Arabidopsis thaliana as a means to improve our understanding of, for example, gene function, biosynthetic capacity and evolution [7–10]. Dissecting natural variation can be done by a QTL analysis, which is a statistical search for regions of the genome where genetic variation associates with phenotypic variation, for instance, plant height. The ultimate goal of QTL mapping is to determine which genes are responsible for variation in the trait [11]. In Arabidopsis, most work has focused on structured, segregating populations as exemplified by recombinant inbred lines (RILs) [7]. In crop species, a wider range of structured mapping populations have been utilized for QTL analysis, including doubled haploids, F2 and backcross populations, among others. Corresponding author: Kliebenstein, D.J. ([email protected]). 72 This review will focus on immortal populations, such as RILs. RILs are developed by several generations of singleseed descent from individual plants of an F2 population derived from a cross of two different accessions, that is, two parental genotypes. Thereby, RILs have fixed genotypes at all markers, with each individual RIL containing a random mixture of genotypes from the original parents. This greatly simplifies replicated experiments and assures that the RILs can be stored, disseminated and used by different laboratories to analyze any desired trait in any environment. QTL mapping begins with the collection of phenotypic data from the RILs, followed by statistical analysis to reveal genomic regions where allelic variation correlates with the phenotype. Depending on the number of RILs and, thereby, recombination events, these genomic regions can span a broad genetic interval, which can include several hundred genes, any of which might contain a polymorphism affecting the phenotype [12]. Thus, although it is fairly simple to detect loci controlling the observed variation, it has required significant effort to identify the molecular basis of the QTL. Recent work has begun to show how the application of systems biology approaches to natural variation, such as transcriptomics or genomic re-sequencing, can greatly benefit QTL cloning by reducing the number of candidate genes in a QTL interval. A major step forward in QTL cloning has occurred via the application of microarray technology to obtain genome-wide expression profiling from the individuals in a RIL population. This enables the mapping of QTLs controlling the transcript level for each gene (expression QTLs, eQTLs) and, thereby, the study of the relationship between genome and transcriptome. These eQTLs can then be utilized to search for associations between gene expression polymorphisms and a phenotypic QTL to identify candidate genes controlling phenotypic variation for a given trait, for example, plant height or metabolite content. Another benefit of eQTL analysis is that the same arrays can be simultaneously used for genetic mapping and phenotyping [13–15]. This review article aims to introduce recent advances in eQTL analysis, including new approaches to the use of gene expression analysis to improve our ability to understand the molecular basis of QTLs. Global analysis of cis and trans eQTLs eQTLs for the transcript level of a given gene define regions in the genome that potentially control the expression of the given gene. eQTLs are categorized as cis or trans, where cis 1360-1385/$ – see front matter ß 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.tplants.2007.11.008 Available online 11 February 2008 Review Trends in Plant Science Vol.13 No.2 Figure 1. Example of cis and trans eQTLs. In this example, transcription factor A (TF A) has gene B as regulatory target. The Y axis represents the LOD score, which is the logarithm of odds. The horizontal dashed lines indicate the significance threshold for the LOD score for TF A and gene B. Roman numerals represent the chromosome number (Chr.I–V). (a) Expression of TF A and gene B in parent accessions z and q for the RIL population used for the analyses in (b) and (c). The protein level or functionality of TF A is indicated by the number of blue ovals. The expression levels of all genes are measured for all RILs in the population using microarray, and the expression level of the gene of interest is now the trait that is analyzed. (b) eQTL for TF A. An expression polymorphism is observed for TF A. The genomic location of the polymorphic locus is identical to the genomic position of TF A. Therefore, this is a cis eQTL. Explanation: the polymorphism is located within the promoter of TF A [marked in green in (a)] causing a difference between the parent lines z and q. This expression polymorphism results in an eQTL for TF A at the same position as the genomic position of TF A. (c) eQTL for gene B. An expression polymorphism is observed for gene B. However, the genomic location of the responsible locus is different to the genomic position of gene B. Therefore, this is a trans eQTL for gene B. The location of the eQTL for gene B coincides with the location of the cis eQTL for TFA. Explanation: The polymorphism in the promoter of TF A [marked in green in (a)] results in an expression polymorphism of TF A. This results in expression polymorphism of gene B, which generates an eQTL for gene B at the genomic position of TF A on Chr.V and not at the genomic position of gene B on Chr.I. eQTLs represent a polymorphism physically located near the gene itself, for example, a promoter polymorphism that gives rise to differential expression of the gene (Figure 1). Many QTLs cloned before the existence of genome-wide eQTL analyses are in fact cis eQTLs, that is, based on variation in transcript level. This includes genes in glucosinolate biosynthesis and activation [16–19], genes in phosphate sensing [20] and genes controlling flowering time and development [21–25]. By contrast, trans eQTLs are the result of polymorphisms at a location in the genome other than the actual physical position of the gene whose transcript level is being measured (Figure 1). This region could, for example, contain a polymorphism in the expression of a transcription factor that correspondingly modulates the transcript level for the target genes. Thereby, the target genes have trans eQTLs at the physical position of the transcription factor due to this transcription factor having a cis eQTL. This sets up a potential network where cis variation in regulatory factors controls changes in transcript level for numerous genes in trans, potentially giving rise to phenotypic variation. eQTL analysis has been performed in mice, yeast and humans, identifying numerous cis- and trans-acting regulatory regions [26–30]. This review article describes the ability to utilize eQTLs for phenotypic association; we do not differentiate between cis eQTLs caused by promoter polymorphisms and those generated via indels, splicing variants or differential RNA degradation because all of the above polymorphisms will generate differential transcript presence. Two large-scale microarray studies have recently been published on 160 and 211 lines, respectively, in the 73 Review two Arabidopsis RIL populations, Ler Cvi [31] and Bay0 Sha [32]. Additional research has identified large numbers of eQTLs in a doubled haploid barley population and structured populations of eucalyptus [33,34]. These datasets revealed that gene expression traits are very variable in plants, as was also seen in yeast, humans and rats [26– 30]. Furthermore, the gene expression traits are highly genetically controlled and can have a complex underlying genetic architecture [31,32,34]. The global transcript profiling studies showed that cis eQTLs were mainly larger-effect QTLs, whereas trans eQTLs were mainly smaller-effect QTLs. A possible explanation for the difference in effect size between cis and trans eQTLs could be that cis sequence polymorphisms (e.g. in a promoter) have a direct influence on expression of the gene giving rise to cis eQTLs. By contrast, trans eQTLs are caused by a polymorphism in, for example, a regulatory factor located elsewhere in the genome. Because transcript abundance of most genes is regulated by multiple factors, a polymorphism in one regulatory factor might only result in a small change in the transcript accumulation of genes controlled in trans by that polymorphism. Furthermore, the polymorphism underlying a trans eQTL typically affects numerous other genes and is therefore pleiotropic. Large-effect mutations in pleiotropic genes are likely to be deleterious and, as such, there might be a constraint on the effect size of trans eQTL loci [32,35–38]. Interestingly, the two global eQTL investigations in Arabidopsis had significant differences in the number of eQTLs identified and in the observed ratio between cis and trans eQTLs. A replicated study on the Bay Sha RIL population found >36 000 eQTLs (the sum of cis and trans QTLs) impacting 75% of the transcripts measured [32]; a different study on the Ler Cvi RIL population found a total of around 4000 eQTLs [31]. Furthermore, the Bay Sha study found that 86% were trans eQTLs and 14% cis eQTLs whereas the Ler Cvi found 50% trans and 50% cis [31,32]. A study in Barley estimated the level of cis eQTLs to be between 28 and 39% [34]. These discrepancies are most likely to be the result of differences in the numbers of lines used and replications performed (Box 1) but could also be partly due to different genetic architectures between the populations. 160 RILs and one replicate per RIL were measured in the Ler Cvi study, 139 lines with a single replicate were used in barley; the Bay Sha study analyzed 211 RILs with two independent replicates per RIL [31,32,34]. An increase in both parameters results in more statistical power for the identification of small-effect QTLs [11]. Because most small-effect eQTLs appear to be in trans, the increase in experimental power would simultaneously lead to more eQTLs being identified and to a shift in the ratio between cis and trans eQTLs (Box 1). These observations suggest that small-effect trans eQTLs are the predominant eQTLs and that numerous small-effect eQTLs remain undetected in both populations. By contrast, it is likely that most large-effect cis eQTLs present in the tissues measured within each population have been identified. The cis eQTLs provide a library of candidate genes ready for moderate-to-large-effect QTL analyses in both populations. Additionally, the association of trans eQTLs with QTLs for a given phenotype provides a 74 Trends in Plant Science Vol.13 No.2 Box 1. eQTL analyses: optimizing sample size Two separate factors need to be taken into account when determining the level of replication for eQTL and QTL mapping: the number of independent measurements per line and the number of lines to utilize. Increasing both factors is critical to increase an experiment’s statistical power but they do not have the same effect. Because eQTL mapping will be most useful in immortal populations, such as RILs or fertile doubled haploid populations, we will focus on factors impacting these populations. Numerous additional factors determine QTL detection power, for instance, heritability, epistasis and environmental interactions. However these factors are dependent upon the specific gene and, as such, we cannot make generalizations and use these as specific determinants in designing a global eQTL analysis. Other factors such as QTL mapping models come into play after data generation and, therefore, are not dealt with in this discussion. Replication Increasing the number of replicated measurements per line leads to a statistically more accurate estimate of the mean of that line for the given trait and hence an increase in statistical power leading to an improved ability to detect a QTL at a given location. This also provides the benefit of sampling across micro-environmental variation and dampening any effect that this might have on the resulting data. Line number Increasing the number of lines per experiment has two potential benefits for eQTL- or QTL-mapping experiments. The first and most direct is similar to the benefit from increasing the number of replicates. An increase in RIL population size results in more measurements per allelic class at any given genomic position, which increases the ability to separate the means between the two parental alleles at a genomic position. A second benefit of increasing the number of lines is that there will be more recombination events within the population. This provides greater genetic resolution in refining the position of a given QTL, leading to shorter QTL intervals; furthermore, this allows the separation of QTLs in close proximity. Combined decisions Given that most populations have a large number of ‘small-effect’ QTLs that can combine to cause dramatic phenotypic differences, it is critical to have replicated measures on a large collection of lines. Time and money are in most cases the limiting factors when determining the number of samples that it will be feasible to analyze. In this situation, it will be preferable in most cases to increase the number of RILs at the cost of increasing the number of replicates; however, having independent replication per line is the preferred scenario for structured immortal populations. list of genes that are candidates for association with the phenotype but not necessarily candidates for causing the phenotypic QTL, which is more probably caused be a gene containing a cis eQTL. Therefore, such a list is very valuable for gene discovery and for all researchers working on traits for which QTL analyses have been performed using these RIL populations. Using eQTL analysis to identify candidate genes for phenotypic QTLs At present, QTL mapping within Arabidopsis and numerous crop species is readily applicable because several structured populations with high-density marker maps are publicly available. Hitherto, the identification of the gene or genes responsible for the QTL was hampered by the large regions encompassed by the QTL owing to the low density of markers and the small number of recombinants Review utilized. This generated difficulties in identifying candidate genes among the hundreds of genes in the QTL region. The availability of a full-genome sequence is a helpful tool in filtering through genes in the QTL interval, because the examination of the annotation can often suggest which of the genes in the QTL interval might be likely candidates. However, even after this filtering process, the number of candidate genes will in most cases be overwhelming. Recent work has shown that, when a phenotypic QTL has been identified in a RIL population where genomewide eQTL analysis has been conducted, it becomes possible to investigate whether differential variation in gene expression can be the cause of the phenotypic variation [16,39,40]. Promising results have been obtained by combining QTL analysis of physiological traits and gene expression traits based on colocalization of the respective phenotypic QTLs and eQTLs in multiple species [33,41– 47]. In wheat, differential gene expression in coordination with rough eQTL mapping was used to identify numerous candidate genes for involvement in seed development [41]. In corn, phenotype QTL to eQTL linkage was used to identify genes of potential interest in cell wall digestion [48]. However, these genes have yet to be shown to play a role in the respective trait. Expression variation has previously been used to identify the genes actually controlling phenotypic QTLs. One example is the identification of EPITHIOSPECIFIER MODIFIER 1 (ESM1) as the gene responsible for a QTL controlling glucosinolate breakdown in Arabidopsis [16]. It was found that the expression of ESM1 was greatly diminished in Col-0 Ler RILs that were Col-0 at a specific marker, and high when Col-0 Ler RILs were Ler at this marker. This pointed to ESM1 as the most likely candidate gene, which was validated in ensuing experiments. One pitfall in the approach of using eQTLs to identify the polymorphism responsible for a physiological QTL occurs when the polymorphism does not change the expression level of a gene. For example, the molecular polymorphism causing the physiological QTL could be in the coding region, leading to variations in protein stability, enzymatic activity or post-translation modification, or possibly even a polymorphism in the methylation level of the DNA. This should be kept in mind when using eQTLs to search for candidate genes, and it is probable that, in processes where, for example, post-translational modifications are the predominant regulatory mechanisms, this approach is not useful. Network eQTLs: QTLs controlling gene expression networks For many biological processes, the genes contributing to a certain process, for example the synthesis of specific compounds or the control of flowering time, are often well known. However, in most cases, little is known about the regulation of, and interaction between, these genes. To gain knowledge on this, one could ask if the genes with trans eQTLs at the same specific location on the genome are involved in the same genetic network, biological process or metabolic pathway. Addressing such questions can identify genetic variation influencing entire processes and Trends in Plant Science Vol.13 No.2 thereby reveal polymorphisms upstream in the network, process or pathway. A way to develop hypotheses about regulatory regions in the genome controlling the expression level of a network of genes, is by ‘network eQTL’ analyses. There are two major approaches to conduct network eQTL analysis. These can roughly be classified as either ‘a priori’ or ‘a posteriori’. ‘A priori’ network analysis In an a priori analysis, the network being tested (e.g. a biosynthetic or a signal transduction pathway) must be known or at least predicted. Different approaches, such as z scaling or mean shifting, can be used to convert the expression level of single genes into a common measure that can be used as a measure for the expression level of the entire gene network. This common value is then used as the trait for the QTL analysis. This approach has been used to search for network eQTLs for several known biosynthetic pathways and processes. A first attempt at network eQTLs analyzed 18 networks mainly involved in plant defense, for example glucosinolate and flavonol biosynthesis [49]. The gene encoding the known transcription factor PAP1, which regulates, for example, flavonol biosynthesis [50,51], was found to be located at the same position as a network eQTL for the pathway, and PAP1 was shown to have a cis eQTL at this same position [49]. This suggests that the variation in the expression of this transcription factor is responsible for the flavonol network eQTL. Further support was provided by the observation that PAP1 was the likely basis of a QTL for anthocyanin accumulation within a different Arabidopsis RIL population [51]. However, not all flavonol transcription factors with a cis eQTL resulted in a network eQTL [49]. This might be because the natural variation was not large enough to have a statistically detectable effect upon the network. Another explanation as to why a network eQTL was not detected for all flavonol transcription factors with expression variation could be limiting factors in the analyses, such as the size of the RIL population and replication number analyzed (Box 1). That a change in expression of one regulator does not always give rise to a change in transcript level in the target network shows that the approach is not a bullet-proof method to identify regulators. The use of a priori-defined networks enables the researcher to measure simultaneously the biological phenotype used to define the network. For instance, if the network is defined as a metabolic pathway the researcher can measure the metabolite and directly compare this to the network eQTL, in this case consisting of biosynthetic genes in the pathway. This enables us to begin addressing an important question in the genetics of complex traits: to what extent is variation in gene expression associated with variation in complex traits at the phenotypic level [52]? In Arabidopsis, a recent comparison of network eQTLs for two secondary metabolite biosynthetic pathways (indolic and aliphatic glucosinolates) with QTLs controlling the accumulation of these secondary metabolites showed that all network eQTLs altered accumulation of the metabolites [52]. However, metabolite-specific QTLs did exist. An interesting observation was that a network eQTL for the 75 Review aliphatic glucosinolate pathway on chromosome IV colocalized with a cis eQTL in the biosynthetic gene AOP2 [49,52]. Transgenic analysis confirmed that differential expression of the biosynthetic gene, AOP2, regulates transcript accumulation for the entire aliphatic glucosinolate gene network, which suggests a potential feedback regulatory mechanism between the metabolome and transcriptome [52]. This shows that loci controlling network eQTLs encompass a wide range of genetic functions. ‘A posteriori’ network analysis The a priori approach to network analysis requires that the network in question be already known or at least hypothesized. However, recent work is showing that the reliance on a few RIL populations and pre-selected pathways do not fully sample the available gene networks present within a plant species [53,54]. The identification of novel networks and their underlying regulators requires an a posteriori approach, where the eQTL data are utilized to generate networks. This approach typically uses either correlation of expression patterns or colocalization of eQTL positions to identify clusters or networks of genes [55]. A similar approach generates a hierarchical relationship akin to a regulatory network [56]. After the eQTL data are utilized to identify novel networks, trans eQTLs can be identified for these novel networks, and subsequently these genetic loci can be searched for regulatory genes containing cis eQTLs that can be tested for the ability to regulate the novel gene networks [57]. A hybrid a posteriori approach has been tested within Arabidopsis using a predefined genomic sample of genes involved in flowering time. A predefined set of 175 genes, involved in the well-known process of transition to flowering but with unknown connections, was tested for network relationships using eQTL information [31]. This study confirmed many of the flowering-time regulatory interactions identified previously, and also predicted numerous unknown interactions. Thus, this approach is useful when searching for unidentified regulators and to gain knowledge about previously known regulators. It will be interesting to see this approach extended to the full genome to better understand the complex interactions between gene networks. Additionally, it will be important to understand if natural genetic variation merely tweaks genetic networks present in all members of the species or if there are in fact new networks and different connections in play within the other individuals within a species, for example Arabidopsis accessions other than the standard model accessions Col-0 or Ler. Available and desired genomic tools A whole-genome eQTL analysis provides a genomic database of genetic variation as well as the local (cis) and distal (trans) effects of this genetic variation. Such datasets provide a ready library of candidate genes for phenotypic QTLs. This is especially true when the same RILs used for the eQTL analyses are tested for QTLs controlling a researcher’s trait of choice, because they would be able to use the available eQTL data to search for candidate genes. This is particularly true if plants have been grown under similar conditions. Unfortunately, at present there are no databases 76 Trends in Plant Science Vol.13 No.2 that allow for the querying of eQTLs for the gene of interest in the Bay Sha and Ler Cvi RILs, where intensive eQTL analyses have been performed. The development of such an easy-to-use database through which community-generated eQTL data could be rapidly queried would greatly aid the use of the published eQTL data. Concluding remarks and future perspectives The combination of phenotypic QTLs and eQTL data is a powerful tool for gene discovery. This systems biology approach to natural variation has enhanced our ability to identify the underlying molecular basis of QTLs within model and crop plant systems. eQTL analysis is currently underway in numerous organisms, and when genomic tools such as dense marker maps, mapping populations, massively parallel genomic re-sequencing and microarray platforms become available for more organisms, this promising approach is likely to aid QTL cloning in these organisms. Recently, QTL mapping of physiological and metabolic traits has also moved from mapping one or a few traits [58– 60] to mapping all mass peaks detected in a mass-spectrometric analyses [61,62]. Combining metabolomics data with eQTL data from the same lines has great potential in investigating hundreds of metabolites and linking them to eQTLs. Furthermore, this makes it possible to link metabolites in a biosynthetic network with network eQTLs for the biosynthetic genes. This provides tremendous potential for the expansion of our understanding of regulatory interactions between the transcriptome and metabolome. Finally, the generation of eQTL datasets in crop species containing long-term mapping populations with extensive phenotypic information will allow for rapid growth in the understanding of the molecular basis of phenotypic QTLs, which will greatly benefit plant breeding. Conflicts of interest The authors have no conflicts of interest to report with regards to this review. Acknowledgement The Danish National Research Foundation is acknowledged for its support to PlaCe (Center for Molecular Plant Physiology). BGH acknowledges FOBI (Graduate School in Biotechnology, University of Copenhagen) for providing a PhD stipend. Funding for this manuscript was provided in part by NSF grant DBI#0642481 to DJK. We also thank three anonymous reviewers for helping to improve the manuscript. References 1 Collard, B.C.Y. et al. (2005) An introduction to markers, quantitative trait loci (QTL) mapping and marker-assisted selection for crop improvement: The basic concepts. Euphytica 142, 169–196 2 Asins, M.J. (2002) Present and future of quantitative trait locus analysis in plant breeding. Plant Breed. 121, 281–291 3 Kearsey, M.J. and Farquhar, A.G.L. (1998) QTL analysis in plants; where are we now? Heredity 80, 137–142 4 Yano, M. and Sasaki, T. (1997) Genetic and molecular dissection of quantitative traits in rice. Plant Mol. Biol. 35, 145–153 5 Dudley, J.W. (1993) Molecular markers in plant improvement – manipulation of genes affecting quantitative traits. Crop Sci. 33, 660–668 6 Holland, J.B. (2007) Genetic architecture of complex traits in plants. Curr. Opin. Plant Biol. 10, 156–161 7 Koornneef, M. et al. (2004) Naturally occurring genetic variation in Arabidopsis thaliana. Annu. Rev. Plant Biol. 55, 141–172 Review 8 Borevitz, J.O. and Chory, J. (2004) Genomics tools for QTL analysis and gene discovery. Curr. Opin. Plant Biol. 7, 132–136 9 Mitchell-Olds, T. and Schmitt, J. (2006) Genetic mechanisms and evolutionary significance of natural variation in Arabidopsis. Nature 441, 947–952 10 Maloof, J.N. (2003) Genomic approaches to analyzing natural variation in Arabidopsis thaliana. Curr. Opin. Genet. Dev. 13, 576–582 11 Mackay, T.F.C. (2001) The genetic architecture of quantitative traits. Annu. Rev. Genet. 35, 303–339 12 Salvi, S. and Tuberosa, R. (2005) To clone or not to clone plant QTLs: present and future challenges. Trends Plant Sci. 10, 297–304 13 Luo, Z.W. et al. (2007) SFP genotyping from Affymetrix arrays is robust but largely detects cis-acting expression regulators. Genetics 176, 789– 800 14 West, M.A.L. et al. (2006) High-density haplotyping with microarraybased expression and single feature polymorphism markers in Arabidopsis. Genome Res. 16, 787–795 15 Rostoks, N. et al. (2005) Single-feature polymorphism discovery in the barley transcriptome. Genome Biol. 6, R54 16 Zhang, Z. et al. (2006) The gene controlling the Quantitative Trait Locus EPITHIOSPECIFIER MODIFIER1 alters glucosinolate hydrolysis and insect resistance in Arabidopsis. Plant Cell 18, 1524– 1536 17 Kroymann, J. et al. (2003) Evolutionary dynamics of an Arabidopsis insect resistance quantitative trait locus. Proc. Natl. Acad. Sci. U. S. A. 100, 14587–14592 18 Lambrix, V. et al. (2001) The Arabidopsis epithiospecifier protein promotes the hydrolysis of glucosinolates to nitriles and influences Trichoplusia ni herbivory. Plant Cell 13, 2793–2807 19 Kliebenstein, D.J. et al. (2001) Gene duplication and the diversification of secondary metabolism: side chain modification of glucosinolates in Arabidopsis thaliana. Plant Cell 13, 681–693 20 Svistoonoff, S. et al. (2007) Root tip contact with low-phosphate media reprograms plant root architecture. Nat Genet. 39, 792–796 21 Johanson, U. et al. (2000) Molecular analysis of FRIGIDA, a major determinant of natural variation in Arabidopsis flowering time. Science 290, 344–347 22 Werner, J.D. et al. (2005) Quantitative trait locus mapping and DNA array hybridization identify an FLM deletion as a cause for natural flowering-time variation. Proc. Natl. Acad. Sci. U. S. A. 102, 2460–2465 23 Caicedo, A.L. et al. (2004) Epistatic interaction between Arabidopsis FRI and FLC flowering time genes generates a latitudinal cline in a life history trait. Proc. Natl. Acad. Sci. U. S. A. 101, 15670–15675 24 Salvi, S. et al. (2007) Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize. Proc. Natl. Acad. Sci. U. S. A. 104, 11376–11381 25 Clark, R.M. et al. (2006) A distant upstream enhancer at the maize domestication gene tb1 has pleiotropic effects on plant and inflorescent architecture. Nat. Genet. 38, 594–597 26 Morley, M. et al. (2004) Genetic analysis of genome-wide variation in human gene expression. Nature 430, 743–747 27 Brem, R.B. and Kruglyak, L. (2005) The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl. Acad. Sci. U. S. A. 102, 1572–1577 28 Brem, R.B. et al. (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–755 29 Schadt, E.E. et al. (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 30 Yvert, G. et al. (2003) trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat. Genet. 35, 57–64 31 Keurentjes, J.J.B. et al. (2007) Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci. Proc. Natl. Acad. Sci. U. S. A. 104, 1708–1713 32 West, M.A.L. et al. (2007) Global eQTL mapping reveals the complex genetic architecture of transcript level variation in Arabidopsis. Genetics 175, 1441–1450 33 Kirst, M. et al. (2005) Genetic architecture of transcript-level variation in differentiating xylem of a eucalyptus hybrid. Genetics 169, 2295– 2303 34 Potokina, E. et al. (2008) Gene expression quantitative trait locus analysis of 16 000 barley genes reveals a complex pattern of genome-wide transcriptional regulation. Plant J. 53, 90–101 Trends in Plant Science Vol.13 No.2 35 Wagner, A. (2000) The role of population size, pleiotropy and fitness effects of mutations in the evolution of overlapping gene functions. Genetics 154, 1389–1401 36 Jeong, H. et al. (2001) Lethality and centrality in protein networks. Nature 411, 41–42 37 Fraser, H.B. et al. (2002) Evolutionary rate in the protein interaction network. Science 296, 750–752 38 Yu, H. et al. (2004) Genomic analysis of essentiality within protein networks. Trends Genet. 20, 227–231 39 Meng, H. et al. (2007) Identification of Abcc6 as the major causal gene for dystrophic cardiac calcification in mice through integrative genomics. Proc. Natl. Acad. Sci. U. S. A. 104, 4530–4535 40 Brem, R.B. et al. (2005) Genetic interactions between polymorphisms that affect gene expression in yeast. Nature 436, 701–703 41 Jordan, M.C. et al. (2007) Identifying regions of the wheat genome controlling seed development by mapping expression quantitative trait loci. Plant Biotechnol. J. 5, 442–453 42 DeCook, R. et al. (2006) Genetic regulation of gene expression during shoot development in Arabidopsis. Genetics 172, 1155–1164 43 Juenger, T.E. et al. (2006) Natural genetic variation in whole-genome expression in Arabidopsis thaliana: the impact of physiological QTL introgression. Mol. Ecol. 15, 1351–1365 44 Street, N.R. et al. (2006) The genetics and genomics of the drought response in Populus. Plant J. 48, 321–341 45 An, C. et al. (2007) Transcriptome profiling, sequence characterization, and SNP-based chromosomal assignment of the EXPANSIN genes in cotton. Mol. Genet. Genomics 278, 539–553 46 Venu, R.C. et al. (2007) RL-SAGE and microarray analysis of the rice transcriptome after Rhizoctonia solani infection. Mol. Genet. Genomics 278, 421–431 47 Poormohammad Kiani, S. et al. (2007) Genetic variability for physiological traits under drought conditions and differential expression of water stress-associated genes in sunflower (Helianthus annuus L.). Theor. Appl. Genet. 114, 193–207 48 Shi, C. et al. (2007) Identification of candidate genes associated with cell wall digestibility and eQTL (expression quantitative trait loci) analysis in a Flint x Flint maize recombinant inbred line population. BMC Genomics 8, 22 49 Kliebenstein, D.J. et al. (2006) Identification of QTLs controlling gene expression networks defined a priori. BMC Bioinformatics 7, 308 50 Borevitz, J.O. et al. (2000) Activation tagging identifies a conserved MYB regulator of phenylpropanoid biosynthesis. Plant Cell 12, 2383– 2393 51 Teng, S. et al. (2005) Sucrose-specific induction of anthocyanin biosynthesis in Arabidopsis requires the MYB75/PAP1 gene. Plant Physiol. 139, 1840–1852 52 Wentzell, A.M. et al. (2007) Linking metabolic QTL with network and cis-eQTL controlling biosynthetic pathways. PLOS Genet. 3, e162 53 Van Leeuwen, H. et al. (2007) Natural variation among Arabidopsis thaliana accessions for transcriptome response to exogenous salicylic acid. Plant Cell 19, 2099–2110 54 Kliebenstein, D.J. et al. (2006) Genomic survey of gene expression diversity in Arabidopsis thaliana. Genetics 172, 1179–1189 55 Lan, H. et al. (2006) Combined expression trait correlations and expression quantitative trait locus mapping. Plos Genet. 2, 51–61 56 Lee, S.I. et al. (2006) Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc. Natl. Acad. Sci. U. S. A. 103, 14062–14067 57 Sun, W. et al. (2007) Detection of eQTL modules mediated by activity levels of transcription factors. Bioinformatics 23, 2290–2297 58 Kliebenstein, D.J. et al. (2001) Comparative quantitative trait loci mapping of aliphatic, indolic and benzylic glucosinolate production in Arabidopsis thaliana leaves and seeds. Genetics 159, 359–370 59 McMullen, M.D. et al. (1998) Quantitative trait loci and metabolic pathways. Proc. Natl. Acad. Sci. U. S. A. 95, 1996–2000 60 Thormann, C.E. et al. (1996) Mapping loci controlling the concentrations of erucic and linolenic acids in seed oil of Brassica napus L. Theor. Appl. Genet. 93, 282–286 61 Keurentjes, J.J.B. et al. (2006) The genetics of plant metabolism. Nat. Genet. 38, 842–849 62 Schauer, N. et al. (2006) Comprehensive metabolic profiling and phenotyping of interspecific introgression lines for tomato improvement. Nat. Biotechnol. 24, 447–454 77