Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
WANTED: GENES FOR PULP YIELD V. Carocha(1), J.A. Araújo(1), M.R. Oliveira(2), A.M. Pires(2), C.M. Marques(1)* (1) RAIZ - Laboratório Análise Genómica, ITQBII, Apartado 127, 2781-901 Oeiras (2) DM-CEMAT, IST, Av. Rovisco Pais, 1049-001 Lisboa RAIZ is a Portuguese private non-profit research institute funded by the Pulp & Paper Portucel Soporcel Group (www.raiz-iifp.pt). RAIZ integrates two research areas: Technological Research, aiming at improving pulp quality, reducing production cost and minimizing environmental impact; and Forestry Research, aiming at increasing the productivity of the Eucalyptus forest in Portugal, implement sustainable forest management practices and reduce wood costs. The E. globulus breeding program is managed in order to generate trees with increased economic value, through gains in forest productivity and wood properties. This program is one of the most advanced in the world, and is based on about 450 selected trees in Portuguese plantations and more than 370 selected trees from Australian provenances. Currently, it includes more than 40,000 genotypes, tested in over 100 field trials. BLUP analysis helps to infer each individual’s economic value based on volume, wood density and pulp yield. The genomics group has been using Molecular Markers as a selection tool in the E. globulus Genetic Improvement Program since 1990. Besides supporting short term population management activities such as clonal identification and genetic diversity management, the group has been actively engaged in a longer term gene discovery project. This work involves several national and international collaborations in the arena of genetic mapping of putative candidate genes, quantitative trait loci (QTL) detection for wood traits, transcriptomics of wood-related candidate genes, assessment of wood properties using Near Infrared Reflectance analysis and association studies. The goal is to be able to select elite trees for wood properties at the seedling stage, through the analysis of its key genes. Besides saving the cost and time now required for the assessment of wood property traits in 3-4 year old trees in multiple field trials, DNA-based selection is a more precise index allowing faster genetic gain. 1 WHY GENES? The advent of high-throughput genomic technologies opened new perspectives in the speed, scale and detail with which genes, genomes and complex traits can be investigated. Genomics can have an impact on forest tree improvement, if strongly interconnected with accurate phenotyping and advanced breeding. Literature reports on the genetic architecture of quantitatively inherited wood property traits in Eucalyptus support the existence of a few QTLs controlling 3-16% of the total phenotypic variation, suggesting that loci linked to traits of interest could be incorporated into breeding programs through marker-assisted selection (Myburg et al. 2006). The value of the identification of candidate genes (CG) is the possibility to associate molecular information (sequence) to phenotype expression (function), allowing the implementation of marker assisted selection at the population level (Pflieger et al. 2001). Putative CG may be obtained from sequence databases, using function homologies (“functional CG”) or from differential gene expression screening in phenotypically contrasting tissues (“expressional CG”). The strength of the recently developed DNA-array technology relates to the ability to analyze the expression of hundreds of gene fragments (EST - expressed sequence tags) collected from different tissues, simultaneously. In both cases, a CG is a sequence of hypothesized biological function that needs to be validated (Rothschild and Soller 1997). The first step in the validation process is to map the CG in genetic linkage maps. Genetic linkage maps allow the establishment of a framework of molecular markers along the chromosomes, that support the estimation of the number, position and effect of QTL that explain quantitative variation in segregating populations. QTL detection allows the dissection of the continuous phenotypic variation into discrete factors. Genes are the ultimate markers for genetic linkage mapping, as potentially useful loci may be identified empirically from co-localization with QTL (Myburg et al. 2006). As a result of a large scale gene discovery program in two poplar species, Sterky et al. (1998) report 3,719 unique EST transcripts associated to wood forming tissues, expected to represent approximately 2,200 genes. More than 50% of these EST sequences shared significant similarity with proteins of known function from other organisms (i.e. enzymes involved in the formation of lignin monomers and cellulose synthesis, proteins involved in cell wall expansion-cross linking and modification of the fibre structure/composition, proteins related to hormone synthesis and perception, cell cycle control and putative transcriptional activators). The objective of RAIZ gene discovery project is the identification of useful candidate genes for improved selection efficiency of technological traits in the Eucalyptus globulus genetic improvement program. In order to achieve this goal, a working strategy was put together in order to accumulate complementary evidences of significant association between wood trait quantitative trait loci (QTL), and putative CG. 2 FROM THE FIELD TO DNA… A model system was put together in order to associate phenotypic information with DNA sequence. Several controlled crosses between elite parent trees with interesting technological characteristics were performed. The genetic diversity of the parent trees was accessed with 38 SSR markers (Ribeiro et al. 2004a,b). Two families with around 500 individuals were selected for mapping and QTL detection. One of these families was cloned and installed in 2 field locations. Paternity and ramet identity was controlled with 13 SSR markers. Pulp yield, the target trait of this project, is the result of a number of biochemical, cellular, developmental and adaptive variables determined by the genotype and regulated by the environment. Since lignin (22%), glucose (54%) and xilose (14%) are the 3 major components of Eucalyptus globulus wood that contribute to the production of chemical pulp (Neto et al. 2005), they were also estimated. Technological pulp yield estimates, lignin, glucose and xilose content, were obtained by Near Infrared Spectroscopy (FT-NIR Vector 22-N, Bruker). Sample preparation procedures and model calibration are described in Mendes de Sousa et al. 2002. Each sample was analyzed twice. Spectra were processed from the 2nd derivative and analyzed using the multivariate statistics software UNSCRAMBLER v7.5 (CAMO, Norway). In order to identify functional candidate genes, we have performed extensive searches using appropriate queries on representative databases with restrictive sequence selection criteria (Carocha and Marques 2002a). The majority of the available DNA gene sequences for eucalypts were available to us through the Genolyptus (Brazil) and the CNRS/UPS (France). An additional number of EST sequences were isolated by the Forest Biotechnology Group (NCSU, USA). The access to private databases with eucalypt sequences allowed us to identify ortholog functional candidate genes in Eucalyptus (Carocha and Marques 2003a) and attempt additional putative function characterization (Carocha and Marques 2003b). We have selected 163 functional CG covering 11 different functional classes (lignin, non cellulosic polysaccharides, cellulose, transcription factors, hormones, cell wall proteins, cellular matrix communication proteins, amino acid metabolism, glucans, wood extractives and miscellaneous). Together with a Southern European Research network joining academic and industrial partners (UMR CNRS UPS 5546, CIRAD and ENCE) we have used three normalized SSH cDNA libraries (E. gunnii xylem vs leaves, xylem vs phloem and E. globulus juvenile vs mature) to put together a eucalypt wood unigene array with 586 gene fragments putatively associated with wood formation in Eucalyptus. Methods are described in detail in Paux 2003, Paux et al. 2004 & 2005, and Foucart et al. 2006. Through the Genolyptus consortium we have access to 110,000 EST sequences from fifteen (non-normalized) cDNA libraries from different eucalypt tissues (e.g. xylem, phloem, leaves) of several species (e.g. E. grandis, E. urophylla, E. globulus and E. pellita). This resource was instrumental in the process of identifying 503 non-redundant putative candidate genes in the eucalypt wood unigene array. This number roughly represents 1/5 of the genes expected to participate in wood formation. In the mean time, tension/opposite, juvenile/mature, optimal/limiting growing conditions, spring/winter and fertilized/non-fertilized wood samples were collected in the field. These wood samples were characterized in technological, chemical, biometrical and anatomical terms (Marques 2002; Marques et al. 2004) in order to select the most appropriate tissues for differential gene expression experiments. With the support of Doctor Ana Pires team from the Math Department of Instituto Superior Tecnico (Lisbon), macroarray hybridisation data were described and interpreted in order to identify expressional CG with stable differential expression in high density and high pulp yield wood tissues. A restricted group of 60 expressional CG were selected and given priority in terms of mapping and functional characterization (Soares et al. 2006). The Single Strand Conformation Polimorphism (SSCP) technology was used to reveal sequence polymorphism in the selected candidate genes, in order to map them (Carocha and Marques 2002b). Putative CG available sequences were extended and adequate primers were designed for mapping (Carocha and Marques 2003c; Carocha et al. 2005). Besides around 90 candidate genes, genetic linkage maps for the parent trees of the two selected pedigrees include over 150 microsatellite markers available in the literature. Maps have been constructed with Mapmaker/Exp 3.0 (Lincoln et al. 1992). Part of these markers allowed us to establish homeologies between linkage groups of published genetic maps from different species (reviewed in Myburg et al. 2006). Spatial analysis was pursued on an individual tree basis to account for environmental heterogeneity within the field trial of the non cloned family. A separable autoregressive process of order one was applied as a variance structure for modelling the residual variation, according to the methodology described by Costa e Silva et al. 2001 and Dutkowski et al. 2002 for forest genetic trials. Adjusted and non-adjusted pulp yield, lignin, glucose and xilose estimates were used for QTL detection. QTL detection was carried out using single marker analysis followed by a multiple regression approach. Corrections for multiple testing were applied in order to control the false discovery rate. An independent validation check was run using permutation methods. All the computations were performed in the statistical software R. 3 …QTL ON MAPS WILL LEAD US TO THE CANDIDATE GENES… Pulp yield represents the amount of fiber produced from a basis of 100% wood. Chemical pulp is essentially made of glucose (45%) and xilose (11%), since lignin (1,3%) is removed during the process of kraft cooking and bleaching (Neto et al. 2005). In practically all cases, whenever a marker significantly increases xilose, it also affects glucose. Moreover, QTL for pulp yield are almost always associated with QTL for lignin, xilose and glucose. This data reflect the biological meaning of the statistical associations detected. We are currently genotyping more markers in the chromosome areas that influence pulp yield, in order to reduce the QTL intervals, narrow down the number of candidate genes for further studies and estimate the QTL effect in a more precise way. In order to add critical mass along this line of work, we have established a collaboration with Doctor Jorge Paiva (ITQB/IBET), who submitted a post-doctoral project to FCT aiming to identify and characterise the genomic region that underlies the most interesting wood property QTLs in E. globulus, using a map-based cloning approach. A complementary functional genomics approach was put together in collaboration with Doctor Rita Teixeira (Coimbra University) in the framework of an “icentro” project. These projects will take advantage of the availability of the E. grandis complete genome sequence, that we expect to be publicly released by the Joint Genome Institute (USA) in 2008 (http://www.jgi.doe.gov/News/news_6_8_07.html), and also of the genomic resources made available within the framework of the International Eucalyptus Genome Consortium (IEuGC, www.ieugc.up.ac.za). Once we have the smallest possible QTL-delimited area, we can study the gene content of that portion of sequence. Within the framework of a ERANET European project (Eucalyptus genomics research network for improved wood properties and adaptation to drought), we will return to differential gene expression experiments focusing on the genes present in the QTL-delimited sequence, using RNA from individuals with extreme phenotypes at the population level, in order to reduce the number of candidate genes. 4 …FROM THE CANDIDATE GENES BACK TO THE FIELD The next step will be testing for association of gene alleles with phenotyped individuals at the population level and/or functional analysis (for example in model plants) in order to confirm the phenotypic effect of the target genes. We have sequenced genomic segments from three genes (cinnamyl-alcohol dehydrogenase, CAD; ferulate-5hydroxylase, F5H; and S-adenosylmethionine synthase, SAMS) in E. globulus (17 genotypes of Portuguese origin and 24 genotypes from 12 provenances in continental Australia) (Kirst et al. 2005; Marques et al. 2005). The choice of wood trait CG for this work resulted from previous work performed at NCSU in an E. grandis × E. globulus pedigree. The nucleotide diversity was very high for all genes, and LD extended in general over 500-1000 bp, indicating that by sampling multiple gene regions we might be able to identify polymorphisms in LD with quantitative trait nucleotides. This study supported the feasibility of association studies in Eucalyptus. With the support of a post-doctoral grant from FCT, João Costa e Silva is currently investigating the genetic structure of a population of trees originating from RAIZ breeding program, in order to identify a subset of unrelated trees with ample variation for wood property traits. He will identify and assess the diversity of single-nucleotide polymorphisms, examine linkage disequilibrium, and evaluate haplotype diversity and structure of selected CG in this subset of trees. Associations between allelic polymorphisms and observed phenotypic variation will then be explored in the overall Portuguese population, and verified using an Australian population. AKNOWLEDGMENTS TO OUR COLLABORATORS AND COLLEAGUES Doctor Nuno Borralho Engs. J.L. Amaral, Mendes de Sousa e Dra Fernanda Paula Dras Carla Ribeiro, Marta Melo, Engs. Teresa Baptista da Silva, Adelina Jerónimo, Eduarda Coutinho Lab technicians Fátima Cunha, Catarina Grilo Students Eduardo Costa, Ana Rita Pereira de Sá, Ricardo Rocha, Dra Catarina Soares (IST, Lisbon) Field technicians José Cardoso, Luís Ferreira, Paulo Silva Doctor J. Grima-Pettenati team (CNRS/UPS, Toulouse) Doctor P. Vigneron team (CIRAD, Montpellier) Dr. Roberto Astorga (ENCE, Navia) The Genolyptus network (Brazil) Doctors Phil Jackson and Jorge Paiva (ITQB/IBET, Oeiras) Doctor João Costa e Silva (IST, Lisbon) Doctor Rita Teixeira (Coimbra University) Doctor Santiago González Martínez (INIA, Madrid) Doctor Matias Kirst (Florida University, USA) Doctor René Vaillancourt team (University of Tasmânia, Australia) 5 REFERENCES Carocha, V.J. and Marques, C.M. (2002a) RAIZ Report (D2, PT036). Carocha, V.J. and Marques, C.M. (2002b) RAIZ Report (E2-2 (1), PT036). Carocha, V.J. and Marques, C.M. (2003a) RAIZ Report (D3, PT036). Carocha, V.J. and Marques, C.M. (2003b) RAIZ Report (E2-4, PT036). Carocha, V.J. and Marques, C.M. (2003c) RAIZ Report (E2-2 (2), PT036). Carocha, V.J., Melo, M., Cunha, F. and Marques, C.M. (2005) RAIZ Report (D9, PT036). Costa e Silva, J., Dutkowski, G.W. and Gilmour, A.R. (2001) Can J For Res 31: 1887-1893. Dutkowski, G.W., Costa e Silva, J., Gilmour, A.R. and Lopez, G.A. (2002) Can J For Res 32: 2201-2214. Foucart, C., Paux, E., Ladouce, N., San-Clemente, H. and Grima-Pettenati, J. (2006) New Phytologist 170(4): 739-752. Kirst, M., Marques, C.M. and Sederoff, R. (2005) IUFRO-Tree Biotechnology, South Africa, Pretoria. (Poster S5.28p). Lincoln, S., Daly, M. and Lander E. (1992) Whitehead Institute Technical Report. Marques, C.M. (2002) RAIZ Report (D1, PT036). Marques, C.M., Mendes de Sousa, A.P., Carocha, V.J., Araújo, J.A. and Borralho, N. (2004) RAIZ Report (E1-6, PT036). Marques, C.M., Carocha, V.J. and Kirst, M. (2005) RAIZ Report (D19, PT036). Mendes de Sousa, A.P., Baptista, M.D’A. and Amaral, J.L. (2002) Proceedings of the XIII Tecnicelpa National Meeting: 307-321. Myburg, A.A., Potts, B.M., Marques, C.M., Kirst, M., Gion, J.M., Grattapaglia, D. and Grima-Pettenati, J. (2006) In “Genome Mapping & Molecular Breeding in plants”. Vol. 7: Forest Trees. Ed C.R. Kole. Springer, Heidelberg, Berlin, New York, Tokyo. Neto, C.P., Evtuguin, D., Pinto, P., Silvestre, A. and Freire, C. (2005) Proceedings of the XIX Tecnicelpa National Meeting, Tomar, Portugal: 59-70. Paux, E. (2003). Activity report UMR CNRS/UPS 5546. Meeting Castanet Tolosan, France. Paux, E., M’Barek, T., Ladouce, N., Sivadon, P. and Grima-Pettenati, J. (2004) Plant Mol Biol 55 (2): 263-280. Paux, E., Carocha, V., Marques, C.M., Mendes de Sousa, A., Borralho, N., Sivadon, P. and Grima-Pettenati, J. (2005) New Phytologist 167 (1): 89–100. Pflieger, S., Lefebvre, V. and Causse, M. (2001) Mol Breeding 7: 275 – 291. R Development Core Team (2006) Vienna: R Foundation for Statistical Computing. URL http://www.R-project.org. Ribeiro, C., Carocha, V.J., Marques, C.M. (2004a) RAIZ Report (PT036). Ribeiro, C., Cunha, F., Marques, C.M. (2004b) RAIZ Report (PT036). Rothschild, M.F. and Soller , M. (1997) Probe 8: 13-20. Soares, C., Pires, A., Carocha, V.J., Cunha, F. and Marques, C.M. (2006) RAIZ Report (D8, PT036). Sterky, F. et al. (1998) PNAS 95: 13330-13335. 6