Download WANTED: GENES FOR PULP YIELD V. Carocha(1)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
WANTED: GENES FOR PULP YIELD
V. Carocha(1), J.A. Araújo(1), M.R. Oliveira(2), A.M. Pires(2), C.M. Marques(1)*
(1)
RAIZ - Laboratório Análise Genómica, ITQBII, Apartado 127, 2781-901 Oeiras
(2)
DM-CEMAT, IST, Av. Rovisco Pais, 1049-001 Lisboa
RAIZ is a Portuguese private non-profit research institute funded by the Pulp & Paper Portucel Soporcel Group
(www.raiz-iifp.pt). RAIZ integrates two research areas: Technological Research, aiming at improving pulp quality,
reducing production cost and minimizing environmental impact; and Forestry Research, aiming at increasing the
productivity of the Eucalyptus forest in Portugal, implement sustainable forest management practices and reduce wood
costs.
The E. globulus breeding program is managed in order to generate trees with increased economic value, through
gains in forest productivity and wood properties. This program is one of the most advanced in the world, and is
based on about 450 selected trees in Portuguese plantations and more than 370 selected trees from Australian
provenances. Currently, it includes more than 40,000 genotypes, tested in over 100 field trials. BLUP analysis helps to
infer each individual’s economic value based on volume, wood density and pulp yield.
The genomics group has been using Molecular Markers as a selection tool in the E. globulus Genetic
Improvement Program since 1990. Besides supporting short term population management activities such as clonal
identification and genetic diversity management, the group has been actively engaged in a longer term gene discovery
project. This work involves several national and international collaborations in the arena of genetic mapping of putative
candidate genes, quantitative trait loci (QTL) detection for wood traits, transcriptomics of wood-related candidate
genes, assessment of wood properties using Near Infrared Reflectance analysis and association studies. The goal is to be
able to select elite trees for wood properties at the seedling stage, through the analysis of its key genes. Besides saving
the cost and time now required for the assessment of wood property traits in 3-4 year old trees in multiple field trials,
DNA-based selection is a more precise index allowing faster genetic gain.
1
WHY GENES?
The advent of high-throughput genomic technologies opened new perspectives in the speed, scale and detail with
which genes, genomes and complex traits can be investigated. Genomics can have an impact on forest tree
improvement, if strongly interconnected with accurate phenotyping and advanced breeding. Literature reports on the
genetic architecture of quantitatively inherited wood property traits in Eucalyptus support the existence of a few QTLs
controlling 3-16% of the total phenotypic variation, suggesting that loci linked to traits of interest could be incorporated
into breeding programs through marker-assisted selection (Myburg et al. 2006).
The value of the identification of candidate genes (CG) is the possibility to associate molecular information
(sequence) to phenotype expression (function), allowing the implementation of marker assisted selection at the
population level (Pflieger et al. 2001). Putative CG may be obtained from sequence databases, using function
homologies (“functional CG”) or from differential gene expression screening in phenotypically contrasting tissues
(“expressional CG”). The strength of the recently developed DNA-array technology relates to the ability to analyze the
expression of hundreds of gene fragments (EST - expressed sequence tags) collected from different tissues,
simultaneously. In both cases, a CG is a sequence of hypothesized biological function that needs to be validated
(Rothschild and Soller 1997). The first step in the validation process is to map the CG in genetic linkage maps.
Genetic linkage maps allow the establishment of a framework of molecular markers along the chromosomes,
that support the estimation of the number, position and effect of QTL that explain quantitative variation in
segregating populations. QTL detection allows the dissection of the continuous phenotypic variation into discrete
factors. Genes are the ultimate markers for genetic linkage mapping, as potentially useful loci may be identified
empirically from co-localization with QTL (Myburg et al. 2006).
As a result of a large scale gene discovery program in two poplar species, Sterky et al. (1998) report 3,719 unique EST
transcripts associated to wood forming tissues, expected to represent approximately 2,200 genes. More than 50%
of these EST sequences shared significant similarity with proteins of known function from other organisms (i.e.
enzymes involved in the formation of lignin monomers and cellulose synthesis, proteins involved in cell wall
expansion-cross linking and modification of the fibre structure/composition, proteins related to hormone synthesis and
perception, cell cycle control and putative transcriptional activators).
The objective of RAIZ gene discovery project is the identification of useful candidate genes for improved
selection efficiency of technological traits in the Eucalyptus globulus genetic improvement program. In order to
achieve this goal, a working strategy was put together in order to accumulate complementary evidences of significant
association between wood trait quantitative trait loci (QTL), and putative CG.
2
FROM THE FIELD TO DNA…
A model system was put together in order to associate phenotypic information with DNA sequence. Several
controlled crosses between elite parent trees with interesting technological characteristics were performed. The genetic
diversity of the parent trees was accessed with 38 SSR markers (Ribeiro et al. 2004a,b). Two families with around 500
individuals were selected for mapping and QTL detection. One of these families was cloned and installed in 2 field
locations. Paternity and ramet identity was controlled with 13 SSR markers.
Pulp yield, the target trait of this project, is the result of a number of biochemical, cellular, developmental and adaptive
variables determined by the genotype and regulated by the environment. Since lignin (22%), glucose (54%) and xilose
(14%) are the 3 major components of Eucalyptus globulus wood that contribute to the production of chemical pulp
(Neto et al. 2005), they were also estimated. Technological pulp yield estimates, lignin, glucose and xilose content,
were obtained by Near Infrared Spectroscopy (FT-NIR Vector 22-N, Bruker). Sample preparation procedures and
model calibration are described in Mendes de Sousa et al. 2002. Each sample was analyzed twice. Spectra were
processed from the 2nd derivative and analyzed using the multivariate statistics software UNSCRAMBLER v7.5
(CAMO, Norway).
In order to identify functional candidate genes, we have performed extensive searches using appropriate queries on
representative databases with restrictive sequence selection criteria (Carocha and Marques 2002a). The majority of the
available DNA gene sequences for eucalypts were available to us through the Genolyptus (Brazil) and the CNRS/UPS
(France). An additional number of EST sequences were isolated by the Forest Biotechnology Group (NCSU, USA). The
access to private databases with eucalypt sequences allowed us to identify ortholog functional candidate genes in
Eucalyptus (Carocha and Marques 2003a) and attempt additional putative function characterization (Carocha and
Marques 2003b). We have selected 163 functional CG covering 11 different functional classes (lignin, non
cellulosic polysaccharides, cellulose, transcription factors, hormones, cell wall proteins, cellular matrix communication
proteins, amino acid metabolism, glucans, wood extractives and miscellaneous).
Together with a Southern European Research network joining academic and industrial partners (UMR CNRS UPS
5546, CIRAD and ENCE) we have used three normalized SSH cDNA libraries (E. gunnii xylem vs leaves, xylem vs
phloem and E. globulus juvenile vs mature) to put together a eucalypt wood unigene array with 586 gene fragments
putatively associated with wood formation in Eucalyptus. Methods are described in detail in Paux 2003, Paux et al.
2004 & 2005, and Foucart et al. 2006. Through the Genolyptus consortium we have access to 110,000 EST sequences
from fifteen (non-normalized) cDNA libraries from different eucalypt tissues (e.g. xylem, phloem, leaves) of several
species (e.g. E. grandis, E. urophylla, E. globulus and E. pellita). This resource was instrumental in the process of
identifying 503 non-redundant putative candidate genes in the eucalypt wood unigene array. This number roughly
represents 1/5 of the genes expected to participate in wood formation.
In the mean time, tension/opposite, juvenile/mature, optimal/limiting growing conditions, spring/winter and
fertilized/non-fertilized wood samples were collected in the field. These wood samples were characterized in
technological, chemical, biometrical and anatomical terms (Marques 2002; Marques et al. 2004) in order to select the
most appropriate tissues for differential gene expression experiments. With the support of Doctor Ana Pires team from
the Math Department of Instituto Superior Tecnico (Lisbon), macroarray hybridisation data were described and
interpreted in order to identify expressional CG with stable differential expression in high density and high pulp yield
wood tissues. A restricted group of 60 expressional CG were selected and given priority in terms of mapping and
functional characterization (Soares et al. 2006).
The Single Strand Conformation Polimorphism (SSCP) technology was used to reveal sequence polymorphism in the
selected candidate genes, in order to map them (Carocha and Marques 2002b). Putative CG available sequences were
extended and adequate primers were designed for mapping (Carocha and Marques 2003c; Carocha et al. 2005). Besides
around 90 candidate genes, genetic linkage maps for the parent trees of the two selected pedigrees include over
150 microsatellite markers available in the literature. Maps have been constructed with Mapmaker/Exp 3.0 (Lincoln et
al. 1992). Part of these markers allowed us to establish homeologies between linkage groups of published genetic maps
from different species (reviewed in Myburg et al. 2006).
Spatial analysis was pursued on an individual tree basis to account for environmental heterogeneity within the field trial
of the non cloned family. A separable autoregressive process of order one was applied as a variance structure for
modelling the residual variation, according to the methodology described by Costa e Silva et al. 2001 and Dutkowski et
al. 2002 for forest genetic trials. Adjusted and non-adjusted pulp yield, lignin, glucose and xilose estimates were
used for QTL detection. QTL detection was carried out using single marker analysis followed by a multiple regression
approach. Corrections for multiple testing were applied in order to control the false discovery rate. An independent
validation check was run using permutation methods. All the computations were performed in the statistical software R.
3
…QTL ON MAPS WILL LEAD US TO THE CANDIDATE GENES…
Pulp yield represents the amount of fiber produced from a basis of 100% wood. Chemical pulp is essentially made of
glucose (45%) and xilose (11%), since lignin (1,3%) is removed during the process of kraft cooking and bleaching
(Neto et al. 2005). In practically all cases, whenever a marker significantly increases xilose, it also affects glucose.
Moreover, QTL for pulp yield are almost always associated with QTL for lignin, xilose and glucose. This data reflect
the biological meaning of the statistical associations detected.
We are currently genotyping more markers in the chromosome areas that influence pulp yield, in order to reduce the
QTL intervals, narrow down the number of candidate genes for further studies and estimate the QTL effect in a more
precise way. In order to add critical mass along this line of work, we have established a collaboration with Doctor Jorge
Paiva (ITQB/IBET), who submitted a post-doctoral project to FCT aiming to identify and characterise the genomic
region that underlies the most interesting wood property QTLs in E. globulus, using a map-based cloning approach. A
complementary functional genomics approach was put together in collaboration with Doctor Rita Teixeira (Coimbra
University) in the framework of an “icentro” project. These projects will take advantage of the availability of the E.
grandis complete genome sequence, that we expect to be publicly released by the Joint Genome Institute (USA) in 2008
(http://www.jgi.doe.gov/News/news_6_8_07.html), and also of the genomic resources made available within the
framework of the International Eucalyptus Genome Consortium (IEuGC, www.ieugc.up.ac.za).
Once we have the smallest possible QTL-delimited area, we can study the gene content of that portion of sequence.
Within the framework of a ERANET European project (Eucalyptus genomics research network for improved wood
properties and adaptation to drought), we will return to differential gene expression experiments focusing on the genes
present in the QTL-delimited sequence, using RNA from individuals with extreme phenotypes at the population level,
in order to reduce the number of candidate genes.
4
…FROM THE CANDIDATE GENES BACK TO THE FIELD
The next step will be testing for association of gene alleles with phenotyped individuals at the population level and/or
functional analysis (for example in model plants) in order to confirm the phenotypic effect of the target genes.
We have sequenced genomic segments from three genes (cinnamyl-alcohol dehydrogenase, CAD; ferulate-5hydroxylase, F5H; and S-adenosylmethionine synthase, SAMS) in E. globulus (17 genotypes of Portuguese origin and
24 genotypes from 12 provenances in continental Australia) (Kirst et al. 2005; Marques et al. 2005). The choice of
wood trait CG for this work resulted from previous work performed at NCSU in an E. grandis × E. globulus pedigree.
The nucleotide diversity was very high for all genes, and LD extended in general over 500-1000 bp, indicating that by
sampling multiple gene regions we might be able to identify polymorphisms in LD with quantitative trait nucleotides.
This study supported the feasibility of association studies in Eucalyptus. With the support of a post-doctoral grant from
FCT, João Costa e Silva is currently investigating the genetic structure of a population of trees originating from RAIZ
breeding program, in order to identify a subset of unrelated trees with ample variation for wood property traits. He will
identify and assess the diversity of single-nucleotide polymorphisms, examine linkage disequilibrium, and evaluate
haplotype diversity and structure of selected CG in this subset of trees. Associations between allelic polymorphisms and
observed phenotypic variation will then be explored in the overall Portuguese population, and verified using an
Australian population.
AKNOWLEDGMENTS TO OUR COLLABORATORS AND COLLEAGUES
Doctor Nuno Borralho
Engs. J.L. Amaral, Mendes de Sousa e Dra Fernanda Paula
Dras Carla Ribeiro, Marta Melo, Engs. Teresa Baptista da Silva, Adelina Jerónimo, Eduarda Coutinho
Lab technicians Fátima Cunha, Catarina Grilo
Students Eduardo Costa, Ana Rita Pereira de Sá, Ricardo Rocha, Dra Catarina Soares (IST, Lisbon)
Field technicians José Cardoso, Luís Ferreira, Paulo Silva
Doctor J. Grima-Pettenati team (CNRS/UPS, Toulouse)
Doctor P. Vigneron team (CIRAD, Montpellier)
Dr. Roberto Astorga (ENCE, Navia)
The Genolyptus network (Brazil)
Doctors Phil Jackson and Jorge Paiva (ITQB/IBET, Oeiras)
Doctor João Costa e Silva (IST, Lisbon)
Doctor Rita Teixeira (Coimbra University)
Doctor Santiago González Martínez (INIA, Madrid)
Doctor Matias Kirst (Florida University, USA)
Doctor René Vaillancourt team (University of Tasmânia, Australia)
5
REFERENCES
Carocha, V.J. and Marques, C.M. (2002a) RAIZ Report (D2, PT036).
Carocha, V.J. and Marques, C.M. (2002b) RAIZ Report (E2-2 (1), PT036).
Carocha, V.J. and Marques, C.M. (2003a) RAIZ Report (D3, PT036).
Carocha, V.J. and Marques, C.M. (2003b) RAIZ Report (E2-4, PT036).
Carocha, V.J. and Marques, C.M. (2003c) RAIZ Report (E2-2 (2), PT036).
Carocha, V.J., Melo, M., Cunha, F. and Marques, C.M. (2005) RAIZ Report (D9, PT036).
Costa e Silva, J., Dutkowski, G.W. and Gilmour, A.R. (2001) Can J For Res 31: 1887-1893.
Dutkowski, G.W., Costa e Silva, J., Gilmour, A.R. and Lopez, G.A. (2002) Can J For Res 32: 2201-2214.
Foucart, C., Paux, E., Ladouce, N., San-Clemente, H. and Grima-Pettenati, J. (2006) New Phytologist 170(4): 739-752.
Kirst, M., Marques, C.M. and Sederoff, R. (2005) IUFRO-Tree Biotechnology, South Africa, Pretoria. (Poster S5.28p).
Lincoln, S., Daly, M. and Lander E. (1992) Whitehead Institute Technical Report.
Marques, C.M. (2002) RAIZ Report (D1, PT036).
Marques, C.M., Mendes de Sousa, A.P., Carocha, V.J., Araújo, J.A. and Borralho, N. (2004) RAIZ Report (E1-6,
PT036).
Marques, C.M., Carocha, V.J. and Kirst, M. (2005) RAIZ Report (D19, PT036).
Mendes de Sousa, A.P., Baptista, M.D’A. and Amaral, J.L. (2002) Proceedings of the XIII Tecnicelpa National
Meeting: 307-321.
Myburg, A.A., Potts, B.M., Marques, C.M., Kirst, M., Gion, J.M., Grattapaglia, D. and Grima-Pettenati, J. (2006) In
“Genome Mapping & Molecular Breeding in plants”. Vol. 7: Forest Trees. Ed C.R. Kole. Springer,
Heidelberg, Berlin, New York, Tokyo.
Neto, C.P., Evtuguin, D., Pinto, P., Silvestre, A. and Freire, C. (2005) Proceedings of the XIX Tecnicelpa National
Meeting, Tomar, Portugal: 59-70.
Paux, E. (2003). Activity report UMR CNRS/UPS 5546. Meeting Castanet Tolosan, France.
Paux, E., M’Barek, T., Ladouce, N., Sivadon, P. and Grima-Pettenati, J. (2004) Plant Mol Biol 55 (2): 263-280.
Paux, E., Carocha, V., Marques, C.M., Mendes de Sousa, A., Borralho, N., Sivadon, P. and Grima-Pettenati, J. (2005)
New Phytologist 167 (1): 89–100.
Pflieger, S., Lefebvre, V. and Causse, M. (2001) Mol Breeding 7: 275 – 291.
R Development Core Team (2006) Vienna: R Foundation for Statistical Computing. URL http://www.R-project.org.
Ribeiro, C., Carocha, V.J., Marques, C.M. (2004a) RAIZ Report (PT036).
Ribeiro, C., Cunha, F., Marques, C.M. (2004b) RAIZ Report (PT036).
Rothschild, M.F. and Soller , M. (1997) Probe 8: 13-20.
Soares, C., Pires, A., Carocha, V.J., Cunha, F. and Marques, C.M. (2006) RAIZ Report (D8, PT036).
Sterky, F. et al. (1998) PNAS 95: 13330-13335.
6