* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Analysis of expressed sequence tags from the Huperzia serrata leaf
Vectors in gene therapy wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Designer baby wikipedia , lookup
Microevolution wikipedia , lookup
Pathogenomics wikipedia , lookup
Gene expression profiling wikipedia , lookup
Transposable element wikipedia , lookup
History of genetic engineering wikipedia , lookup
Human genome wikipedia , lookup
Non-coding DNA wikipedia , lookup
Primary transcript wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Smith–Waterman algorithm wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Microsatellite wikipedia , lookup
Multiple sequence alignment wikipedia , lookup
Point mutation wikipedia , lookup
Genome editing wikipedia , lookup
Helitron (biology) wikipedia , lookup
Sequence alignment wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Physiologia Plantarum 139: 1–12. 2010 Copyright © Physiologia Plantarum 2009, ISSN 0031-9317 TECHNICAL FOCUS Analysis of expressed sequence tags from the Huperzia serrata leaf for gene discovery in the areas of secondary metabolite biosynthesis and development regulation Hongmei Luoa , Chao Suna , Ying Lia , Qiong Wua , Jingyuan Songa , Deli Wangb , Xiaocheng Jiaa , Rongtao Lib and Shilin Chena,c,∗ a Institute of Medicinal Plant Development (IMPLAD), Chinese Academy of Medical Sciences & Peking Union Medical College, No. 151, Malianwa North Road, HaiDian District, Beijing 100193, China b Hainan Branch Institute of Medicinal Plant Development (HBIMPLAD), Chinese Academy of Medical Sciences & Peking Union Medical College, XingLong Town, Wanning County, Hainan 572522, China c Hubei College of Traditional Chinese Medicine, Hongshan District of Wuhan City, Huangjia Lake West Road on the 1st, Hubei Province, 430065, China Correspondence *Correspondence author, e-mail: [email protected] Received 19 October 2009; revised 27 November 2009 doi:10.1111/j.1399-3054.2009.01339.x Huperzia serrata produces various types of lycopodium alkaloids, especially the huperzine A (HupA) that is a promising drug candidate for Alzheimer’s disease. Despite the medicinal importance of H. serrata, little genomic or transcriptomic data are available from the public databases. A cDNA library was thus generated from RNA isolated from the leaves of H . serrata. A total of 4012 clones were randomly selected from the library, and 3451 high-quality expressed sequence tags (ESTs) were assembled to yield 1510 unique sequences with an average length of 712 bp. The majority (79.4%) of the unique sequences were assigned to the putative functions based on the BLAST searches against the public databases. The functions of these unique sequences covered a broad set of molecular functions, biological processes and biochemical pathways according to GO and KEGG assignments. The transcripts involved in the secondary metabolite biosynthesis of alkaloids, terpenoids and flavone/flavonoids, such as cytochrome P450, lysine decarboxylase (LDC), flavanone 3-hydroxylase, squalene synthetase and 2-oxoglutarate 3-dioxygenase, were well represented by 34 unique sequences in this EST dataset. The corresponding peptide sequence of the LDC contained the Pfam 03641 domain and was annotated as a putative LDC. The unique sequences encoding transcription factors, phytohormone biosynthetic enzymes and signaling components were also found in this EST collection. In addition, a total of 501 potential SSR-motif microsatellite loci were identified from the 393 H . serrata leaf unique sequences. This set of nonredundant ESTs and the molecular markers obtained in this study will establish valuable resources for a wide range of applications including gene discovery and identification, genetic mapping and analysis of genetic diversity, cultivar identification and marker-assisted selections in this important medicinal plant. Abbreviations – BLAST, Basic Local Alignment Search Tool; CYP450, cytochrome P450-dependent monoxygenase; ESTs, expressed sequence tags; GA, gibberellin; GO, Gene Ontology; HupA, huperzine A; KEGG, Kyoto Encyclopedia of Genes and Genomes; LDC, lysine decarboxylase; NCBI, National Center for Biotechnology Information; SSR, simple sequence repeat. Physiol. Plant. 139, 2010 1 Introduction Huperzia serrata (Thunb.) Trev. is a member of the Huperziaceae family. The Huperziaceae are one of the oldest medicinally important vascular plants (Ching 1978). The whole plant of H . serrata is named Qian Ceng Ta and has been used as Chinese folk medicine for the treatment of contusions, strains, swellings, schizophrenia, myasthenia gravis and organophosphate poisoning (Ma et al. 2006). H . serrata produces various types of lycopodium alkaloids, including lycopodines, lycodines and fawcettimines. Some of these alkaloids are valuable for pharmaceutical applications (Ma and Gang 2004). In particular, the lycodine HupA has been used as an anti-Alzheimer’s disease drug candidate in China because of its bioactivities of the selective inhibition of acetylcholinesterase and as a dietary supplement in the USA (Liu et al. 1986, Ma et al. 2006, Tang 1996). The amount of HupA in the H . serrata plant differs across tissues, and the highest content was identified in the leaves, followed by the stems, and the lowest level was in the roots and sporangia (Ma et al. 2005). H . serrata plants grow very slowly in specific habitats and normally need approximately 15–20 years to reach maturity after spore germination (Ma et al. 2006). The plants of H . serrata are widely distributed along the Yangtze River and throughout the southern parts of China, usually in the tropical or subtropical habitats (Ma et al. 2006). The whole plant body of H . serrata is harvested for HupA collection when the height of the sporophytes reaches 5–15 cm (Ma et al. 2006). The plants are currently in danger of extinction in China because of the extensive collection for the production of HupA (Ma et al. 2007). Many studies investigated the proposed biosynthesis pathway from pelletierine coupled with 4PAA to synthesize HupA and the related lycopodium alkaloids (Comins and Al-awar 1995, Ma and Gang 2004, Nyembo et al. 1978). However, no enzymes have been identified in the plants of the Huperziaceae family that might be involved in the biosynthesis of lycopodium alkaloids. Thus, the biosynthetic processes leading to the production of HupA have not been elucidated in the Huperziaceae. Based on the investigations of the natural products in H . serrata, the secondary metabolites including triterpenoids and flavone/flavonoids also accumulate in this medicinal plant ( Yang et al. 2008, Zhou et al. 2003, 2004, Zhu et al. 1994). Several recent studies identified the polyketide synthase1 (PKS1) and its corresponding gene from H . serrata. PKS1 is a novel-type III polyketide synthase and shows unusually broad catalytic promiscuity, producing various aromatic 2 tetraketides (Morita et al. 2007, Wanibuchi et al. 2007). However, as of January 2009, there were only 10 nucleotide sequences from H . serrata available in the NCBI database. The limited information on the genetic contents of this plant triggered our efforts to construct a cDNA library from the H . serrata leaf. The objective of this study was to identify the functional genes in H . serrata, especially those involved in the biosynthesis of secondary metabolites, by expressed sequence tag (EST) analysis. EST analysis is a cost-effective and rapid tool used for the isolation of genes. This analysis has provided a means of identifying novel genes and characterizing the transcriptome in various tissues (Adams et al. 1991, Paraoan et al. 2000, Shu et al. 2009). EST sequencing has also been used to establish phylogenetic relationships and identify simple sequence repeats (SSRs), the useful markers for creating genetic maps in plants (Morgante et al. 2002). This technique has also led to the discovery of genes involved in the biosynthesis of secondary metabolites (Ohlrogge and Benning 2000). Recently, the genes encoding enzymes involved in the biosynthesis of ginsenoside (Jung et al. 2003), monoterpenoid indole alkaloids (Murata et al. 2006), triterpene saponin (Suzuki et al. 2002) and diterpenes (Brandle et al. 2002) were identified using the EST gene discovery approach. This report describes the first EST analysis on the H . serrata leaf and identifies several candidate transcripts that have significant sequence similarities to cytochrome P450s and lysine decarboxylase (LDC), which may be involved in HupA biosynthesis. Other functional transcripts that might be associated with H . serrata developmental regulation and involved in serratane (triterpenoid) and flavone/flavonoid biosynthesis are also discussed. The identification of SSRs in the sequence data will be useful in marker-assisted breeding programs. This study presents for the first time a profile of expressed genes based on the EST analysis of H . serrata. The EST analysis and information will greatly contribute to better understanding of the molecular data of secondary metabolite biosynthesis, developmental regulation and marker-assisted selections in H . serrata and the related species in the Huperziaceae family. Materials and methods RNA extraction and cDNA library construction H. serrata plants that had grown to reach the height of 10–12 cm (grown in wild for about 10 years) were collected at Bawangling with the altitude of 1320 m (109◦ 10 E, 19◦ 7 N) in the Hainan Province (November 3, 2008). The plants were authenticated by Physiol. Plant. 139, 2010 Professor Yu-Lin Lin of the Institute of Medicinal Plant Development (IMPLAD), Chinese Academy of Medical Sciences. The whole plants were collected and rinsed with water for 5–8 times. And then, the plants, including the leaves, were dried with absorbent paper gently and quickly. The cleaned leaves (1–3 mm petiole) were isolated from plants, frozen in liquid nitrogen immediately and stored at −70◦ C until RNA isolation. Total RNA was extracted from 0.5 g of leaves using the RNeasy plant kit (BioTeke, Beijing, China). RNA concentration was measured using a GeneQuant100 spectrophotometer (GE Healthcare, Chalfont St Giles, UK). The RNA quality was tested on ethidium bromidestained agarose gels. Library construction was performed using the Creator™ SMART™ cDNA library construction kit (Clontech, Mountain View, CA) in accordance to the manufacture’s recommendations. Size-selected double-stranded cDNA (fragments >500 bp) was directionally ligated into the Sfi I restriction site of the pDNR-lib vector (Clontech, Mountain View, CA) and electroporated into a DH5α E. coli strain (TakaRa, Shiga, Japan) using an ECM630 electroporator (NatureGene Corp., Medford, NJ). EST sequencing, assembly and annotation Randomly selected clones were cultured in liquid LB medium containing 34 mg l−1 chloramphenicol and incubated overnight at 220 rpm and 37◦ C. Plasmid DNA was prepared using an Axyprep-96 Plasmid Kit (Axygen, Union City, CA). Bacterial clones were prepared for sequencing using the BigDye 3.1 sequencing chemistry and then sequenced from the 5 end using M13 forward primer and an ABI3730 DNA sequencer. The ABI-formatted chromatogram sequences were processed automatically using a local EST analysis pipeline. This pipeline linked base calling using Phred algorithm. High-quality EST sequences were generated after the vector, low-quality and short sequences (<100 bp) were removed, and the polyA/T tails were trimmed using Cross match. These ESTs were assembled into contigs (clusters of assembled ESTs) and singletons (sequences found only once) by the Phrap program. The unique sequences were searched against the SwissProt database (released in December 2008) using the BLASTX algorithm with an E-value cutoff of 10−5 . If the unique sequences did not match any sequences in the SwissProt database, they were used to search the NCBI non-redundant protein (nr) database using BLASTX (E-value <10−5 ). After the BLASTX analysis, the unique sequences that did not match any sequences in the above analyses were then used to search the NCBI Physiol. Plant. 139, 2010 non-redundant nucleotide (nt) databases using BLASTN (E-value <10−5 ). The integrated protein domain recognition program InterProScan was run locally to search the translated unique consensus sequences against all of the InterPro protein domains. The functional categories of these unique sequences were further identified using the Gene Ontology (GO) Database (Ashburner et al. 2000) based on the existing mappings of InterPro domains to the GO hierarchy. The biochemical pathway assignments were performed according to the Kyoto Encyclopedia of Genes and Genomes (KEGG) mapping (http://www.genome.ad.jp/kegg/kegg2.html). Enzyme commission (EC) numbers were assigned to the unique sequences based on BLASTX searching of protein databases with a cutoff value of E <10−5 . SSR detection The detection of SSRs from the total unique sequences of the H. serrata leaf was performed using the Simple Sequence Repeat Identification Tool (SSRIT) (http://www.gramene.org/db/markers/ssrtool). The SSRIT accepts FASTA-formatted sequence files and reports the sequence ID, SSR motif, number of repeats (di-, tri-, tetra-, penta- or hexa-nucleotide repeat units), repeat length and position of the SSR and the total length of the sequence in which the SSRs were found (Temnykh et al. 2001). The frequency of repeat classes (e.g. di-, tri-, tetra-, penta- or hexa-nucleotide) was combined by type; for example, GA repeats also encompassed repeats identified as AG and their complementary sequences TC or CT repeats. The search parameters for the maximum motif-length group were set to hexamer and those for the minimum number of repeats were set to five. [Accession numbers: The EST data reported in this paper are available in the GenBank databases under the Accession Nos. (GO248777-GO248876 and GO911766-GO915116)]. Results and discussion Construction and general characteristics of the H. serrata leaf cDNA library To identify the genes and the expression profiles involved in the cellular development of the H. serrata leaf and in the biosynthesis of secondary metabolites in its leaf, a cDNA library was constructed from leaves of H. serrata. This library had a titer of 4.5 × 105 colonyforming units per milliliter. A total of 4012 cDNA clones were randomly chosen from the library for sequencing, generating 3451 high-quality ESTs with an average sequence length of 685 bp after base calling and removal 3 Table 1. Overview of the results from H. serrata leaf cDNA library. a The unique sequences were annotated by BLAST analysis against the public databases. Description Number Total number of clones sequenced Total number of high quality ESTs Average length per ESTs (bp) G+C content (%) Total number of unique sequences Average length per unique sequence (bp) Number of contigs Number of singletons Number of annotated unique sequencesa Number of non-annotated sequences 4012 3451 685 46.0 1510 712 394 1116 1225 285 15 Percentage (%) 12 The unique sequences were used in similarity analysis against the public databases. A total of 768 (50.86%) unique sequences had significant hits (E-value <10−5 ) to the sequences in the SwissProt database using BLASTX. Of the remaining unannotated sequences (742 unique sequences), 431 unique sequences were homologous to sequences in the NCBI non-redundant protein (nr) database by a BLASTX analysis. Finally, of the 311 still unannotated sequences, 26 unique sequences showed similarities to sequences in the NCBI nonredundant nucleotide (nt) database using the BLASTN algorithm. Together, 1225 (81.1%) unique sequences were assigned putative identities based on significant sequence similarities to at least one sequence in the NCBI Protein or DNA databases (Table 1). The other 285 (18.9%) unique sequences showed no similarities to any sequences in the public databases and they likely represent novel transcripts (Table 1). 9 Functional categories of the unique sequences by GO analysis 6 3 0 0 200 400 600 80 0 EST length (bp) Fig. 1. The size distribution of EST length without vector sequences in H. serrata leaf cDNA library. of vector, short sequences and low-quality sequences (Table 1). The size distribution of EST length without vector sequences in H. serrata leaf cDNA library is given in Fig. 1. These high-quality ESTs were assembled into 394 contigs and 1116 singletons by a cluster analysis, yielding a total of 1510 unique sequences with an average length of 712 bp (Table 1). The contigs were composed of multiple ESTs, ranging from 2 to 75, with sequence length between 140 and 1513 bp. More than 13.5% contigs consisted of two sequences, followed by the 7.2% having 3–5 sequences, and the 5.4% having 6–70 sequences. Redundancy ranged from one contig with 163 sequences to 204 contigs with two sequences, and 190 contigs with more than three sequences. The average GC content of these high-quality ESTs was 46% (Table 1), which is higher than what has been found in Arabidopsis ESTs (43.4%) (Asamizu et al. 2000). This EST dataset provides the first available information about the H. serrata leaf transcriptome. 4 Annotation of unique sequences by BLAST analysis Putative functions were assigned to 700 (46.36%) unique sequences involved in cellular component, molecular function and biological process categories by GO analysis (Fig. 2). When mapped against the cellular component GO terms, 283 unique sequences (40.4%) each were directly associated with ‘Cell’ and ‘Cell part’ functions, respectively. A total of 155 unique sequences (22.1%) were assigned to the ‘Macromolecular complex’ (Fig. 2A). In contrast, the majority of unique sequences in the molecular function category were assigned to ‘Catalytic activity’ (279 unique sequences, 39.9%) and ‘Binding activity’ (257 unique sequences, 36.7%) (Fig. 2B). When mapped against the biological process category, 381 unique sequences (54.4%) were involved in ‘Metabolic processes’; 354 unique sequences (50.6%) were involved in ‘Cellular processes’ and 31 unique sequences (4.4%) were involved in ‘Responses to stimuli’ (Fig. 2C). These results provide the very first global gene expression profile of the H. serrata leaf. Functional classification based on KEGG analysis The unique sequences were assigned to biochemical pathways described in KEGG based on their EC numbers. A total of 1058 unique sequences (70.1%) showed sequence similarities to genes in the KEGG database. Only 332 unique sequences (31.4%) were assigned EC numbers and mapped to 89 unique KEGG biochemical pathways. The KEGG metabolic pathways that were well represented by the 196 Physiol. Plant. 139, 2010 Table 2. Mappings of H. serrata unique sequences to KEGG biochemical pathways. a Percentage based on unique sequences (1058) with significant similarities to sequences in KEGG database. b Unassigned unique sequences are those that have significant similarities to known sequences in KEGG database whose functions are unclear. Cellular component A Cell 40.4% Cell part 40.4% Extracellular region 1.1% Macromolecular complex 22.1% 23.0% Organelle KEGG categories represented 6.6% Organelle part 2.0% Others 0 10 20 30 40 50 60 Percentage of total unique sequences (%) Molecular function B Binding 36.7% Catalytic activity 39.9% Electron carrier activity 4.9% Structural molecule activity 14.4% Transporter activity Transcription\Translation regulator activity Others 7.1% 2.4% 3.3% 0 10 20 30 40 50 Percentage of total unique sequences (%) C 60 Biological process Biological regulation 5.9% Cellular process Establishment of localization Localization 50.6% 9.1% 9.3% Metabolic process Pigmentation Multi\-organism process 54.4% 5.0% 1.3% 4.4% 3.0% Response to stimulus Others 0 10 20 30 40 50 60 Percentage of total unique sequences (%) Fig. 2. Distribution of GO-based cellular component (A), molecular function (B) and biological process (C) for unique sequences from H. serrata leaf cDNA library. unique sequences (18.5%) of H. serrata included energy metabolism (41 enzymes), carbohydrate metabolism (23 enzymes), lipid metabolism (20 enzymes), amino acid metabolism (12 enzymes), the biosynthesis of secondary metabolites (9 enzymes) and xenobiotic and biodegradation metabolism (9 enzymes) (Table 2). For the biosynthesis pathways of secondary metabolites, the related EC numbers are listed in Appendix S1, Supporting information. These enzymes are involved in alkaloid biosynthesis, terpenoid and diterpenoid biosynthesis, phenylpropanoid and flavone/flavonoid biosynthesis. A total of 105 unique sequences (9.9%) belonged to the genetic information processing category, which includes folding, sorting and degradation (8 enzymes), replication and repair (2 enzymes), transcription (1 enzyme) and translation (2 enzymes) Physiol. Plant. 139, 2010 Metabolism Amino acid metabolism Biosynthesis of secondary metabolites Carbohydrate metabolism Energy metabolism Glycan biosynthesis and metabolism Lipid metabolism Metabolism of cofactors and vitamins Metabolism of other amino acids Nucleotide metabolism Xenobiotics biodegradation and metabolism Genetic information processing Folding, sorting and degradation Replication and repair Transcription Translation Environmental information processing Membrane transport Signal transduction Cellular processes Cell growth and death Cell motility Endocrine system Human diseases Cancers Infectious diseases Neurodegenerative disorders Unassignedb Unique sequences Percentagea (no. of enzymes) (%) 196 (131) 15 (12) 11 (9) 18.5 1.4 1.0 36 (23) 68 (41) 6 (6) 3.4 6.4 0.6 31 (20) 3 (2) 2.9 0.3 7 (5) 4 (4) 15 (9) 0.7 0.4 1.4 105 (13) 30 (8) 2 (2) 6 (1) 67 (2) 12 (3) 9.9 2.8 0.2 0.6 6.3 1.1 7 (1) 5 (2) 9 (1) 1 (0) 2 (0) 6 (1) 10 (6) 2 (1) 2 (1) 6 (4) 726 0.7 0.5 0.8 0.1 0.2 0.6 1.0 0.2 0.2 0.6 68.6 (Table 2). The KEGG pathways of environmental information processing included membrane transport (1 enzyme) and signal transduction (2 enzymes) (Table 2). The smallest number of unique sequences (9 unique sequences, 1 enzyme) mapped to the cellular processes category (Table 2). Additionally, most of the unique sequences (726 unique sequences, 68.6%) remained unassigned to any known biochemical pathways (Table 2). Highly expressed transcripts in the H. serrata leaf library An abundant representation of a specific sequence in a cDNA library generally correlates with a high level of 5 Table 3. Assembled clusters that contain more than 15 ESTs in H. serrata leaf. No. of ESTs BLASTX annotation 75 56 47 36 29 26 21 20 17 Unnamed protein product Thioredoxin H-type Photosystem II 10 kDa polypeptide Pathogenesis-related protein 5 precursor Auxin-repressed 12.5 kDa protein Chlorophyll a-b binding protein 6A Glutathione S-transferase ERD13 Glutathione S-transferase ERD13 BRASSINOSTEROID INSENSITIVE 1-associated receptor kinase 1 precursor Germin-like protein subfamily 1 member 7 Endo-1,3;1,4-beta-D-glucanase precursor Photosystem I reaction center subunit N 17 17 16 expression in the original biological sample (Audic and Claverie 1997). The transcripts with the highest levels of expression were represented by more than 15 ESTs in the H. serrata leaf cDNA library (Table 3). These transcripts were mostly involved in redox reactions, metabolisms, the phytohormone responses and photosystem reactions, and some of that represent ‘house-keeping’ genes. Of the classes of EST assigned functions, thioredoxin, photosystem polypeptide, glutathione S-transferase and pathogenesis-related protein, were present abundantly in this EST collection. A unique sequence consisting of 56 ESTs showed sequence similarity to H-type thioredoxin, which is an important regulatory element in plant metabolism (Schürmann and Jacquot 2000). A number of metabolism-related transcripts encoding the cytochrome b559 subunit alpha (13 ESTs), mannitol dehydrogenase (14 ESTs) and fructose-bisphosphate aldolase (13 ESTs) were also highly expressed (data not shown). Many unique sequences matched to glutathione S-transferase ERD13, which likely plays a role in the detoxification of toxic compounds formed during stress (Oono et al. 2003). Other transcripts encoded pathogenesisrelated protein 5 (36 ESTs) and germin-like protein (17 ESTs), which may function in defense responses (Doll et al. 2003). The unique sequences associated with the phytohormone response and signal transduction, such as the auxin-repressed 12.5 kDa protein (29 ESTs) and the brassinosteroid insensitive 1-associated receptor kinase 1 (17 ESTs), were expressed at high levels. Additionally, several unique sequences that represent encoding well-documented photosynthetic-related proteins and may be involved in the photosystem reaction were also abundant in this EST dataset (Table 3). 6 Table 4. Summary of di- and tri-nucleotide repeats in the unique sequences of H. serrata leaf. a Number of the unique sequences containing SSRs. b The relative percentage of the repeat compositions in di- and tri-nucleotide repeats, respectively. Repeat composition Dinucleotide AC/CA/GT/TG AG/GA/CT/TC AT/TA CG/GC Total no. of dinucleotides Trinucleotide AAC/CAA/ACA/GTT/TTG/TGT AAG/GAA/AGA/CTT/TTC/TCT AAT/TAA/ATA/ATT/TTA/TAT ACC/CAC/CCA/GGT/GTG/TGG ACG/CGA/GAC/CGT/GTC/TCG ACT/CTA/TAC/AGT/TAG/GTA AGC/CAG/GCA/TGC/CTG/GCT AGG/GGA/GAG/TCC/CTC/CCT ATC/CAT/TCA/GAT/ATG/TGA CCG/CGC/GCC/GGC/GCG/CGG Total no. of trinucleotides Numbera Percentageb 48 318 15 4 385 12.5 82.6 3.9 1.0 100 4 22 5 1 4 1 46 9 13 0 105 3.8 21.0 4.8 1.0 3.8 1.0 43.8 8.6 12.4 0 100 SSR detection SSRs, also known as microsatellites, consisting of short (1–6 bp) and tandemly repeated sequences, have been shown to be one of the most powerful of genetic marker systems in biology. A total of 501 potential SSR-motif microsatellite loci were identified from the 393 H. serrata leaf unique sequences (see Appendix S2, Supporting information). Approximately 26% (393/1510) of the H. serrata leaf unique sequences contained one or more di-, tri-, tetra-, penta- or hexanucleotide SSRs. Di- and tri- motifs were differently represented with percentages of 76.8 and 20.9% in total (385 vs 105), respectively (Table 4). The relative frequency of repeats with different dinucleotide compositions was bias among the four possible repeat classes (Table 4). AG repeats were by far the most common dinucleotide repeat, constituting nearly 82.6% of dinucleotide repeats. The bias to AG repeats in the H. serrata leaf unique sequences was similar to the relative frequency in Arabidopsis (83%) (Zhang et al. 2004). The next common dinucleotide repeats in H. serrata were AC repeats with the relative frequency of 12.5% compared with 4% for Arabidopsis, followed by AT repeats at 3.9% in H. serrata compared with 8% in Arabidopsis. Although AT repeats are thought to be very abundant in the genomic sequences of plants (Lagercrantz et al. 1993), this was not the case for the H. serrata leaf unique sequences. CG repeats are very infrequent and poorly represented in plants at 0.1% in H. serrata similar to 0.14% in Arabidopsis. Physiol. Plant. 139, 2010 Among the trinucleotide repeats, AGC/CAG/GCA/ TGC/CTG/GCT was the largest repeat class (43.8%), followed by AAG/GAA/AGA/CTT/TTC/TCT (20.9%) and ATC/CAT/TCA/GAT/ATG/TGA (12.4%) (Table 4). The tetra- and hexanucleotides showed significant lower values in the total (1.8 and 1.2%, respectively). The pentanucleotide repeats constituted only 0.8% of the total. The majority (77.1%) of the SSR-containing unique sequences had a single SSR per sequence, while 90 (22.9%) of them contained two or more putative SSRs per sequence. Just 44.7% of the repeats were between 9 and 14 bases in length and 30.1% of the repeats were longer than 20 nucleotides in length (see Appendix S2, Supporting information). This result may depend on the length of unique sequences detected in this study. Identification of candidate genes This EST collection contains signatures of many genes involved in important traits in H . serrata. The unique sequences were grouped into functional categories, which facilitated to visualize the leaf transcriptome and accelerate the identification of candidate transcripts associated with the secondary metabolite biosynthesis and the developmental regulation. The lycopodium alkaloids, especially HupA, were investigated extensively and intensively. Thus, those transcripts related to the biosynthesis of lycopodium alkaloids were identified. We were able to identify transcripts specific to the biosynthesis of triterpenoids and flavone/flavonoids in this EST dataset. In addition, the transcripts associated with the environmental responses including the phytohormone biosynthesis and signal transduction pathways, that may play key roles in regulation of the development of H. serrata, were also discovered. The biosynthesis of secondary metabolites Lycopodium alkaloid biosynthesis H. serrata, the original resource of production of HupA, also produces various lycopodium alkaloids. Several types of these alkaloids have valuable medical applications. However, the biosynthetic mechanisms of these secondary metabolites in H. serrata remain unclear. Lycopodium alkaloids originate from the coupling of the pelletierine and 4PAA/4PAACoA (Castillor et al. 1970, Ma and Gang 2004). Initially, Llysine is decarboxylated by LDC to form cadaverine. Thus, LDC is the first enzyme that participates in lycopodium alkaloids biosynthesis. Fortunately, a unique sequence (GO914645) encoding a full-length LDC -like gene with unknown function was found in the EST collection, which was named HsLDC (Table 5). The corresponding peptide sequence of HsLDC contained the Pfam 03641 domain, which could define a family including proteins annotated as putative LDC (http://pfam.sanger.ac.uk/). The members of this family share a highly conserved motif PGGXGTXXE that is probably functionally important (Fig. 3). Given the fact that the activity of LDC is rather low in higher plants, there was few report of the purification and characterization of any plant LDC (Herminghaus et al. 1991). LDC activity could be regarded as one limiting factor in the synthesis of cadaverinederived secondary metabolites (Herminghaus et al. 1991). Berlin et al. (1998) reported that the biosynthesis of phenylpropanoid-polyamine conjugates can be stimulated by overexpression of a heterologous bacterial Table 5. H. serrata leaf unique sequences with significant sequence similarities to genes possibly involved in alkaloids (including HupA) biosynthesis. GenBank accession no. Cytochrome P450 GO914428 GO912402 GO913165,GO913064 GO913720 GO914381,GO911886 GO911852 Decarboxylase GO912010 GO914645 Dioxygenase GO912724 Methyltransferase GO913407 GO914756 Physiol. Plant. 139, 2010 Sequence with highest similarity E-value Cytochrome P450 71A1 (Persea Americana) Cytochrome P450 72A1 (Catharanthus roseus) Cytochrome P450 72A1 (Catharanthus roseus) Cytochrome P450 74A (Arabidopsis thaliana) Cytochrome P450 77A1 (Solanum melongena) Cytochrome P450 90A1 (Arabidopsis thaliana) 2.00E–33 2.00E–21 4.00E–24 5.00E–58 4.00E–80 1.00E–53 Diaminopimelate decarboxylase (Archaeoglobus fulgidus) Lysine decarboxylase (Arabidopsis thaliana) (HsLDC) 5.00E–18 2.00E–79 Hyoscyamine 6-dioxygenase (Hyoscyamus niger) 5.00E–36 Methyltransferase type 11 (Cyanothece sp. PCC 7425) (RS)-norcoclaurine 6-O-methyltransferase (Coptis japonica) 2.00E–38 7.00E–29 7 Fig. 3. Protein sequence alignment of HsLDC with putative lysine decarboxylases. The multiple sequence alignment presented the conserved PGGXGTXXE motif, labeled by the asterisk, of the amino acid sequences of HsLDC and the closest Arabidopsis (At1g50575) and rice (Os03g0587100) homologues. These proteins contained the Pfam 03641 domain, belonging to putative LDCs (http://pfam.sanger.ac.uk/). Table 6. H. serrata leaf unique sequences with significant sequence similarities to genes involved in terpenoids (including serratane) biosynthesis. GenBank accession no. Sequence with highest similarity GO913429 GO914776 GO911926 GO248858 GO912322 GO913346 GO912569 1-deoxy-D-xylulose5-phosphate reductoisomerase (Oryza sativa subsp. Japonica) 10-deacetylbaccatin III 10-O-acetyltransferase (Taxus cuspidate) Acetyl-CoA acetyltransferase (Arabidopsis thaliana) Farnesyl pyrophosphate synthetase 2 (Lupinus albus) Isopentenyl-diphosphate Delta-isomerase I (Arabidopsis thaliana) Isopentenyl-diphosphate Delta-isomerase I (Camptotheca acuminate) Squalene synthetase (Nicotiana benthamiana) LDC. Detection of the activity of HsLDC will facilitate the elucidation of the biosynthesis processes of lycopodium alkaloids in H. serrata. In general, a total of 11 (0.7%) unique sequences in this library showed sequence similarities to uncharacterized enzymes that may be associated with the alkaloid biosynthesis (Table 5). These transcripts may involved in a series of catalytic processes, namely decarboxylation, oxidation, hydroxylation and methylation, mediated by decarboxylases, dioxygenases, cytochrome P450-dependent monooxygenases, methyltransferases and other enzymes, leading to the production of a series of precursors to HupA and related alkaloids. These transcripts will be further characterized for their biological functions in the biosynthesis of lycopodium alkaloids. Terpenoid biosynthesis Serratanes are a unique family of pentacyclic triterpenoids possessing seven tertiary methyl moieties or functional groups and a central seven-membered C-ring, originally isolated from plants of the Pinaceae family and Lycopodium genus (Conner et al. 1981). Previous studies have revealed that the plants of H. serrata produce serratane-type triterpenoids ( Zhou et al. 2003, 2004, Zhu et al. 1994). In plants, both of the mevolonic acid (MVA) pathway in the cytosol and the methylerythritolphosphate (MEP) pathway in the chloroplast are used to synthesize the isoprenoides, the original precursor of triterpenoides. Based on the progresses on the investigations of triterpenoid biosynthesis in plants, we found several unique sequences encoding 1-deoxy-D-xylulose5-phosphate reductoisomerase, squalene synthetase, farnesyl pyrophosphate synthetase 8 E-value 1.00E–53 1.00E–25 4.00E–89 8.00E–69 8.00E–70 1.00E–102 2.00E–92 and isopentenyl-diphosphate delta-isomerase I in our EST dataset (Table 6), most of which may participate in the biosynthesis of serratane-type triterpenoids in H. serrata. Flavone/flavonoid and anthocyanin biosynthesis The identification of a gene specific to the flavonoid biosynthetic pathway and the isolation of a flavone glycoside from H. serrata suggest that flavonoids, including anthocyanins and flavone, are commonly found natural products present in this medicinal plant (Wanibuchi et al. 2007, Yang et al. 2008). The unique sequences related to flavone/flavonoid and anthocyanin pathways in H. serrata are shown in Table 7, including the transcripts encoding of flavanone 3-hydroxylase, isoflavone reductase homolog, leucoanthocyanidin dioxygenase, NAD(P)H-dependent 6 -deoxychalcone synthase, anthocyanidin 3-O-glucosyltransferase and trans-cinnamate 4-monooxygenase. The use of ESTs has greatly facilitated the identification of candidate genes and lays the foundation for the elucidation of the molecular basis of flavonoid metabolism in H. serrata. The phytohormone metabolism and signal transduction As expected, a subset of transcripts had similarities to genes previously implicated in the secondary metabolite biosynthesis. Another subset encodes protein kinases, enzymes and transcription factors that may have a role in phytohormone biosynthesis and signal transduction as well as gene regulation events specific to the development processes. Physiol. Plant. 139, 2010 Table 7. H. serrata leaf unique sequences with significant sequence similarities to genes involved in flavanoids/anthocyanin biosynthesis. GenBank accession no. Sequence with highest similarity E-value GO913969 GO914425 GO912462,GO912981 GO914123,GO913199 GO911813,GO911948 GO913598 GO912262,GO91202, GO912774,GO91405, GO914108 GO912570 GO912551 GO914476,GO914480 GO912837 GO912280,GO911884, GO913762,GO913220 GO914964,GO912007 GO248797 GO914186 GO913685 Anthocyanidin 3-O-glucosyltransferase (Manihot esculenta) Caffeic acid 3-O-methyltransferase (Populus tremuloides) Flavanone 3-hydroxylase (Arabidopsis thaliana) Flavanone 3-hydroxylase (Arabidopsis thaliana) Flavanone 3-hydroxylase (Eustoma grandiflorum) Isoflavone reductase homolog (Lupinus albus) Isoflavone reductase homolog P3 (Arabidopsis thaliana) 8.00E–19 1.00E–44 8.00E–33 1.00E–37 3.00E–12 1.00E–53 2.00E–91 Leucoanthocyanidin dioxygenase (Malus domestica) Naringenin,2-oxoglutarate 3-dioxygenase(Callistephus chinensis) Naringenin,2-oxoglutarate 3-dioxygenase (Callistephus chinensis) Naringenin,2-oxoglutarate 3-dioxygenase (Dianthus caryophyllus) Naringenin,2-oxoglutarate 3-dioxygenase (Dianthus caryophyllus) 2.00E–32 6.00E–27 6.00E–23 2.00E–38 3.00E–20 NAD(P)H-dependent 6 -deoxychalcone synthase (Glycine max) Tocopherol O-methyltransferase (Arabidopsis thaliana) Tocopherol O-methyltransferase (Arabidopsis thaliana) Trans-cinnamate 4-monooxygenase (Helianthus tuberosus) 1.00E–38 4.00E–39 1.00E–34 5.00E–81 Table 8. H. serrata leaf unique sequences with significant sequence similarities to genes involved in phytohormone metabolism and signal transduction. GenBank accession no. Phytohormone metabolism GO248843 GO914176 GO912668 GO914650 GO912348, GO912347 GO914646,GO914444, GO912348,GO912347 GO912059 GO912753,GO912652 GO911852 Phytohormone signal transduction GO911793 GO913328 GO912769 GO913169,GO912555,GO912751, GO913255,GO913656,GO913492, GO914630,GO914889,GO912625, GO913562,GO913449,GO912193, GO912954,GO911951,GO914472, GO912160,GO913618 Sequence with highest similarity 1-aminocyclopropane-1-carboxylate oxidase homolog 1 (Arabidopsis thaliana) 1-aminocyclopropane-1-carboxylate oxidase homolog 1 (Arabidopsis thaliana) Cell elongation protein diminuto (Pisum sativum) Cytokinin-O-glucosyltransferase 3 (Arabidopsis thaliana) Gibberellin 2-beta-dioxygenase 1 (Pisum sativum) Gibberellin 20 oxidase 1 (Arabidopsis thaliana) Gibberellin 20 oxidase 1 (Arabidopsis thaliana) Gibberellin 20-oxidase-like protein (Selaginella moellendorffii) Jasmonate O-methyltransferase (Brassica rapa subsp. Pekinensis) Steroid 23-alpha-hydroxylase (Arabidopsis thaliana) 2.00E–29 3.00E–34 2.00E–56 9.00E–12 4.00E–12 1.00E–35 3.00E–14 3.00E–09 3.00E–23 1.00E–53 Gibberellin receptor GID1L2 (Arabidopsis thaliana) Gibberellin receptor GID1(Oryza sativa subsp. Japonica) Brassinosteroid insensitive 1-associated receptor kinase 1 precursor (Arabidopsis thaliana) Brassinosteroid insensitive 1-associated receptor kinase 1 precursor (Arabidopsis thaliana) 6.00E–23 2.00E–26 4.00E–42 1.00E–51 Phytohormones, which are synthesized and transported throughout the plant and acts at low concentrations, play important roles in the regulation of plant development and environmental responses (Achard et al. 2006, Bishopp et al. 2006, Gray 2004,). Key enzyme-encoding transcripts involved in phytohormone biosynthesis were found in this EST study of the H. serrata leaf (Table 8). These transcripts encode Physiol. Plant. 139, 2010 E-value 1-aminocyclopropane-1-carboxylate oxidase, gibberellin 20 oxidase 1 and gibberellin 2-beta-dioxygenase 1, which are involved in the ethylene and gibberellin biosynthetic pathways, respectively (Hamilton et al. 1990, Olszewski et al. 2002, Thomas et al. 1999). In addition, the unique sequences with sequence similarities to those of cytokinin-O-glucosyltransferase 3 and jasmonate O-methyltransferase, which may be 9 associated with the metabolic processes of cytokinin and jasmonate (Sakakibara 2006, Seo et al. 2001), respectively, also present in our EST dataset. The transcripts involved in gibberellin and brassinosteroid signal transduction were also identified, including those encoding the gibberellin receptors GID1L2 and GID1 and brassinosteroid insensitive 1-associated receptor kinase 1 (Belkhadir and Chory 2006, Ikeda et al. 2001, Li and Jin 2007, Ueguchi-Tanaka et al. 2005) (Table 8). These transcripts may exert essential functions on the regulation of the development and environmental responses of H. serrata. Our EST dataset contained a few of unique sequences participating in transcriptional regulation. These unique sequences have very high homology with known transcription factors, including WRKY (GO913556), zinc finger A20 and AN1 domain-containing proteins (GO911767, GO911913), TRAF-type zinc finger protein (GO914042) and RING-H2 finger protein (GO914999, GO914647, GO914685, GO913341). WRKY is one class of transcription factor only found in plants, characterized by a highly conserved amino acid motif with WRKYGQK at its N-terminus and a metal chelating zinc finger signature at the C-terminus (Mangelsen et al. 2008). WRKY proteins play roles in pathogen defenses, plant development regulation and sugar signaling transduction(Lagace and Matton 2004, Lai et al. 2008, Sun et al. 2003, Zhang et al. 2008). The A20/AN1 zinc finger proteins represent common elements of stress response in plants and have been identified in rice and Arabidopsis (Vij and Tyagi 2008). These diseaseor stress-associated transcripts may be involved in the responses to environments during the development processes of H. serrata. Conclusion This is the first report of the use of EST analysis to study the gene expression profiles in H. serrata, a representative member of the Huperziaceae family. The EST dataset will provide significant resource for gene identification and molecular breeding in H. serrata. A number of unique sequences reported in this study will be completely sequenced and characterized, and this will improve our understanding of H. serrata in the areas of secondary metabolite biosynthesis, development regulation and SSR-associated genetic selection. In addition, the 501 SSR markers developed from the library will facilitate the mapping on the populations of the Huperziaceae family. We investigate the transcripts associated with the biosynthesis of secondary metabolites and the developmental regulation from H. serrata, the scale of gene 10 detection is limited. To identify more transcripts involved in the biosynthesis of bioactive compounds in H. serrata, we will use the strategies of the tissue-specific normalized cDNA library construction and the high throughput EST generation based on the ‘next-generation’ sequencing technology in future. These studies will facilitate the elucidation of the entire biosynthetic pathways of lycopodium alkaloid, the main pharmaceutical resource, and understand the development mechanisms on H. serrata at the molecular level. Acknowledgements – This study was supported by the National Natural Science Foundation of China (30900113). We thank Professor Yu-Lin Lin (Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, China) for his kind help in the authentication of the plant of H. serrata. We thank Professor Chang Liu (Molecular Chinese Medicine Laboratory, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China) for his kind advice on the revisions of this manuscript. References Achard P, Cheng H, De Grauwe L, Decat J, Schoutteten H, Moritz T, Van Der Straeten D, Peng J Harberd NP (2006) Integration of plant responses to environmentally activated phytohormonal signals. Science 311: 91–94 Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropolos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, Mccombe WR, Venter JC (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252: 1651–1656 Asamizu E, Nakamura Y, Sato S, Tabata S (2000) A large scale analysis of cDNA in Arabidopsis thaliana: generation of 12,028 non-redundant expressed sequence tags from normalized and size-selected cDNA libraries. DNA Res 7: 175–180 Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig J, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. The Gene Ontology Consortium (2000) Gene Ontology: tool for the unification of biology. Nat Genet 25: 25–29 Audic S, Claverie JM (1997) The significance of digital gene expression profiles. Genome Res 7: 986–995 Belkhadir Y, Chory J (2006) Brassinosteroid signaling: a paradigm for steroid hormone signaling from the cell surface. Science 314: 1410–1411 Berlin J, Mollenschott C, Herminghaus S, Fecker, LF (1998) Lysine decarboxylase transgenic tobacco root cultures biosynthesize novel hydroxycinnamoylcadaverines. Phytochemistry 48: 79–84 Physiol. Plant. 139, 2010 Bishopp A, Mahonen AP, Helariutta Y (2006) Signs of change: hormone receptors that regulate plant development. Development 133: 1857–1869 Brandle JE, Richman A, Swanson AK, Chapman BP (2002) Leaf ESTs from Stevia rebaudiana: a resource for gene discovery in diterpene synthesis. Plant Mol Biol 50: 613–622 Castillor M, Guptay N, Ho K, Macleana DB, Spenser Ian D (1970) Biosynthesis of lycopodine. Incorporation of A’-piperideine and of pelletierine’. Can J Chem 48: 2911–2918 Ching RC (1978) The Chinese fern families and genera: systematic arrangement and historical origin. Acta Phytotax Sin 16: 1–9 Comins DL, Al-awar RS (1995) Model studies toward the synthesis of the lycopodium alkaloid, phlegmarine. J Org Chem 60: 711–716 Conner AH, Haromy TP, Sundaralingam M (1981) 30-Nor-3.beta.-methoxyserrat-14-en-21-one: first reported natural occurrence of a norserratene triterpene. J Org Chem 46: 2987–2988 Doll J, Hause B, Demchenko K, Pawlowski K, Krajinski F (2003) A member of the germin-like protein family is a highly conserved mycorrhiza-specific induced gene. Plant Cell Physiol 44: 1208–1214 Gray WM (2004) Hormonal regulation of plant growth and development. PLoS Biol 2: 1270–1273 Hamilton AJ, Lycett GW, Grierson D (1990) Antisense gene that inhibits synthesis of the hormone ethylene in transgenic plants. Nature 346: 284–287 Herminghaus S, Schreier PH, McCarthy JEG, Landsmann J, Botterman J, Berlin J (1991) Expression of a bacterial lysine decarboxylase gene and transport of the protein into chloroplasts of transgenic tobacco. Plant Mol Biol 17: 475–486 Ikeda A, Ueguchi-Tanaka M, Sonoda Y, Kitano H, Koshioka M, Futsuhara Y, Matsuoka M, Yamaguchi J (2001) Slender rice, a constitutive gibberellin response mutant, is caused by a null mutation of the SLR1 gene, an ortholog of the height-regulating gene GAI/RGA/RHT/D8. Plant Cell 13: 999–1010 Jung JD, Park HW, Hahn Y, Hur CG, In DS, Chung HJ, Liu JR, Choi DW (2003) Discovery of genes for ginsenoside biosynthesis by analysis of ginseng expressed sequence tags. Plant Cell Rep 22: 224–230 Lagace M, Matton DP (2004) Characterization of a WRKY transcription factor expressed in late torpedo-stage embryos of Solanum chacoense. Planta 219: 185–189 Lagercrantz U, Ellegren H, Andersson L (1993) The abundance of various polymorphic microsatellite motifs differs between plants and vertebrates. Nucleic Acids Res 21: 1111–1115 Lai ZB, Vinod K, Zheng ZY, Fan BF and Chen ZX (2008) Roles of Arabidopsis WRKY3 and WRKY4 transcription Physiol. Plant. 139, 2010 factors in plant responses to pathogens. BMC Plant Biol 8: 68 Li J, Jin H (2007) Regulation of brassinosteroid signaling. Trends Plant Sci 12: 37–41 Liu JS, Zhu YL, Yu CM, Zhou YZ, Han YY, Wu FW, Qi BF (1986) The structures of huperzine A and B, two new alkaloids exhibiting marked anticholinesterase activity. Can J Chem 64: 837–839 Ma XQ, Gang DR (2004) The lycopodium alkaloids. Nat Prod Rep 21: 752–772 Ma XQ, Tan CH, Zhu DY, Gang DR (2005) Is there a better source of huperzine A than Huperzia serrata? huperzine A content of Huperziaceae species in China. J Agric Food Chem 53: 1393–1398 Ma XQ, Tan CH, Zhu DY, Gang DR (2006) A survey of potential huperzine A natural resources in China: The Huperziaceae. J Ethnopharmacol 104: 54–67 Ma XQ, Tan CH, Zhu DY, Gang DR, Xiao PG (2007) Huperzine A from Huperzia species – An ethnopharmacolgical review. J Ethnopharmacol 113: 15–34 Mangelsen E, Kilian J, Berendzen KW, Kolukisaoglu ÜH, Harter K, Jansson C, Wanke D (2008) Phylogenetic and comparative gene expression analysis of barley (Hordeum vulgare) WRKY transcription factor family reveals putatively retained functions between monocots and dicots. BMC Genomics 9: 194 Morgante M, Hanafey M, Powell W (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat Genet 30: 194–200 Morita H, Kondo S, Kato R, Wanibuchi K, Noguchi H, Sugio S, Abe I, Kohno T (2007) Crystallization and preliminary crystallographic analysis of an acridone-producing novel multifunctional type III polyketide synthase from Huperzia serrata. Acta Cryst 63: 576–578 Murata J, Bienzle D, Brandle JE, Sensen CW, Luca VD (2006) Expressed sequence tags from Madagascar periwinkle (Catharanthus roseus) FEBS Lett 580: 4501–4507 Nyembo L, Goffin A, Hootele C, Braekman JC (1978) Phlegmarine, a likely key intermediate in the biosynthesis of the lycopodium alkaloids. Can J Chem 56: 851–856 Ohlrogge J, Benning C (2000) Unraveling plant metabolism by EST analysis. Curr Opin Plant Biol 3: 224–228 Olszewski N, Sun TP, Gubler F (2002) Gibberellin signaling: biosynthesis, catabolism, and response pathways. Plant Cell 14: S61–S80 Oono Y, Seki M, Nanjo T, Narusaka M, Fujita M, Satoh R, Satou M, Sakurai T, Ishida J, Akiyama K, Iida K, Maruyama K, Satoh S, Yamaguchi-Shinozaki K, Shinozaki K (2003) Monitoring expression profiles of Arabidopsis gene expression during rehydration process 11 after dehydration using ca. 7000 full-length cDNA microarray. Plant J 34: 868–887 Paraoan L, Grierson I, Maden BE (2000) Analysis of expressed sequence tags of retinal pigment epithelium: cystatin C is an abundant transcript. Int J Biochem Cell Biol 32: 417–426 Sakakibara H (2006) Cytokinins: activity, biosynthesis, and translocation. Annu Rev Plant Biol 57: 431–449 Schürmann P, Jacquot JP (2000) Plant thioredoxin systems revisited. Annu Rev Plant Physiol Mol Biol 51: 371–400 Seo HS, Song JT, Cheong JJ, Lee YH, Lee YW, Hwang I, Lee JS, Choi YD (2001) Jasmonic acid carboxyl methyltransferase: a key enzyme for jasmonate-regulated plant responses. Proc Natl Acad Sci USA 98: 4788–4793 Shu QY, Wischnitzki E, Liu ZA, Ren HX, Han XY, Hao Q, Gao FF, Xu SX, Wang LS (2009) Functional annotation of expressed sequence tags as a tool to understand the molecular mechanism controlling flower bud development in tree peony. Physiol Plant 135: 436–449 Sun C, Palmqvist S, Olsson H, Borén M, Ahlandsberg S, Jansson C (2003) A novel WRKY transcription factor, SUSIBA2, participates in sugar signaling in barley by binding to the sugar-responsive elements of the iso1 promoter. Plant Cell 115: 2076–2092 Suzuki H, Achnine L, Xu R, Matsuda SPT, Dixon RA (2002) A genomics approach to the early stages of triterpene saponin biosynthesis in Medicago truncatula. Plant J 32: 1033–1048 Tang XC (1996) Huperzine A (Shuangyiping): a promising drug for Alzheimer’s disease. Acta Pharmacol Sin 17: 481–484 Temnykh S, DeClerck G, Lukashova A, Lipovich L, Cartinhour S, McCouch S (2001) Computational and experimental analysis of microsatellites in Rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res 11: 1441–1452 Thomas SG, Phillips AL, Hedden P (1999) Molecular cloning and functional expression of gibberellin 2-oxidases, multifunctional enzymes involved in gibberellin deactivation. Proc Natl Acad Sci USA 96: 4698–4703 Ueguchi-Tanaka M, Ashikari M, Nakajima M, Itoh H, Katoh E, Kobayashi M, Chow TY, Hsing YC, Kitano H, Yamaguchi I, Matsuoka M (2005) GIBBERELLIN INSENSITIVE DWARF1 encodes a soluble receptor for gibberellin. Nature 437: 693–698 Vij S, Tyagi AK (2008) A20/AN1 zinc-finger domain-containing proteins in plants and animals represent common elements in stress response. Funct Integr Genomics 8: 301–307 Wanibuchi K, Zhang P, Abe T, Morita H, Kohno T, Chen GS, Noguchi H, Abe I (2007) An acridone-producing novel multifunctional type c ó polyketide synthase from Huperzia serrata. FEBS J 274: 1073–1082 Yang YB, Yang XQ, Xu YQ, Tai ZG, Ding ZT (2008) A new flavone glycoside from Huperzia serrata. Chin J Nat Med 6: 408–410 Zhang L, Yuan D, Yu S, Li Z, Cao Y, Miao Z, Qian H, Tang K (2004) Preference of simple sequence repeats in coding and non-coding regions of Arabidopsis thaliana. Bioinformatics 20: 1081–1086 Zhang J, Peng YL, Guo ZJ (2008) Constitutive expression of pathogen-inducible OsWRKY31 enhances disease resistance and affects root growth and auxin response in transgenic rice plants. Cell Res 18: 508–521 Zhou H, Jiang SH, Tan CH, Wang BD, Zhu DY (2003) New epoxyserratanes from Huperzia serrata. Planta Med 69: 91–94 Zhou H, Li YS, Tong XT, Liu HQ, Jiang SH, Zhu DY (2004) Serratane-type triterpenoids from Huperzia serrata. Nat Prod Res 18: 453–459 Zhu DY, Jiang S H, Huang M F, Lin LZ, Cordell GA (1994) Huper serratinine from Huperzia serrata. Phytochemistry 36: 1069–1072 Supporting Information Additional Supporting Information may be found in the online version of this article: Appendix S1. Key enzyme discovery in the unique sequences of H. serrata leaf. The unique sequences encoding key enzymes involved in the biosynthesis of alkaloids, brassinosteroids, flavone/flavonoids, phenylpropanoids, terpenoids and steroids. Appendix S2. SSR discovery in the unique sequences of H. serrata leaf. The unique sequences containing the putative SSRs per sequence are listed in this table with unique sequence-ID, motif, number of motif repeats, SSR start coordinate, SSR end coordinate, unique sequence length, repeat length and the GenBank accession no. of the corresponding ESTs composed of unique sequences. Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article. Edited by D. Campbell 12 Physiol. Plant. 139, 2010