Download Analysis of expressed sequence tags from the Huperzia serrata leaf

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Vectors in gene therapy wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Designer baby wikipedia , lookup

Microevolution wikipedia , lookup

Pathogenomics wikipedia , lookup

Gene wikipedia , lookup

Gene expression profiling wikipedia , lookup

Transposable element wikipedia , lookup

History of genetic engineering wikipedia , lookup

Human genome wikipedia , lookup

Non-coding DNA wikipedia , lookup

Primary transcript wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Genomics wikipedia , lookup

NEDD9 wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Microsatellite wikipedia , lookup

Multiple sequence alignment wikipedia , lookup

Point mutation wikipedia , lookup

Genome editing wikipedia , lookup

Helitron (biology) wikipedia , lookup

Sequence alignment wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Metagenomics wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
Physiologia Plantarum 139: 1–12. 2010
Copyright © Physiologia Plantarum 2009, ISSN 0031-9317
TECHNICAL FOCUS
Analysis of expressed sequence tags from the
Huperzia serrata leaf for gene discovery in the areas
of secondary metabolite biosynthesis and development
regulation
Hongmei Luoa , Chao Suna , Ying Lia , Qiong Wua , Jingyuan Songa , Deli Wangb , Xiaocheng Jiaa ,
Rongtao Lib and Shilin Chena,c,∗
a Institute of Medicinal Plant Development (IMPLAD), Chinese Academy of Medical Sciences & Peking Union Medical College, No. 151, Malianwa
North Road, HaiDian District, Beijing 100193, China
b Hainan Branch Institute of Medicinal Plant Development (HBIMPLAD), Chinese Academy of Medical Sciences & Peking Union Medical College,
XingLong Town, Wanning County, Hainan 572522, China
c Hubei College of Traditional Chinese Medicine, Hongshan District of Wuhan City, Huangjia Lake West Road on the 1st, Hubei Province,
430065, China
Correspondence
*Correspondence author,
e-mail: [email protected]
Received 19 October 2009;
revised 27 November 2009
doi:10.1111/j.1399-3054.2009.01339.x
Huperzia serrata produces various types of lycopodium alkaloids, especially
the huperzine A (HupA) that is a promising drug candidate for Alzheimer’s
disease. Despite the medicinal importance of H. serrata, little genomic or
transcriptomic data are available from the public databases. A cDNA library
was thus generated from RNA isolated from the leaves of H . serrata.
A total of 4012 clones were randomly selected from the library, and
3451 high-quality expressed sequence tags (ESTs) were assembled to yield
1510 unique sequences with an average length of 712 bp. The majority
(79.4%) of the unique sequences were assigned to the putative functions
based on the BLAST searches against the public databases. The functions
of these unique sequences covered a broad set of molecular functions,
biological processes and biochemical pathways according to GO and KEGG
assignments. The transcripts involved in the secondary metabolite biosynthesis
of alkaloids, terpenoids and flavone/flavonoids, such as cytochrome P450,
lysine decarboxylase (LDC), flavanone 3-hydroxylase, squalene synthetase
and 2-oxoglutarate 3-dioxygenase, were well represented by 34 unique
sequences in this EST dataset. The corresponding peptide sequence of the
LDC contained the Pfam 03641 domain and was annotated as a putative
LDC. The unique sequences encoding transcription factors, phytohormone
biosynthetic enzymes and signaling components were also found in this EST
collection. In addition, a total of 501 potential SSR-motif microsatellite loci
were identified from the 393 H . serrata leaf unique sequences. This set of nonredundant ESTs and the molecular markers obtained in this study will establish
valuable resources for a wide range of applications including gene discovery
and identification, genetic mapping and analysis of genetic diversity, cultivar
identification and marker-assisted selections in this important medicinal plant.
Abbreviations – BLAST, Basic Local Alignment Search Tool; CYP450, cytochrome P450-dependent monoxygenase; ESTs,
expressed sequence tags; GA, gibberellin; GO, Gene Ontology; HupA, huperzine A; KEGG, Kyoto Encyclopedia of Genes and
Genomes; LDC, lysine decarboxylase; NCBI, National Center for Biotechnology Information; SSR, simple sequence repeat.
Physiol. Plant. 139, 2010
1
Introduction
Huperzia serrata (Thunb.) Trev. is a member of the
Huperziaceae family. The Huperziaceae are one of
the oldest medicinally important vascular plants (Ching
1978). The whole plant of H . serrata is named Qian
Ceng Ta and has been used as Chinese folk medicine
for the treatment of contusions, strains, swellings,
schizophrenia, myasthenia gravis and organophosphate
poisoning (Ma et al. 2006). H . serrata produces various
types of lycopodium alkaloids, including lycopodines,
lycodines and fawcettimines. Some of these alkaloids are
valuable for pharmaceutical applications (Ma and Gang
2004). In particular, the lycodine HupA has been used
as an anti-Alzheimer’s disease drug candidate in China
because of its bioactivities of the selective inhibition of
acetylcholinesterase and as a dietary supplement in the
USA (Liu et al. 1986, Ma et al. 2006, Tang 1996).
The amount of HupA in the H . serrata plant differs
across tissues, and the highest content was identified
in the leaves, followed by the stems, and the lowest
level was in the roots and sporangia (Ma et al. 2005).
H . serrata plants grow very slowly in specific habitats
and normally need approximately 15–20 years to reach
maturity after spore germination (Ma et al. 2006). The
plants of H . serrata are widely distributed along the
Yangtze River and throughout the southern parts of
China, usually in the tropical or subtropical habitats
(Ma et al. 2006). The whole plant body of H . serrata
is harvested for HupA collection when the height of
the sporophytes reaches 5–15 cm (Ma et al. 2006). The
plants are currently in danger of extinction in China
because of the extensive collection for the production of
HupA (Ma et al. 2007).
Many studies investigated the proposed biosynthesis
pathway from pelletierine coupled with 4PAA to
synthesize HupA and the related lycopodium alkaloids
(Comins and Al-awar 1995, Ma and Gang 2004,
Nyembo et al. 1978). However, no enzymes have been
identified in the plants of the Huperziaceae family that
might be involved in the biosynthesis of lycopodium
alkaloids. Thus, the biosynthetic processes leading to
the production of HupA have not been elucidated in the
Huperziaceae.
Based on the investigations of the natural products
in H . serrata, the secondary metabolites including
triterpenoids and flavone/flavonoids also accumulate
in this medicinal plant ( Yang et al. 2008, Zhou
et al. 2003, 2004, Zhu et al. 1994). Several recent
studies identified the polyketide synthase1 (PKS1) and
its corresponding gene from H . serrata. PKS1 is a
novel-type III polyketide synthase and shows unusually
broad catalytic promiscuity, producing various aromatic
2
tetraketides (Morita et al. 2007, Wanibuchi et al. 2007).
However, as of January 2009, there were only 10
nucleotide sequences from H . serrata available in the
NCBI database. The limited information on the genetic
contents of this plant triggered our efforts to construct
a cDNA library from the H . serrata leaf. The objective
of this study was to identify the functional genes in
H . serrata, especially those involved in the biosynthesis
of secondary metabolites, by expressed sequence tag
(EST) analysis.
EST analysis is a cost-effective and rapid tool used
for the isolation of genes. This analysis has provided
a means of identifying novel genes and characterizing
the transcriptome in various tissues (Adams et al. 1991,
Paraoan et al. 2000, Shu et al. 2009). EST sequencing has
also been used to establish phylogenetic relationships
and identify simple sequence repeats (SSRs), the useful
markers for creating genetic maps in plants (Morgante
et al. 2002). This technique has also led to the discovery
of genes involved in the biosynthesis of secondary
metabolites (Ohlrogge and Benning 2000). Recently, the
genes encoding enzymes involved in the biosynthesis
of ginsenoside (Jung et al. 2003), monoterpenoid indole
alkaloids (Murata et al. 2006), triterpene saponin (Suzuki
et al. 2002) and diterpenes (Brandle et al. 2002) were
identified using the EST gene discovery approach.
This report describes the first EST analysis on the
H . serrata leaf and identifies several candidate transcripts
that have significant sequence similarities to cytochrome
P450s and lysine decarboxylase (LDC), which may
be involved in HupA biosynthesis. Other functional
transcripts that might be associated with H . serrata
developmental regulation and involved in serratane
(triterpenoid) and flavone/flavonoid biosynthesis are also
discussed. The identification of SSRs in the sequence data
will be useful in marker-assisted breeding programs. This
study presents for the first time a profile of expressed
genes based on the EST analysis of H . serrata. The
EST analysis and information will greatly contribute to
better understanding of the molecular data of secondary
metabolite biosynthesis, developmental regulation and
marker-assisted selections in H . serrata and the related
species in the Huperziaceae family.
Materials and methods
RNA extraction and cDNA library construction
H. serrata plants that had grown to reach the height
of 10–12 cm (grown in wild for about 10 years)
were collected at Bawangling with the altitude of
1320 m (109◦ 10 E, 19◦ 7 N) in the Hainan Province
(November 3, 2008). The plants were authenticated by
Physiol. Plant. 139, 2010
Professor Yu-Lin Lin of the Institute of Medicinal Plant
Development (IMPLAD), Chinese Academy of Medical
Sciences. The whole plants were collected and rinsed
with water for 5–8 times. And then, the plants, including
the leaves, were dried with absorbent paper gently
and quickly. The cleaned leaves (1–3 mm petiole)
were isolated from plants, frozen in liquid nitrogen
immediately and stored at −70◦ C until RNA isolation.
Total RNA was extracted from 0.5 g of leaves using
the RNeasy plant kit (BioTeke, Beijing, China). RNA
concentration was measured using a GeneQuant100
spectrophotometer (GE Healthcare, Chalfont St Giles,
UK). The RNA quality was tested on ethidium bromidestained agarose gels.
Library construction was performed using the
Creator™ SMART™ cDNA library construction kit (Clontech, Mountain View, CA) in accordance to the manufacture’s recommendations. Size-selected double-stranded
cDNA (fragments >500 bp) was directionally ligated into
the Sfi I restriction site of the pDNR-lib vector (Clontech,
Mountain View, CA) and electroporated into a DH5α
E. coli strain (TakaRa, Shiga, Japan) using an ECM630
electroporator (NatureGene Corp., Medford, NJ).
EST sequencing, assembly and annotation
Randomly selected clones were cultured in liquid LB
medium containing 34 mg l−1 chloramphenicol and
incubated overnight at 220 rpm and 37◦ C. Plasmid DNA
was prepared using an Axyprep-96 Plasmid Kit (Axygen,
Union City, CA). Bacterial clones were prepared for
sequencing using the BigDye 3.1 sequencing chemistry
and then sequenced from the 5 end using M13 forward
primer and an ABI3730 DNA sequencer.
The ABI-formatted chromatogram sequences were
processed automatically using a local EST analysis
pipeline. This pipeline linked base calling using Phred
algorithm. High-quality EST sequences were generated
after the vector, low-quality and short sequences
(<100 bp) were removed, and the polyA/T tails were
trimmed using Cross match. These ESTs were assembled
into contigs (clusters of assembled ESTs) and singletons
(sequences found only once) by the Phrap program.
The unique sequences were searched against the
SwissProt database (released in December 2008) using
the BLASTX algorithm with an E-value cutoff of 10−5 .
If the unique sequences did not match any sequences
in the SwissProt database, they were used to search
the NCBI non-redundant protein (nr) database using
BLASTX (E-value <10−5 ). After the BLASTX analysis, the
unique sequences that did not match any sequences in
the above analyses were then used to search the NCBI
Physiol. Plant. 139, 2010
non-redundant nucleotide (nt) databases using BLASTN
(E-value <10−5 ).
The integrated protein domain recognition program
InterProScan was run locally to search the translated
unique consensus sequences against all of the InterPro protein domains. The functional categories of
these unique sequences were further identified using
the Gene Ontology (GO) Database (Ashburner et al.
2000) based on the existing mappings of InterPro
domains to the GO hierarchy. The biochemical pathway assignments were performed according to the
Kyoto Encyclopedia of Genes and Genomes (KEGG)
mapping (http://www.genome.ad.jp/kegg/kegg2.html).
Enzyme commission (EC) numbers were assigned to
the unique sequences based on BLASTX searching of
protein databases with a cutoff value of E <10−5 .
SSR detection
The detection of SSRs from the total unique sequences
of the H. serrata leaf was performed using the
Simple Sequence Repeat Identification Tool (SSRIT)
(http://www.gramene.org/db/markers/ssrtool). The SSRIT
accepts FASTA-formatted sequence files and reports the
sequence ID, SSR motif, number of repeats (di-, tri-,
tetra-, penta- or hexa-nucleotide repeat units), repeat
length and position of the SSR and the total length of
the sequence in which the SSRs were found (Temnykh
et al. 2001). The frequency of repeat classes (e.g. di-,
tri-, tetra-, penta- or hexa-nucleotide) was combined
by type; for example, GA repeats also encompassed
repeats identified as AG and their complementary
sequences TC or CT repeats. The search parameters for
the maximum motif-length group were set to hexamer
and those for the minimum number of repeats were
set to five. [Accession numbers: The EST data reported
in this paper are available in the GenBank databases
under the Accession Nos. (GO248777-GO248876 and
GO911766-GO915116)].
Results and discussion
Construction and general characteristics of the
H. serrata leaf cDNA library
To identify the genes and the expression profiles involved
in the cellular development of the H. serrata leaf and
in the biosynthesis of secondary metabolites in its
leaf, a cDNA library was constructed from leaves of
H. serrata. This library had a titer of 4.5 × 105 colonyforming units per milliliter. A total of 4012 cDNA clones
were randomly chosen from the library for sequencing,
generating 3451 high-quality ESTs with an average
sequence length of 685 bp after base calling and removal
3
Table 1. Overview of the results from H. serrata leaf cDNA library. a The
unique sequences were annotated by BLAST analysis against the public
databases.
Description
Number
Total number of clones sequenced
Total number of high quality ESTs
Average length per ESTs (bp)
G+C content (%)
Total number of unique sequences
Average length per unique sequence (bp)
Number of contigs
Number of singletons
Number of annotated unique sequencesa
Number of non-annotated sequences
4012
3451
685
46.0
1510
712
394
1116
1225
285
15
Percentage (%)
12
The unique sequences were used in similarity analysis
against the public databases. A total of 768 (50.86%)
unique sequences had significant hits (E-value <10−5 ) to
the sequences in the SwissProt database using BLASTX.
Of the remaining unannotated sequences (742 unique
sequences), 431 unique sequences were homologous
to sequences in the NCBI non-redundant protein (nr)
database by a BLASTX analysis. Finally, of the 311
still unannotated sequences, 26 unique sequences
showed similarities to sequences in the NCBI nonredundant nucleotide (nt) database using the BLASTN
algorithm. Together, 1225 (81.1%) unique sequences
were assigned putative identities based on significant
sequence similarities to at least one sequence in the
NCBI Protein or DNA databases (Table 1). The other
285 (18.9%) unique sequences showed no similarities
to any sequences in the public databases and they likely
represent novel transcripts (Table 1).
9
Functional categories of the unique sequences
by GO analysis
6
3
0
0
200
400
600
80 0
EST length (bp)
Fig. 1. The size distribution of EST length without vector sequences in
H. serrata leaf cDNA library.
of vector, short sequences and low-quality sequences
(Table 1). The size distribution of EST length without
vector sequences in H. serrata leaf cDNA library is given
in Fig. 1.
These high-quality ESTs were assembled into 394
contigs and 1116 singletons by a cluster analysis,
yielding a total of 1510 unique sequences with an
average length of 712 bp (Table 1). The contigs were
composed of multiple ESTs, ranging from 2 to 75, with
sequence length between 140 and 1513 bp. More than
13.5% contigs consisted of two sequences, followed by
the 7.2% having 3–5 sequences, and the 5.4% having
6–70 sequences. Redundancy ranged from one contig
with 163 sequences to 204 contigs with two sequences,
and 190 contigs with more than three sequences. The
average GC content of these high-quality ESTs was 46%
(Table 1), which is higher than what has been found
in Arabidopsis ESTs (43.4%) (Asamizu et al. 2000). This
EST dataset provides the first available information about
the H. serrata leaf transcriptome.
4
Annotation of unique sequences by BLAST analysis
Putative functions were assigned to 700 (46.36%) unique
sequences involved in cellular component, molecular
function and biological process categories by GO
analysis (Fig. 2). When mapped against the cellular
component GO terms, 283 unique sequences (40.4%)
each were directly associated with ‘Cell’ and ‘Cell part’
functions, respectively. A total of 155 unique sequences
(22.1%) were assigned to the ‘Macromolecular complex’
(Fig. 2A). In contrast, the majority of unique sequences
in the molecular function category were assigned
to ‘Catalytic activity’ (279 unique sequences, 39.9%)
and ‘Binding activity’ (257 unique sequences, 36.7%)
(Fig. 2B). When mapped against the biological process
category, 381 unique sequences (54.4%) were involved
in ‘Metabolic processes’; 354 unique sequences (50.6%)
were involved in ‘Cellular processes’ and 31 unique
sequences (4.4%) were involved in ‘Responses to stimuli’
(Fig. 2C). These results provide the very first global gene
expression profile of the H. serrata leaf.
Functional classification based on KEGG analysis
The unique sequences were assigned to biochemical
pathways described in KEGG based on their EC
numbers. A total of 1058 unique sequences (70.1%)
showed sequence similarities to genes in the KEGG
database. Only 332 unique sequences (31.4%) were
assigned EC numbers and mapped to 89 unique
KEGG biochemical pathways. The KEGG metabolic
pathways that were well represented by the 196
Physiol. Plant. 139, 2010
Table 2. Mappings of H. serrata unique sequences to KEGG biochemical pathways. a Percentage based on unique sequences (1058) with
significant similarities to sequences in KEGG database. b Unassigned
unique sequences are those that have significant similarities to known
sequences in KEGG database whose functions are unclear.
Cellular component
A
Cell
40.4%
Cell part
40.4%
Extracellular region
1.1%
Macromolecular complex
22.1%
23.0%
Organelle
KEGG categories represented
6.6%
Organelle part
2.0%
Others
0
10
20
30
40
50
60
Percentage of total unique sequences (%)
Molecular function
B
Binding
36.7%
Catalytic activity
39.9%
Electron carrier activity
4.9%
Structural molecule activity
14.4%
Transporter activity
Transcription\Translation
regulator activity
Others
7.1%
2.4%
3.3%
0
10
20
30
40
50
Percentage of total unique sequences (%)
C
60
Biological process
Biological regulation
5.9%
Cellular process
Establishment of localization
Localization
50.6%
9.1%
9.3%
Metabolic process
Pigmentation
Multi\-organism process
54.4%
5.0%
1.3%
4.4%
3.0%
Response to stimulus
Others
0
10
20
30
40
50
60
Percentage of total unique sequences (%)
Fig. 2. Distribution of GO-based cellular component (A), molecular
function (B) and biological process (C) for unique sequences from
H. serrata leaf cDNA library.
unique sequences (18.5%) of H. serrata included energy
metabolism (41 enzymes), carbohydrate metabolism
(23 enzymes), lipid metabolism (20 enzymes), amino
acid metabolism (12 enzymes), the biosynthesis of
secondary metabolites (9 enzymes) and xenobiotic and
biodegradation metabolism (9 enzymes) (Table 2).
For the biosynthesis pathways of secondary metabolites, the related EC numbers are listed in Appendix S1,
Supporting information. These enzymes are involved in
alkaloid biosynthesis, terpenoid and diterpenoid biosynthesis, phenylpropanoid and flavone/flavonoid biosynthesis.
A total of 105 unique sequences (9.9%) belonged
to the genetic information processing category,
which includes folding, sorting and degradation
(8 enzymes), replication and repair (2 enzymes), transcription (1 enzyme) and translation (2 enzymes)
Physiol. Plant. 139, 2010
Metabolism
Amino acid metabolism
Biosynthesis of secondary
metabolites
Carbohydrate metabolism
Energy metabolism
Glycan biosynthesis and
metabolism
Lipid metabolism
Metabolism of cofactors
and vitamins
Metabolism of other amino acids
Nucleotide metabolism
Xenobiotics biodegradation and
metabolism
Genetic information processing
Folding, sorting and degradation
Replication and repair
Transcription
Translation
Environmental information
processing
Membrane transport
Signal transduction
Cellular processes
Cell growth and death
Cell motility
Endocrine system
Human diseases
Cancers
Infectious diseases
Neurodegenerative disorders
Unassignedb
Unique sequences Percentagea
(no. of enzymes)
(%)
196 (131)
15 (12)
11 (9)
18.5
1.4
1.0
36 (23)
68 (41)
6 (6)
3.4
6.4
0.6
31 (20)
3 (2)
2.9
0.3
7 (5)
4 (4)
15 (9)
0.7
0.4
1.4
105 (13)
30 (8)
2 (2)
6 (1)
67 (2)
12 (3)
9.9
2.8
0.2
0.6
6.3
1.1
7 (1)
5 (2)
9 (1)
1 (0)
2 (0)
6 (1)
10 (6)
2 (1)
2 (1)
6 (4)
726
0.7
0.5
0.8
0.1
0.2
0.6
1.0
0.2
0.2
0.6
68.6
(Table 2). The KEGG pathways of environmental
information processing included membrane transport (1 enzyme) and signal transduction (2 enzymes)
(Table 2). The smallest number of unique sequences
(9 unique sequences, 1 enzyme) mapped to the cellular processes category (Table 2). Additionally, most of
the unique sequences (726 unique sequences, 68.6%)
remained unassigned to any known biochemical pathways (Table 2).
Highly expressed transcripts in the H. serrata
leaf library
An abundant representation of a specific sequence in a
cDNA library generally correlates with a high level of
5
Table 3. Assembled clusters that contain more than 15 ESTs in
H. serrata leaf.
No. of ESTs
BLASTX annotation
75
56
47
36
29
26
21
20
17
Unnamed protein product
Thioredoxin H-type
Photosystem II 10 kDa polypeptide
Pathogenesis-related protein 5 precursor
Auxin-repressed 12.5 kDa protein
Chlorophyll a-b binding protein 6A
Glutathione S-transferase ERD13
Glutathione S-transferase ERD13
BRASSINOSTEROID INSENSITIVE 1-associated
receptor kinase 1 precursor
Germin-like protein subfamily 1 member 7
Endo-1,3;1,4-beta-D-glucanase precursor
Photosystem I reaction center subunit N
17
17
16
expression in the original biological sample (Audic and
Claverie 1997). The transcripts with the highest levels of
expression were represented by more than 15 ESTs in the
H. serrata leaf cDNA library (Table 3). These transcripts
were mostly involved in redox reactions, metabolisms,
the phytohormone responses and photosystem reactions, and some of that represent ‘house-keeping’
genes.
Of the classes of EST assigned functions, thioredoxin,
photosystem polypeptide, glutathione S-transferase and
pathogenesis-related protein, were present abundantly in
this EST collection. A unique sequence consisting of 56
ESTs showed sequence similarity to H-type thioredoxin,
which is an important regulatory element in plant
metabolism (Schürmann and Jacquot 2000). A number of
metabolism-related transcripts encoding the cytochrome
b559 subunit alpha (13 ESTs), mannitol dehydrogenase
(14 ESTs) and fructose-bisphosphate aldolase (13 ESTs)
were also highly expressed (data not shown). Many
unique sequences matched to glutathione S-transferase
ERD13, which likely plays a role in the detoxification
of toxic compounds formed during stress (Oono
et al. 2003). Other transcripts encoded pathogenesisrelated protein 5 (36 ESTs) and germin-like protein
(17 ESTs), which may function in defense responses (Doll
et al. 2003).
The unique sequences associated with the phytohormone response and signal transduction, such as
the auxin-repressed 12.5 kDa protein (29 ESTs) and
the brassinosteroid insensitive 1-associated receptor
kinase 1 (17 ESTs), were expressed at high levels. Additionally, several unique sequences that represent encoding well-documented photosynthetic-related proteins
and may be involved in the photosystem reaction were
also abundant in this EST dataset (Table 3).
6
Table 4. Summary of di- and tri-nucleotide repeats in the unique
sequences of H. serrata leaf. a Number of the unique sequences
containing SSRs. b The relative percentage of the repeat compositions in
di- and tri-nucleotide repeats, respectively.
Repeat composition
Dinucleotide
AC/CA/GT/TG
AG/GA/CT/TC
AT/TA
CG/GC
Total no. of dinucleotides
Trinucleotide
AAC/CAA/ACA/GTT/TTG/TGT
AAG/GAA/AGA/CTT/TTC/TCT
AAT/TAA/ATA/ATT/TTA/TAT
ACC/CAC/CCA/GGT/GTG/TGG
ACG/CGA/GAC/CGT/GTC/TCG
ACT/CTA/TAC/AGT/TAG/GTA
AGC/CAG/GCA/TGC/CTG/GCT
AGG/GGA/GAG/TCC/CTC/CCT
ATC/CAT/TCA/GAT/ATG/TGA
CCG/CGC/GCC/GGC/GCG/CGG
Total no. of trinucleotides
Numbera
Percentageb
48
318
15
4
385
12.5
82.6
3.9
1.0
100
4
22
5
1
4
1
46
9
13
0
105
3.8
21.0
4.8
1.0
3.8
1.0
43.8
8.6
12.4
0
100
SSR detection
SSRs, also known as microsatellites, consisting of short
(1–6 bp) and tandemly repeated sequences, have been
shown to be one of the most powerful of genetic marker
systems in biology. A total of 501 potential SSR-motif
microsatellite loci were identified from the 393 H. serrata
leaf unique sequences (see Appendix S2, Supporting
information). Approximately 26% (393/1510) of the
H. serrata leaf unique sequences contained one or more
di-, tri-, tetra-, penta- or hexanucleotide SSRs. Di- and
tri- motifs were differently represented with percentages
of 76.8 and 20.9% in total (385 vs 105), respectively
(Table 4).
The relative frequency of repeats with different
dinucleotide compositions was bias among the four
possible repeat classes (Table 4). AG repeats were by
far the most common dinucleotide repeat, constituting
nearly 82.6% of dinucleotide repeats. The bias to AG
repeats in the H. serrata leaf unique sequences was
similar to the relative frequency in Arabidopsis (83%)
(Zhang et al. 2004). The next common dinucleotide
repeats in H. serrata were AC repeats with the relative
frequency of 12.5% compared with 4% for Arabidopsis,
followed by AT repeats at 3.9% in H. serrata compared
with 8% in Arabidopsis. Although AT repeats are thought
to be very abundant in the genomic sequences of plants
(Lagercrantz et al. 1993), this was not the case for the
H. serrata leaf unique sequences. CG repeats are very
infrequent and poorly represented in plants at 0.1% in
H. serrata similar to 0.14% in Arabidopsis.
Physiol. Plant. 139, 2010
Among the trinucleotide repeats, AGC/CAG/GCA/
TGC/CTG/GCT was the largest repeat class (43.8%),
followed by AAG/GAA/AGA/CTT/TTC/TCT (20.9%) and
ATC/CAT/TCA/GAT/ATG/TGA (12.4%) (Table 4). The
tetra- and hexanucleotides showed significant lower
values in the total (1.8 and 1.2%, respectively). The
pentanucleotide repeats constituted only 0.8% of the
total.
The majority (77.1%) of the SSR-containing unique
sequences had a single SSR per sequence, while 90
(22.9%) of them contained two or more putative SSRs
per sequence. Just 44.7% of the repeats were between 9
and 14 bases in length and 30.1% of the repeats were
longer than 20 nucleotides in length (see Appendix S2,
Supporting information). This result may depend on the
length of unique sequences detected in this study.
Identification of candidate genes
This EST collection contains signatures of many genes
involved in important traits in H . serrata. The unique
sequences were grouped into functional categories,
which facilitated to visualize the leaf transcriptome
and accelerate the identification of candidate transcripts
associated with the secondary metabolite biosynthesis and the developmental regulation. The lycopodium
alkaloids, especially HupA, were investigated extensively and intensively. Thus, those transcripts related to
the biosynthesis of lycopodium alkaloids were identified. We were able to identify transcripts specific to the
biosynthesis of triterpenoids and flavone/flavonoids in
this EST dataset. In addition, the transcripts associated
with the environmental responses including the phytohormone biosynthesis and signal transduction pathways,
that may play key roles in regulation of the development
of H. serrata, were also discovered.
The biosynthesis of secondary metabolites
Lycopodium alkaloid biosynthesis
H. serrata, the original resource of production of
HupA, also produces various lycopodium alkaloids.
Several types of these alkaloids have valuable medical
applications. However, the biosynthetic mechanisms
of these secondary metabolites in H. serrata remain
unclear. Lycopodium alkaloids originate from the
coupling of the pelletierine and 4PAA/4PAACoA
(Castillor et al. 1970, Ma and Gang 2004). Initially, Llysine is decarboxylated by LDC to form cadaverine.
Thus, LDC is the first enzyme that participates
in lycopodium alkaloids biosynthesis. Fortunately, a
unique sequence (GO914645) encoding a full-length
LDC -like gene with unknown function was found
in the EST collection, which was named HsLDC
(Table 5). The corresponding peptide sequence of
HsLDC contained the Pfam 03641 domain, which could
define a family including proteins annotated as putative
LDC (http://pfam.sanger.ac.uk/). The members of this
family share a highly conserved motif PGGXGTXXE
that is probably functionally important (Fig. 3). Given
the fact that the activity of LDC is rather low in
higher plants, there was few report of the purification
and characterization of any plant LDC (Herminghaus
et al. 1991). LDC activity could be regarded as
one limiting factor in the synthesis of cadaverinederived secondary metabolites (Herminghaus et al.
1991). Berlin et al. (1998) reported that the biosynthesis
of phenylpropanoid-polyamine conjugates can be
stimulated by overexpression of a heterologous bacterial
Table 5. H. serrata leaf unique sequences with significant sequence similarities to genes possibly involved in alkaloids (including HupA) biosynthesis.
GenBank accession no.
Cytochrome P450
GO914428
GO912402
GO913165,GO913064
GO913720
GO914381,GO911886
GO911852
Decarboxylase
GO912010
GO914645
Dioxygenase
GO912724
Methyltransferase
GO913407
GO914756
Physiol. Plant. 139, 2010
Sequence with highest similarity
E-value
Cytochrome P450 71A1 (Persea Americana)
Cytochrome P450 72A1 (Catharanthus roseus)
Cytochrome P450 72A1 (Catharanthus roseus)
Cytochrome P450 74A (Arabidopsis thaliana)
Cytochrome P450 77A1 (Solanum melongena)
Cytochrome P450 90A1 (Arabidopsis thaliana)
2.00E–33
2.00E–21
4.00E–24
5.00E–58
4.00E–80
1.00E–53
Diaminopimelate decarboxylase (Archaeoglobus fulgidus)
Lysine decarboxylase (Arabidopsis thaliana) (HsLDC)
5.00E–18
2.00E–79
Hyoscyamine 6-dioxygenase (Hyoscyamus niger)
5.00E–36
Methyltransferase type 11 (Cyanothece sp. PCC 7425)
(RS)-norcoclaurine 6-O-methyltransferase (Coptis japonica)
2.00E–38
7.00E–29
7
Fig. 3. Protein sequence alignment of HsLDC with putative lysine decarboxylases. The multiple sequence alignment presented the conserved
PGGXGTXXE motif, labeled by the asterisk, of the amino acid sequences of HsLDC and the closest Arabidopsis (At1g50575) and rice (Os03g0587100)
homologues. These proteins contained the Pfam 03641 domain, belonging to putative LDCs (http://pfam.sanger.ac.uk/).
Table 6. H. serrata leaf unique sequences with significant sequence similarities to genes involved in terpenoids (including serratane) biosynthesis.
GenBank accession no.
Sequence with highest similarity
GO913429
GO914776
GO911926
GO248858
GO912322
GO913346
GO912569
1-deoxy-D-xylulose5-phosphate reductoisomerase (Oryza sativa subsp. Japonica)
10-deacetylbaccatin III 10-O-acetyltransferase (Taxus cuspidate)
Acetyl-CoA acetyltransferase (Arabidopsis thaliana)
Farnesyl pyrophosphate synthetase 2 (Lupinus albus)
Isopentenyl-diphosphate Delta-isomerase I (Arabidopsis thaliana)
Isopentenyl-diphosphate Delta-isomerase I (Camptotheca acuminate)
Squalene synthetase (Nicotiana benthamiana)
LDC. Detection of the activity of HsLDC will facilitate the
elucidation of the biosynthesis processes of lycopodium
alkaloids in H. serrata.
In general, a total of 11 (0.7%) unique sequences
in this library showed sequence similarities to uncharacterized enzymes that may be associated with the
alkaloid biosynthesis (Table 5). These transcripts may
involved in a series of catalytic processes, namely decarboxylation, oxidation, hydroxylation and methylation,
mediated by decarboxylases, dioxygenases, cytochrome
P450-dependent monooxygenases, methyltransferases
and other enzymes, leading to the production of a series
of precursors to HupA and related alkaloids. These transcripts will be further characterized for their biological
functions in the biosynthesis of lycopodium alkaloids.
Terpenoid biosynthesis
Serratanes are a unique family of pentacyclic triterpenoids possessing seven tertiary methyl moieties
or functional groups and a central seven-membered
C-ring, originally isolated from plants of the Pinaceae
family and Lycopodium genus (Conner et al. 1981).
Previous studies have revealed that the plants of
H. serrata produce serratane-type triterpenoids ( Zhou
et al. 2003, 2004, Zhu et al. 1994). In plants, both
of the mevolonic acid (MVA) pathway in the cytosol
and the methylerythritolphosphate (MEP) pathway in the
chloroplast are used to synthesize the isoprenoides, the
original precursor of triterpenoides. Based on the progresses on the investigations of triterpenoid biosynthesis
in plants, we found several unique sequences encoding 1-deoxy-D-xylulose5-phosphate reductoisomerase,
squalene synthetase, farnesyl pyrophosphate synthetase
8
E-value
1.00E–53
1.00E–25
4.00E–89
8.00E–69
8.00E–70
1.00E–102
2.00E–92
and isopentenyl-diphosphate delta-isomerase I in our
EST dataset (Table 6), most of which may participate
in the biosynthesis of serratane-type triterpenoids in
H. serrata.
Flavone/flavonoid and anthocyanin biosynthesis
The identification of a gene specific to the flavonoid
biosynthetic pathway and the isolation of a flavone glycoside from H. serrata suggest that flavonoids, including
anthocyanins and flavone, are commonly found natural products present in this medicinal plant (Wanibuchi
et al. 2007, Yang et al. 2008). The unique sequences
related to flavone/flavonoid and anthocyanin pathways
in H. serrata are shown in Table 7, including the transcripts encoding of flavanone 3-hydroxylase, isoflavone
reductase homolog, leucoanthocyanidin dioxygenase,
NAD(P)H-dependent 6 -deoxychalcone synthase, anthocyanidin 3-O-glucosyltransferase and trans-cinnamate
4-monooxygenase. The use of ESTs has greatly facilitated the identification of candidate genes and lays the
foundation for the elucidation of the molecular basis of
flavonoid metabolism in H. serrata.
The phytohormone metabolism and signal
transduction
As expected, a subset of transcripts had similarities to
genes previously implicated in the secondary metabolite
biosynthesis. Another subset encodes protein kinases,
enzymes and transcription factors that may have a role
in phytohormone biosynthesis and signal transduction
as well as gene regulation events specific to the
development processes.
Physiol. Plant. 139, 2010
Table 7. H. serrata leaf unique sequences with significant sequence similarities to genes involved in flavanoids/anthocyanin biosynthesis.
GenBank accession no.
Sequence with highest similarity
E-value
GO913969
GO914425
GO912462,GO912981
GO914123,GO913199
GO911813,GO911948
GO913598
GO912262,GO91202,
GO912774,GO91405,
GO914108
GO912570
GO912551
GO914476,GO914480
GO912837
GO912280,GO911884,
GO913762,GO913220
GO914964,GO912007
GO248797
GO914186
GO913685
Anthocyanidin 3-O-glucosyltransferase (Manihot esculenta)
Caffeic acid 3-O-methyltransferase (Populus tremuloides)
Flavanone 3-hydroxylase (Arabidopsis thaliana)
Flavanone 3-hydroxylase (Arabidopsis thaliana)
Flavanone 3-hydroxylase (Eustoma grandiflorum)
Isoflavone reductase homolog (Lupinus albus)
Isoflavone reductase homolog P3 (Arabidopsis thaliana)
8.00E–19
1.00E–44
8.00E–33
1.00E–37
3.00E–12
1.00E–53
2.00E–91
Leucoanthocyanidin dioxygenase (Malus domestica)
Naringenin,2-oxoglutarate 3-dioxygenase(Callistephus chinensis)
Naringenin,2-oxoglutarate 3-dioxygenase (Callistephus chinensis)
Naringenin,2-oxoglutarate 3-dioxygenase (Dianthus caryophyllus)
Naringenin,2-oxoglutarate 3-dioxygenase (Dianthus caryophyllus)
2.00E–32
6.00E–27
6.00E–23
2.00E–38
3.00E–20
NAD(P)H-dependent 6 -deoxychalcone synthase (Glycine max)
Tocopherol O-methyltransferase (Arabidopsis thaliana)
Tocopherol O-methyltransferase (Arabidopsis thaliana)
Trans-cinnamate 4-monooxygenase (Helianthus tuberosus)
1.00E–38
4.00E–39
1.00E–34
5.00E–81
Table 8. H. serrata leaf unique sequences with significant sequence similarities to genes involved in phytohormone metabolism and signal
transduction.
GenBank accession no.
Phytohormone metabolism
GO248843
GO914176
GO912668
GO914650
GO912348, GO912347
GO914646,GO914444,
GO912348,GO912347
GO912059
GO912753,GO912652
GO911852
Phytohormone signal transduction
GO911793
GO913328
GO912769
GO913169,GO912555,GO912751,
GO913255,GO913656,GO913492,
GO914630,GO914889,GO912625,
GO913562,GO913449,GO912193,
GO912954,GO911951,GO914472,
GO912160,GO913618
Sequence with highest similarity
1-aminocyclopropane-1-carboxylate oxidase homolog 1 (Arabidopsis thaliana)
1-aminocyclopropane-1-carboxylate oxidase homolog 1 (Arabidopsis thaliana)
Cell elongation protein diminuto (Pisum sativum)
Cytokinin-O-glucosyltransferase 3 (Arabidopsis thaliana)
Gibberellin 2-beta-dioxygenase 1 (Pisum sativum)
Gibberellin 20 oxidase 1 (Arabidopsis thaliana)
Gibberellin 20 oxidase 1 (Arabidopsis thaliana)
Gibberellin 20-oxidase-like protein (Selaginella moellendorffii)
Jasmonate O-methyltransferase (Brassica rapa subsp. Pekinensis)
Steroid 23-alpha-hydroxylase (Arabidopsis thaliana)
2.00E–29
3.00E–34
2.00E–56
9.00E–12
4.00E–12
1.00E–35
3.00E–14
3.00E–09
3.00E–23
1.00E–53
Gibberellin receptor GID1L2 (Arabidopsis thaliana)
Gibberellin receptor GID1(Oryza sativa subsp. Japonica)
Brassinosteroid insensitive 1-associated receptor kinase 1 precursor (Arabidopsis thaliana)
Brassinosteroid insensitive 1-associated receptor kinase 1 precursor (Arabidopsis thaliana)
6.00E–23
2.00E–26
4.00E–42
1.00E–51
Phytohormones, which are synthesized and transported throughout the plant and acts at low concentrations, play important roles in the regulation
of plant development and environmental responses
(Achard et al. 2006, Bishopp et al. 2006, Gray 2004,).
Key enzyme-encoding transcripts involved in phytohormone biosynthesis were found in this EST study of
the H. serrata leaf (Table 8). These transcripts encode
Physiol. Plant. 139, 2010
E-value
1-aminocyclopropane-1-carboxylate oxidase, gibberellin 20 oxidase 1 and gibberellin 2-beta-dioxygenase 1,
which are involved in the ethylene and gibberellin
biosynthetic pathways, respectively (Hamilton et al.
1990, Olszewski et al. 2002, Thomas et al. 1999). In
addition, the unique sequences with sequence similarities to those of cytokinin-O-glucosyltransferase 3
and jasmonate O-methyltransferase, which may be
9
associated with the metabolic processes of cytokinin
and jasmonate (Sakakibara 2006, Seo et al. 2001),
respectively, also present in our EST dataset. The transcripts involved in gibberellin and brassinosteroid signal transduction were also identified, including those
encoding the gibberellin receptors GID1L2 and GID1
and brassinosteroid insensitive 1-associated receptor
kinase 1 (Belkhadir and Chory 2006, Ikeda et al. 2001,
Li and Jin 2007, Ueguchi-Tanaka et al. 2005) (Table 8).
These transcripts may exert essential functions on the regulation of the development and environmental responses
of H. serrata.
Our EST dataset contained a few of unique sequences
participating in transcriptional regulation. These unique
sequences have very high homology with known
transcription factors, including WRKY (GO913556),
zinc finger A20 and AN1 domain-containing proteins
(GO911767, GO911913), TRAF-type zinc finger protein
(GO914042) and RING-H2 finger protein (GO914999,
GO914647, GO914685, GO913341). WRKY is one
class of transcription factor only found in plants,
characterized by a highly conserved amino acid motif
with WRKYGQK at its N-terminus and a metal chelating
zinc finger signature at the C-terminus (Mangelsen et al.
2008). WRKY proteins play roles in pathogen defenses,
plant development regulation and sugar signaling
transduction(Lagace and Matton 2004, Lai et al. 2008,
Sun et al. 2003, Zhang et al. 2008). The A20/AN1 zinc
finger proteins represent common elements of stress
response in plants and have been identified in rice
and Arabidopsis (Vij and Tyagi 2008). These diseaseor stress-associated transcripts may be involved in the
responses to environments during the development
processes of H. serrata.
Conclusion
This is the first report of the use of EST analysis
to study the gene expression profiles in H. serrata, a
representative member of the Huperziaceae family. The
EST dataset will provide significant resource for gene
identification and molecular breeding in H. serrata. A
number of unique sequences reported in this study will
be completely sequenced and characterized, and this
will improve our understanding of H. serrata in the areas
of secondary metabolite biosynthesis, development
regulation and SSR-associated genetic selection. In
addition, the 501 SSR markers developed from the
library will facilitate the mapping on the populations
of the Huperziaceae family.
We investigate the transcripts associated with the
biosynthesis of secondary metabolites and the developmental regulation from H. serrata, the scale of gene
10
detection is limited. To identify more transcripts involved
in the biosynthesis of bioactive compounds in H. serrata,
we will use the strategies of the tissue-specific normalized cDNA library construction and the high throughput EST generation based on the ‘next-generation’
sequencing technology in future. These studies will
facilitate the elucidation of the entire biosynthetic pathways of lycopodium alkaloid, the main pharmaceutical
resource, and understand the development mechanisms
on H. serrata at the molecular level.
Acknowledgements – This study was supported by the
National Natural Science Foundation of China (30900113).
We thank Professor Yu-Lin Lin (Institute of Medicinal
Plant Development, Chinese Academy of Medical Sciences
& Peking Union Medical College, China) for his kind
help in the authentication of the plant of H. serrata. We
thank Professor Chang Liu (Molecular Chinese Medicine
Laboratory, LKS Faculty of Medicine, The University of
Hong Kong, Hong Kong, China) for his kind advice on the
revisions of this manuscript.
References
Achard P, Cheng H, De Grauwe L, Decat J, Schoutteten H,
Moritz T, Van Der Straeten D, Peng J Harberd NP (2006)
Integration of plant responses to environmentally
activated phytohormonal signals. Science 311: 91–94
Adams MD, Kelley JM, Gocayne JD, Dubnick M,
Polymeropolos MH, Xiao H, Merril CR, Wu A, Olde B,
Moreno RF, Mccombe WR, Venter JC (1991)
Complementary DNA sequencing: expressed sequence
tags and human genome project. Science 252:
1651–1656
Asamizu E, Nakamura Y, Sato S, Tabata S (2000) A large
scale analysis of cDNA in Arabidopsis thaliana:
generation of 12,028 non-redundant expressed
sequence tags from normalized and size-selected cDNA
libraries. DNA Res 7: 175–180
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H,
Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig J,
Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S,
Matese JC, Richardson JE, Ringwald M, Rubin GM,
Sherlock G. The Gene Ontology Consortium (2000)
Gene Ontology: tool for the unification of biology. Nat
Genet 25: 25–29
Audic S, Claverie JM (1997) The significance of digital
gene expression profiles. Genome Res 7: 986–995
Belkhadir Y, Chory J (2006) Brassinosteroid signaling: a
paradigm for steroid hormone signaling from the cell
surface. Science 314: 1410–1411
Berlin J, Mollenschott C, Herminghaus S, Fecker, LF (1998)
Lysine decarboxylase transgenic tobacco root cultures
biosynthesize novel hydroxycinnamoylcadaverines.
Phytochemistry 48: 79–84
Physiol. Plant. 139, 2010
Bishopp A, Mahonen AP, Helariutta Y (2006) Signs of
change: hormone receptors that regulate plant
development. Development 133: 1857–1869
Brandle JE, Richman A, Swanson AK, Chapman BP (2002)
Leaf ESTs from Stevia rebaudiana: a resource for gene
discovery in diterpene synthesis. Plant Mol Biol 50:
613–622
Castillor M, Guptay N, Ho K, Macleana DB, Spenser Ian D
(1970) Biosynthesis of lycopodine. Incorporation of
A’-piperideine and of pelletierine’. Can J Chem 48:
2911–2918
Ching RC (1978) The Chinese fern families and genera:
systematic arrangement and historical origin. Acta
Phytotax Sin 16: 1–9
Comins DL, Al-awar RS (1995) Model studies toward the
synthesis of the lycopodium alkaloid, phlegmarine.
J Org Chem 60: 711–716
Conner AH, Haromy TP, Sundaralingam M (1981)
30-Nor-3.beta.-methoxyserrat-14-en-21-one: first
reported natural occurrence of a norserratene triterpene.
J Org Chem 46: 2987–2988
Doll J, Hause B, Demchenko K, Pawlowski K, Krajinski F
(2003) A member of the germin-like protein family is a
highly conserved mycorrhiza-specific induced gene.
Plant Cell Physiol 44: 1208–1214
Gray WM (2004) Hormonal regulation of plant growth and
development. PLoS Biol 2: 1270–1273
Hamilton AJ, Lycett GW, Grierson D (1990) Antisense
gene that inhibits synthesis of the hormone ethylene in
transgenic plants. Nature 346: 284–287
Herminghaus S, Schreier PH, McCarthy JEG, Landsmann J,
Botterman J, Berlin J (1991) Expression of a bacterial
lysine decarboxylase gene and transport of the protein
into chloroplasts of transgenic tobacco. Plant Mol Biol
17: 475–486
Ikeda A, Ueguchi-Tanaka M, Sonoda Y, Kitano H,
Koshioka M, Futsuhara Y, Matsuoka M, Yamaguchi J
(2001) Slender rice, a constitutive gibberellin response
mutant, is caused by a null mutation of the SLR1 gene,
an ortholog of the height-regulating gene
GAI/RGA/RHT/D8. Plant Cell 13: 999–1010
Jung JD, Park HW, Hahn Y, Hur CG, In DS, Chung HJ,
Liu JR, Choi DW (2003) Discovery of genes for
ginsenoside biosynthesis by analysis of ginseng
expressed sequence tags. Plant Cell Rep 22: 224–230
Lagace M, Matton DP (2004) Characterization of a WRKY
transcription factor expressed in late torpedo-stage
embryos of Solanum chacoense. Planta 219: 185–189
Lagercrantz U, Ellegren H, Andersson L (1993) The
abundance of various polymorphic microsatellite motifs
differs between plants and vertebrates. Nucleic Acids
Res 21: 1111–1115
Lai ZB, Vinod K, Zheng ZY, Fan BF and Chen ZX (2008)
Roles of Arabidopsis WRKY3 and WRKY4 transcription
Physiol. Plant. 139, 2010
factors in plant responses to pathogens. BMC Plant Biol
8: 68
Li J, Jin H (2007) Regulation of brassinosteroid signaling.
Trends Plant Sci 12: 37–41
Liu JS, Zhu YL, Yu CM, Zhou YZ, Han YY, Wu FW, Qi BF
(1986) The structures of huperzine A and B, two new
alkaloids exhibiting marked anticholinesterase activity.
Can J Chem 64: 837–839
Ma XQ, Gang DR (2004) The lycopodium alkaloids. Nat
Prod Rep 21: 752–772
Ma XQ, Tan CH, Zhu DY, Gang DR (2005) Is there a
better source of huperzine A than Huperzia serrata?
huperzine A content of Huperziaceae species in China.
J Agric Food Chem 53: 1393–1398
Ma XQ, Tan CH, Zhu DY, Gang DR (2006) A survey of
potential huperzine A natural resources in China: The
Huperziaceae. J Ethnopharmacol 104: 54–67
Ma XQ, Tan CH, Zhu DY, Gang DR, Xiao PG (2007)
Huperzine A from Huperzia species – An
ethnopharmacolgical review. J Ethnopharmacol 113:
15–34
Mangelsen E, Kilian J, Berendzen KW, Kolukisaoglu ÜH,
Harter K, Jansson C, Wanke D (2008) Phylogenetic
and comparative gene expression analysis of barley
(Hordeum vulgare) WRKY transcription factor family
reveals putatively retained functions between monocots
and dicots. BMC Genomics 9: 194
Morgante M, Hanafey M, Powell W (2002) Microsatellites
are preferentially associated with nonrepetitive DNA in
plant genomes. Nat Genet 30: 194–200
Morita H, Kondo S, Kato R, Wanibuchi K, Noguchi H,
Sugio S, Abe I, Kohno T (2007) Crystallization and
preliminary crystallographic analysis of an
acridone-producing novel multifunctional type III
polyketide synthase from Huperzia serrata. Acta Cryst
63: 576–578
Murata J, Bienzle D, Brandle JE, Sensen CW, Luca VD
(2006) Expressed sequence tags from Madagascar
periwinkle (Catharanthus roseus) FEBS Lett 580:
4501–4507
Nyembo L, Goffin A, Hootele C, Braekman JC (1978)
Phlegmarine, a likely key intermediate in the
biosynthesis of the lycopodium alkaloids. Can J Chem
56: 851–856
Ohlrogge J, Benning C (2000) Unraveling plant
metabolism by EST analysis. Curr Opin Plant Biol 3:
224–228
Olszewski N, Sun TP, Gubler F (2002) Gibberellin
signaling: biosynthesis, catabolism, and response
pathways. Plant Cell 14: S61–S80
Oono Y, Seki M, Nanjo T, Narusaka M, Fujita M, Satoh R,
Satou M, Sakurai T, Ishida J, Akiyama K, Iida K,
Maruyama K, Satoh S, Yamaguchi-Shinozaki K,
Shinozaki K (2003) Monitoring expression profiles of
Arabidopsis gene expression during rehydration process
11
after dehydration using ca. 7000 full-length cDNA
microarray. Plant J 34: 868–887
Paraoan L, Grierson I, Maden BE (2000) Analysis of
expressed sequence tags of retinal pigment epithelium:
cystatin C is an abundant transcript. Int J Biochem Cell
Biol 32: 417–426
Sakakibara H (2006) Cytokinins: activity, biosynthesis, and
translocation. Annu Rev Plant Biol 57: 431–449
Schürmann P, Jacquot JP (2000) Plant thioredoxin systems
revisited. Annu Rev Plant Physiol Mol Biol 51: 371–400
Seo HS, Song JT, Cheong JJ, Lee YH, Lee YW, Hwang I,
Lee JS, Choi YD (2001) Jasmonic acid carboxyl
methyltransferase: a key enzyme for
jasmonate-regulated plant responses. Proc Natl Acad Sci
USA 98: 4788–4793
Shu QY, Wischnitzki E, Liu ZA, Ren HX, Han XY, Hao Q,
Gao FF, Xu SX, Wang LS (2009) Functional annotation
of expressed sequence tags as a tool to understand the
molecular mechanism controlling flower bud
development in tree peony. Physiol Plant 135: 436–449
Sun C, Palmqvist S, Olsson H, Borén M, Ahlandsberg
S, Jansson C (2003) A novel WRKY transcription factor,
SUSIBA2, participates in sugar signaling in barley by
binding to the sugar-responsive elements of the iso1
promoter. Plant Cell 115: 2076–2092
Suzuki H, Achnine L, Xu R, Matsuda SPT, Dixon RA
(2002) A genomics approach to the early stages of
triterpene saponin biosynthesis in Medicago truncatula.
Plant J 32: 1033–1048
Tang XC (1996) Huperzine A (Shuangyiping): a promising
drug for Alzheimer’s disease. Acta Pharmacol Sin 17:
481–484
Temnykh S, DeClerck G, Lukashova A, Lipovich L,
Cartinhour S, McCouch S (2001) Computational and
experimental analysis of microsatellites in Rice (Oryza
sativa L.): frequency, length variation, transposon
associations, and genetic marker potential. Genome Res
11: 1441–1452
Thomas SG, Phillips AL, Hedden P (1999) Molecular
cloning and functional expression of gibberellin
2-oxidases, multifunctional enzymes involved in
gibberellin deactivation. Proc Natl Acad Sci USA 96:
4698–4703
Ueguchi-Tanaka M, Ashikari M, Nakajima M, Itoh H,
Katoh E, Kobayashi M, Chow TY, Hsing YC, Kitano H,
Yamaguchi I, Matsuoka M (2005) GIBBERELLIN
INSENSITIVE DWARF1 encodes a soluble receptor for
gibberellin. Nature 437: 693–698
Vij S, Tyagi AK (2008) A20/AN1 zinc-finger
domain-containing proteins in plants and animals
represent common elements in stress response. Funct
Integr Genomics 8: 301–307
Wanibuchi K, Zhang P, Abe T, Morita H, Kohno T,
Chen GS, Noguchi H, Abe I (2007) An
acridone-producing novel multifunctional type c ó
polyketide synthase from Huperzia serrata. FEBS J 274:
1073–1082
Yang YB, Yang XQ, Xu YQ, Tai ZG, Ding ZT (2008) A new
flavone glycoside from Huperzia serrata. Chin J Nat Med
6: 408–410
Zhang L, Yuan D, Yu S, Li Z, Cao Y, Miao Z, Qian H,
Tang K (2004) Preference of simple sequence repeats in
coding and non-coding regions of Arabidopsis thaliana.
Bioinformatics 20: 1081–1086
Zhang J, Peng YL, Guo ZJ (2008) Constitutive expression of
pathogen-inducible OsWRKY31 enhances disease
resistance and affects root growth and auxin response in
transgenic rice plants. Cell Res 18: 508–521
Zhou H, Jiang SH, Tan CH, Wang BD, Zhu DY (2003)
New epoxyserratanes from Huperzia serrata. Planta Med
69: 91–94
Zhou H, Li YS, Tong XT, Liu HQ, Jiang SH, Zhu DY (2004)
Serratane-type triterpenoids from Huperzia serrata. Nat
Prod Res 18: 453–459
Zhu DY, Jiang S H, Huang M F, Lin LZ, Cordell GA (1994)
Huper serratinine from Huperzia serrata. Phytochemistry
36: 1069–1072
Supporting Information
Additional Supporting Information may be found in the
online version of this article:
Appendix S1. Key enzyme discovery in the unique
sequences of H. serrata leaf. The unique sequences
encoding key enzymes involved in the biosynthesis of
alkaloids, brassinosteroids, flavone/flavonoids, phenylpropanoids, terpenoids and steroids.
Appendix S2. SSR discovery in the unique sequences
of H. serrata leaf. The unique sequences containing the
putative SSRs per sequence are listed in this table with
unique sequence-ID, motif, number of motif repeats, SSR
start coordinate, SSR end coordinate, unique sequence
length, repeat length and the GenBank accession
no. of the corresponding ESTs composed of unique
sequences.
Please note: Wiley-Blackwell are not responsible for
the content or functionality of any supporting materials
supplied by the authors. Any queries (other than missing
material) should be directed to the corresponding author
for the article.
Edited by D. Campbell
12
Physiol. Plant. 139, 2010