Download MADS-Box Gene Diversity in Seed Plants 300 Million Years Ago

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Evolutionary history of plants wikipedia , lookup

Plant breeding wikipedia , lookup

Glossary of plant morphology wikipedia , lookup

Plant evolutionary developmental biology wikipedia , lookup

Transcript
MADS-Box Gene Diversity in Seed Plants 300 Million Years Ago
Annette Becker, Kai-Uwe Winter, Britta Meyer, Heinz Saedler, and Günter Theißen
Max-Planck-Institut für Züchtungsforschung, Abteilung Molekulare Pflanzengenetik, Köln, Germany
MADS-box genes encode a family of transcription factors which control diverse developmental processes in flowering plants ranging from root development to flower and fruit development. Through phylogeny reconstructions,
most of these genes can be subdivided into defined monophyletic gene clades whose members share similar expression patterns and functions. Therefore, the establishment of the diversity of gene clades was probably an
important event in land plant evolution. In order to determine when these clades originated, we isolated cDNAs of
19 different MADS-box genes from Gnetum gnemon, a gymnosperm model species and thus a representative of
the sister group of the angiosperms. Phylogeny reconstructions involving all published MADS-box genes were then
used to identify gene clades containing putative orthologs from both angiosperm and gymnosperm lineages. Thus,
the minimal number of MADS-box genes that were already present in the last common ancestor of extant gymnosperms and angiosperms was determined. Comparative expression studies involving pairs of putatively orthologous genes revealed a diversity of patterns that has been largely conserved since the time when the angiosperm
and gymnosperm lineages separated. Taken together, our data suggest that there were already at least seven different
MADS-box genes present at the base of extant seed plants about 300 MYA. These genes were probably already
quite diverse in terms of both sequence and function. In addition, our data demonstrate that the MADS-box gene
families of extant gymnosperms and angiosperms are of similar complexities.
Introduction
MADS-box genes (Schwarz-Sommer et al. 1990)
encode transcription factors which play important roles
in developmental control in plants, animals, and fungi
(Shore and Sharrocks 1995; Theißen and Saedler 1995;
Theißen, Kim, and Saedler 1996; Riechmann and Meyerowitz 1997; Theißen et al. 2000). Some plant MADSbox genes, such as DEFICIENS (DEF) from Antirrhinum majus and AGAMOUS (AG) from Arabidopsis thaliana, work as organ identity (homeotic selector) genes
during flower development (Sommer et al. 1990; Yanofsky et al. 1990). Floral organ identity genes can be
subdivided into four different classes, termed A-, B-, Cand D-function genes, whose members provide four different homeotic functions, with A specifying sepals;
A1B, petals; B1C, stamens; C, carpels; and D, ovules
(Weigel and Meyerowitz 1994; Angenent and Colombo
1996). Most floral organ identity genes that could be
cloned to date belong to the family of MADS-box genes
(for recent reviews, see Theißen, Kim, and Saedler
1996; Riechmann and Meyerowitz 1997; Theißen et al.
2000). The MADS-type floral homeotic genes of Arabidopsis are APETALA1 (AP1; A-function), APETALA3
and PISTILLATA (AP3 and PI; B-function), and AG (Cfunction). An Arabidopsis D-function gene has not been
described to date.
Besides providing floral homeotic functions,
MADS-box genes have many other roles within the gene
networks that ‘‘control’’ reproductive development in
angiosperms such as Arabidopsis (for reviews, see Okada and Shimura 1994; Theißen and Saedler 1995, 1998,
1999; Theißen, Kim, and Saedler 1996). FLC, for exKey words: MADS-box gene, gymnosperm, angiosperm, Gnetales, development, evolution.
Address for correspondence and reprints: Günter Theißen, MaxPlanck-Institut für Züchtungsforschung, Abteilung Molekulare Pflanzengenetik, Carl-von-Linné-Weg 10, D-50829 Köln, Germany. E-mail:
[email protected].
Mol. Biol. Evol. 17(10):1425–1434. 2000
q 2000 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038
ample, is a ‘‘flowering time gene’’ which mediates, depending on environmental factors such as cold, the
switch from vegetative to reproductive development
(Michaels and Amasino 1999). Flowering time genes
exert their function by influencing meristem identity
genes. Inflorescence meristem identity genes, such as the
MADS-box gene FRUITFULL (FUL), and floral meristem identity genes, such as the MADS-box genes AP1
and CAULIFLOWER (CAL), specify the identities of inflorescence and floral meristems, respectively, and thus
‘‘control’’ the transition from one meristem type to the
other. Within floral meristems, cadastral genes set the
boundaries of floral organ identity gene functions, thus
defining the different floral whorls. Besides its role as a
floral organ identity gene, AG also has a cadastral function, because it prevents the A-function from being expressed in the third and fourth floral whorls. Some ‘‘intermediate genes,’’ such as AGL2, AGL4, and AGL9,
possibly mediate between floral meristem and organ
identity genes. The floral organ identity genes specify
the organ identity within each whorl of the flower by
activating ‘‘realizator genes’’. After fertilization of the
flower, MADS-box genes such as AGL1, AGL5, and
FUL are required for proper fruit development (Gu et
al. 1998; Liljegren et al. 1998).
Moreover, transcription of a number of MADS-box
genes outside flowers and fruits suggests that members
of this gene family play regulatory roles also during
vegetative development, such as embryo, root, and leaf
development (Ma, Yanofsky, and Meyerowitz 1991;
Huang et al. 1995; Rounsley, Ditta, and Yanofsky 1995;
Theißen et al. 2000). Analysis of a transgenic mutant
indicated that the MADS-box gene ANR1 is a key component of the signal transduction chain by which nitrate
stimulates lateral root proliferation (Zhang and Forde
1998). The existence of MADS-box genes in gymnosperms and ferns, which form neither flowers nor fruits,
further demonstrates that the role of these genes in
plants is not restricted to flower or fruit development
(Tandre et al. 1995; Münster et al. 1997; Mouradov et
1425
1426
Becker et al.
al. 1998, 1999; Rutledge et al. 1998; Sundström et al.
1999; Winter et al. 1999).
There are reasons to assume that changes in number, expression, and interaction of developmental control
genes all have contributed to the evolution of plant form
(Theißen and Saedler 1995; Theißen, Kim, and Saedler
1996; Theißen et al. 2000). Since MADS-box genes play
important roles in the gene networks that ‘‘control’’
plant development, understanding the phylogeny of
MADS-box genes may strongly improve our understanding of plant evolution. When and how was the diversity of MADS-box genes present in flowering plants
such as Arabidopsis generated during evolution? Did it
appear during angiosperm evolution, or is it considerably older? Do changes in the expression and function
of these genes reflect morphological innovations during
plant evolution? To answer these questions, the phylogeny of MADS-box genes has to be reconstructed and
superimposed on the phylogeny of land plant taxa. As
a prerequisite, the MADS-box gene families of phylogenetically informative key taxa have to be
characterized.
For the angiosperm model species A. thaliana, we
are already quite close to a complete knowledge of all
MADS-box genes: more than 40 of them have been isolated so far (Liljegren et al. 1998), and based on a large
fraction of the A. thaliana genome which has already
been sequenced, their total number in the genome can
be estimated to be about 60. However, only for a minority of genes is the function defined by a mutant phenotype. In the other key taxa of land plants (gymnosperms, pteridophytes, bryophytes), the sampling of
MADS-box genes is far less complete (Theißen et al.
2000). Nevertheless, some key insights into the phylogeny of plant MADS-box genes have already been
obtained.
Phylogeny reconstructions revealed that the
MADS-box gene family is composed of several defined
gene clades (J. J. Doyle 1994; Purugganan et al. 1995;
Theißen, Kim, and Saedler 1996; Theißen et al. 2000).
Almost all plant MADS-box genes known to date are
members of a monophyletic superclade of genes with a
conserved structural organization, the so-called MIKCtype domain structure, including a MADS (M-), an intervening (I-), a keratin-like (K-), and a C-terminal (C-)
domain (Ma, Yanofsky, and Meyerowitz 1991; Theißen,
Kim, and Saedler 1996; Hasebe and Banks 1997; Münster et al. 1997). The highly conserved MADS-domain
is the major determinant of DNA binding, but it also
performs dimerization and accessory factor binding
functions (Shore and Sharrocks 1995). The I-domain is
only relatively weakly conserved among plant MADSdomain proteins (Purugganan et al. 1995), but it may
constitute a key molecular determinant for the selective
formation of DNA-binding dimers (Riechmann and
Meyerowitz 1997). The K-domain is characterized by a
conserved regular spacing of hydrophobic residues,
which is proposed to allow for the formation of an amphipathic helix involved in protein dimerization (Ma,
Yanofsky, and Meyerowitz 1991; Shore and Sharrocks
1995). The most variable region, both in sequence and
in length, is the C-domain at the C-terminal end of the
MADS-domain proteins. In some MADS-domain proteins, it is involved in transcriptional activation or the
formation of multimeric transcription factor complexes,
respectively (Cho et al. 1999; Egea-Cortines, Saedler,
and Sommer 1999).
The MIKC-type gene superclade can be further
subdivided into several well-defined gene clades whose
members share similar expression patterns and highly
related functions. For example, all A-, B-, C- and Dfunction genes known to date fall into separate clades,
namely SQUAMOSA- (A-function), DEFICIENS- or
GLOBOSA- (B-function), and AGAMOUS-like genes
(C- and D-function) (J. J. Doyle 1994; Purugganan et
al. 1995; Theißen and Saedler 1995; Angenent and Colombo 1996; Theißen, Kim, and Saedler 1996; Münster
et al. 1997; Theißen et al. 2000). Therefore, the establishment of the mentioned gene clades by gene duplication, diversification, and fixation was probably an important step toward the establishment of the floral homeotic functions (Theißen, Kim, and Saedler 1996).
Former studies of MADS-box gene evolution,
based mainly on sequences from ferns and angiosperms,
have led to the conclusion that the last common ancestor
of ferns and seed plants about 400 MYA had at least
two different MIKC-type genes, but no orthologs of any
of the MADS-box genes were present in angiosperms
yet (Münster et al. 1997; Theißen et al. 2000). In the
last common ancestor of monocots and dicots about 200
MYA, however, the vast majority of MADS-box gene
types known from Arabidopsis, including orthologs of
all types of floral homeotic genes, were already established (Theißen et al. 2000). These gene lineages therefore must have been established in the lineage that led
to the flowering plants after the separation from the fern
lineage but very likely before the radiation of the flowering plants. The only extant taxa whose lineage
branched off from the lineage that led to flowering
plants during this critical time interval 400–200 MYA
are the gymnosperms. Molecular data are currently converging on the view that extant gymnosperms are a
monophyletic group (Goremykin et al. 1996; Chaw et
al. 1997; Qiu et al. 1999; Samigullin et al. 1999; Soltis
et al. 1999) which separated from the lineage that led
to angiosperms about 300 MYA (Wolfe et al. 1989; Savard et al. 1994; Goremykin, Hansmann, and Martin
1997).
In order to determine how many gene clades are
shared by angiosperms and gymnosperms, we screened
for MADS-box genes in the gymnosperm Gnetum gnemon, a member of the gnetophytes. Phylogeny reconstructions and comparison of expression patterns were
then used to determine the minimal number and type of
MADS-box genes already present in the last common
ancestor of gymnosperms and angiosperms, providing a
minimal estimate for the structural and functional diversity of this complex regulatory gene family in plants
300 MYA.
MADS-Box Gene Diversity in Seed Plants
1427
FIG. 1.—Southern blot analysis of MADS-box genes in G. gnemon. DNA isolated from leaves of an individual tree growing in the botanical
garden of Bochum was digested with XbaI (X) or HindIII (H) as indicated above the lanes, electrophoresed, blotted onto nylon membranes,
and hybridized under stringent conditions with probes specific for the different GGM genes (1–19) as depicted in the figure. At the left margin,
the lengths of some marker molecules (in kb) are indicated. In some lanes, two bands can be seen, either due to the presence of internal XbaI
or HindIII sites in the respective genes (GGM2, GGM7, GGM10, GGM19) or due to the presence of two very similar genes or two different
alleles in the Gnetum gnemon genome (GGM18) as confirmed by sequence analysis and Southern blots using four additional restriction enzymes
(data not shown).
Materials and Methods
Plant Material
Sequence Alignments and Construction of
Phylogenetic Trees
Leaves and cones of male and female G. gnemon
trees growing in the botanical gardens of the University
of Bochum or Karlsruhe, Germany, were used throughout this study.
Multiple alignments of conceptual amino acid sequences were generated by using the PILEUP program
of the GCG package (version 10.0) with a gap creation
penalty of 8 and a gap extension penalty of 2 (default
parameters). Based on alignments of the MADS-domain
(60 amino acids) plus the 110 amino acids downstream
of the MADS-domain (termed ‘‘MADS1110’’ or
‘‘170’’ domain sequence; see Theißen, Kim, and Saedler
1996; Winter et al. 1999), phylogenetic trees were constructed by the neighbor-joining method (Saitou and Nei
1987), version 3.5, as implemented by the PHYLIP program package (Felsenstein 1993). The neighbor-joining
method was chosen because it is known to be quite efficient in obtaining reliable trees from large sets of data
(Zhang and Nei 1996). Distance matrices were generated using the protein distance algorithm, version 3.55c,
which is based on the PAM model of amino acid transition (Dayhoff 1979). To assess support for the inferred
relationships, 100 bootstrap samples were generated as
described (Münster et al. 1997).
Isolation of cDNAs
cDNAs were isolated by rapid amplification of cDNA
ends (RACE) procedures as described (Winter et al. 1999).
As template, poly A1 RNA isolated from leaves or cones
of male or female G. gnemon trees were used. Sequences
of primers used during the 39 and 59 RACE procedures
can be downloaded from our home page (http://www.
mpiz-koeln.mpg.de/;theissen/grouphome/index.html).
The obtained cDNAs were sequenced on both strands using automatic sequencers. The nucleotide sequence data of
the cDNAs have been deposited in the EMBL, GenBank,
and DDBJ nucleotide sequence databases under the accession numbers AJ132207–AJ132219 (GGM1–GGM13) and
AJ251554–AJ251559 (GGM14–GGM19).
Hybridization Studies
Hybridization probes were obtained from the region downstream of the MADS-box to avoid cross-hybridization with other gene family members. For Southern analyses, DNA gel blots were prepared by standard
methods (Sambrook, Fritsch, and Maniatis 1989) with
10 mg DNA per lane, isolated from G. gnemon leaf material, and digested with restriction enzymes BamHI,
EcoRI, EcoRV, HindII, HindIII, or XbaI. For the synthesis of probes, linear PCR was employed essentially
as described (Fischer et al. 1995), but PCR products of
GGM gene cDNAs were used as templates, and different
gene-specific oligonucleotides were used as primers.
The filters were hybridized and washed as described
elsewhere (Münster et al. 1997). Northern analyses were
carried out as described (Winter et al. 1999) using several different Northern blots, onto which, however, aliquots of the same RNA preparations from leaves and
male as well as female cones had always been
transferred.
Results
cDNA Cloning and Structural Evaluation of MADSBox Genes from G. gnemon
We cloned and sequenced the cDNAs of 19 different MADS-box genes, termed GGM1–GGM19, from the
gnetophyte G. gnemon (for GGM1–GGM13, see also
Winter et al. 1999). Hybridization of Southern blots containing genomic DNA of an individual G. gnemon tree
with different probes specific for each of the GGM genes
under stringent conditions indicated that GGM1–
GGM19 represent 18 different single-copy genes, plus
one gene (GGM18) that either is present in two different
alleles or has a duplicate locus with a very similar sequence in the G. gnemon genome (fig. 1).
Sequence comparisons revealed that the products
encoded by the GGM genes have a MIKC-type domain
structure (fig. 2), like almost all MADS-domain proteins
isolated from vascular plants. The high conservation
known for the MADS-domains of angiosperm proteins
is also obvious for the sequences from Gnetum (fig. 2).
1428
Becker et al.
FIG. 2.—Conserved domain structure of MADS-domain proteins from Gnetum gnemon. Conceptual amino acid sequences of GGM genes
were aligned by a computer program. A ‘‘,’’ sign at the beginning of some sequences indicates that they are incomplete at the N-terminus due
to the cloning procedures used. The MADS-, I-, K-, and C-domains are indicated. Within the K-domain and in its vicinity, hydrophobic amino
acids (L, I, V, M) are shown in bold, and in addition, positions at which more than 75% of the sequences have a hydrophobic residue are
marked by a star.
The presence of K-domains in the GGM proteins makes
it conceivable that they all interact with other K-domain–containing proteins via these regions, very likely
the same or other types of MADS-domain proteins.
The similarity between the GGM genes and all other MIKC-type genes with respect to overall domain
structure indicates that they share a common ancestor
from which they were derived by gene duplication, sequence diversification, and fixation. However, the I- and
C-domains are quite diverse in both length and sequence
(fig. 2), suggesting a functional diversification of the
GGM proteins in the selective formation of DNA-binding dimers or tetramers or in transcriptional activation.
This structural diversity could be due to rapid sequence
evolution or an ancient origin of the corresponding
genes. To determine the minimal number of gene clades
containing sequences from both gymnosperms and angiosperms, and thus to distinguish between these possibilities, the phylogeny of the MADS-box gene family
was reconstructed involving the novel G. gnemon genes.
A Large Fraction of the GGM Genes Have Putative
Orthologs in Flowering Plants
In initial phylogeny reconstructions, all available
MIKC-type MADS-domain proteins were used, many of
which have been published only in databases to date. In
this way, genes with putative orthologs in gymnosperms
were determined on the basis of the total available evidence (data not shown). For simplicity, we then constructed trees in which the majority of genes which had
no putative orthologs in gymnosperms and angiosperms
were omitted. Moreover, we reduced the gene sets of
huge clades to a few representative members. In addition,
GGM6, GGM15, and GGM16 were omitted from most
tree reconstruction procedures, because uncertainties in
sequence alignments made the tree topology sensitive to
gene sampling. Our preliminary data suggest, however,
that GGM15 is closely related to the DAL12 gene from
the gymnosperm Picea abies (Sundström et al. 1999) and
that GGM6 and GGM16 are not members of any of the
MADS-box gene subfamilies described so far.
An informative phylogenetic tree is shown in figure
3. More comprehensive trees are accessible via the World
Wide Web (http://www.mpiz-koeln.mpg.de/mads/). Five
different clades of putatively orthologous genes from
both gymnosperms and angiosperms which had been reported before (Tandre et al. 1995; Mouradov et al. 1998,
1999; Rutledge et al. 1998; Sundström et al. 1999; Winter et al. 1999) could be confirmed in this study. These
were the AG-, AGL2-, AGL6-, DEF/GLO-, and TM3-like
genes, comprising the G. gnemon genes GGM1–GGM3,
GGM9, and GGM11 (fig. 3). In addition, the tree in
figure 3 identifies GGM12 as a STMADS11-like gene.
Moreover, cloning of a GGM13-like gene from maize
(Zea mays ssp. mays) (unpublished data) also established a novel clade containing both gymnosperm and
angiosperm sequences, termed GGM13-like genes (fig.
3). The GGM13-like genes are closely related to the
DEF and GLO-like genes (fig. 3), which provide the
floral homeotic B-function in angiosperms.
The other GGM genes do not fall into any of the
subfamilies described in the literature (J. J. Doyle 1994;
Purugganan et al. 1995; Theißen and Saedler 1995;
Theißen, Kim, and Saedler 1996; Münster et al. 1997;
MADS-Box Gene Diversity in Seed Plants
1429
FIG. 3.—Phylogenetic tree showing the relationships between a subset of the MADS-domain proteins known. Genus names of species from
which the respective genes were isolated are given in parentheses beside the protein names. Gnetum proteins are indicated by inverted boxes,
and proteins from nongnetalean gymnosperms are indicated by shaded boxes. Proteins from ferns are highlighted by open boxes. Proteins that
are not boxed represent angiosperm sequences. The numbers next to some nodes give bootstrap percentages, shown only for relevant nodes and
those defining gene subfamilies (Theißen and Saedler 1995; Theißen, Kim, and Saedler 1996; Münster et al. 1997; Winter et al. 1999).
Subfamilies are labeled by brackets at the right margin. Bootstrap values and subfamily names corresponding to minimal clades containing
sequences from both gymnosperms and angiosperms are boxed.
Winter et al. 1999; Theißen et al. 2000), if bootstrap
support of .50% is used as a criterion (fig. 3 and unpublished data). Whether they have orthologs in angiosperms remains to be seen in future studies. The topology of many phylogenetic trees (e.g., fig. 3) would be
compatible with the view that GGM10 is an AGL12-like
gene and GGM19 is an AGL15-like gene. However,
bootstrap support for the respective relationships is so
low that additional evidence will be needed to clarify
these cases.
1430
Becker et al.
Differential Expression of GGM Genes
Similar expression patterns may corroborate hypotheses about orthology if these expression patterns are
found for most (or even all) members of the clade of
putatively orthologous genes and are rarely (or not at
all) found outside the respective gene clade (Winter et
al. 1999). Moreover, knowledge about the expression
patterns of Gnetum MADS-box genes may provide first
clues concerning the functions of the genes. Therefore,
we worked out an overview of the expression of the
GGM genes employing Northern hybridization.
For GGM5, GGM10, and GGM17, expression has
not been found so far in the investigated organs, very
likely because it is too weak there to be detected by
hybridization (data not shown). The expression patterns
of the other GGM genes appear to be quite diverse, even
at our low level of spatial resolution (fig. 4; for the expression patterns of GGM1–GGM3, GGM9, and
GGM11, see Winter et al. 1999). The expression of most
genes is restricted to male or female reproductive units,
but only GGM3, GGM8, and GGM9 seem to be expressed there in approximately equal amounts. Some
other genes (GGM4, GGM7, GGM11) are more strongly
expressed in female than in male cones. Since the cones
of male G. gnemon plants contain a certain amount of
sterile female reproductive units assumed to be involved
in pollinator attraction (Hufford 1996), it seems possible
that expression of these genes is restricted to female
reproductive units, comprising the fertile ovules of female cones and the sterile ovules of male cones. Expression of GGM2, GGM6, GGM15, GGM16, and
GGM18 has been detected so far only in male cones. In
contrast, GGM13 expression appears to be specific for
female cones. Only four genes show considerable expression in vegetative leaves, with GGM1 being almost
ubiquitously expressed in vegetative and reproductive
organs, GGM12 being more strongly expressed in leaves
than in cones, GGM14 being more strongly expressed
in female cones than in male cones and leaves, and
GGM19 being expressed in leaves and female cones but
not in male cones. Taken together, these data suggest a
considerable diversity of GGM gene functions in both
vegetative and reproductive organ development in G.
gnemon.
Discussion
Our studies on 19 different members of the MADSbox gene family from the gnetophyte G. gnemon reveal
that this gene family is quite complex in terms of gene
number, sequence diversity, and expression patterns. In
line with this, PCR cloning of a 61-bp segment using
degenerate primers targeted to the MADS-box suggested
the presence of over 27 different MADS-box genes
within black spruce (Picea mariana), a gymnosperm belonging to the conifers (Rutledge et al. 1998). Both studies thus suggest that the complexity of the MADS-box
gene family in gymnosperms is similar to that in
angiosperms.
In contrast to the sequence fragments available for
the black spruce genes, which are too small for reliable
FIG. 4.—Northern blot analysis of GGM gene expression. The
names of the respective genes are indicated at the right margin. At the
left margin, the apparent lengths of the major bands are indicated in
kilobases. RNA sources were young leaves (L) and male (M) or female
(F) cones from Gnetum gnemon trees, as indicated. The expression
pattern of GGM1 has already been shown elsewhere (Winter et al.
1999), but since the gene is quite ubiquitously expressed, it is included
here as a control for RNA loading. At the top, a section of an ethidium
bromide–stained gel containing rRNA is shown before membrane blotting as an additional control for equal RNA loading.
phylogeny reconstructions, the (almost) full-length sequences from G. gnemon presented here could be used
to show that seven of them fall into six distinct gene
clades which also contain members from angiosperm
species. Together with the clade of AGL2-like genes
which contains the conifer gene PRMADS1 (Mouradov
et al. 1998), we thus have defined seven different gene
clades which contain both gymnosperm and angiosperm
members at the .70% level of bootstrap support. Six
MADS-Box Gene Diversity in Seed Plants
1431
FIG. 5.—The origin of MADS-box gene clades in the evolution of vascular plants. A phylogenetic tree of some major taxa of vascular
plants is shown. The ages (in MYA) given at two nodes of the tree are rough estimates. At the left side of the root and some branches of the
tree, three important stages in the evolution of the megasporangium are schematically depicted. From bottom to top: a sporangium that is not
covered by an integument, a condition still found in extant ferns; a sporangium that is covered by an integument (ovule); and a sporangium
that, in addition, is surrounded by a carpel. The gene names beside the branches denote gene subfamilies, not single genes. These were established,
at the latest, during the time interval represented by the respective branches of the phylogenetic tree. This could be concluded from the presence
of respective clade members in extant taxa. For example, AG-, AGL2-, AGL6-, DEF/GLO-, GGM13-, STMADS11-, and TM3-like genes have
already been isolated from angiosperms and gymnosperms, but not from ferns. ‘‘2 # MIKC-type genes’’ symbolizes that the last common
ancestor of ferns and seed plants already had at least two MIKC-type MADS-box genes (Münster et al. 1997). Information about some gene
clades shown here but not described in this paper has been reviewed elsewhere (Theißen et al. 2000).
of these clades have bootstrap support .80%, and four
even have bootstrap support of $90% (fig. 3).
Due to their membership in distinct subclades of
the MADS-box gene tree, the respective GGM genes
and PRMADS1 are not just homologs, but even putative
orthologs of the respective clade members from angiosperms, meaning that the ancestors of these genes were
established during a speciation event(s) that separated
the lineage(s) that led to the respective extant gymnosperm groups from the lineage that led to extant
angiosperms.
Further implications of these findings depend on
the phylogenetic position of the gnetophytes and conifers within the seed plants. Extant seed plants comprise
angiosperms and four different groups of gymnosperms,
i.e., gnetophytes (with only three genera, Gnetum, Ephedra, and Welwitschia), conifers, cycads, and Ginkgo biloba. Although some phylogenetic analyses of morphological data suggested that gnetophytes are a sister group
to angiosperms among extant gymnosperms (J. A. Doyle
1994, 1996), recent phylogeny reconstructions based on
molecular markers indicated that gnetophytes were more
closely related to conifers than to angiosperms (Hansen
et al. 1999; Winter et al. 1999) and, moreover, that all
extant gymnosperms represent a monophyletic group
(Goremykin et al. 1996; Chaw et al. 1997; Qiu et al.
1999; Samigullin et al. 1999; Soltis et al. 1999). Therefore, ancestors of orthologous genes shared by angiosperms and any gymnosperm were very likely already
present in the last common ancestor of all extant seed
plants. Although the earliest fossil evidence of gymnosperms dates back to about 350–365 MYA (Beck 1988;
Taylor and Taylor 1993), the last common ancestor of
extant seed plants probably existed about 300 MYA—
recent estimations based on molecular data range from
285 to 348 MYA (Savard et al. 1994; Goremykin, Hansmann, and Martin 1997).
Assuming monophyly of all extant gymnosperms,
we thus conclude from our data that the last common
ancestor of extant gymnosperms and angiosperms about
300 MYA already contained at least seven different
MADS-box genes, namely, distinct representatives of
the clades of AG-, AGL2-, AGL6-, DEF/GLO-, GGM13-,
STMADS11-, and TM3-like genes (fig. 5). Aside from
the possibility that some gene types might have been
lost in some seed plant lineages, representatives of all
of the clades found in gnetophytes and angiosperms can
thus also be expected for conifers, Ginkgo, and cycads.
Indeed, AG-, AGL6-, DEF/GLO-, and TM3-like genes
have already been isolated from conifer species (Tandre
et al. 1995; Mouradov et al. 1998, 1999; Rutledge et al.
1998; Sundström et al. 1999), and an AGL6-like gene
has also been found in Ginkgo (Winter et al. 1999;
Theißen et al. 2000).
According to our data, the precursors of the GGM
genes in the ancient clades of paralogous genes (GGM1,
GGM2, GGM3, GGM9, GGM11, GGM12, GGM13)
separated more than 300 MYA, which explains, at least
1432
Becker et al.
in part, the diversity of extant GGM proteins in the Iand C-domains (fig. 2).
Hybridization of Southern blots containing G. gnemon total DNA with GGM probes at moderate stringency indicate that our sampling of GGM genes was not
exhaustive (data not shown). Since MADS-box gene
sampling is far from being complete for any gymnosperm model species, we consider our determination of
the number of genes in the last common ancestor of
extant seed plants a minimal estimate. This is also true
because some gene types could have been lost after the
separation of the lineages that led to extant angiosperms
and gymnosperms in at least one of the lineages, and
because some genuine orthologous relationships may
have escaped our detection methods due to the long time
since the separation of the angiosperm and gymnosperm
lineages and/or due to rapid sequence evolution.
Comparison between the expression patterns of orthologous angiosperm and Gnetum MADS-box genes
reveals some striking similarities. For the TM3-like gene
GGM1, the DEF/GLO-like gene GGM2, the AG-like
gene GGM3, and the AGL6-like genes GGM9 and
GGM11, these similarities have already been outlined
elsewhere (Winter et al. 1999).
The STMADS11-like gene GGM12 is expressed
strongly in leaves, but only weakly in reproductive
cones (fig. 4). Its putative orthologs from the potato,
STMADS11 and STMADS16, are expressed in all vegetative organs of the plant, but not in flowers (Carmona,
Ortega, and Garcia-Maroto 1998; Garcia-Maroto et al.
2000). Thus, the STMADS11-like genes identified so far
show a preference for expression in vegetative organs,
which is a very unusual feature for MADS-box genes
from seed plants.
GGM13 expression was found exclusively in female cones (fig. 4). The only GGM13-like gene that has
been isolated so far from an angiosperm, the maize gene
ZMM17, is predominantly expressed in female inflorescences (maize cobs), where at late developmental stages
expression is restricted to carpels (unpublished data).
Thus, the two GGM13-like genes known to date have
in common an expression pattern which is focused on
female reproductive structures.
The most parsimonious explanation for these similarities in expression patterns of putatively orthologous
genes (also including the cases involving GGM1–
GGM3, GGM9, and GGM11; see Winter et al. 1999) is
that the MADS-box genes that were present in the last
common ancestor of extant seed plants had already
adopted their gene clade-specific expression patterns,
which then were conserved to a certain extent in the
different angiosperm and gymnosperm lineages. Thus,
some of the MADS-box genes which were present in
the last common ancestor of extant seed plants not only
already had some of the sequence characteristics typical
for extant clade members, but also were already diversified and fixed in terms of expression patterns and (by
inference) function. While some of the ancestral genes
300 MYA were very likely already specialized in male
(DEF/GLO-like genes) or female reproductive organ development (GGM13-like genes) or both (AG-, AGL2-,
and AGL6-like genes), others were probably involved in
vegetative development or the switch from vegetative to
reproductive development (STMADS11- and TM3-like
genes). Interestingly, all of these MADS-box gene types
probably originated during the period when the ovule
was established during evolution, i.e., 400–300 MYA
(fig. 5; Beck 1988; Taylor and Taylor 1993). Since
GGM13- and AG-like genes are expressed in ovules, and
some AG-like genes are key control genes of ovule development (Angenent and Colombo 1996; Western and
Haughn 1999), the establishment of these genes may
have been an important step in the evolution of the ovule
(see also Münster et al. 1997; Theißen et al. 2000). Since
the members of at least five other clades of MADS-box
genes have been conserved for a similar period of time
(fig. 5), they also may well have been important for
developmental and structural key innovations of the
seed plants, e.g., the evolution of microsporophylls (antherophores, stamina) in case of DEF-, GLO-, and DEF/
GLO-like genes.
Acknowledgments
We thank Thomas Stützel (Spezielle Botanik, RuhrUniversität Bochum), Angelika Piernitzky, and Manfred
H. Weisenseel (Botanischer Garten der Universität
Karlsruhe) for plant material from G. gnemon. We also
thank the Automatic DNA Isolation and Sequencing
team of our institute for sequencing the cDNA clones.
Many thanks to Jan Kim for his help with computer
work and to Thomas Münster for valuable advice and
discussions. Financial support from the DFG to G.T. (Th
417/3-1) and to A.B. (Graduiertenkolleg ‘‘Molekulare
Analyse von Entwicklungsprozessen bei Pflanzen’’) is
gratefully acknowledged.
LITERATURE CITED
ANGENENT, G. C., and L. COLOMBO. 1996. Molecular control
of ovule development. Trends Plant Sci. 1:228–232.
BECK, C. B. 1988. Origin and evolution of gymnosperms. Columbia University Press, New York.
CARMONA, M. J., N. ORTEGA, and F. GARCIA-MAROTO. 1998.
Isolation and molecular characterization of a new vegetative
MADS-box gene from Solanum tuberosum L. Planta 207:
181–188.
CHAW, S.-M., A. ZHARKIKH, H.-M. SUNG, T.-C. LAU, and W.-H.
LI. 1997. Molecular phylogeny of extant gymnosperms and
seed plant evolution: analysis of 18S rRNA sequences. Mol.
Biol. Evol. 14:56–78.
CHO, S., S. JANG, S. CHAE, K. M. CHUNG, Y.-H. MOON, G.
AN, and S. K. JANG. 1999. Analysis of the C-terminal region of Arabidopsis thaliana APETALA1 as a transcription
activation domain. Plant Mol. Biol. 40:419–429.
DAYHOFF, M. O. 1979. Atlas of protein sequences and structure. Vol. 5, Suppl. 3. National Biomedical Research Foundation, Washington, D.C.
DOYLE, J. A. 1994. Origin of the angiosperm flower: a phylogenetic perspective. Plant Syst. Evol. 8(Suppl.):7–29.
———. 1996. Seed plant phylogeny and the relationships of
Gnetales Int. J. Plant Sci. 157(Suppl. 6):S3–S39.
DOYLE, J. J. 1994. Evolution of a plant homeotic multigene
family: towards connecting molecular systematics and molecular developmental genetics. Syst. Biol. 43:307–328.
MADS-Box Gene Diversity in Seed Plants
EGEA-CORTINES, M., H. SAEDLER, and H. SOMMER. 1999. Ternary complex formation between the MADS-box proteins
SQUAMOSA, DEFICIENS and GLOBOSA is involved in
the control of floral architecture in Anthirrinum majus.
EMBO J. 18:5370–5379.
FELSENSTEIN, J. 1993. PHYLIP (phylogeny inference package).
Version 3.5. Distributed by the author, Department of Genetics, University of Washington, Seattle.
FISCHER, A., N. BAUM, H. SAEDLER, and G. THEIßEN. 1995.
Chromosomal mapping of the MADS-box multigene family
in Zea mays reveals dispersed distribution of allelic genes
as well as transposed copies. Nucleic Acids. Res. 23:1901–
1911.
GARCIA-MAROTO, F., N. ORTEGA, R. LOZANO, and M.-J. CARMONA. 2000. Characterization of the potato MADS-box
gene STMADS16 and expression analysis in tobacco transgenic plants. Plant Mol. Biol. 42:499–513.
GOREMYKIN, V., V. BOBROVA, J. PAHNKE, A. TROITSKY, A.
ANTONOV, and W. MARTIN. 1996. Noncoding sequences
from the slowly evolving chloroplast inverted repeat in addition to rbcL data do not support Gnetalean affinities of
angiosperms. Mol. Biol. Evol. 13:383–396.
GOREMYKIN, V., S. HANSMANN, and W. F. MARTIN. 1997. Evolutionary analysis of 58 proteins encoded in six completely
sequenced chloroplast genomes: revised molecular estimates of two seed plant divergence times. Plant Syst. Evol.
206:337–351.
GU, Q., C. FERRÁNDIZ, M. F. YANOFSKY, and R. MARTIENSSEN.
1998. The FRUITFULL MADS-box gene mediates cell differentiation during Arabidopsis fruit development. Development 125:1509–1517.
HANSEN, A., S. HANSMANN, T. SAMIGULLIN, A. ANTONOV, and
W. MARTIN. 1999. Gnetum and the angiosperms: molecular
evidence that their shared morphological characters are convergent, rather than homologous. Mol. Biol. Evol. 16:1006–
1009.
HASEBE, M., and J. A. BANKS. 1997. Evolution of MADS gene
family in plants. Pp. 179–197 in K. IWATSUKI and P. H.
RAVEN. Evolution and diversification of land plants. Springer-Verlag, Tokyo.
HUANG, H., M. TUDOR, C. A. WEISS, Y. HU, and H. MA. 1995.
The Arabidopsis MADS-box gene AGL3 is widely expressed and encodes a sequence-specific DNA-binding protein. Plant Mol. Biol. 28:549–567.
HUFFORD, L. 1996. The morphology and evolution of male
reproductive structures of Gnetales. Int. J. Plant Sci.
157(Suppl. 6):S95–S112.
LILJEGREN, S. J., C. FERRÁNDIZ, E. R. ALVAREZ-BUYLLA, S.
PELAZ, and M. F. YANOFSKY. 1998. MADS-box genes involved in fruit dehiscence. Flowering Newsl. 25:9–19.
MA, H., M. F. YANOFSKY, and E. M. MEYEROWITZ. 1991.
AGL1–AGL6, an Arabidopsis gene family with similarity to
floral homeotic and transcription factor genes. Genes Dev.
5:484–495.
MICHAELS, S. D., and R. M. AMASINO. 1999. FLOWERING
LOCUS C encodes a novel MADS domain protein that acts
as a repressor of flowering. Plant Cell 11:949–956.
MOURADOV, A., T. V. GLASSIK, B. A. HAMDORF, L. C. MURPHY, S. S. MARLA, Y. YANG, and R. TEASDALE. 1998. Family of MADS-box genes expressed early in male and female
reproductive structures of Monterey pine. Plant Physiol.
117:55–61.
MOURADOV, A., B. HAMDORF, R. D. TEASDALE, J. KIM, K.-U.
WINTER, and G. THEIßEN. 1999. A DEF/GLO-like MADSbox gene from a gymnosperm: Pinus radiata contains an
ortholog of angiosperm B class floral homeotic genes. Dev.
Genet. 25:245–252.
1433
MÜNSTER, T., J. PAHNKE, A. DI ROSA, J. T. KIM, W. MARTIN,
H. SAEDLER, and G. THEIßEN. 1997. Floral homeotic genes
were recruited from homologous MADS-box genes preexisting in the common ancestor of ferns and seed plants.
Proc. Natl. Acad. Sci. USA 94:2415–2420.
OKADA, K., and Y. SHIMURA. 1994. Genetic analyses of signalling in flower development using Arabidopsis. Plant
Mol. Biol. 26:1357–1377.
PURUGGANAN, M. D., S. D. ROUNSLEY, R. J. SCHMIDT, and M.
YANOFSKY. 1995. Molecular evolution of flower development: diversification of the plant MADS-box regulatory
gene family. Genetics 140:345–356.
QIU, Y.-L., J. LEE, F. BERNASCONI-QUADRONI, D. E. SOLTIS, P.
S. SOLTIS, M. ZANIS, E. A. ZIMMER, Z. CHEN, V. SAVOLAINEN, and M. W. CHASE. 1999. The earliest angiosperms:
evidence from mitochondrial, plastid and nuclear genomes.
Nature 402:404–407.
RIECHMANN, J. L., and E. M. MEYEROWITZ. 1997. MADS domain proteins in plant development. Biol. Chem. 378:1079–
1101.
ROUNSLEY, S. D., G. S. DITTA, and M. F. YANOFSKY. 1995.
Diverse roles for MADS box genes in Arabidopsis development. Plant Cell 7:1259–1269.
RUTLEDGE, R., S. REGAN, O. NICOLAS, P. FOBERT, C. COTÉ,
W. BOSNICH, C. KAUFFELDT, G. SUNOHARA, A. SÉGUIN, and
D. STEWART. 1998. Characterization of an AGAMOUS homologue from the conifer black spruce (Picea mariana) that
produces floral homeotic conversions when expressed in Arabidopsis. Plant J. 15:625–634.
SAITOU, N., and M. NEI. 1987. The neighbor-joining method:
a new method for reconstructing phylogenetic trees. Mol.
Biol. Evol. 4:406–425.
SAMBROOK, J., E. F. FRITSCH, and T. MANIATIS. 1989. Molecular cloning: a laboratory manual. 2nd edition. Cold Spring
Harbor Laboratory Press, Cold Spring Harbor, N.Y.
SAMIGULLIN, T. K., W. F. MARTIN, A. V. TROITSKY, and A. S.
ANTONOV. 1999. Molecular data from the chloroplast rpoC1
gene suggest a deep and distinct dichotomy of contemporary spermatophytes into two monophyla: gymnosperms
(including Gnetales) and angiosperms. J. Mol. Evol. 49:
310–315.
SAVARD, L., P. LI, S. H. STRAUSS, M. W. CHASE, M. MICHAUD,
and J. BOUSQUET. 1994. Chloroplast and nuclear gene sequences indicate Late Pennsylvanian time for the last common ancestor of extant seed plants. Proc. Natl. Acad. Sci.
USA 91:5163–5167.
SCHWARZ-SOMMER, Z., P. HUIJSER, W. NACKEN, H. SAEDLER,
and H. SOMMER. 1990. Genetic control of flower development by homeotic genes in Antirrhinum majus. Science
250:931–936.
SHORE, P., and A. D. SHARROCKS. 1995. The MADS-box family of transcription factors. Eur. J. Biochem. 229:1–13.
SOLTIS, P. S., D. E. SOLTIS, P. G. WOLF, D. L. NICKRENT, S.
M. CHAW, and R. L. CHAPMAN. 1999. The phylogeny of
land plants inferred from 18S rDNA sequences: pushing the
limits of rDNA signal. Mol. Biol. Evol. 16:1774–1784.
SOMMER, H., J.-P. BELTRÁN, P. HUIJSER, H. PAPE, W.-E.
LÖNNIG, H. SAEDLER, and Z. SCHWARZ-SOMMER. 1990. Deficiens, a homeotic gene involved in the control of flower
morphogenesis in Anthirrinum majus: the protein shows homology to transcription factors. EMBO J. 9:605–613.
SUNDSTRÖM, J., A. CARLSBECKER, M. SVENSSON, M. SVENSON,
J. URBAN, G. THEIßEN, and P. ENGSTRÖM. 1999. MADSbox genes active in developing pollen cones of Norway
spruce (Picea abies) are homologous to the B-class floral
homeotic genes in angiosperms. Dev. Genet. 25:253–266.
1434
Becker et al.
TANDRE, K., V. A. ALBERT, A. SUNDAS, and P. ENGSTRÖM.
1995. Conifer homologues to genes that control floral development in angiosperms. Plant Mol. Biol. 27:69–78.
TAYLOR, T. N., and E. L. TAYLOR. 1993. The biology and
evolution of fossil plants. Prentice Hall, Englewood Cliffs,
N.J.
THEIßEN, G., A. BECKER, A. DI ROSA, A. KANNO, J. T. KIM,
T. MÜNSTER, K.-U. WINTER, and H. SAEDLER. 2000. A short
history of MADS-box genes in plants. Plant Mol. Biol. 42:
115–149.
THEIßEN, G., J. KIM, and H. SAEDLER. 1996. Classification and
phylogeny of the MADS-box multigene family suggest defined roles of MADS-box gene subfamilies in the morphological evolution of eukaryotes. J. Mol. Evol. 43:484–516.
THEIßEN, G., and H. SAEDLER. 1995. MADS-box genes in plant
ontogeny and phylogeny: Haeckel’s ‘biogenetic law’ revisited. Curr. Opin. Genet. Dev. 5:628–639.
———. 1998. Molecular architects of plant body plans. Prog.
Bot. 59:227–256.
———. 1999. The golden decade of molecular floral development (1990–1999): a cheerful obituary. Dev. Genet. 25:
181–193.
WEIGEL, D., and E. M. MEYEROWITZ. 1994. The ABCs of floral
homeotic genes. Cell 78:203–209.
WESTERN, T. L., and G. W. HAUGHN. 1999. BELL1 and AGAMOUS genes promote ovule identity in Arabidopsis thaliana. Plant J. 18:329–336.
WINTER, K.-U., A. BECKER, T. MÜNSTER, J. T. KIM, H. SAEDLER, and G. THEIßEN. 1999. MADS-box genes reveal that
gnetophytes are more closely related to conifers than to
flowering plants. Proc. Natl. Acad. Sci. USA 96:7342–7347.
WOLFE, K. H., M. GOUY, Y.-W. YANG, P. M. SHARP, and W.H. LI. 1989. Date of the monocot-dicot divergence estimated from chloroplast DNA sequence data. Proc. Natl.
Acad. Sci. USA 86:6201–6205.
YANOFSKY, M. F., H. MA, J. L. BOWMAN, G. N. DREWS, K.
A. FELDMAN, and E. M. MEYEROWITZ. 1990. The protein
encoded by the Arabidopsis homeotic gene agamous resembles transcription factors. Nature 346:35–39.
ZHANG, H., and B. G. FORDE. 1998. An Arabidopsis MADS
box gene that controls nutrient-induced changes in root architecture. Science 279:407–409.
ZHANG, J., and M. NEI. 1996. Evolution of Antennapedia-class
homeobox genes. Genetics 142:295–303.
WILLIAM MARTIN, reviewing editor
Accepted June 5, 2000