Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MADS-Box Gene Diversity in Seed Plants 300 Million Years Ago Annette Becker, Kai-Uwe Winter, Britta Meyer, Heinz Saedler, and Günter Theißen Max-Planck-Institut für Züchtungsforschung, Abteilung Molekulare Pflanzengenetik, Köln, Germany MADS-box genes encode a family of transcription factors which control diverse developmental processes in flowering plants ranging from root development to flower and fruit development. Through phylogeny reconstructions, most of these genes can be subdivided into defined monophyletic gene clades whose members share similar expression patterns and functions. Therefore, the establishment of the diversity of gene clades was probably an important event in land plant evolution. In order to determine when these clades originated, we isolated cDNAs of 19 different MADS-box genes from Gnetum gnemon, a gymnosperm model species and thus a representative of the sister group of the angiosperms. Phylogeny reconstructions involving all published MADS-box genes were then used to identify gene clades containing putative orthologs from both angiosperm and gymnosperm lineages. Thus, the minimal number of MADS-box genes that were already present in the last common ancestor of extant gymnosperms and angiosperms was determined. Comparative expression studies involving pairs of putatively orthologous genes revealed a diversity of patterns that has been largely conserved since the time when the angiosperm and gymnosperm lineages separated. Taken together, our data suggest that there were already at least seven different MADS-box genes present at the base of extant seed plants about 300 MYA. These genes were probably already quite diverse in terms of both sequence and function. In addition, our data demonstrate that the MADS-box gene families of extant gymnosperms and angiosperms are of similar complexities. Introduction MADS-box genes (Schwarz-Sommer et al. 1990) encode transcription factors which play important roles in developmental control in plants, animals, and fungi (Shore and Sharrocks 1995; Theißen and Saedler 1995; Theißen, Kim, and Saedler 1996; Riechmann and Meyerowitz 1997; Theißen et al. 2000). Some plant MADSbox genes, such as DEFICIENS (DEF) from Antirrhinum majus and AGAMOUS (AG) from Arabidopsis thaliana, work as organ identity (homeotic selector) genes during flower development (Sommer et al. 1990; Yanofsky et al. 1990). Floral organ identity genes can be subdivided into four different classes, termed A-, B-, Cand D-function genes, whose members provide four different homeotic functions, with A specifying sepals; A1B, petals; B1C, stamens; C, carpels; and D, ovules (Weigel and Meyerowitz 1994; Angenent and Colombo 1996). Most floral organ identity genes that could be cloned to date belong to the family of MADS-box genes (for recent reviews, see Theißen, Kim, and Saedler 1996; Riechmann and Meyerowitz 1997; Theißen et al. 2000). The MADS-type floral homeotic genes of Arabidopsis are APETALA1 (AP1; A-function), APETALA3 and PISTILLATA (AP3 and PI; B-function), and AG (Cfunction). An Arabidopsis D-function gene has not been described to date. Besides providing floral homeotic functions, MADS-box genes have many other roles within the gene networks that ‘‘control’’ reproductive development in angiosperms such as Arabidopsis (for reviews, see Okada and Shimura 1994; Theißen and Saedler 1995, 1998, 1999; Theißen, Kim, and Saedler 1996). FLC, for exKey words: MADS-box gene, gymnosperm, angiosperm, Gnetales, development, evolution. Address for correspondence and reprints: Günter Theißen, MaxPlanck-Institut für Züchtungsforschung, Abteilung Molekulare Pflanzengenetik, Carl-von-Linné-Weg 10, D-50829 Köln, Germany. E-mail: [email protected]. Mol. Biol. Evol. 17(10):1425–1434. 2000 q 2000 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038 ample, is a ‘‘flowering time gene’’ which mediates, depending on environmental factors such as cold, the switch from vegetative to reproductive development (Michaels and Amasino 1999). Flowering time genes exert their function by influencing meristem identity genes. Inflorescence meristem identity genes, such as the MADS-box gene FRUITFULL (FUL), and floral meristem identity genes, such as the MADS-box genes AP1 and CAULIFLOWER (CAL), specify the identities of inflorescence and floral meristems, respectively, and thus ‘‘control’’ the transition from one meristem type to the other. Within floral meristems, cadastral genes set the boundaries of floral organ identity gene functions, thus defining the different floral whorls. Besides its role as a floral organ identity gene, AG also has a cadastral function, because it prevents the A-function from being expressed in the third and fourth floral whorls. Some ‘‘intermediate genes,’’ such as AGL2, AGL4, and AGL9, possibly mediate between floral meristem and organ identity genes. The floral organ identity genes specify the organ identity within each whorl of the flower by activating ‘‘realizator genes’’. After fertilization of the flower, MADS-box genes such as AGL1, AGL5, and FUL are required for proper fruit development (Gu et al. 1998; Liljegren et al. 1998). Moreover, transcription of a number of MADS-box genes outside flowers and fruits suggests that members of this gene family play regulatory roles also during vegetative development, such as embryo, root, and leaf development (Ma, Yanofsky, and Meyerowitz 1991; Huang et al. 1995; Rounsley, Ditta, and Yanofsky 1995; Theißen et al. 2000). Analysis of a transgenic mutant indicated that the MADS-box gene ANR1 is a key component of the signal transduction chain by which nitrate stimulates lateral root proliferation (Zhang and Forde 1998). The existence of MADS-box genes in gymnosperms and ferns, which form neither flowers nor fruits, further demonstrates that the role of these genes in plants is not restricted to flower or fruit development (Tandre et al. 1995; Münster et al. 1997; Mouradov et 1425 1426 Becker et al. al. 1998, 1999; Rutledge et al. 1998; Sundström et al. 1999; Winter et al. 1999). There are reasons to assume that changes in number, expression, and interaction of developmental control genes all have contributed to the evolution of plant form (Theißen and Saedler 1995; Theißen, Kim, and Saedler 1996; Theißen et al. 2000). Since MADS-box genes play important roles in the gene networks that ‘‘control’’ plant development, understanding the phylogeny of MADS-box genes may strongly improve our understanding of plant evolution. When and how was the diversity of MADS-box genes present in flowering plants such as Arabidopsis generated during evolution? Did it appear during angiosperm evolution, or is it considerably older? Do changes in the expression and function of these genes reflect morphological innovations during plant evolution? To answer these questions, the phylogeny of MADS-box genes has to be reconstructed and superimposed on the phylogeny of land plant taxa. As a prerequisite, the MADS-box gene families of phylogenetically informative key taxa have to be characterized. For the angiosperm model species A. thaliana, we are already quite close to a complete knowledge of all MADS-box genes: more than 40 of them have been isolated so far (Liljegren et al. 1998), and based on a large fraction of the A. thaliana genome which has already been sequenced, their total number in the genome can be estimated to be about 60. However, only for a minority of genes is the function defined by a mutant phenotype. In the other key taxa of land plants (gymnosperms, pteridophytes, bryophytes), the sampling of MADS-box genes is far less complete (Theißen et al. 2000). Nevertheless, some key insights into the phylogeny of plant MADS-box genes have already been obtained. Phylogeny reconstructions revealed that the MADS-box gene family is composed of several defined gene clades (J. J. Doyle 1994; Purugganan et al. 1995; Theißen, Kim, and Saedler 1996; Theißen et al. 2000). Almost all plant MADS-box genes known to date are members of a monophyletic superclade of genes with a conserved structural organization, the so-called MIKCtype domain structure, including a MADS (M-), an intervening (I-), a keratin-like (K-), and a C-terminal (C-) domain (Ma, Yanofsky, and Meyerowitz 1991; Theißen, Kim, and Saedler 1996; Hasebe and Banks 1997; Münster et al. 1997). The highly conserved MADS-domain is the major determinant of DNA binding, but it also performs dimerization and accessory factor binding functions (Shore and Sharrocks 1995). The I-domain is only relatively weakly conserved among plant MADSdomain proteins (Purugganan et al. 1995), but it may constitute a key molecular determinant for the selective formation of DNA-binding dimers (Riechmann and Meyerowitz 1997). The K-domain is characterized by a conserved regular spacing of hydrophobic residues, which is proposed to allow for the formation of an amphipathic helix involved in protein dimerization (Ma, Yanofsky, and Meyerowitz 1991; Shore and Sharrocks 1995). The most variable region, both in sequence and in length, is the C-domain at the C-terminal end of the MADS-domain proteins. In some MADS-domain proteins, it is involved in transcriptional activation or the formation of multimeric transcription factor complexes, respectively (Cho et al. 1999; Egea-Cortines, Saedler, and Sommer 1999). The MIKC-type gene superclade can be further subdivided into several well-defined gene clades whose members share similar expression patterns and highly related functions. For example, all A-, B-, C- and Dfunction genes known to date fall into separate clades, namely SQUAMOSA- (A-function), DEFICIENS- or GLOBOSA- (B-function), and AGAMOUS-like genes (C- and D-function) (J. J. Doyle 1994; Purugganan et al. 1995; Theißen and Saedler 1995; Angenent and Colombo 1996; Theißen, Kim, and Saedler 1996; Münster et al. 1997; Theißen et al. 2000). Therefore, the establishment of the mentioned gene clades by gene duplication, diversification, and fixation was probably an important step toward the establishment of the floral homeotic functions (Theißen, Kim, and Saedler 1996). Former studies of MADS-box gene evolution, based mainly on sequences from ferns and angiosperms, have led to the conclusion that the last common ancestor of ferns and seed plants about 400 MYA had at least two different MIKC-type genes, but no orthologs of any of the MADS-box genes were present in angiosperms yet (Münster et al. 1997; Theißen et al. 2000). In the last common ancestor of monocots and dicots about 200 MYA, however, the vast majority of MADS-box gene types known from Arabidopsis, including orthologs of all types of floral homeotic genes, were already established (Theißen et al. 2000). These gene lineages therefore must have been established in the lineage that led to the flowering plants after the separation from the fern lineage but very likely before the radiation of the flowering plants. The only extant taxa whose lineage branched off from the lineage that led to flowering plants during this critical time interval 400–200 MYA are the gymnosperms. Molecular data are currently converging on the view that extant gymnosperms are a monophyletic group (Goremykin et al. 1996; Chaw et al. 1997; Qiu et al. 1999; Samigullin et al. 1999; Soltis et al. 1999) which separated from the lineage that led to angiosperms about 300 MYA (Wolfe et al. 1989; Savard et al. 1994; Goremykin, Hansmann, and Martin 1997). In order to determine how many gene clades are shared by angiosperms and gymnosperms, we screened for MADS-box genes in the gymnosperm Gnetum gnemon, a member of the gnetophytes. Phylogeny reconstructions and comparison of expression patterns were then used to determine the minimal number and type of MADS-box genes already present in the last common ancestor of gymnosperms and angiosperms, providing a minimal estimate for the structural and functional diversity of this complex regulatory gene family in plants 300 MYA. MADS-Box Gene Diversity in Seed Plants 1427 FIG. 1.—Southern blot analysis of MADS-box genes in G. gnemon. DNA isolated from leaves of an individual tree growing in the botanical garden of Bochum was digested with XbaI (X) or HindIII (H) as indicated above the lanes, electrophoresed, blotted onto nylon membranes, and hybridized under stringent conditions with probes specific for the different GGM genes (1–19) as depicted in the figure. At the left margin, the lengths of some marker molecules (in kb) are indicated. In some lanes, two bands can be seen, either due to the presence of internal XbaI or HindIII sites in the respective genes (GGM2, GGM7, GGM10, GGM19) or due to the presence of two very similar genes or two different alleles in the Gnetum gnemon genome (GGM18) as confirmed by sequence analysis and Southern blots using four additional restriction enzymes (data not shown). Materials and Methods Plant Material Sequence Alignments and Construction of Phylogenetic Trees Leaves and cones of male and female G. gnemon trees growing in the botanical gardens of the University of Bochum or Karlsruhe, Germany, were used throughout this study. Multiple alignments of conceptual amino acid sequences were generated by using the PILEUP program of the GCG package (version 10.0) with a gap creation penalty of 8 and a gap extension penalty of 2 (default parameters). Based on alignments of the MADS-domain (60 amino acids) plus the 110 amino acids downstream of the MADS-domain (termed ‘‘MADS1110’’ or ‘‘170’’ domain sequence; see Theißen, Kim, and Saedler 1996; Winter et al. 1999), phylogenetic trees were constructed by the neighbor-joining method (Saitou and Nei 1987), version 3.5, as implemented by the PHYLIP program package (Felsenstein 1993). The neighbor-joining method was chosen because it is known to be quite efficient in obtaining reliable trees from large sets of data (Zhang and Nei 1996). Distance matrices were generated using the protein distance algorithm, version 3.55c, which is based on the PAM model of amino acid transition (Dayhoff 1979). To assess support for the inferred relationships, 100 bootstrap samples were generated as described (Münster et al. 1997). Isolation of cDNAs cDNAs were isolated by rapid amplification of cDNA ends (RACE) procedures as described (Winter et al. 1999). As template, poly A1 RNA isolated from leaves or cones of male or female G. gnemon trees were used. Sequences of primers used during the 39 and 59 RACE procedures can be downloaded from our home page (http://www. mpiz-koeln.mpg.de/;theissen/grouphome/index.html). The obtained cDNAs were sequenced on both strands using automatic sequencers. The nucleotide sequence data of the cDNAs have been deposited in the EMBL, GenBank, and DDBJ nucleotide sequence databases under the accession numbers AJ132207–AJ132219 (GGM1–GGM13) and AJ251554–AJ251559 (GGM14–GGM19). Hybridization Studies Hybridization probes were obtained from the region downstream of the MADS-box to avoid cross-hybridization with other gene family members. For Southern analyses, DNA gel blots were prepared by standard methods (Sambrook, Fritsch, and Maniatis 1989) with 10 mg DNA per lane, isolated from G. gnemon leaf material, and digested with restriction enzymes BamHI, EcoRI, EcoRV, HindII, HindIII, or XbaI. For the synthesis of probes, linear PCR was employed essentially as described (Fischer et al. 1995), but PCR products of GGM gene cDNAs were used as templates, and different gene-specific oligonucleotides were used as primers. The filters were hybridized and washed as described elsewhere (Münster et al. 1997). Northern analyses were carried out as described (Winter et al. 1999) using several different Northern blots, onto which, however, aliquots of the same RNA preparations from leaves and male as well as female cones had always been transferred. Results cDNA Cloning and Structural Evaluation of MADSBox Genes from G. gnemon We cloned and sequenced the cDNAs of 19 different MADS-box genes, termed GGM1–GGM19, from the gnetophyte G. gnemon (for GGM1–GGM13, see also Winter et al. 1999). Hybridization of Southern blots containing genomic DNA of an individual G. gnemon tree with different probes specific for each of the GGM genes under stringent conditions indicated that GGM1– GGM19 represent 18 different single-copy genes, plus one gene (GGM18) that either is present in two different alleles or has a duplicate locus with a very similar sequence in the G. gnemon genome (fig. 1). Sequence comparisons revealed that the products encoded by the GGM genes have a MIKC-type domain structure (fig. 2), like almost all MADS-domain proteins isolated from vascular plants. The high conservation known for the MADS-domains of angiosperm proteins is also obvious for the sequences from Gnetum (fig. 2). 1428 Becker et al. FIG. 2.—Conserved domain structure of MADS-domain proteins from Gnetum gnemon. Conceptual amino acid sequences of GGM genes were aligned by a computer program. A ‘‘,’’ sign at the beginning of some sequences indicates that they are incomplete at the N-terminus due to the cloning procedures used. The MADS-, I-, K-, and C-domains are indicated. Within the K-domain and in its vicinity, hydrophobic amino acids (L, I, V, M) are shown in bold, and in addition, positions at which more than 75% of the sequences have a hydrophobic residue are marked by a star. The presence of K-domains in the GGM proteins makes it conceivable that they all interact with other K-domain–containing proteins via these regions, very likely the same or other types of MADS-domain proteins. The similarity between the GGM genes and all other MIKC-type genes with respect to overall domain structure indicates that they share a common ancestor from which they were derived by gene duplication, sequence diversification, and fixation. However, the I- and C-domains are quite diverse in both length and sequence (fig. 2), suggesting a functional diversification of the GGM proteins in the selective formation of DNA-binding dimers or tetramers or in transcriptional activation. This structural diversity could be due to rapid sequence evolution or an ancient origin of the corresponding genes. To determine the minimal number of gene clades containing sequences from both gymnosperms and angiosperms, and thus to distinguish between these possibilities, the phylogeny of the MADS-box gene family was reconstructed involving the novel G. gnemon genes. A Large Fraction of the GGM Genes Have Putative Orthologs in Flowering Plants In initial phylogeny reconstructions, all available MIKC-type MADS-domain proteins were used, many of which have been published only in databases to date. In this way, genes with putative orthologs in gymnosperms were determined on the basis of the total available evidence (data not shown). For simplicity, we then constructed trees in which the majority of genes which had no putative orthologs in gymnosperms and angiosperms were omitted. Moreover, we reduced the gene sets of huge clades to a few representative members. In addition, GGM6, GGM15, and GGM16 were omitted from most tree reconstruction procedures, because uncertainties in sequence alignments made the tree topology sensitive to gene sampling. Our preliminary data suggest, however, that GGM15 is closely related to the DAL12 gene from the gymnosperm Picea abies (Sundström et al. 1999) and that GGM6 and GGM16 are not members of any of the MADS-box gene subfamilies described so far. An informative phylogenetic tree is shown in figure 3. More comprehensive trees are accessible via the World Wide Web (http://www.mpiz-koeln.mpg.de/mads/). Five different clades of putatively orthologous genes from both gymnosperms and angiosperms which had been reported before (Tandre et al. 1995; Mouradov et al. 1998, 1999; Rutledge et al. 1998; Sundström et al. 1999; Winter et al. 1999) could be confirmed in this study. These were the AG-, AGL2-, AGL6-, DEF/GLO-, and TM3-like genes, comprising the G. gnemon genes GGM1–GGM3, GGM9, and GGM11 (fig. 3). In addition, the tree in figure 3 identifies GGM12 as a STMADS11-like gene. Moreover, cloning of a GGM13-like gene from maize (Zea mays ssp. mays) (unpublished data) also established a novel clade containing both gymnosperm and angiosperm sequences, termed GGM13-like genes (fig. 3). The GGM13-like genes are closely related to the DEF and GLO-like genes (fig. 3), which provide the floral homeotic B-function in angiosperms. The other GGM genes do not fall into any of the subfamilies described in the literature (J. J. Doyle 1994; Purugganan et al. 1995; Theißen and Saedler 1995; Theißen, Kim, and Saedler 1996; Münster et al. 1997; MADS-Box Gene Diversity in Seed Plants 1429 FIG. 3.—Phylogenetic tree showing the relationships between a subset of the MADS-domain proteins known. Genus names of species from which the respective genes were isolated are given in parentheses beside the protein names. Gnetum proteins are indicated by inverted boxes, and proteins from nongnetalean gymnosperms are indicated by shaded boxes. Proteins from ferns are highlighted by open boxes. Proteins that are not boxed represent angiosperm sequences. The numbers next to some nodes give bootstrap percentages, shown only for relevant nodes and those defining gene subfamilies (Theißen and Saedler 1995; Theißen, Kim, and Saedler 1996; Münster et al. 1997; Winter et al. 1999). Subfamilies are labeled by brackets at the right margin. Bootstrap values and subfamily names corresponding to minimal clades containing sequences from both gymnosperms and angiosperms are boxed. Winter et al. 1999; Theißen et al. 2000), if bootstrap support of .50% is used as a criterion (fig. 3 and unpublished data). Whether they have orthologs in angiosperms remains to be seen in future studies. The topology of many phylogenetic trees (e.g., fig. 3) would be compatible with the view that GGM10 is an AGL12-like gene and GGM19 is an AGL15-like gene. However, bootstrap support for the respective relationships is so low that additional evidence will be needed to clarify these cases. 1430 Becker et al. Differential Expression of GGM Genes Similar expression patterns may corroborate hypotheses about orthology if these expression patterns are found for most (or even all) members of the clade of putatively orthologous genes and are rarely (or not at all) found outside the respective gene clade (Winter et al. 1999). Moreover, knowledge about the expression patterns of Gnetum MADS-box genes may provide first clues concerning the functions of the genes. Therefore, we worked out an overview of the expression of the GGM genes employing Northern hybridization. For GGM5, GGM10, and GGM17, expression has not been found so far in the investigated organs, very likely because it is too weak there to be detected by hybridization (data not shown). The expression patterns of the other GGM genes appear to be quite diverse, even at our low level of spatial resolution (fig. 4; for the expression patterns of GGM1–GGM3, GGM9, and GGM11, see Winter et al. 1999). The expression of most genes is restricted to male or female reproductive units, but only GGM3, GGM8, and GGM9 seem to be expressed there in approximately equal amounts. Some other genes (GGM4, GGM7, GGM11) are more strongly expressed in female than in male cones. Since the cones of male G. gnemon plants contain a certain amount of sterile female reproductive units assumed to be involved in pollinator attraction (Hufford 1996), it seems possible that expression of these genes is restricted to female reproductive units, comprising the fertile ovules of female cones and the sterile ovules of male cones. Expression of GGM2, GGM6, GGM15, GGM16, and GGM18 has been detected so far only in male cones. In contrast, GGM13 expression appears to be specific for female cones. Only four genes show considerable expression in vegetative leaves, with GGM1 being almost ubiquitously expressed in vegetative and reproductive organs, GGM12 being more strongly expressed in leaves than in cones, GGM14 being more strongly expressed in female cones than in male cones and leaves, and GGM19 being expressed in leaves and female cones but not in male cones. Taken together, these data suggest a considerable diversity of GGM gene functions in both vegetative and reproductive organ development in G. gnemon. Discussion Our studies on 19 different members of the MADSbox gene family from the gnetophyte G. gnemon reveal that this gene family is quite complex in terms of gene number, sequence diversity, and expression patterns. In line with this, PCR cloning of a 61-bp segment using degenerate primers targeted to the MADS-box suggested the presence of over 27 different MADS-box genes within black spruce (Picea mariana), a gymnosperm belonging to the conifers (Rutledge et al. 1998). Both studies thus suggest that the complexity of the MADS-box gene family in gymnosperms is similar to that in angiosperms. In contrast to the sequence fragments available for the black spruce genes, which are too small for reliable FIG. 4.—Northern blot analysis of GGM gene expression. The names of the respective genes are indicated at the right margin. At the left margin, the apparent lengths of the major bands are indicated in kilobases. RNA sources were young leaves (L) and male (M) or female (F) cones from Gnetum gnemon trees, as indicated. The expression pattern of GGM1 has already been shown elsewhere (Winter et al. 1999), but since the gene is quite ubiquitously expressed, it is included here as a control for RNA loading. At the top, a section of an ethidium bromide–stained gel containing rRNA is shown before membrane blotting as an additional control for equal RNA loading. phylogeny reconstructions, the (almost) full-length sequences from G. gnemon presented here could be used to show that seven of them fall into six distinct gene clades which also contain members from angiosperm species. Together with the clade of AGL2-like genes which contains the conifer gene PRMADS1 (Mouradov et al. 1998), we thus have defined seven different gene clades which contain both gymnosperm and angiosperm members at the .70% level of bootstrap support. Six MADS-Box Gene Diversity in Seed Plants 1431 FIG. 5.—The origin of MADS-box gene clades in the evolution of vascular plants. A phylogenetic tree of some major taxa of vascular plants is shown. The ages (in MYA) given at two nodes of the tree are rough estimates. At the left side of the root and some branches of the tree, three important stages in the evolution of the megasporangium are schematically depicted. From bottom to top: a sporangium that is not covered by an integument, a condition still found in extant ferns; a sporangium that is covered by an integument (ovule); and a sporangium that, in addition, is surrounded by a carpel. The gene names beside the branches denote gene subfamilies, not single genes. These were established, at the latest, during the time interval represented by the respective branches of the phylogenetic tree. This could be concluded from the presence of respective clade members in extant taxa. For example, AG-, AGL2-, AGL6-, DEF/GLO-, GGM13-, STMADS11-, and TM3-like genes have already been isolated from angiosperms and gymnosperms, but not from ferns. ‘‘2 # MIKC-type genes’’ symbolizes that the last common ancestor of ferns and seed plants already had at least two MIKC-type MADS-box genes (Münster et al. 1997). Information about some gene clades shown here but not described in this paper has been reviewed elsewhere (Theißen et al. 2000). of these clades have bootstrap support .80%, and four even have bootstrap support of $90% (fig. 3). Due to their membership in distinct subclades of the MADS-box gene tree, the respective GGM genes and PRMADS1 are not just homologs, but even putative orthologs of the respective clade members from angiosperms, meaning that the ancestors of these genes were established during a speciation event(s) that separated the lineage(s) that led to the respective extant gymnosperm groups from the lineage that led to extant angiosperms. Further implications of these findings depend on the phylogenetic position of the gnetophytes and conifers within the seed plants. Extant seed plants comprise angiosperms and four different groups of gymnosperms, i.e., gnetophytes (with only three genera, Gnetum, Ephedra, and Welwitschia), conifers, cycads, and Ginkgo biloba. Although some phylogenetic analyses of morphological data suggested that gnetophytes are a sister group to angiosperms among extant gymnosperms (J. A. Doyle 1994, 1996), recent phylogeny reconstructions based on molecular markers indicated that gnetophytes were more closely related to conifers than to angiosperms (Hansen et al. 1999; Winter et al. 1999) and, moreover, that all extant gymnosperms represent a monophyletic group (Goremykin et al. 1996; Chaw et al. 1997; Qiu et al. 1999; Samigullin et al. 1999; Soltis et al. 1999). Therefore, ancestors of orthologous genes shared by angiosperms and any gymnosperm were very likely already present in the last common ancestor of all extant seed plants. Although the earliest fossil evidence of gymnosperms dates back to about 350–365 MYA (Beck 1988; Taylor and Taylor 1993), the last common ancestor of extant seed plants probably existed about 300 MYA— recent estimations based on molecular data range from 285 to 348 MYA (Savard et al. 1994; Goremykin, Hansmann, and Martin 1997). Assuming monophyly of all extant gymnosperms, we thus conclude from our data that the last common ancestor of extant gymnosperms and angiosperms about 300 MYA already contained at least seven different MADS-box genes, namely, distinct representatives of the clades of AG-, AGL2-, AGL6-, DEF/GLO-, GGM13-, STMADS11-, and TM3-like genes (fig. 5). Aside from the possibility that some gene types might have been lost in some seed plant lineages, representatives of all of the clades found in gnetophytes and angiosperms can thus also be expected for conifers, Ginkgo, and cycads. Indeed, AG-, AGL6-, DEF/GLO-, and TM3-like genes have already been isolated from conifer species (Tandre et al. 1995; Mouradov et al. 1998, 1999; Rutledge et al. 1998; Sundström et al. 1999), and an AGL6-like gene has also been found in Ginkgo (Winter et al. 1999; Theißen et al. 2000). According to our data, the precursors of the GGM genes in the ancient clades of paralogous genes (GGM1, GGM2, GGM3, GGM9, GGM11, GGM12, GGM13) separated more than 300 MYA, which explains, at least 1432 Becker et al. in part, the diversity of extant GGM proteins in the Iand C-domains (fig. 2). Hybridization of Southern blots containing G. gnemon total DNA with GGM probes at moderate stringency indicate that our sampling of GGM genes was not exhaustive (data not shown). Since MADS-box gene sampling is far from being complete for any gymnosperm model species, we consider our determination of the number of genes in the last common ancestor of extant seed plants a minimal estimate. This is also true because some gene types could have been lost after the separation of the lineages that led to extant angiosperms and gymnosperms in at least one of the lineages, and because some genuine orthologous relationships may have escaped our detection methods due to the long time since the separation of the angiosperm and gymnosperm lineages and/or due to rapid sequence evolution. Comparison between the expression patterns of orthologous angiosperm and Gnetum MADS-box genes reveals some striking similarities. For the TM3-like gene GGM1, the DEF/GLO-like gene GGM2, the AG-like gene GGM3, and the AGL6-like genes GGM9 and GGM11, these similarities have already been outlined elsewhere (Winter et al. 1999). The STMADS11-like gene GGM12 is expressed strongly in leaves, but only weakly in reproductive cones (fig. 4). Its putative orthologs from the potato, STMADS11 and STMADS16, are expressed in all vegetative organs of the plant, but not in flowers (Carmona, Ortega, and Garcia-Maroto 1998; Garcia-Maroto et al. 2000). Thus, the STMADS11-like genes identified so far show a preference for expression in vegetative organs, which is a very unusual feature for MADS-box genes from seed plants. GGM13 expression was found exclusively in female cones (fig. 4). The only GGM13-like gene that has been isolated so far from an angiosperm, the maize gene ZMM17, is predominantly expressed in female inflorescences (maize cobs), where at late developmental stages expression is restricted to carpels (unpublished data). Thus, the two GGM13-like genes known to date have in common an expression pattern which is focused on female reproductive structures. The most parsimonious explanation for these similarities in expression patterns of putatively orthologous genes (also including the cases involving GGM1– GGM3, GGM9, and GGM11; see Winter et al. 1999) is that the MADS-box genes that were present in the last common ancestor of extant seed plants had already adopted their gene clade-specific expression patterns, which then were conserved to a certain extent in the different angiosperm and gymnosperm lineages. Thus, some of the MADS-box genes which were present in the last common ancestor of extant seed plants not only already had some of the sequence characteristics typical for extant clade members, but also were already diversified and fixed in terms of expression patterns and (by inference) function. While some of the ancestral genes 300 MYA were very likely already specialized in male (DEF/GLO-like genes) or female reproductive organ development (GGM13-like genes) or both (AG-, AGL2-, and AGL6-like genes), others were probably involved in vegetative development or the switch from vegetative to reproductive development (STMADS11- and TM3-like genes). Interestingly, all of these MADS-box gene types probably originated during the period when the ovule was established during evolution, i.e., 400–300 MYA (fig. 5; Beck 1988; Taylor and Taylor 1993). Since GGM13- and AG-like genes are expressed in ovules, and some AG-like genes are key control genes of ovule development (Angenent and Colombo 1996; Western and Haughn 1999), the establishment of these genes may have been an important step in the evolution of the ovule (see also Münster et al. 1997; Theißen et al. 2000). Since the members of at least five other clades of MADS-box genes have been conserved for a similar period of time (fig. 5), they also may well have been important for developmental and structural key innovations of the seed plants, e.g., the evolution of microsporophylls (antherophores, stamina) in case of DEF-, GLO-, and DEF/ GLO-like genes. Acknowledgments We thank Thomas Stützel (Spezielle Botanik, RuhrUniversität Bochum), Angelika Piernitzky, and Manfred H. Weisenseel (Botanischer Garten der Universität Karlsruhe) for plant material from G. gnemon. We also thank the Automatic DNA Isolation and Sequencing team of our institute for sequencing the cDNA clones. Many thanks to Jan Kim for his help with computer work and to Thomas Münster for valuable advice and discussions. Financial support from the DFG to G.T. (Th 417/3-1) and to A.B. (Graduiertenkolleg ‘‘Molekulare Analyse von Entwicklungsprozessen bei Pflanzen’’) is gratefully acknowledged. LITERATURE CITED ANGENENT, G. C., and L. COLOMBO. 1996. Molecular control of ovule development. Trends Plant Sci. 1:228–232. BECK, C. B. 1988. Origin and evolution of gymnosperms. Columbia University Press, New York. CARMONA, M. J., N. ORTEGA, and F. GARCIA-MAROTO. 1998. Isolation and molecular characterization of a new vegetative MADS-box gene from Solanum tuberosum L. Planta 207: 181–188. CHAW, S.-M., A. ZHARKIKH, H.-M. SUNG, T.-C. LAU, and W.-H. LI. 1997. Molecular phylogeny of extant gymnosperms and seed plant evolution: analysis of 18S rRNA sequences. Mol. Biol. Evol. 14:56–78. CHO, S., S. JANG, S. CHAE, K. M. CHUNG, Y.-H. MOON, G. AN, and S. K. JANG. 1999. Analysis of the C-terminal region of Arabidopsis thaliana APETALA1 as a transcription activation domain. Plant Mol. Biol. 40:419–429. DAYHOFF, M. O. 1979. Atlas of protein sequences and structure. Vol. 5, Suppl. 3. National Biomedical Research Foundation, Washington, D.C. DOYLE, J. A. 1994. Origin of the angiosperm flower: a phylogenetic perspective. Plant Syst. Evol. 8(Suppl.):7–29. ———. 1996. Seed plant phylogeny and the relationships of Gnetales Int. J. Plant Sci. 157(Suppl. 6):S3–S39. DOYLE, J. J. 1994. Evolution of a plant homeotic multigene family: towards connecting molecular systematics and molecular developmental genetics. Syst. Biol. 43:307–328. MADS-Box Gene Diversity in Seed Plants EGEA-CORTINES, M., H. SAEDLER, and H. SOMMER. 1999. Ternary complex formation between the MADS-box proteins SQUAMOSA, DEFICIENS and GLOBOSA is involved in the control of floral architecture in Anthirrinum majus. EMBO J. 18:5370–5379. FELSENSTEIN, J. 1993. PHYLIP (phylogeny inference package). Version 3.5. Distributed by the author, Department of Genetics, University of Washington, Seattle. FISCHER, A., N. BAUM, H. SAEDLER, and G. THEIßEN. 1995. Chromosomal mapping of the MADS-box multigene family in Zea mays reveals dispersed distribution of allelic genes as well as transposed copies. Nucleic Acids. Res. 23:1901– 1911. GARCIA-MAROTO, F., N. ORTEGA, R. LOZANO, and M.-J. CARMONA. 2000. Characterization of the potato MADS-box gene STMADS16 and expression analysis in tobacco transgenic plants. Plant Mol. Biol. 42:499–513. GOREMYKIN, V., V. BOBROVA, J. PAHNKE, A. TROITSKY, A. ANTONOV, and W. MARTIN. 1996. Noncoding sequences from the slowly evolving chloroplast inverted repeat in addition to rbcL data do not support Gnetalean affinities of angiosperms. Mol. Biol. Evol. 13:383–396. GOREMYKIN, V., S. HANSMANN, and W. F. MARTIN. 1997. Evolutionary analysis of 58 proteins encoded in six completely sequenced chloroplast genomes: revised molecular estimates of two seed plant divergence times. Plant Syst. Evol. 206:337–351. GU, Q., C. FERRÁNDIZ, M. F. YANOFSKY, and R. MARTIENSSEN. 1998. The FRUITFULL MADS-box gene mediates cell differentiation during Arabidopsis fruit development. Development 125:1509–1517. HANSEN, A., S. HANSMANN, T. SAMIGULLIN, A. ANTONOV, and W. MARTIN. 1999. Gnetum and the angiosperms: molecular evidence that their shared morphological characters are convergent, rather than homologous. Mol. Biol. Evol. 16:1006– 1009. HASEBE, M., and J. A. BANKS. 1997. Evolution of MADS gene family in plants. Pp. 179–197 in K. IWATSUKI and P. H. RAVEN. Evolution and diversification of land plants. Springer-Verlag, Tokyo. HUANG, H., M. TUDOR, C. A. WEISS, Y. HU, and H. MA. 1995. The Arabidopsis MADS-box gene AGL3 is widely expressed and encodes a sequence-specific DNA-binding protein. Plant Mol. Biol. 28:549–567. HUFFORD, L. 1996. The morphology and evolution of male reproductive structures of Gnetales. Int. J. Plant Sci. 157(Suppl. 6):S95–S112. LILJEGREN, S. J., C. FERRÁNDIZ, E. R. ALVAREZ-BUYLLA, S. PELAZ, and M. F. YANOFSKY. 1998. MADS-box genes involved in fruit dehiscence. Flowering Newsl. 25:9–19. MA, H., M. F. YANOFSKY, and E. M. MEYEROWITZ. 1991. AGL1–AGL6, an Arabidopsis gene family with similarity to floral homeotic and transcription factor genes. Genes Dev. 5:484–495. MICHAELS, S. D., and R. M. AMASINO. 1999. FLOWERING LOCUS C encodes a novel MADS domain protein that acts as a repressor of flowering. Plant Cell 11:949–956. MOURADOV, A., T. V. GLASSIK, B. A. HAMDORF, L. C. MURPHY, S. S. MARLA, Y. YANG, and R. TEASDALE. 1998. Family of MADS-box genes expressed early in male and female reproductive structures of Monterey pine. Plant Physiol. 117:55–61. MOURADOV, A., B. HAMDORF, R. D. TEASDALE, J. KIM, K.-U. WINTER, and G. THEIßEN. 1999. A DEF/GLO-like MADSbox gene from a gymnosperm: Pinus radiata contains an ortholog of angiosperm B class floral homeotic genes. Dev. Genet. 25:245–252. 1433 MÜNSTER, T., J. PAHNKE, A. DI ROSA, J. T. KIM, W. MARTIN, H. SAEDLER, and G. THEIßEN. 1997. Floral homeotic genes were recruited from homologous MADS-box genes preexisting in the common ancestor of ferns and seed plants. Proc. Natl. Acad. Sci. USA 94:2415–2420. OKADA, K., and Y. SHIMURA. 1994. Genetic analyses of signalling in flower development using Arabidopsis. Plant Mol. Biol. 26:1357–1377. PURUGGANAN, M. D., S. D. ROUNSLEY, R. J. SCHMIDT, and M. YANOFSKY. 1995. Molecular evolution of flower development: diversification of the plant MADS-box regulatory gene family. Genetics 140:345–356. QIU, Y.-L., J. LEE, F. BERNASCONI-QUADRONI, D. E. SOLTIS, P. S. SOLTIS, M. ZANIS, E. A. ZIMMER, Z. CHEN, V. SAVOLAINEN, and M. W. CHASE. 1999. The earliest angiosperms: evidence from mitochondrial, plastid and nuclear genomes. Nature 402:404–407. RIECHMANN, J. L., and E. M. MEYEROWITZ. 1997. MADS domain proteins in plant development. Biol. Chem. 378:1079– 1101. ROUNSLEY, S. D., G. S. DITTA, and M. F. YANOFSKY. 1995. Diverse roles for MADS box genes in Arabidopsis development. Plant Cell 7:1259–1269. RUTLEDGE, R., S. REGAN, O. NICOLAS, P. FOBERT, C. COTÉ, W. BOSNICH, C. KAUFFELDT, G. SUNOHARA, A. SÉGUIN, and D. STEWART. 1998. Characterization of an AGAMOUS homologue from the conifer black spruce (Picea mariana) that produces floral homeotic conversions when expressed in Arabidopsis. Plant J. 15:625–634. SAITOU, N., and M. NEI. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425. SAMBROOK, J., E. F. FRITSCH, and T. MANIATIS. 1989. Molecular cloning: a laboratory manual. 2nd edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. SAMIGULLIN, T. K., W. F. MARTIN, A. V. TROITSKY, and A. S. ANTONOV. 1999. Molecular data from the chloroplast rpoC1 gene suggest a deep and distinct dichotomy of contemporary spermatophytes into two monophyla: gymnosperms (including Gnetales) and angiosperms. J. Mol. Evol. 49: 310–315. SAVARD, L., P. LI, S. H. STRAUSS, M. W. CHASE, M. MICHAUD, and J. BOUSQUET. 1994. Chloroplast and nuclear gene sequences indicate Late Pennsylvanian time for the last common ancestor of extant seed plants. Proc. Natl. Acad. Sci. USA 91:5163–5167. SCHWARZ-SOMMER, Z., P. HUIJSER, W. NACKEN, H. SAEDLER, and H. SOMMER. 1990. Genetic control of flower development by homeotic genes in Antirrhinum majus. Science 250:931–936. SHORE, P., and A. D. SHARROCKS. 1995. The MADS-box family of transcription factors. Eur. J. Biochem. 229:1–13. SOLTIS, P. S., D. E. SOLTIS, P. G. WOLF, D. L. NICKRENT, S. M. CHAW, and R. L. CHAPMAN. 1999. The phylogeny of land plants inferred from 18S rDNA sequences: pushing the limits of rDNA signal. Mol. Biol. Evol. 16:1774–1784. SOMMER, H., J.-P. BELTRÁN, P. HUIJSER, H. PAPE, W.-E. LÖNNIG, H. SAEDLER, and Z. SCHWARZ-SOMMER. 1990. Deficiens, a homeotic gene involved in the control of flower morphogenesis in Anthirrinum majus: the protein shows homology to transcription factors. EMBO J. 9:605–613. SUNDSTRÖM, J., A. CARLSBECKER, M. SVENSSON, M. SVENSON, J. URBAN, G. THEIßEN, and P. ENGSTRÖM. 1999. MADSbox genes active in developing pollen cones of Norway spruce (Picea abies) are homologous to the B-class floral homeotic genes in angiosperms. Dev. Genet. 25:253–266. 1434 Becker et al. TANDRE, K., V. A. ALBERT, A. SUNDAS, and P. ENGSTRÖM. 1995. Conifer homologues to genes that control floral development in angiosperms. Plant Mol. Biol. 27:69–78. TAYLOR, T. N., and E. L. TAYLOR. 1993. The biology and evolution of fossil plants. Prentice Hall, Englewood Cliffs, N.J. THEIßEN, G., A. BECKER, A. DI ROSA, A. KANNO, J. T. KIM, T. MÜNSTER, K.-U. WINTER, and H. SAEDLER. 2000. A short history of MADS-box genes in plants. Plant Mol. Biol. 42: 115–149. THEIßEN, G., J. KIM, and H. SAEDLER. 1996. Classification and phylogeny of the MADS-box multigene family suggest defined roles of MADS-box gene subfamilies in the morphological evolution of eukaryotes. J. Mol. Evol. 43:484–516. THEIßEN, G., and H. SAEDLER. 1995. MADS-box genes in plant ontogeny and phylogeny: Haeckel’s ‘biogenetic law’ revisited. Curr. Opin. Genet. Dev. 5:628–639. ———. 1998. Molecular architects of plant body plans. Prog. Bot. 59:227–256. ———. 1999. The golden decade of molecular floral development (1990–1999): a cheerful obituary. Dev. Genet. 25: 181–193. WEIGEL, D., and E. M. MEYEROWITZ. 1994. The ABCs of floral homeotic genes. Cell 78:203–209. WESTERN, T. L., and G. W. HAUGHN. 1999. BELL1 and AGAMOUS genes promote ovule identity in Arabidopsis thaliana. Plant J. 18:329–336. WINTER, K.-U., A. BECKER, T. MÜNSTER, J. T. KIM, H. SAEDLER, and G. THEIßEN. 1999. MADS-box genes reveal that gnetophytes are more closely related to conifers than to flowering plants. Proc. Natl. Acad. Sci. USA 96:7342–7347. WOLFE, K. H., M. GOUY, Y.-W. YANG, P. M. SHARP, and W.H. LI. 1989. Date of the monocot-dicot divergence estimated from chloroplast DNA sequence data. Proc. Natl. Acad. Sci. USA 86:6201–6205. YANOFSKY, M. F., H. MA, J. L. BOWMAN, G. N. DREWS, K. A. FELDMAN, and E. M. MEYEROWITZ. 1990. The protein encoded by the Arabidopsis homeotic gene agamous resembles transcription factors. Nature 346:35–39. ZHANG, H., and B. G. FORDE. 1998. An Arabidopsis MADS box gene that controls nutrient-induced changes in root architecture. Science 279:407–409. ZHANG, J., and M. NEI. 1996. Evolution of Antennapedia-class homeobox genes. Genetics 142:295–303. WILLIAM MARTIN, reviewing editor Accepted June 5, 2000