Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Biological Journal of the Linnean Society, 2002, 76, 21–37. With 2 figures Phylogeny of a paradigm lineage: the Drosophila melanogaster species group (Diptera: Drosophilidae) VALERIE SCHAWAROCH1,2* 1 Department of Biology, City College, CUNY, New York, NY 10031, USA Division of Invertebrate Zoology, American Museum of Natural History, Central Park West at 79th Street, New York, NY10024-5192, USA 2 Received 28 May 2001; accepted for publication 3 January 2002 Although Drosophila melanogaster is a paradigm eukaryote for biology, relationships of this species and the other 174 species in the melanogaster species group are poorly explored and ambiguous. Gene regions of Cytochrome oxidase II (mt:CoII), Alcohol dehydrogenase (Adh) and hunchback (hb) were sequenced and analysed phylogenetically to test prior hypotheses of relationships for the group based on chromosomes, morphology, and 28S rRNA gene sequences. A simultaneous cladistic analysis of the three newly sequenced gene regions produced a single wellresolved phylogeny for 49 exemplar species representing eight subgroups. Monophyly of each of the ananassae, melanogaster, montium, and takahashii subgroups is supported; the suzukii subgroup is polyphyletic. This phylogeny is consistent with variation in significant morphological structures, such as the male sex comb on the fore tarsus. The broad range of morphological variation among these species is interpreted and the applicability to evolution and developmental investigations is discussed. This phylogeny facilitates comparative investigations, such as gene family evolution, transposable element transmission, and evolution of morphological structures. © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37. ADDITIONAL KEYWORDS: Adh – DNA – hb – molecular phylogeny – mt:CoII – sex comb. INTRODUCTION THE D. MELANOGASTER PARADIGM Evolutionary studies rely on well-established phylogenies. Drosophila melanogaster traditionally and currently serves as a model organism for virtually all aspects of biology, but especially for genetics and development (Lawrence, 1992; Kohler, 1994; Sullivan, Ashburner & Hawley, 2000). This species, however, is only one of 174 species within the melanogaster species group. Techniques developed and information gathered in D. melanogaster-based studies can be expanded to the other closely related species. Previous investigations employing Drosophila, such as the evolution of gene families (Drosopoulou & Scouras, 1995; Inomata et al., 1997b; Inomata et al., 1997a), *Current address: Division of Invertebrate Zoology, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024–5192, USA. E-mail: [email protected] chromosome structure (Mavragani-Tsipidou et al., 1994; Drosopoulou & Scouras, 1995; Scouras, 1995), and transposable element transmission (Tanda et al., 1988; Daniels et al., 1990; Clark & Kidwell, 1997; Clark et al., 1998) would benefit from a wellestablished phylogeny. However, phylogenetic relationships for species within the melanogaster species group are poorly understood. PREVIOUS HYPOTHESES OF RELATIONSHIPS The melanogaster species group is one of eight species groups within the subgenus Sophophora of the genus Drosophila, and represents one of the largest radiations of species within the genus Drosophila (Bock, 1980). The obscura group has been established as sister to the melanogaster group based on numerous morphological and biochemical investigations (Sturtevant, 1942; Throckmorton, 1975; Powell & DeSalle, 1995). Males in both groups possess a comb of thick, sclerotized teeth (the ‘sex comb’) on the fore © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 21 22 V. SCHAWAROCH Figure 1. Phylogenetic hypothesis generated from a most parsimonious simultaneous analysis of the three gene regions (length = 1540 steps and CI = 0.35). The ingroup (i.e. the melanogaster group) exhibits monophyly where applicable for all traditional taxonomic groupings, except the suzukii subgroup. Bold lines on the cladogram represent nodes where both the Bremer support is ≥3 and bootstrap is >50%. Morphological structures (sex comb, epandrium and mid-tibia) discussed in the text and that corroborate cladogram structure are illustrated. The evolutionary changes seen in the male sex comb are labelled with hatched bars on the phylogeny. Sex comb orientation was denoted as either Hor. for horizontal, Obl. for oblique or Long. for longitudinal. The number of foreleg tarsal segments containing sex comb teeth were listed as either 1 tar., 2 tar., or 3 tar. © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 THE MELANOGASTER SPECIES GROUP tarsi, which is the most overt feature of the groups, although scarcely developed in some species (Fig. 1). This character has also been used to establish and maintain the melanogaster group and its 12 named subgroups. There have been five major studies on the melanogaster species group (Hsu, 1949; Okada, 1954; Bock & Wheeler, 1972; Bock, 1980 and a regional revision by Toda, 1991). Only the studies of Hsu (1949) and Okada (1954) and Bock & Wheeler (1972) proposed relationships among the subgroups using morphological characters, but without phylogenetic bases. These classifications were presented when only 12–50% of the current species in the group were known. Bock & Wheeler (1972) proposed that the ananassae and montium species subgroups form one lineage. Within the remaining subgroups they also proposed a cluster of closely related subgroups (i.e. eugracilis, ficusphila, suzukii and takahashii) which could be distinguished by the small hooked bristles on the mid-tibiae of the males and characters from the male genitalia. Recent hypotheses provide some support for this classification. Ashburner et al. (1984), using chromosomes and morphology, discerned three lineages: (1) an ananassae species subgroup; (2) a montium species subgroup; and (3) a lineage comprised of the elegans, eugracilis, ficusphila, melanogaster, suzukii, and takahashii species subgroups. However, Ashburner et al. (1984) were unsure how these lineages were interrelated or where to place the remaining species subgroups. The first major DNA sequencing studies of the melanogaster group were by Pélandakis et al. (1991) and Pélandakis & Solignac (1993) and contained the greatest taxon sampling. Unfortunately, their findings are controversial. Pélandakis et al. (1991) and Pélandakis & Solignac (1993) used 28S rRNA sequences with parsimony and Neighbour-Joining analyses to reconstruct relationships within the genus Drosophila and the subgenus Sophophora, including 21 species in the melanogaster group. The parsimony analysis produced numerous trees whose consensus lacked resolution; therefore, the proposed phylogeny was based on Neighbour-Joining. The NeighbourJoining phylogeny had extensive paraphyly at various levels (within the genus Drosophila, the subgenus Sophophora, and the melanogaster group), contrary to previous hypotheses based on morphological characters. With respect to the melanogaster group, Pélandakis et al. (1991) and Pélandakis & Solignac (1993) proposed three lineages: (1) the obscura and fima groups allied with the ananassae subgroup; (2) a montium subgroup lineage; and (3) a lineage comprised of the melanogaster subgroup plus the ‘Oriental’ (Asian) elegans, eugracilis, ficusphila, suzukii, and takahashii subgroups. The Neighbour-Joining tree 23 had a low bootstrap value of 6 at the node uniting the first with the second lineage. Pélandakis et al. (1991) and Pélandakis & Solignac (1993) attributed the lack of resolution with their parsimony analyses, as well as the paraphyly and poor bootstrap support in their Neighbour-Joining phylogeny, to the small number of characters in the 28S gene supporting the ananassae + montium subgroups node. The few 28S characters, they believe, is a result of a rapid speciation event. Methodologically, Neighbour-Joining produces a single ‘resolved’ tree, however, its solutions are not considered optimal and should not used for final phylogenetic hypotheses (Hillis et al., 1996). A basal trichotomous relationship for the subgroups is unlikely because distinctive morphological characters exist uniting some of the subgroups, such as hooked setae on the mid-tibia, and presence of both a surstylar clasper and cercal clasper. Rather, the lack of characters here probably reflects the gene region used for the analyses. Within the melanogaster group the D1 and D2 regions of 28S rRNA are extremely invariant (see figs in Pélandakis & Solignac, 1993; Schawaroch, 2000). In fact, the D3 expansion region of 28S rDNA evolves so slowly as to resolve relationships among the holometabolous insect orders – divergence events an order of magnitude older than in drosophilids (see Whiting et al., 1997). The final hypothesis of relationships proposed by Pélandakis et al. (1991) and Pélandakis & Solignac (1993) largely agrees with Ashburner et al. (1984) in recognizing three distinct lineages within the melanogaster group: (1) the ananassae subgroup; (2) the montium subgroup; and (3) the melanogaster plus Asian subgroups. There was no proposal for how these lineages were interrelated or where to place the remaining subgroups. THE PRESENT STUDY’S GOALS The primary goal of the present study is to resolve relationships within the melanogaster group by choosing more variable gene regions (i.e. Alcohol dehydrogenase [Adh], hunchback [hb], and Cytochrome oxidase II [mt:CoII]) and by sampling more taxa (i.e. 43 species representing eight subgroups [Fig. 1]). Morphological characters employed by earlier studies were examined in the context of this new molecular phylogeny. Relationships established by the molecular phylogeny will be used to describe the evolution of sex comb transformation for the melanogaster species group. MATERIAL AND METHODS TAXON SAMPLING Flies were obtained from the National Drosophila © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 24 V. SCHAWAROCH Species Resource Center at Bowling Green and from D. Lachaise (Appendix A). A total of 49 taxa were used in this study. The ingroup contains representative taxa from eight of the 12 subgroups within the melanogaster group. Outgroup taxa consist of six species from the obscura group, two from each of the subgroups of obscura, pseudoobscura and affinis in the obscura group (Barrio et al., 1994). GENE REGIONS One mitochondrial (mt:CoII) and two single copy nuclear (Adh and hb) gene regions were chosen on their ability to provide characters (synapomorphies) at the taxonomic level of investigation (Schawaroch, 2000). All of these regions have conserved areas and variable areas that are a source of characters at the species level. Although each of these genes exhibits population-level variability, species have their own distinct sequences that are shared among populations (Davis & Nixon, 1992). Any DNA sites found to be heterozygous were coded as such. The DNA regions chosen are used in a wide variety of taxa, especially within Drosophila; thus facilitating comparisions with previous studies (e.g. Beckenbach et al., 1993; Thomas & Hunt, 1993; Baker & DeSalle, 1997). A simultaneous analysis of multiple unlinked gene regions was employed to avoid problems of single gene phylogenies (Doyle, 1992). MOLECULAR TECHNIQUES Genomic DNA was prepared using single fly preps (DeSalle et al., 1993). DNA was PCR amplified using PE Taq polymerase with primers described previously (Thomas & Hunt, 1993; Brower, 1994; Baker & DeSalle, 1997; plus two new Adh primers 5¢-TGGGCGGCATTGGNYTNGAYAC-3¢ and 5¢AGCCAGGARTTGAAYTTRTG-3¢, Schawaroch, 2000). PCR products were purified using Gene Clean II (Bio 101). The Adh fragment was cloned using Invitrogen’s TA cloning kit for many of the taxa. Gene regions were sequenced in both directions by either manual or automated sequencing methods. Most of the Adh, and mt:CoII sequences were generated manually. The remainder including all the hb sequences were done using an ABI 373 automated sequencer. Manual sequencing of double stranded PCR products and clones was done using 35S and United States Biochemical’s Sequenase version 2.0 DNA sequencing kit, according to manufacturer’s instructions. Automated sequencing of double stranded PCR product was accomplished according to ABI Prism DNA sequencing kit, purified by sephadex columns and run using Applied Biosystems 373A machine and DNA sequence protocols. Sequences were checked and corrected using S E Q U E N C H E R 3.0 (Gene Codes Corp.) sequence analysis software. Most of the DNA sequence was generated by this study (GenBank accession numbers – Adh: AF459744-AF459786; mt:CoII: AF461268AF461308; hb: AF461309-AF461356) with the following exceptions: affinis: mt:CoII (M95140); ambigua: mt:CoII (M95145), Adh (X54813); bifasciata: mt:CoII (M95147); melanogaster: mt:CoII (AF200828), Adh (M11290), hb (Y00274); persimilis: mt:CoII (M95143), Adh (M60997); pseudoobscura: mt:CoII (M95150), Adh (M60989); teissieri: Adh (X54118); tolteca: mt:CoII (M95147); and yakuba: mt:CoII (X00924), Adh (X57376). CHARACTER ASSIGNMENT The Adh and mt:CoII DNA sequences of 290 and 384 bp, respectively, contained no insertions or deletions; therefore, alignment was straightforward. Homology assessment for hb was more complicated because the total length of the hb sequence varied from 513 bp in D. bifasciata to 456 bp in D. takahashii and D. elegans. It was necessary to convert hb nucleotide sequence to amino acid sequence for recognition of homology (i.e. topological identity sensu Brower & Schawaroch, 1996). Alignments were performed on hb amino acid sequences using the Clustal method in M E G A L I G N (D N A S TA R , version 1.02). A sensitivity analysis (Wheeler, 1995) was performed varying the gap penalty from 8 to 30. Because topological identity could not be established, alignment-ambiguous sites were removed from the analysis (Gatsey et al., 1994). The remaining gaps were coded as a combination of question marks and 5th state depending upon the alignment context (Appendix B). MOLECULAR CHARACTER ASSESSMENT Character data were assessed for heterogeneity in base composition. Total nucleotide composition for each gene region was tested for heterogeneity using HKY (Hasegawa et al., 1985) likelihood model under P U Z Z L E (ver. 4.0.2, Strimmer & von Haeseler, 1999). To quantify the congruence between each of the data partitions and the combined analysis tree, the incongruence length difference (ILD) (Mickevich & Farris, 1981) was calculated for phylogenetically informative characters. ILD values were tested for significance (Farris et al., 1994; Farris et al., 1995) using the partition-homogeneity test for 111 iterations with 10 random addition tree bisection-reconnection (TBR) searches in PAUP* 4.0 (Beta ver. 4.0b2a-4.0b4a, Swofford, 2002). © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 THE MELANOGASTER SPECIES GROUP PHYLOGENETIC ANALYSES A combined analysis (Kluge, 1989) of all three genes for all the taxa was generated. The tree was rooted with the six outgroup species chosen from the sister taxon, the obscura group. Only informative characters were used to generate trees and tree statistics. Heuristic tree searches were performed using PAUP* 4.0 (Beta ver. 4.0b2a-4.0b4a, Swofford, 2002) with random addition of taxa, TBR branch swapping and repeated 20 times. The characters were given an equal weight of one and run unordered. CHARACTER SUPPORT AT NODES Support for nodes in the combined analyses was evaluated by Bremer support (BS) (Bremer, 1988; Bremer, 1994) and bootstrap (B) values (Felsenstein, 1985). BS values were calculated using A U T O D E C AY (ver. 2.9.8, Eriksson, 1997). B analyses employed 1000 replicates with each replicate containing 10 heuristic searches with random addition of taxa and TBR branch swapping. RESULTS MOLECULAR mt:CoII gene region has a strong A and T bias with relatively little C and G nucleotide base content (A = 33.8%, C = 13.6%, G = 15.2%, T = 37.4%). No correction was needed for this nucleotide bias because each gene region passed a heterogeneity chi-square test at the 5% level (P U Z Z L E ver. 4.0.2, Strimmer & von Haeseler, 1999). INCONGRUENCE LENGTH DIFFERENCE (ILD) The data partitions are significantly heterogeneous when compared to the combined data (P = 0.009). This paper, however, employs a combined data analysis for phylogeny reconstruction based on the following reasons. Sources of evidence (i.e. characters) should be varied, thereby negating the problem of single character or gene phylogenies (or homogenized data) (e.g. Doyle, 1992). By not including all the data, resolution could be lost especially if data partitions contribute information for different levels of the analysis (e.g. Hillis, 1987). Studies have demonstrated that simultaneous analyses provide greater resolution (Olmstead & Sweere, 1994; Miller et al., 1997; Remsen & DeSalle, 1998). CHARACTER ASSESSMENT Approximately 30% of all the nucleotides sequenced were phylogenetically informative (Table 1). All the DNA regions used are protein coding. As expected, third position sites contributed the greatest number of phylogenetically informative characters, and second position sites contributed the least. BASE COMPOSITION Within separate gene regions the total nucleotide base composition exhibited biases. Nuclear regions have a bias in favour of the C nucleotide base (Adh: A = 24.6%, C = 30.2%, G = 25.8%, T = 19.5%; hb: A = 25.5%, C = 36.7%, G = 23.9%, T = 14%). As in other insect mitochondrial studies (Clary & Wolstenholme, 1985; DeSalle et al., 1987; Liu & Beckenbach, 1992), the 25 TOTAL EVIDENCE ANALYSIS The simultaneous analysis (mt:CoII + Adh + hb) resulted in a well resolved, single most parsimonious (SMP) cladogram (Length = 1540 steps, CI = 0.35) (Fig. 1). The CI value is within the standardized estimate for the number of taxa (Sanderson & Donoghue, 1989). In this tree monophyly was strongly supported for the melanogaster group (BS = 36; B = 100%). Monophyly was also supported for the ananassae, melanogaster, montium and takahashii subgroups with BS = 9, 8, 14, 1 and B = 98%, 90%, 99%, <50%, respectively. As the small elegans and ficusphila subgroups were each represented by a single taxon in this study, monophyly could not be tested. Monophyly is untestable for the monotypic eugracilis subgroup. In this analysis, the suzukii subgroup was polyphyletic. For the suzukii subgroup representatives: D. mimet- Table 1. Contributions of the various gene regions and nucleotide positions to the number of phylogenetically informative characters in the dataset Phylogenetically informative Data Total characters Phylogenetically informative characters All three gene regions Adh hb mt:Coll 1108 290 434 384 342 92 138 112 First position Second position Third position 52 18 16 18 21 8 12 1 267 66 108 93 © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 26 V. SCHAWAROCH ica was basal in the takahashii clade (BS = 3; B = 59%), D. biarmipes was basal in the clade containing the melanogaster and eugracilis subgroups (BS = 2; B < 50%), and D. lucipennis forms a clade with D. elegans (BS = 12; B = 99%). This analysis also detected three major lineages within the melanogaster group: ananassae subgroup, the montium subgroup and the melanogaster + Asian subgroups, confirming the classification of Ashburner et al. (1984). The ananassae and montium subgroups appear to be sister taxa, although there is weak BS and B support (BS = 1; B < 50%). DISCUSSION Phylogenetic hypotheses are constrained by the taxa and characters sampled. The importance of taxon choice and its influence on cladogram structure has been demonstrated (Lecointre et al., 1993; Graybeal, 1998; Hillis, 1998; Poe, 1998). To obtain an adequate DNA sample for small species such as drosophilids the whole specimen must be sacrificed. Fortunately, a number of drosophilid species are maintained in laboratory cultures. These cultures, however, limit the scope of the molecular studies. Most, but not all, of the available stocks have been included in the present study, which included 25% of the currently known species in the melanogaster group – the most comprehensive taxon sampling for the melanogaster group for any biochemical investigation thus far. In some instances the resulting phylogenetic hypotheses may seem to contradict traditional views or appear poorly supported, however, they are actually in agreement and are corroborated by other types of data (e.g. morphological and electrophoretic) and methods of analysis. This corroboration indicates stability of the hypothesis of relationships proposed here. THE SPECIES SUBGROUPS The ananassae subgroup The ananassae subgroup has been characterized by the presence of a cercal clasper and a surstylar clasper with two sets of teeth (Figs 1 and 2). For members of this subgroup, D. varians is morphologically unique in lacking the cercal clasper and possessing a cercal plate with bristles similar to species of the suzukii subgroup (Fig. 1). Drosophila varians has been included within the ananassae subgroup based on chromosomal characters (Bock & Wheeler, 1972). In the present study six species were chosen to represent the ananassae subgroup, three from the ananassae complex (D. ananassae, pallidosa and phaeopleura), one from the bipectinata complex (D. malerkotliana), one from the ercepeae complex (D. ercepeae), and one unassigned species, D. varians. The SMP cladogram supported monophyly for the ananassae subgroup with very high BS and B values (BS = 9, and B = 98%). Within the ananassae subgroup clade, species of the ananassae complex form a cluster. As the remaining species complexes have only single representatives it can not be determined if they are natural groups. It is interesting to note that despite previous studies questioning its inclusion, the morphologically aberrant D. varians does turn out to be nested within the ananassae subgroup. Figure 2. Foreleg and male periphallic structures; terms refer to structures discussed in the text. © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 THE MELANOGASTER SPECIES GROUP The melanogaster subgroup The melanogaster subgroup has been well investigated and much data has been used to indicate its monophyly (Lachaise et al., 1988). This study sampled the familiar D. melanogaster and the two species of the yakuba complex, D. yakuba and D. teissieri. The melanogaster subgroup forms a well-supported clade (BS = 8 and B = 90%) located in a relatively more derived position within various Asian species and subgroups. The yakuba complex has even greater Bremer and bootstrap values (BS = 16 and B = 100%). The SMP cladogram agrees with the current hypothesis of relationships for the yakuba complex + D. melanogaster. The montium subgroup The montium subgroup is monophyletic with very high BS and B values (BS = 14, and B = 99%) at the basal node. A monophyletic montium subgroup is in agreement with morphological and biochemical studies for which various synapomorphies have been proposed (e.g. Bock & Wheeler, 1972; Ashburner et al., 1984; Scouras, 1995). This contrasts with Tsacas & David (1978) who felt that the montium subgroup could not be monophyletic due to its enormous size and widespread distribution, criteria not applicable to defining monophyletic groups. Toda (1991) removed three species (i.e. D. rhopaloa, palmata and longissima) from the montium subgroup, for which he established the rhopaloa and longissima subgroups. Unfortunately, none of these species or other representatives of these subgroups were included in the present study. One way to assess this classification in lieu of sequencing is to assess diagnostic value of morphological characters, as determined by total evidence of the analysed taxa. The two species in the longissima subgroup possess a sex comb identical to ones in the montium subgroup, which consists of two long, longitudinally arranged rows of numerous teeth on tarsomeres 1 and 2. As this feature corroborates the molecular phylogeny presented here, the longissima subgroup may either be a complex within montium subgroup or a sister taxon to the montium subgroup. Many morphological features of the rhopaloa subgroup are variable and its monophyly is presently questionable. A thorough discussion of the complexes and relationships within this large complicated subgroup is provided in Schawaroch (2000). It can be noted that: (1) D. barbarae is not a member of the clade that contains representatives of either jambulina or kikkawai complexes; and (2) much of the resolution within the montium subgroup is supported by BS-values less than or equal to 2 and B values less than 50%. The takahashii subgroup The SMP cladogram supports monophyly for the taka- 27 hashii subgroup, even though BS and B values are low (BS = 1 and B < 50%). The takahashii subgroup species grouped as two complexes: paralutea + prostipennis and takahashii + lutescens. This differs from the hypothesized affinity of D. lutescens, paralutea, pseudotakahashii, takahashii and trilutea based on hybridization tests (Bock & Wheeler, 1972; Watanabe & Kawanishi, 1983; Lemeunier et al., 1986). A morphological character (i.e. number of rows of male sex comb teeth on the second tarsal segment) (Figs 1 and 2) and electrophoresis of allozymes by Parkash et al. (1994) corresponds with the division of the takahashii subgroup seen in the current study’s molecular cladogram. Although included within the melanogaster + Asian subgroups clade, the takahashii subgroup is not the sister taxon to the melanogaster subgroup as Bock (1980) hypothesized. The suzukii subgroup Members of this subgroup apparently exhibit the greatest disparity of morphological characters, particularly for sex comb, phallic and periphallic structures, which accounts for polyphyly of the ‘subgroup.’ The putative synapomorphies for this ‘subgroup’ are generalized male genitalic characters, such as surstylar clasper with several sets of distinctly different teeth, cercal plate with lower bristles differentiated from upper bristles, and large posterior paramere (Toda, 1991). For these reason, monophyly of the suzukii ‘subgroup’ has been questioned (Bock & Wheeler, 1972; Bock, 1980; Toda, 1991). In the SMP cladogram the suzukii subgroup was polyphyletic. The three suzukii ‘subgroup’ representatives of D. mimetica, biarmipes and lucipennis used in this study exhibit the complete range in variation with respect to sex comb structure. The SMP molecular-based analysis places each of these three representative species in a separate clade that corresponds well with their sex comb morphology. Drosophila mimetica is sister to the takahashii clade (BS = 3; B = 59%), and all taxa have similar sex combs – a horizontal row each on the first and second tarsal segments. Drosophila biarmipes is the most basal member in a clade containing the melanogaster and eugracilis subgroups (BS = 2; B < 50%). All taxa have a sex comb located on the first tarsal segment. However, the D. biarmipes and the melanogaster subgroup sex comb has an oblique orientation, whereas, the eugracilis sex comb contains two teeth of either oblique or longitudinal orientation. Drosophila lucipennis forms a clade with D. elegans (BS = 12; B = 99%). Drosophila lucipennis has completely lost the sex comb. In D. elegans, the sex comb is a series of horizontal rows along the first three tarsal segments. Bock & Wheeler (1972: 27–28) established a subgroup for D. elegans alone, because it “. . . differs substantially in the structure of the male genitalia © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 28 V. SCHAWAROCH [from suzukii subgroup].” Only the D. lucipennis plus D. elegans clade has high BS and B values. Low BS and B values were also seen for the clades containing D. mimetica and D. biarmipes (BS = 2; B < 50%), and the sister to this clade, which contains D. lucipennis (BS = 3; B = 63%). Further work on the status and/or redefinition of the suzukii ‘subgroup’ is indicated. RELATIONSHIPS AMONG CLADES The three lineages In the SMP cladogram the melanogaster group is subdivided into three major clades: the ananassae subgroup, the montium subgroup, and the melanogaster + Asian subgroups. This agrees with Ashburner et al. (1984). Ashburner et al. (1984), however, did not hypothesize any relationships among these lineages. My hypothesis also agrees with Pélandakis et al. (1991), who interpreted their findings to support the three lineages of Ashburner et al. (1984) with the exception of the obscura and fima groups within the melanogaster group. The SMP cladogram hypothesized relationships among the three lineages as: (ananassae, montium) melanogaster + Asian subgroups. The ananassae and montium subgroups are sister taxa The SMP cladogram proposes the ananassae and montium subgroups as sister taxa. The node supporting this relationship has low BS and B values (BS = 1 and B < 50%). Despite the weak support, the sister relationship between the ananassae and montium subgroups also has a morphological basis, indicated even by the early studies of Hsu (1949) and Okada (1954). Bock & Wheeler (1972) hypothesized that the ananassae and montium subgroups formed a lineage based on the presence of both a surstylar clasper and cercal clasper (Figs 1 and 2). Ashburner et al. (1984) hypothesized no resolution among the three lineages. Pélandakis et al. (1991), and even Pélandakis & Solignac (1993), presented the relationship: subgenus Drosophila [melanogaster + Asian (montium, ananassae + fima + obscura)]. Despite the paraphyly and the lack of characters within Pélandakis et al. (1991) and Pélandakis & Solignac (1993) studies, the ananassae and montium subgroups are still seen as having a greater affinity. phila. The support for these relationships varies but they have relatively low BS (< 3) and B (< 50%) values. Okada (1964) and Bock & Wheeler (1972) each felt that the eugracilis, ficusphila, suzukii and takahashii subgroups had a close affinity due to the hooked setae on the mid-tibiae of males and other characters of the male genitalia. The SMP cladogram does not exactly support this hypothesis because the melanogaster and elegans subgroups are nested within that clade. Mapping the hooked setae character on the SMP cladogram (i.e. eugracilis, ficusphila, suzukii, and takahashii representatives are coded for presence, all other taxa absence) increases the length of the tree by three steps and does not affect the CI value (343 informative characters, vs. the previous 342, therefore L = 1543, CI = 0.35). According to the SMP cladogram, the presence of hooked setae on the mid-tibia may have evolved once at the base of the melanogaster + Asian subgroups clade and was lost twice: once at the node for the melanogaster subgroup clade and a second time at the terminal for the elegans subgroup. Therefore, this character may actually be a synapomorphy for the melanogaster + Asian subgroups lineage. The SMP cladogram disagrees with previous hypotheses. Hsu (1949) and Okada (1954) placed the suzukii subgroup at the base of the melanogaster group, whereas within the SMP cladogram none of the suzukii representatives are basal either within the melanogaster group or the melanogaster + Asian subgroups clade. Contrary to Bock (1980), the SMP cladogram places melanogaster as sister to the eugracilis subgroup rather than the takahashii subgroup. The electrophoretic and hybridization (breeding) studies of Kim & Lee (1991), Kim et al. (1992), Lee et al. (1993), and Lee et al. (1994) hypothesized a hierarchy of (melanogaster, takahashii) suzukii which was not supported by the SMP cladogram. Also in contrast to the SMP cladogram was Nigro et al. (1991) mitochondrial DNA-based scheme of relationships for the melanogaster + Asian subgroups as: D. eugracilis (D. takahashii (melanogaster subgroup)). However, the SMP cladogram’s subdivisions within the melanogaster + Asian subgroups clade best explains changes in sex comb morphology (see previous discussion for suzukii subgroup). CONCLUSION Relationships within the melanogaster and Asian subgroups The melanogaster + Asian subgroups clade seems well supported with a BS value of 3 and a B value of 85%. The relationships of the subgroups within this clade can be summarized as: ((((melanogaster, eugracilis) suzukii) (takahashii, suzukii)) (elegans, suzukii)) ficus- POTENTIAL FOR COMPARATIVE DEVELOPMENTAL EVOLUTION OF MORPHOLOGICAL STRUCTURES Phylogenies can be used to establish or to test evolutionary hypotheses explaining variation exhibited by a group of organisms. Within the melanogaster species group the male sex comb is a well-documented, © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 THE MELANOGASTER SPECIES GROUP highly variable structure. A male sex comb, possessed in varying degrees, is a synapomorphy for the melanogaster and obscura groups within the subgenus Sophophora and unique within the family Drosophilidae (Grimaldi, 1990). Within the melanogaster species group the sex comb character has been used to define species subgroups (e.g. Bock & Wheeler, 1972; Bock, 1980; Toda, 1991). The male sex comb functions in species mate recognition, courtship and copulation (Spieth & Ringo, 1983). By mapping the sex comb character on the current molecular phylogeny a hypothesis can be generated describing sex comb structural transformation. The sex comb varies in general length, number of tarsal segments and orientation (Fig. 1). Mapping the sex comb types on the SMP phylogeny illustrates the usefulness of this character for diagnosing the subgroups, as it has been used in previous studies (Fig. 1). Loss of sex comb, seen here in D. lucipennis, also occurs in five other melanogaster group species belonging to four subgroups. Homology statements should ideally not be based on loss (absence) of a character; therefore, other character information, including molecular data, will be necessary for phylogenetic reconstruction. Evolution of the sex comb among groups and subgroups is best understood on the basis of two characters – orientation and number of tarsal segments. When the orientation of the sex comb was mapped (MacClade ver. 3.01, Maddison & Maddison, 1992) on to the SMP phylogeny using the accelerated transformation (ACCTRAN) model of character evolution there were minimally two origins for longitudinal orientation (one for ficusphila and the other for the montium subgroup), or minimally two origins for a horizontal orientation (one at the node shared by the takahashii subgroup-mimetica clade + the elegans subgroup and the other for the ananassae subgroup). There was a reversal in sex comb orientation to the plesiomorphic condition of oblique for the biarmipeseugracilis-melanogaster subgroup clade. A sex comb occupying two tarsal segments is the plesiomorphic condition for the melanogaster group on the SMP phylogeny, which is not surprising because the outgroup (the obscura species group) shows this state. Employing an ACCTRAN evolutionary hypothesis produces two instances where the sex comb was reduced to a single tarsal segment (biarmipes-eugracilismelanogaster clade and the other in the derived montium subgroup species, D. nikananu. Drosophia nikananu is a member of a species complex characterized by short sex combs.) Due to the similarity of the sex combs to those of the melanogaster subgroup, the nikananu complex has been proposed as most basal within the montium subgroup, linking the montium with the melanogaster subgroup (Tsacas, 29 1979; Tsacas, 1984; Tsacas & Chassagnard, 1992). However, it is important to note that this ‘similar’ character differs in orientation and is therefore structurally and phylogenetically convergent. A sex comb occupying three tarsal segments occurs in two lineages: the elegans subgroup and the ananassae subgroup (with a reversal appearing in D. malerkotliana whose sex comb covers only two tarsal segments). Future studies might explore comparative developmental mechanisms involved in sex comb formation. Studies exist which explain the relation between genes and phenotypic expression in Drosophila (Liu et al., 1996; True et al., 1997; Rutherford & Lindquist, 1998; Zeng et al., 2000). Many genes are involved in sex comb development, including Sex combs reduced (Scr) and cramped (crm). Scr affects the number of teeth in a sex comb (Pattatucci et al., 1991) – a feature that varies at the species level within subgroups of the melanogaster group. The crm gene causes proximaldistal transformations that produce sex comb teeth to be present on second and third tarsal segments (Yamamoto et al., 1997) – a feature used in some cases to define species subgroups. Reliable evolutionary hypotheses of important structures (e.g. sex comb and Balbiani rings) or biochemical entities (e.g. gene families and transposable elements) are possible only with informative, stable phylogenies like the one presented here for the melanogaster group. ACKNOWLEDGEMENTS NSF Doctoral Dissertation Improvement Grant (NSF DEB 9423508) and two City College Biology Dissertation Grants funded this research. AMNH-CUNY Doctoral Training Program Fellowship provided a stipend. Research was conducted at the AMNH Molecular Systematics Laboratory. I am most grateful to the following individuals: Rob DeSalle who supported the molecular work, Dave Grimaldi for tutelage regarding Drosophila, to Carole Griffiths and Gail Simmons for providing informative commentary, and Steve Thurston for the figures. I would like to thank Michael Ashburner and two anonymous reviewers who suggested changes that improved this paper. REFERENCES Ashburner M, Bodmer M, Lemeunier F. 1984. On the evolutionary relationships of Drosophila melanogaster. Developmental Genetics 4: 295–312. Baker RH, DeSalle R. 1997. Multiple sources of molecular characters and the phylogeny of Hawaiian drosophilids. Systematic Biology 46: 654–673. Barrio E, Latorre A, Moya A. 1994. Phylogeny of the Drosophila obscura species group deduced from mitochondr- © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 30 V. SCHAWAROCH ial DNA sequences. Journal of Molecular Evolution 39: 478– 488. Beckenbach AT, Wei YW, Liu H. 1993. Relationships in the Drosophila obscura species group, inferred from mitochondrial cytochrome oxidase II sequences. Molecular Biology and Evolution 10: 619–634. Bock IR. 1980. Current status of the Drosophila melanogaster species-group (Diptera). Systematic Entomology 5: 341–356. Bock IR, Wheeler MR. 1972. The Drosophila melanogaster species group. University of Texas Publications: 7213: 1–102. Bremer K. 1988. The limits of amino acid sequence data in angiosperm phylogenetic reconstruction. Evolution 42: 795– 803. Bremer K. 1994. Branch support and tree stability. Cladistics 10: 295–304. Brower AVZ. 1994. Phylogeny of Heliconius butterflies inferred from mitochondrial DNA sequences (Lepidoptera: Nymphalidae). Molecular Phylogenetics and Evolution 3: 159–174. Brower AVZ, Schawaroch V. 1996. Three steps of homology assessment. Cladistics 12: 265–272. Clark JB, Kidwell MG. 1997. A phylogenetic perspective on P transposable element evolution in Drosophila. Proceedings of the National Academy of Sciences, U.S.A. 94: 11428– 11433. Clark J, Kim PC, Kidwell MG. 1998. Molecular evolution of P transposable elements in the genus Drosophila. 3. The melanogaster species group. Molecular Biology and Evolution 15: 746–755. Clary DO, Wolstenholme DR. 1985. The mitochondrial DNA molecule of Drosophila yakuba. Nucleotide sequence, gene organization, and genetic code. Journal of Molecular Evolution 22: 252–271. Danforth BN, Sauquert H, Packer L. 1999. Phylogeny of the bee genus Halictus (Hymenoptera. Halictidae) based on parsimony and likelihood analyses of nuclear EF-1 alpha sequence data. Molecular Phylogenetics and Evolution 13: 605–618. Daniels SB, Chovnick A, Boussy I. 1990. Distribution of hobo transposable elements in the genus Drosophila. Molecular Biology and Evolution 7: 589–606. Davis JI, Nixon KC. 1992. Populations, genetic variation, and the delimitation of phylogenetic species. Systematic Biology 41: 421–435. Dayhoff MO. 1978. Atlas of protein sequence and structure. Silver Spring, Maryland: National Biomedical Research Foundation. DeSalle R, Freedman T, Prager EM, Wilson AC. 1987. Tempo and mode of sequence evolution in mitochondrial DNA of Hawaiian Drosophila. Journal of Molecular Evolution 26: 157–164. DeSalle R, Williams AK, George M. 1993. Isolation and characterization of animal mitochondrial DNA. In: Zimmer EA, White TJ, Cann RL, Wilson AC, eds. Methods in enzymology. London: Academic Press, 176–204. Doyle JJ. 1992. Gene trees and species trees: molecular systematics as one-character taxonomy. Systematic Botany 17: 144–163. Drosopoulou E, Scouras ZG. 1995. The B-tubulin gene family evolution in the Drosophila montium subgroup of the melanogaster species group. Journal of Molecular Evolution 41: 293–298. Eriksson T. 1997. Autodecay 2.9.8. Stockholm: Stockholm University, Botaniska Institutionen. Farris JS, Kallersjo M, Kluge AG, Bult C. 1994. Testing significance of congruence. Cladistics 10: 315–320. Farris JS, Kallersjo M, Kluge AG, Bult C. 1995. Constructing a significance test for incongruence. Systematic Biology 44: 570–572. Felsenstein J. 1985. Confidence limits of phylogenies: an approach using the bootstrap. Evolution 39: 783–791. Gatsey J, DeSalle R, Wheeler WC. 1994. Alignmentambiguous nucleotide sites and the exclusion of systematic data. Molecular Phylogenetics and Evolution 2: 152–157. Graybeal A. 1998. Is it better to add taxa or characters to a difficult phylogenetic problem? Systematic Biology 47: 9–17. Grimaldi DA. 1990. A. phylogenetic, revised classification of genera in the Drosophilidae (Diptera). Bulletin of the American Museum of Natural History 197: 1–139. Hasegawa M, Kishino H, Yano T. 1985. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 21: 160–174. Hillis DM. 1987. Molecular versus morphological approaches to systematics. Annual Review of Ecology and Systematics 18: 23–42. Hillis DM. 1998. Taxonomic sampling, phylogenetic accuracy and investigator bias. Systematic Biology 47: 3–8. Hillis DM, Mable BK, Moritz C. 1996. Applications of molecular systematics. The state of the field and a look to the future. In: Hillis DM, Moritz C, Mable BK, eds. Molecular systematics 2nd edn. Sunderland, MA: Sinauer Associates Inc., 515–543. Hsu TC. 1949. The external genital apparatus of male Drosophilidae in relation to systematics. University of Texas Publications 4920: 80–142. Inomata N, Tachida H, Yamazaki T. 1997a. Molecular evolution of the Amy multigenes in the subgenus Sophophora of Drosophila. Molecular Biology and Evolution 14: 942–950. Inomata N, Tachida H, Yamazaki T. 1997b. Molecular evolution of the Amy multigenes in the subgenus Sophophora of Drosophila. Molecular Biology and Evolution 14: 1338. Kim NW, Lee TJ. 1991. Genetic relationships among the eight species of the Drosophila melanogaster species group by allozyme analysis. Korean Journal of Genetics 13: 297–309. Kim NW, Lee TJ, Song ES. 1992. Evolutionary genetic study on the eight species of the Drosophila melanogaster group from Korea: reproductive isolation and protein analysis. Korean Journal of Zoology 35: 211–218. Kluge AG. 1989. A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes). Systematic Zoology 38: 7–25. Kohler RE. 1994. Lords of the fly: Drosophila genetics and the experimental life. Chicago: University of Chicago Press. Lachaise D, Cariou M-L, David JR, Lemeunier F, Tsacas L, Ashburner M. 1988. Historical biogeography of the Drosophila melanogaster species subgroup. In: Hecht MK, © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 THE MELANOGASTER SPECIES GROUP Wallace B, Prance GT, eds. Evolutionary biology. New York: Plenum Press, 159–225. Lawrence PA. 1992. The making of a fly: the genetics of animal design. Blackwell Scientific Publications, Oxford. Lecointre GH, Philippe H, Vân Lê HL, Guyader HL. 1993. Species sampling has a major impact on phylogenetic inference. Molecular Phylogenetics and Evolution 2: 205–224. Lee TJ, Hong KJ, Kim NW. 1994. Genetic relationships and protein variations during development within the Drosophila melanogaster species group. II. Analysis of soluble protein by 2DE. Korean Journal of Zoology 37: 249–254. Lee TJ, Hong KJ, Song ES. 1993. Genetic relationships and protein variations during development within the Drosophila melanogaster species group. I. Analysis of soluble protein by SDS-PAGE. Korean Journal of Genetics 15: 269–276. Lemeunier F, David JR, Tsacas L, Ashburner M. 1986. The melanogaster species group. In: Ashburner M, Carson HL, Thompson JN, eds. The genetics and biology of Drosophila vol. 3e. London: Academic Press, 147–256. Liu H, Beckenbach AT. 1992. Evolution of the mitochondrial cytochrome oxidase II gene among ten orders of insects. Molecular Phylogenetics and Evolution 1: 41–52. Liu J, Mercer JM, Stam LF, Gibson GC, Zeng Z-B, Laurie CC. 1996. Genetic analysis of a morphological shape difference in the male genitalia of Drosophila simulans and D. mauritiana. Genetics 142: 1129–1145. Maddison DR, Maddison WP. 1992. MacClade: 3.01. Analysis of phylogeny and Character Evolution. Sunderland, Massacheusetts: Sinauer Associates. Mavragani-Tsipidou P, Zambetaki A, Kleanthous K, Pangou E, Scouras ZG. 1994. Cytotaxonomic differentiation of the Afrotropical Drosophila montium subgroup: D. diplacantha and D. seguyi. The major role of reverse tandem duplications. Genome 37: 935–944. Mickevich M, Farris S. 1981. The implications of congruence in Menidia. Systematic Biology 30: 351–370. Miller JS, Brower AVZ, DeSalle R. 1997. Phylogeny of the neotropical moth tribe Josiini (Notodontidae: Dioptinae). Evidence from DNA sequences and morphology. Biological Journal of the Linnean Society 60: 297–316. Nigro L Solignac M, Sharp PM. 1991. Mitochondrial sequence divergence in the melanogaster and Oriental species subgroups of Drosophila. Journal of Molecular Evolution 33: 156–162. Okada T. 1954. Comparative morphology of the drosophilid flies. I. Phallic organs of the melanogaster group. Kontyû 22: 36–46. Okada T. 1964. Drosophilidae (Diptera) of Southeast Asia collected by the Thai-Japanese Biological Expedition 1961–62. In: Kira T, Umesao T, eds. Nature and life in southeast Asia. Kyoto: Fauna and Flora Research Society, 439–466. Olmstead RG, Sweere JA. 1994. Combining data in phylogenetic systematics: An empirical approach using three molecular data sets in the Solanaceae. Systematic Biology 43: 467–481. Parkash R, Iyoutsna, Vanda [sic]. 1994. Allozyme phy- 31 logeny of five species of takahashii species subgroup of Drosophila. Korean Journal of Genetics 16: 187–196. Pattatucci AM, Ottenson DC, Kaufman T. 1991. A functional and structural analysis of the sex combs reduced locus of Drosophila melanogaster. Genetics 129: 423–441. Pélandakis M, Higgins DG, Solignac M. 1991. Molecular phylogeny of the subgenus Sophophora of Drosophila derived from the large subunit of ribosomal RNA sequences. Genetica 84: 87–94. Pélandakis M, Solignac M. 1993. Molecular phylogeny of Drosophila based on ribosomal RNA sequences. Journal of Molecular Evolution 37: 525–543. de Pinna MCC. 1991. Concepts and tests of homology in the cladistic paradigm. Cladistics 7: 367–394. Platnick NI, Griswold CE, Coddington JA. 1991. On missing entries in cladistic analysis. Cladistics 7: 337– 343. Poe S. 1998. Sensitivity of phylogenetic estimation to taxonomic sampling. Systematic Biology 47: 18–31. Powell JR, DeSalle R. 1995. Drosophila molecular phylogenies and their uses. Evolutionary Biology 28: 87–138. Remsen J, DeSalle R. 1998. Character congruence of multiple data partitions and the origin of the Hawaiian Drosophilidae. Molecular Phylogenetics and Evolution 9: 225–235. Rutherford SL, Lindquist S. 1998. Hsp90 as a capacitor for morphological evolution. Nature 396: 336–342. Sanderson MJ, Donoghue MJ. 1989. Patterns of variation in levels of homoplasy. Evolution 43: 1781–1795. Schawaroch VA. 2000. Molecular phylogeny of the Drosophila melanogaster species group with special emphasis on the montium subgroup. PhD Thesis, The City University of New York. XII, 339. Scouras ZG. 1995. The Drosophila montium subgroup species: recent cytogenetic, molecular, development and evolutionary studies. Bios (Thessaloniki) 3: 125–158. Spieth HT, Ringo JM. 1983. Mating behavior and sexual isolation in Drosophila vol 3c. In: Ashburner M, Carson HL, Thompson JN, eds. The genetics and biology of Drosophila. New York: Academic Press, 223–341. Strimmer K, von Haeseler A. 1999. PUZZLE. 4.0.2. Sturtevant AH. 1942. The classification of the genus Drosophila, with descriptions of nine new species. University of Texas Publications 4213: 5–51. Sullivan W, Ashburner M, Hawley RS. 2000. Drosophila protocols. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press. Swofford DL. 2000. PAUP*. Phylogenetic Analysis Using Parsimony (*and other methods), version 4. Sunderland, MA: Sinauer Associates, Inc. Tanda S, Shrimpton AE, Ling-Ling C, Itayama H, Matsubayashi H, Saigo K, Tobari YN, Langley CH. 1988. Retrovirus-like features and site specific insertions of a transposable element, Tom, in Drosophila ananassae. Molecular Genetics 214: 405–411. Thomas RH, Hunt JA. 1993. Phylogenetic relationships in Drosophila. A conflict between molecular and morphological data. Molecular Biology and Evolution 10: 362–374. Throckmorton LH. 1975. The phylogeny, ecology, and geog- © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 32 V. SCHAWAROCH raphy of Drosophila. In: King RC, ed. Handbook of genetics. New York: Plenum, 421–469. Toda MJ. 1991. Drosophilidae (Diptera) in Myanmar (Burma) VII. The Drosophila melanogaster species-group, excepting the D. montium species-subgroup. Oriental Insects 25: 69– 94. True JR, Liu J, Stam LF, Zeng Z-B, Laurie CC. 1997. Quantitative genetic analysis of divergence in male secondary sexual traits between Drosophila simulans and Drosophila mauritiana. Evolution 51: 816–832. Tsacas L. 1979. Contribution des données africaines à la compréhension de la biogéographie et de l’évolution du sous-genre Drosophila (Sophophora) Sturtevant (Diptera, Drosophilidae). Compté rendu des séances de la Société de biogéographie 48: 29–51. Tsacas L. 1984. Nouvelles données sur la biogéographie et l’évolution du groupe Drosophila melanogaster en Afrique. Description de six nouvelles espèces. (Diptera, Drosophilidae). Annales de la Societe Entomologique de France (N.S.) 20: 419–438. Tsacas L, Chassagnard M-T. 1992. Le complex Drosophila nikananu: description d’une nouvelle espèce africaine et analyse de quelques caracters morphologiques du groupe melanogaster (Diptera, Drosophilidae). Nouvelle Revue d’Entomologie (N. S.) 8: 385–398. Tsacas L, David J. 1978. Systematics and biogeography of the Drosophila kikkawai-complex, with descriptions of new species (Diptera, Drosophilidae). Annales de la Societe Entomologique de France (N.S.) 13: 675–693 (published in 1977). Watanabe TK, Kawanishi M. 1983. Stasipatric speciation in Drosophila. Japanese Journal of Genetics 58: 269–274. Wheeler WC. 1993. The triangle inequality and character analysis. Molecular Biology and Evolution 10: 707–712. Wheeler WC. 1995. Sequence alignment, parameter sensitivity, and the phylogenetic analysis of molecular data. Systematic Biology 44: 321–331. Whiting MF, Carpenter JC, Wheeler QD, Wheeler WC. 1997. The Strepsiptera problem: phylogeny of the holometabolous insect orders inferred from 18S and 28S ribosomal DNA sequences and morphology. Systematic Biology 46: 1–68. Yamamoto Y, Girard F, Bello B, Affolter M, Gehring WJ. 1997. The cramped gene of Drosophila is a member of the Polycomb-group, and interacts with mus209, the gene encoding proliferating cell nuclear antigen. Development 124: 3385–3394. Zeng Z-B, Liu J, Stam LF, Kao C-H, Mercer JM, Laurie CC. 2000. Genetic architecture of a morphological shape difference between two Drosophila species. Genetics 154: 299–310. APPENDIX A Species with culture numbers for all of the stocks used in this study. All species were obtained from the National Drosophila Species Resource Center at Bowling Green with the exception of D. teissieri Brazzaville isofemale line 16 which was a gift from D. Lachaise to G. Simmons. All species identifications were confirmed based on male genitalic dissections. The following three species were originally mislabelled by the National Drosophila Species Resource Center at Bowling Green. D. ficusphila was incorrectly labelled as D. pennae 14028–0631.0. D. ercepeae was incorrectly labelled as D. greeni 14028–0712.0. D. greeni was incorrectly labelled as D. ercepeae 14024–0432.0. Representatives of all taxa sampled have been placed in the collections at the American Museum of Natural History. The species, D. rajasekari (14023–0361.3) was ordered from the stock centre; however, D. rajasekari Reddy & Krishnamurthy, 1968 and D. raychaudhurii Gupta, 1969 were made junior synonyms of D. biarmipes Malloch, 1924 by Bock (1980). D. jambulina Parshad & Paika, 1964 has been found to be an Indian endemic species. Collections made in Indochina (Thailand and Cambodia, e.g, D. jambulina 14028–0531.1) are actually D. watanabei Gupta & Gupta, 1992, see Table A1. APPENDIX B HB CHARACTER ASSIGNMENT By aligning the DNA or amino acid sequence, molecular systematists establish topological identity for the primary/putative homology statement (de Pinna, 1991; Brower & Schawaroch, 1996). The total length of the hb sequence varied from 513 bp in D. bifasciata to 456 bp in D. takahashii and D. elegans. This caused the alignment for hb gene region to be more complicated in comparison to the Adh and mt:CoII regions which had no indels (insertions or deletions). Alignment of hb It was necessary to convert hb nucleotide sequence to amino acid sequence for recognition of homology (i.e. topological identity sensuBrower & Schawaroch, 1996). Alignments were performed on hb amino acid sequences, using the Clustal method in M E G A L I G N (D N A S TA R , version 1.02). To determine alignment ambiguous sites © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 THE MELANOGASTER SPECIES GROUP 33 Table A1. Species Culture/Stock Species Culture/Stock D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. 14011–0091.0 14012–0181.0 14011–0121.0 14011–0111.0 14012–0141.0 14012–0210.0 14027–0461.0 14026–0451.0 misID 14028–0631.0 14023–0331.0 14023–0341.0 14023–0361.3 gift D. Lachaise 14021–0261.0 14022–0311.5 14022–0271.0 14022–0281.0 14022–0291.0 14024–0371.0 misID 14028–0712.0 14024–0391.0 14024–0433.0 14024–0434.0 14024–0431.0 D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. D. 14028–0471.1 14028–0491.2 14028–0481.1 14028–0501.0 14028–0511.0 14028–0521.0 14028–0586.0 misID 14024–0432.0 14028–0531.1 14028–0541.0 14028–0561.3 14028–0581.0 14028–0591.0 14028–0601.0 14028–0611.0 14028–0621.0 14028–0641.0 14028–0661.0 14028–0651.0 14028–0671.0 14028–0681.0 14028–0691.0 14028–0701.0 14028–0711.0 ambigua bifasciata pseudoobscura persimilis affinis tolteca elegans eugracilis ficusphila lucipennis mimetica biarmipes teissieri yakuba takahashii lutescens paralutea prostipennis ananassae ercepeae m. malerkotliana pallidosa phaeopleura varians auraria barbarae baimaii biauraria bicornuta birchii dipacantha greeni watanabei kanapiae kikkawai lini mayri nikananu orosa parvula punjabiensis rufa quadraria seguyi serrata triauraria tsacasi vulcana (Gatsey et al., 1994) the cost parameters varied as follows: (1) the gap length penalty was set at a value of 10; (2) the amino acid change cost was according to the PAM250 residue weight table (Dayhoff, 1978); and (3) the gap penalty value varied from 8 to 30. Evaluating the alignment Three stretches of the amino acid sequence exhibited alignment ambiguity (positions 5–20, 106–113, and 151–166 in Table B1). Future investigations with increased taxon sampling may make the third stretch (amino acids 151–166) not ambiguous and this stretch would become an excellent source of characters. It is interesting to note that the hypervariable region predominated by Q’s and H’s at amino acid positions 30–51 was not alignment ambiguous. This may reflect the low alignment cost to switch between Q and H (a value of 2 for a range from 0 to 22). The multiple repeats of Q’s and H’s most probably occurred by a slippage mechanism and their putative homology statements in this region seem questionable at best. This region’s alignment, however, was conserved across all the parameters tested. Therefore, this region remained in the matrix for analysis. hb sequence used in phylogenetic analysis After removal of the alignment ambiguous sites, the remaining aligned hb amino acid sequence was reconverted to nucleotide sequence in an effort to maximize possible character information. The aligned hb nucleotide sequence now 441 bp long was inserted back as primary data in the matrix (Table B1). GAP CODING Gaps within molecular sequence have traditionally been coded as question marks. Morphological characters coded by a question mark can be the result of one of three conditions: the character is ambiguous, inapplicable or missing (Platnick et al., 1991). In this study gaps were neither ambiguities (due to polymorphisms) nor missing (stretches of DNA not sequenced) but rather were inapplicable (the taxon does not have the structure [stretch © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 1 ambigua persimilis pseudoobscura affinis bifasciata tolteca diplacantha watanabei punjabiensis greeni kanapiae parvula seguyi vulcana nikananu kikkawai lini serrata tsacasi orosa auraria triauraria rufa quadraria biauraria barbarae birchii mayri bicornuta baimaii ananassae phaeopleura malerkotliana pallidosa varians ercepeae ficusphila elegans paralutea prostipennis takahashii lutescens lucipennis mimetica biarmipes eugracilis yakuba teissieri melanogaster SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SLAS SLAS SLAS SLAS SLAS SLTS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS SVAS * * GSPSPRQSPLPSP--GSPSPRQSPLPSP--GSPSPRQSPLXSP--GSPSPRQSPLPSP--GSPSPRQSPLASP--GSPSPRQSPLPSP-----SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAX ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLPA ---IPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPLPSPLAA ---SPRQSPIPSPMNP ---SPRQSPIPSPMNP ---SPRQSPIPSPMNP ---SPRQSPIPSPMNP ---SPRQSPIPSPLNP ---SPRQSPIPSPLNP ---SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS------SPRQSPIPS---¨æAmbiguousæÆ * * * * * * * 97 GNHLEQYLKQQQQQ--HHQQQQLQ-----QQPMDTLCGAAMTPSPSQNDQNSLQHFDVTLHQQLLQQQQYQQHFQAA GNHLEQYLKQQQQQ--HHQQQQLQ-----QQPMDTLCGAAMTPSPSQNDQNSLQHFDVTLQQQLLQQQQYQQHFQAA GNHLEQYLKQQQQQ--HHQQQQLQ-----QQPMDTLCGAAMTPSPSQNDQNSLQHFDVTLQQQLLQQQQYQQHFQAA GNHLEQYLKQQQQ----HQQQQLQ-----QQPMDTMCGAAMTPSPNQNDQNSLQHFDVTLQQQLLQQQQYQQHFQAA GNHLEQYLKQQQQQQQHQHQQQLQ-----QQPMDTLCGAAMTPSPSQNDQNSLQHFDVTLQQQLLQQQQYQQHFQAA GNHLEQYLKQQQHQQQ-QQQQQLQ-----QQPMDTMCGAAMTPSPSQNDQNSLQHFDVTLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-HHHQQQQQQQQ-HQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-QHHQQQQQ----HQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-QHHQQQQ-----HQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-HHHQQQQQQQQQHQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-HHHQQQQ-----HQTHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA NSQLEQFLKQQHHHQQQQQ-----HQTHQQQPMDTMC--TMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQQQHHHQQQ---QQHQSHQQQPMDXMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-HHHQQQQ---QQHQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA NSQLEQFLKQQQHH----QQQQQQHQSHQQQPMDTMC--AMTPSPSQXDQNSLQHFDATLQQQFLQQQQYQQHFQAA SSQLEQFLKQQ-QHHQQQQ---QHHQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-HHHQQQQ---QQHQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-QHHQQQQQQQQQHQSHHQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-HHHQQQQ---QQHQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDGTLQQQLLQQQQYQQHFQAA NSQLEQFLKQQQHHHQQQQQQQQQHQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQQHH-QQQQ---QQHQPHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQQHH-QQQQ---QQHQPHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA GSQLEQFLKQQQHH-QQQQ---QQHQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQQHH-QQQQ---QQHQPHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQQHHHQQQQ---QQHQPHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-QHHQQQ------HQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQQHHHHQQQQ----HQSQQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQHHH---QQHQEQQHQSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQ-HHQQQQQQQQQQHQSHQQQLMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA SSQLEQFLKQQQHHQQQQQHQ---HPSHQQQPMDTMC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA GNQLEQFLKQQ-HHQQQ------------QQPMDTLC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA GNQLEQFLKQQ-HHQQQ----------HQQQPMDTLC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA GNQLEQFLKQQ-QSHHQ----------QQQQPMDTLC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA GNQLEQFLKQQ-HHQQQ------------QQPMDTLC--AMTPSPSQNDQNSLQHFDATLQQQLLQQQQYQQHFQAA ANQLEQFLKQQQHHHQQ----------QQQQPMDTLC--AMTPSPSQNDQNSLQHFDATLQQQILQQQQYQQHFQAA GNQLEQFLKQQ-HQQHH----------HQQQPMDTLC--AMTPSPSQNDQNSLQHFDATLQQQLMQQQQYQQHFQAA TNHLEQFLKQQQQQ-------------HQQQPMDTLC--AMTPSPSQNDQNSLQHYDANLQQQLLQQQQYQQHFQAA TNHLEQFLKQQQ---------------HQQQPMDTLC--AMTPSPSQNDQNSLQHYDAGLQQQLLQQQQYQQHFQAA TSHLEQFLKQQQQ--------------HQQQPMDTLC--AMTPSPSQNDQNSLQHYDASLQQQLLQQQQYQQHFQAA TSHLEQFLKQQQQ--------------HQQQPMDTLC--AMTPSPSQNDQNSLQHYDASLQQQLLQQQQYQQHFQAA TNHLEQFLKQQQHQ--------------QQQPMDTLC--AMTPSPSQNDQNSLQHYDASLQQQLLQQQQYQQHFQAA TNHLEQFLKQQQ---------------HQQQPMDTLC--AMTPSPSQNDQNSLQHYDASLQQQLLQQQQYQQHFQAA TNHLEQFLKQQHQQ--------------QQQPMDTLC--AMTPSPSQNDQNSLQHYDASLQQQLLQQQQYQQHFQAA TNHLEQFLKQQHHQ-------------QQQQPMDTLC--AMTPSPSQNDQNSLQXYDANLQQQLLQQQQYQQHFQAA TNHLEQFLKQQQ---------------HQQQPMDTLC--AMTPSPSQNDQNSLQHYDANLQQQLLQQQQYQQHFQAA TNHLEQFLKQQHQQ--------------QQQPMDTLC--AMTPSPSQNDQNSLQHYDANLQQQLLQQQQYQQHFQAA TNHLEQFLKQQQQQQ------------HQQQPMDTLC--AMTPSPSQNDQNSLQHYDASLQQQLLQQQQYQQHFQAA TNHLEQFLKQQQQQQ------------HQQQPMDTLC--AMTPSPSQNDQNSLQHYDASLQQQLLQQQQYQQHFQAA TNHLEQFLKQQQQQL-------------QQQPMDTLC--AMTPSPSQNDQNSLQHYDANLQQQLLQQQQYQQHFQAA V. SCHAWAROCH © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 Species 34 Table B1a. Alignment of hb amino acid sequence for 49 taxa. The exemplar alignment is the one that resulted from using a gap penalty value of 8 98* QQQQQQQA QQQQQQQA QQQQQQQA QQQQQQQA QQQQQQQA QQQQQQ-A QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----QQQ----- HHHHHHLG HHHHHHLG HHHHHHLG HHHHHHLG HHHHHHLG HHHHHHLG HHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHLHHHHHHL¨ Amb. Æ * * * * LGGFNPLTPPGLPNPMQHFYAGNLGRPSPQPTPTATQ LGGFNPLTPPGLPNPMQHFYAGNLGRPSPQPTPTATQ LGGFNPLTPPGLPNPMQHFYAGNLGRPSPQPTPTATQ LGGFNPLTPPGLPNPMQHFYAGNLGRPSPQPTPTATQ LGGFNPLTPPGXPNPMQHFYAGNLGRPSPQPTPTATQ LGGFNPLTPPGLPNPMQHFYAGNLGRPSPQPTPTATQ MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPTPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGSL-RPSPQPTPTT-MTGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTT-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTN-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTN-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTN-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTN-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGSL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGSL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGTL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGSL-RPSPQPTPTA-MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTAMA MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTAPS MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTSAS MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTSAS MGGFNPLTPPGLPNPMQHFYGGSL-RPSPQPTPTSAS MGGFNPLTPPGLPNPMQHFYGGSL-RPSPQPTPTSAS MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTSVA MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTSAS MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTSVS MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTAAA MGGFNPLTPPGXPNPMQHFYGGNL-RPSPQPTPTSAS MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTSVS MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTSAS MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTSAS MGGFNPLTPPGLPNPMQHFYGGNL-RPSPQPTPTSAS * * * 187 VVAPTQV--------G EKLQALTPPMDVTPPKSPAKS VVAPTQV--------G EKLQALTPPMDVTPPKSPAKS VVAPTQV--------G EKLQALTPPMDVTPPKSPAKA VVAPTQV--------G EKLQALTPPMDVTPPKSPAKS VVAPTQV--------G EKLQALTPPMDVTPPKSPAKS VVAPTQV--------G EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS DKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -GVAVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -GVAVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -GVAVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -GVAVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -GVAVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AVA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-AIA---PVAVATS EKLQALTPPMDVTPPKSPAKS -G-TVA---TVAVATS EKLQALTPPMDVTPPKSPAKS PSAA-----SVTSTTS EKLQALTPPMDVTPPKSPAKS PSAA-----SVTSATS EKLQALTPPMDVTPPKSPAKS PSAA-----SVTSATS EKLQALTPPMDVTPPKSPAKS PSAA-----SVTSTTS EKLQALTPPMDVTPPKSPAKS SSAA-----PVTTATS EKLQALTPPMDVTPPKSPAKS AGTAVA---AGTAVTS EKLQALTPPMDVTPPKSPAKS TVAS---AVPVGSATS EKLQALTPPMDVTPPKSPAKS TIAPVAVPN-GTS--- EKLQALTPPMDVTPPKSPAKS AVAPVALATGSSSSSS EKLQALTPPMDVTPPKSPAKS XVAPXAXATGSSSSS- EKLQALTPPMDVTPPKSPAKS APVAIA-----SSNNS EKLQALTPPMDVTPPKSPAKS AVAPVAIATGSSSS-- EKLQALTPPMDVTPPKSPAKS AVAPVAVA-NGTS--- EKLQALTPPMDVTPPKSPAKS T-APIAVPTSSSNSSS EKLQALTPPMDVTPPKSPAKS SVAPVAVANGGSSS-- EKLQALTPPMDVTPPKSPAKS TVAPVAVAASSSS--- EKLQALTPPMDVTPPKSPAKS TVAPVAVAT-GSS--- EKLQALTPPMDVTPPKSPAKS TVAPVAVAT-GSS--- EKLQALTPPMDVTPPKSPAKS TIAPVAVAT-GSS--- EKLQALTPPMDVTPPKSPAKS ¨æ Ambiguous æÆ 35 ambigua persimilis pseudoobscura affinis bifasciata tolteca diplacantha watanabei punjabiensis greeni kanapiae parvula seguyi vulcana nikananu kikkawai lini serrata tsacasi orosa auraria triauraria rufa quadraria biauraria barbarae birchii mayri bicornuta baimaii ananassae phaeopleura malerkotliana pallidosa varians ercepeae ficusphila elegans paralutea prostipennis takahashii lutescens lucipennis mimetica biarmipes eugracilis yakuba teissieri melanogaster * THE MELANOGASTER SPECIES GROUP © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 Table B1b. Continued 36 V. SCHAWAROCH Table B2. Gaps at amino acid positions 58–59 and 138 were recoded as characters then reinserted as the primary data in the matrix. Each of these amino acid sequences are present in the six outgroup taxa and are lost for the taxa of the melanogaster group. Thus this deletion supports the hypothesis of ingroup monophyly. In this instance, the presence of a gap is an informative character. There is no variation exhibited in the size of the gap; therefore, the deletion producing the gap may have occurred only once. These data were condensed into a single unordered binary or multistate character Species Sequence Coding A Matrix B Sequence Coding A Matrix B ambigua persimilis pseudoobscura affinis bifasciata tolteca paralutea prostipennis takahashii lutescens lucipennis mimetica biarmipes ananassae varians phaeopleura greeni malerkotliana pallidosa eugracilis teissieri yakuba melanogaster bicornuta diplacantha watanabei punjabiensis seguyi vulcana nikananu auraria barbarae birchii kikkawai lini quadraria serrata triauraria tsacasi baimaii biauraria kanapiae mayri orosa parvula rufa ercepeae elegans ficusphila GGGGCA GGGGCA GGGGCA GGGGCG GGGGCA GGGGCG ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 000000 000000 000000 000001 000000 000001 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 111112 0 0 0 1 0 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 GGT GGT GGT GGT GGT GGT --------------------------------------------------------------------------------------- 000 000 000 000 000 000 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 111 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37 THE MELANOGASTER SPECIES GROUP 37 of DNA] which possesses the characters). Previously, Wheeler (1993) and Danforth et al. (1999) have evaluated gaps with respect to alignment context. Gaps at two stretches (amino acid positions 58–59 and 138) were coded as a character because they are flanked by conserved sequence plus appear to convey grouping information. The remaining gaps were coded in the traditional method as question marks. COMBINATION GAP CODING Evaluating gap coding methods The gaps as characters were coded in a binary form according to matrix B (Table B2). The one to one correspondence for the nucleotide characters to binary characters in ‘matrix A’ could be inflating the character information as these gaps are considered as characters and most probably occurred as single events. In contrast, ‘matrix B’ summarizes the nucleotide variation within its binary coding. Coding by either matrix produces the same tree topology. The effect of the coding methods (i.e. traditional all gaps as question marks and the combination gap coding presented here) on resulting tree topologies were compared for combined data (i.e. mt:CoII +Adh +hb) and hb data. Tree topology was unaffected by treating gaps as either all ‘missing’ or as a combination of ‘missing’ and 5th state. CONCLUSION Even though these results demonstrate that the combination gap coding did not alter tree topology from the traditional coding method (gaps as ‘missing’), combination coding reflects the information conveyed by the gaps present in the hb sequence. Matrix B summarizes numerical coding of matrix A with the assumption that these gaps were single events. All phylogenetic analyses in this study combination coded the gaps (both ‘missing’ and the 5th state). PAUP was executed with the option ‘gaps as missing’, and the gaps as 5th state were coded in the PAUP matrix using numerical values according to matrix B (Table B2). © 2002 The Linnean Society of London, Biological Journal of the Linnean Society, 2002, 76, 21–37