Download Phylogenetic reconstruction and phenetic

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
<oological Journal o f t h e Linnean Socieg (1982) 74: 337-344. With 2 figures
Phylogenetic reconstruction and phenetic
taxonomy
J. McNEILL
Biosystematics Research Institute, Agriculture Canada,
Ottawa, Canada K1A OCP
Accepkd for publication 3um 1981
Cladistic analysis should not be equated with phylogenetic reconstruction. Instead it is a means of
describing character-state distributions among organisms and in this it resembles phenetic analysis.
However, the claim that cladistic methods meet phenetic (‘Gilmour-natural’) criteria for classification
as well as or better than traditional phenetic ones is shown to be based on an inadequate interpretation
of these criteria. Instead, a new measure of naturalness is proposed in which the most natural
classification is that which describes the distribution of all character states by the smallest number of
statements. The possibility of extending this measure to provide a criterion for an optimally simple
classification is noted. It is concluded that phylogenetic reconstruction must not only reflect the
branching patterns suggested by cladistic analysis but also take account of the evolutionary history that
is reflected in an optimal phenogram.
KEY WORDS:-
Cladistics
-
phenetics - natural taxa
-
natural classification.
CONTENTS
Cladistic analysis differs from phylogenetic reconstruction
Cladistics resembles phenetics . . . . . . .
Cladistics, phenetics and Gilmour-natural classifications .
Farris’s assessment of naturalness . . . . .
A new measure of naturalness . . . . . .
Phenetics and phylogeny . . . . . . . .
References. . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
.
.
.
.
.
.
.
337
338
340
340
341
343
344
CLADISTIC ANALYSIS DIFFERS FROM PHYLOGENETIC RECONSTRUCTION
I am very glad to have found during this Symposium that I am not alone in
refusing to equate phylogenetic reconstruction with cladistic analysis. Forey
(1982), for example, makes a clear distinction between cladograms and
evolutionary trees, the one the result of cladistic analysis and the other claiming to
be a best attempt at phylogenetic reconstruction. Forey, of course, expresses his
preference for a cladogram as a medium for expressing evolutionary events, but
both he and others have made it clear that cladistic analysis and phylogenetic
reconstruction are two very different things.
*Present address: Department of Botany, University of Ottawa, Ottawa, Canada KIN 6N5.
0024-408~/82/030337
+oa 002.00/0
337
0 1982 The Linnean Society of London
338
J. McNEILL
Cladistic analyses produce cladograms that conform to certain defined rules
and, within the constraints of these rules, best summarize the data that have gone
into their construction. Phylogenetic reconstruction, on the other hand, is the
attempt to deduce, by whatever means seems appropriate, what actually
happened, i.e. the real evolutionary history of the group under investigation.
Because very many investigators (e.g. Hill & Crane, 1982) have used cladistic
techniques as a basis for phylogenetic reconstruction and because its product (the
cladogram) is a tree diagram, it is tempting to view cladistics simply as a method,
or perhaps even as the method, of reconstructing the course of organic evolution. I
believe that this is a misleading view of cladistics and my belief is supported by the
most recent cladistic literature.
CLADISTICS RESEMBLES PHENETICS
It was Hull (1980) who pointed out, in his contribution to the Hennig Memorial
Symposium held in December 1977, that, whereas early cladistic analysis was
concerned primarily with species, and the nodes in a cladogram were interpreted
as real or ‘hypothetical’ ancestral species, more recently emphasis has been on
characters, with cladograms representing the order of emergence of uniquely
derived characters, so that the nodes represent “minimum sets of synapomorphic
characters” (Platnick, 1977). This idea has been developed further by Platnick
( 1980), who notes that “cladistic methods obviously do not depend on the
recognition of historically primitive or historically derived character states (i.e.,
they do not depend on the actual reconstruction of evolutionary history)” but
“merely attempt to discriminate more general from less general characters.”
Patterson (1980) spells out the implications of this when he says “cladistics . . . is
not necessarily about evolution . . . It is about a simpler and more basic matter,
the pattern in nature.” There is a striking parallel here to McNeill’s (1980)
account of phenetics as being dependent on evolution for its success but otherwise
making “no attempt to reflect evolution” but seeking “to describe the distribution
among organisms of as many of their character-states as possible.”
I would suggest that cladistic analysis and phenetic analysis are alike in that
both are made possible by evolution (or by some analagous “external influence”)
but that neither provide evolutionary trees per se and neither, by themselves,
permit phylogenetic reconstruction.
The phenograms and the derived ‘natural’ classifications of a pheneticist and the
cladograms and the identically mapped classifications of a cladist are both in the
form of rooted trees, but both are produced not to reconstruct phylogeny but to
describe character state distributions among organisms. Of course, there are
differences between phenetics and cladistics in addition to the basic one that
phenetic methods start from overall similarity (though subsequent character
selection may be made on the basis of initial group recognition, cf. McNeill, 1980:
475-477), whereas cladistic techniques use only character states considered to
show derived patristic similarity (synapomorphies). Amongst these secondary
differences is the fact that pheneticists usually distinguish between a phenogram
and a classification, adopting, usually intuitively, some simplifying process in going
from the one to the other (cf. McNeill, 1979). But these differences are secondary
and I would suggest that there is, in fact, a very close relationship between modern
PHENETIC TAXONOMY
339
cladistic analysis and genuine phenetic analysis (as described in e.g. Jardine &
Sibson, 1971 : 1 3 w 38; McNeill, 1980).
Figure 1. Diagram of two possible evolutionary histories for the terminal taxa A, B, C, D described by
the characters and character-states listed in Table 1. The nodes E, F and G (A) and E' , F' and G (B)
represent points of divergence of lineages and the numbers attached to the internodes represent the
number of character-state changes postulated as having occurred in each lineage.
Much of the time the two approaches will give the same answer and, what is
more, neither need correspond to what really happened in the evolutionary history
of the organisms concerned. Consider, for example, the four organisms A, B, C & D
(Fig. 1). Figures 1A and 1B show two possible evolutionary histories for these
organisms but the characters and character states are the same in both. Let us
suppose that they are those listed in Table 1. Even if the true evolutionary history
were that depicted in Fig. lB, both phenetic and cladistic techniques would
generate a tree diagram with the topology of Fig. 1A. Moreover, although neither
corresponds to a correct phylogenetic reconstruction, both the phenogram and the
Table 1. Characters (a-n) and character-states (1/0) for the terminal taxa (A, B,
C, D) and nodes (E, F, G and El, F', G) in Fig. 1
a
b
c
d
e
f
g
h
i
j
k
I
m
n
B
C
D
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
1
0
1
0
0
0
0
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
0
1
E
F
1
0
1
0
1
0
1
0
1
0
1
0
0
0
0
1
0
1
0
1
0
1
0
1
0
1
0
0
E'
F'
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
G
0
0
0
0
0
0
0
0
0
0
0
0
0
0
l
A
J . McNEILL
340
cladogram are correct according to their own criteria, in that both accurately
describe the distribution of the character states amongst these four organisms: the
members of the pairs A B and C D are phenetically more similar to each other than
either is to a member of the other pair; the nodes E and F provide a more
parsimonious nesting of the synamorphics than do the nodes E’ and F’ .
CLADISTICS, PHENETICS .2SD GILMOUR-NATURAL CLASSIFICATIONS
Farl-is’s assessment of naturalnesJ
Of course, in other situations phenetic and cladistic methods can give different
results with the same data set. Farris (1980) presents one such data set for four taxa
and 12 characters; this is reproduced in Table 2. Most phenetic methods will give
a set of pair-wise similarities and a phenogram such as is shown in Fig ZA, whereas
cladistic methods, including Farris’s ‘special similarity’ measure, will give pairwise similarities and a cladogram like that in Fig. 2B. Farris (1980) has argued that
even by phenetic criteria the cladogram is to be preferred to the phenogram and
suggests that pheneticists should abandon the use of what he calls ‘raw similarity’
t i.e. overall similarity: in favour of his ‘special similarity’ (i.e. similarity assessed on
the basis of synapomorphies only).
Table 2. Characters and character-states for the four hypothetical taxa forming
Farris’s i 1980) Data Set 2. Alternative phenetic and cladistic groupings of these
taxa appear in Fig. 2
Characters
A
B
C
D
la
Ib
IC
2
3
4a
4b
4c
4d
4e
5
6
1
0
0
0
1
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
1
1
0
0
1
1
1
0
As Farris (1977) and McNeill ( 1980) point out, one of the criteria of a phenetic
classification is that it should be natural in the sense in which that term was used by
Gilmour and others in the formative period of phenetic philosophy (cf. Gilmour,
1937, 194-0, 1961: Gilmour & IValters, 1963; Sneath, 1957; Cain & Harrison,
1958). Farris ( 1977 i describes a ‘Gilmour-natural’ classification as one “whose
constituent groups describe the distribution among organisms of as many features
as possible.” I n analysing the data set reproduced in Fig. 2, Farris (1980) interprets
this as meaning that each group recognized in a classification should be
characterized by at least one character state unique to that group, i.e. a
monothetic criterion. The group (B, C) recognized in the phenogram does not
meet this criterion, for, although nine out of 12 character states are shared by B
and C , all are possessed also by either A or D. Farris argues from this that (B, C) is
not a natural group, whereas the alternative group (A, B), recognized in the
cladogram, is Gilmour-natural, because it uniquely describes the 1 state of
character 5. Farris concludes from this that overall similarity may produce
classifications less desirable phenetically than those produced by cladistic
techniques.
PHENETIC TAXONOMY
A
E
B
E
c
7
9
D
2
4
C
341
A
B
C
2
5
A
I
I
0
0
B
0
C
D
B
Figure 2. Similarity matrices and dendrograms derived from the four taxon character-state matrix in
Table 2. A, Derived by a phenetic method such as the simple matching coefficient with group average
clustering (UPGMA); B, derived by a cladistic method such as Farris’s (1980) ‘special similarity’
measure.
I t appears to me that Farris’s requirement for naturalness, namely that each
group be characterized by at least one unique character state, is not a necessary
corollary of the requirement that Gilmour-natural classifications describe the
distribution among organisms of as many features as possible. Gilmour-natural
classifications are concerned with the distribution of all character-states, whether
‘primitive’ or ‘derived’ and a classification should therefore, be judged on the
efficiency of the classification as a whole in communicating information on such
character-state distribution. Farris ( 1980) does discuss information transmission as
a criterion of naturalness but uses a character-type by character-type approach for
each node in the phenogram. Not altogether surprisingly, with what is really a
cladistic approach, the cladogram in Fig. 2B is shown to have greater ‘information
content’ than the phenogram.
A new measure of naturalness
For a classification to describe the distribution of character-states among
organisms most efficiently, it should permit the character states to be described in
as few statements as possible. The data matrix in Table 2 with four taxa and 12
characters requires 48 statements, if for the present discussion we exclude, as Farris
also does, consideration of taxon labels. Table 3 summarizes the numbers of
statements needed if the cladogram and the phenogram are each treated as
classifications. A ‘compromise’classification of only two ranks is also assessed.
I n all cases, the first grouping is (D) and (A, B, C). The character-states of (D)
must be specified in full (12 statements) if no information is to be lost. Likewise at
least six must be specified for (A, B, C). The remaining six are variable (marked by
asterisks in Table 3) and three alternative ways of making statements about them
might be considered: I, reference to the character-state could be omitted (0
statements) ; 11, the character-state could be specified as variable (6 statements) ;
or 111, for each character the most common state and the exceptions could be
specified (12 statements). For all three classifications the number ofstatements will
be the same: 18,24 or 30, depending upon whether Method I, I1 or I11 is adopted.
Taxa
1
0
I
0
1
0
0
1
I
0
0
0
I
0
1
0
0
0
0
1
0
0
0
1
0
1
*
0
4a
3
0
0
1
*
0
1
0
0
8
8
0
8
0
I
0
8
8
0
r
0
8
0
*
0
0
I
0
2
Ic
Ib
la
0
1
4b
1
0
4c
1
0
4d
1
0
4e
0
1
1
0
0
1
5
1
0
6
Character states about which statements must be made
-
42
-
24
6
6
6
18
6
6
6
36
-
48
-
-
6
6
6
YO
-
51
-
42
-
-
-
3
'5
9
30
6
33
24
6
6
3
3
-
54
-
z
m
r
F
I
2
12
10
6
4
4 i '
111
I11
-
18
6
3
3
3
44
-
34
12
12
6
6
4
4
I1
-
6
12
2
6
4
4
I
Tor als
Table 3. Numbers of statements required to describe completely, with three different classifications, the character-state
distributions among the four taxa listed in Table 2. The uppermost set uses the cladogram (Fig. ZB), the middle, the phenogram
(Fig. ZA), and the lowermost a compromise classification with only two ranks. I, I1 and I11 represent three alternative methods
of computing the number of statements required (for details see'text)
W
N
+
PHENETIC TAXONOMY
343
At each of the subordinate levels no specification of the character-states that
were constant in the more inclusive group is necessary: part of the information
storage component of a classification, Consequently, for the cladogram, six
character-states must be specified for (C) and two, plus four variable ones, for (A,
B) giving 8, 12 or 16 statements depending on the method. By comparison (A) and
(B, C) require 9, 12 or 15 statements. The main difference between the cladogram
and the phenogram comes, however, in the eight statements needed to specify A
and B within (A, B) as against the six required for B and C within (B, C). The
totals in Table 3 show that whether Method I, I1 or I11 is chosen, the phenogram
is more efficient at summarizing all the character-state distributions than is the
cladogram.
Method I11 bears some relationship to Gower’s (1974) ‘maximally predictive’
procedure of summing all the matches with the ‘predictor state’ for each group, a
procedure which Farris ( 1980) has criticized in the hierarchical context as involving
repetition ofcertain character states (in this case those associated with A, B, C) and
not others (those associated with D). This criticism seems valid and certainly
Method I11 does involve unnecessarily repetitive statements about characterstates: consequently I do not regard it as an appropriate method in this context,
and will not consider it further. T h e choice between Method I and Method I1 is
less clear; taxonomic practice, as reflected in descriptions of taxa, tends to favour
Method I, but keys and diagnoses that are to be strictly comparable at each
hierarchical level require Method 11.
The ‘compromise’ classification of only two ranks (i.e. with (ABC) not further
divided) cannot, strictly speaking, be compared with the other two unless
statement lengths for taxon labels and ranks are determined and included. Purely
from the number of character statements, however, it is never superior to the
phenogram and is only more concise than the cladogram if statements must be
made about characters that vary within groups (i.e. Methods I1 & 111). In a real
classificatory situation, the two-rank classification might be preferred to either
three-rank one, because the statement length (or set of symbols) to be associated
with a taxon name and specification of rank might be adjudged to be so great as to
outweigh the advantage of the slightly more concise character statements. This
suggests an alternative strategy to the structural value criterion proposed by
McNeill ( 1979) for the simplification of dendrograms to ‘practical classifications’,
but further consideration of this is outside the scope of the present study.
PHENETICS AND PHYLOGENY
What, then, are the implications of this for phenetic taxonomy and phylogenetic
reconstruction? Firstly, it establishes that there is a component to Gilmournaturalness that is not reflected in the nested synapomorphies of a cladogram. T h e
description of character-state distributions among organisms provided by a
phenogram is as good as or better than that of a cladogram, if the criterion used is
that of minimizing the number of statements that need be made about all
character-states. (If, for some reason, certain character-states are considered
inappropriate for description in the classification, as, for example, species absence
usually is in ecological classification, then account would be taken of this in the
selection of the appropriate phenetic method, e.g. by using the Jaccard coefficient
344
J . McNEILL
for calculating similarity; the phenogram would remain the best descriptor of those
character-states being considered.) Phenetic taxonomy, thus, seeks to express in
communicable, and hence simplified, form (see McNeill, 1980) information about
character-state distributions among organisms that is lacking in a cladogram. This
information is one of the products of the evolutionary history of the organisms.
Even though i t is likely to be more a product of anagenesis than cladogenesis, it
remains a component that phylogenetic reconstruction must explain if it is to
attempt to describe the actual course of evolution.
REFERENCES
C A N , A. J. & HARRISON, G. A,, 1958. An analysis of the taxonomist’sjudgment ofaffinity. Proceedings of the
(oological Socieg of London, 131: 85-98.
FARRIS, J. S . , 1977. On the phenetic approach to vertebrate classification. In M.K. Hecht, P. C. Goody &
B. M. Hecht (Eds), Major Paffernnrin Vertebra& Evolution: 823-850. New York: Plenum Press.
FARRIS, J. S . , 1980. The information content of the phylogenetic system. Sysfmcfic <wlogy, 28: 483-519.
FOREY, P. L., 1982. Palamntologicalstoriesversusneontologicalanalysis.In K. A. Joysey & A. E. Friday (Eds),
Problem of Phylogmfic Rccomfrucfion: 114-157. London: Academic Press.
GILMOUR. J. S. L., 1937. A taxonomic problem. .Vafure, London, 139: 1 W 1 0 4 2 .
GILMOUR, J. S. L., 1940. Taxonomy and philosophy. In J. Huxley (Ed.), The .New Qsmatics: 461-474.
Oxford : Clarendon Press.
GILMOUR, J. S. L., 1961. Taxonomy. In A. M. MacLeod and L. S. Cobley (Eds), Contemporary Botanical
Thought: 27-45. Edinburgh: Oliver & Boyd.
GILMOUR, J. S. L. & WALTERS. S. M., 1963. Philosophy and classification. In W. B. Turrill (Ed.), Vistas in
Rolan).. IV. Recent Researches in Plant Taxonomy: 1-22, Oxford: Pergamon Press.
COWER, J. C.. 1974. Maximal predictive classification. Biomchics, 30: 643-654.
HILL, C. R. & CRA4NE,P. R., 1982. Cladistics and the origin ofangiosperms. In K. A. Joysey & A. E. Friday
(Eds), Problems of Phylagenetic Reconshucfion: 269-361.London: Academic Press.
, 416 ~440.
HULL, D. L.. 1980. The limits of cladism. Sysfmafu ~ o o l o p y 28:
JARDINE, N. & SIBSON, R., 1971. Malhrmaficaf 7axonony. London: Wiley.
McNEILL, J., 1979. Structural value: a concept used in the construction of taxonomic classifications. Taxon, 28:
48 1-504.
McNEILL, J., 1980. Purposeful phenetics. $ & m a f i c (oology, 28: 465-482.
PATTERSON, C., 1980. Cladistics. Biologist, I m d u n , 27: 234-240.
PLATNICK, N. I., 1977. Monotypy and the origin of higher taxa: a reply to E. 0.Wiley. Svstemafic <oolog~,26:
355-357.
PL..\TNICK, N. I., 1980. Philosophy and the transformation of cladistics. Jysfmatic .+dogy, 28: 537-546
SNEATH, P. H. A,, 1957. Some thoughts on bacterial classification.3 0 ~ of~General
~ 1 Microbiolo~,17: 184-200.