Download THE EVALUATION OF EXPERIMENTAL RESULTS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

History of randomness wikipedia , lookup

Birthday problem wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Expected utility hypothesis wikipedia , lookup

Indeterminism wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability box wikipedia , lookup

Probability interpretations wikipedia , lookup

Law of large numbers wikipedia , lookup

Risk aversion (psychology) wikipedia , lookup

Transcript
THE EVALUATION OF EXPERIMENTAL RESULTS
Adapted from Keeton and Gould, 3rd edition
The need for statistical analysis: We have mentioned ratios such as 1:2:1 or 3:1 or 1:1 or 9:3:3:1,
expected in the results of various types of crosses. Geneticists frequently use the phenotypic ratios obtained
in breeding experiments to deduce the underlying genetic phenomena. Evolutionary biologists, on the other
hand, look for evidence of evolution - that is, changes in gene frequencies - in the form of departures from
the expected Mendelian ratios. For example, when selection is operating against a homozygous dihybrid
recessive, the size of the fourth class of the 9:3:3:1 distribution is reduced. But chance deviations from a
predicted distribution are common: although four coin tosses have an expected distribution of two heads and
two tails, a 3:1 ratio is hardly surprising. But if the four-toss sequence were repeated 100 times, yielding
300 heads and 100 tails, most of us would suspect that the coin was biased in some way. To judge whether a
phenotypic distribution indicates the operation of selection (or of some unexpected genetic process), we
therefore must consider both the degree of discrepancy from the expected ratio and the sample size.
Scientists in all fields of research constantly encounter the same fundamental question - whether the
deviations they observe in their experimental results are significant or not. They cannot rely simply on a
guess. They cannot say, “That looks pretty close to what I predicted,” or “That looks odd”. To help them
arrive at a decision, they can refer to a system of standards based on the mathematical probability that any
observed deviation in their sample could have occurred by chance alone. This type of mathematical
treatment of data is known as statistical analysis. Statisticians have devised many mathematical tests for
evaluating experimental or observational data. Though these tests differ in their form and in the sorts of data
to which they can validly be applied, all are simply ways of calculating the probability that the deviations of
the observed values might be due to chance alone.
The chi-square test: One test of statistical significance, devised by Karl Pearson of the University of
London in 1900, represented a fundamental breakthrough for evaluating the results of experimental science.
Pearson’s so-called chi-square (X2) test is particularly applicable to many genetic experiments. This test
measures whether any deviation from the predicted norm that occurs in experimental results exceeds the
deviation that might occur by chance. The formula for chi-square is
X2 = ∑ (d2/e)
where d is the deviation from the expected value, e is the expected value, and ∑ means “the sum of”.
Consider two hypothetical crosses in which we expect that the phenotypic ratio should be 1:1 in the absence
of selection for or against one of the phenotypes. In one cross we actually got values of 45 and 55 instead of
50 and 50, and in the other we got values of 5 and 15 instead of 10 and 10. We want to know in each case
whether the deviation of the observed from the expected values can reasonably be attributed to chance, or
implies that selection is at work.
First we must determine the chi-square value for the two crosses.
First Phenotype
Observed values (o)
45
Second Phenotype
55
Expected values (e)
50
50
Deviation (d)
-5
+5
Deviation squared (d2)
25
25
d2/e
25/50 =0.5
25/50 = 0.5
X2 = ∑ (d2/e) = 0.5 + 0.5 = 1
Next let us determine the chi-square value for the 5:15 experiment, following the same procedure:
Observed values (o)
First Phenotype
5
Second Phenotype
15
Expected values (e)
10
10
Deviation (d)
-5
+5
Deviation squared (d2)
25
25
d2/e
25/10 =2.5
25/10 = 2.5
X2 = ∑ (d2/e) = 2.5 + 2.5 = 5
Notice that in each of these experiments the absolute deviations of the observed values from the expected
values are the same: a deviation of 5 in each phenotype. But notice also that the chi-squares obtained in the
two crosses are very different - the one based on a sample of 20 being five times as large as the one based on
a sample of 100. This illustrates well how sensitive chi-square is to sample size: the difference in sample
size alone has made the great difference in the two chi-square values. To interpret the values, however, we
need to know a little more.
Each of these crosses involves only two classes, in this case two different phenotypes. Hence their chisquare values were calculated on the basis of only two squared deviations. But suppose we had been
analyzing a cross involving three different phenotypes. Then the chi-square would have been calculated on
the basis of three squared deviations, and it is only reasonable to expect that the chi-square value obtained
would have been higher then one based on only two.
It is clear, then, that in evaluating chi-square values we must also take into account the number of classes on
which they are based. By convention, the number of independent classes in a chi-square test is termed the
degree of freedom. The number of independent classes is usually one fewer than the total number of classes
in the cross. Thus, in our crosses involving two phenotypes, there is only one independent class (and so one
degree of freedom), while in a cross involving three phenotypes there would be two independent classes and
two degrees of freedom. A moment’s thought will tell you why this is so.
In our cross based on a sample of 100, once we know that 45 offspring show the first phenotype, we
automatically know that 55 must show the other phenotype. Since we know the total, the number in one
class automatically tells us the number in the other class. In other words, the number in the second class is
dependent upon the number in the first class. Therefore, only the first class is an independent class. The
same reasoning applies if we perform a cross involving three different phenotypes, and the total number of
observations in our sample is 100; once we know the number showing the first and second phenotypes, we
automatically know the number showing the third phenotype, because the number in the third class is
dependent upon the number in the first two classes.
We now know the chi-square values (1.0 and 5.0) and the degrees of freedom (one for each experiment) for
our two hypothetical crosses. The next step is to consult a table of chi-square values. The table below gives
four different chi-square values for each of a series of different degrees of freedom, and gives the probability
(P) that a deviation as great as or greater than that represented by each chi-square value would occur simply
by chance.
Now let us evaluate the results obtained in the first of our hypothetical crosses. Here the deviation of our
results from those expected was such as to yield a chi-square value of 1.0. The cross had one degree of
freedom. According to the table, a value as high as or higher than 1.64 has a chance probability of 0.20 (20
percent); that is, deviation from the expected as great as or greater than that represented by 1.64 will occur
about once in five trials by chance alone. Our chi-square is less than 1.64; hence the deviation in the
experiment can be expected to occur by chance even more often than once in five trials. Most biologists
agree that deviations having a chance probability as great as or greater than 0.05 (5 percent, or 1 in 20) will
not be considered statistically significant. Since the deviation in our experiment has a chance probability
much greater than 5 percent, it is not regarded as statistically significant, and is presumed to be a chance
deviation, which can be disregarded.
In our second experiment, the chi-square value representing the deviation from the expected results turned
out be 5.0. Again there was one degree of freedom. Looking at the listing in the table for one degree of
freedom, we find that the value of 5.0 is greater than 3.84, which has a probability of 0.01 (1 percent).
Hence the probability that the deviation in this cross resulted purely from chance is less than 5 percent but
greater than 1 percent.
According to biological convention, then, the deviation from the expected results in the second cross is
significant: Some factor other than chance was involved in producing the disagreement between result and
prediction. At this point, a geneticist would begin the search for a reasonable explanation: the original
observations are always open to scrutiny; selection may have acted against one of the phenotypes, so that
some of those individuals died, thus leading to fewer representatives of this class than were expected; or
perhaps the assumptions concerning the genetics involved in this cross need modification. In this particular
case, one of the first things to do is to perform a similar experiment using a larger sample to minimize
chance error. After all, as we saw when we calculated the outcome of a dihybrid cross, the probability of
two events happening together is the product of their individual probabilities of happening alone. The
probability of this deviation occurring twice by chance is 0.05 x 0.05, or only 0.25 percent.
APBio/EvalExpRes/tk/2004