Download The History of the .05 Criterion of Statistical Significance

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Astronomy, Brewing, the Probable Error,
and the .05 Criterion of Statistical Significance
One of the articles on your reading list is: Cowles, M., & Davis, C. (1982). On the
origins of the .05 level of statistical significance. American Psychologist, 37, 553-558.
Here are three key paragraphs from that article
William Gosset (who wrote under the pen name of "Student") began his
employment with the Guinness Brewery in Dublin in 1899. Scientific methods were just
starting to be applied to the brewing industry. Among Gosset's tasks was the
supervision of what were essentially quality control experiments. The necessity of using
small samples, meant that his results were, at best, only approximations to the
probability values derived from the normal curve. Therefore the circumstances of his
work led Gosset to formulate the small-sample distribution that is called the t
distribution.
With respect to the determination of a level of significance, Student's (1908)
article, in which he published his derivation of the t test, stated that "three times the
probable error in the normal curve, for most purposes, would be considered
significant" (p. 13).
A few years later, another important article was published under the joint
authorship of an agronomist and an astronomer (Wood & Stratton, 1910). This paper
was essentially to provide direction in the use of probability in interpreting experimental
results. These authors endorse the use of PE as a measure: "The astronomer . . . has
devised a method of estimating the accuracy of his averages . . . the agriculturist cannot
do better than follow his example" (p. 425). They recommend "taking 30 to 1 as the
lowest odds which can be accepted as giving practical certainty that a difference
is significant" (p. 433). Such odds applied to the normal probability curve correspond
to a difference from the mean of 3.2 PE (for practical purposes this was probably
rounded to 3 PE).
You already know that if you go out one PE in each direction from the mean of a
normal distribution you mark off the middle 50% of the distribution. You also know that
if you go out about 2/3 of a standard deviation in each direction from the mean you mark
off the middle 50% of a normal distribution. Accordingly, one PE is equivalent to about
2/3 of a PE, and 3 PE is equivalent to about 2 standard deviations. If you go out about
two standard deviations from the mean in a normal distribution, you have marked off
about the middle 95%, leaving about 5% in the tails, the “rejection region” for the
traditional .05 criterion of statistical significance.
Another reason we might feel comfortable with the .05 criterion is that we have
five fingers on each hand. My reasoning here is the same I employ when I argue that
the reason we usually use a base-ten number system is that we have ten fingers with
which to count. If we had 12 fingers we would have probably ended up using a basetwelve number system.
Karl L. Wuensch,
August, 2010.