Download Supplement for Statistics 260 P-Values 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Fault tolerance wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Portable appliance testing wikipedia , lookup

Automatic test equipment wikipedia , lookup

Transcript
Supplement for Statistics 260
P-Values
1-1
Significance Testing: The P-Value Approach
Hypothesis testing questions involve deciding between two contradictory hypotheses: the Null
Hypothesis Ho and the Alternative Hypothesis Ha . The general approach is to put Ho on trial and
conclude that Ho is plausible unless statistical evidence is found that substantially discredits Ho in favour
of Ha . That is, Ho is to be retained unless statistical evidence proves Ha beyond reasonable doubt. The
statistical evidence is summarized in the form of an observed value of a Test Statistic, which is a random
variable having a known distribution under Ho and having a much stronger tendency to take extreme
values under Ha than under Ho .
The classical approach to hypothesis testing involves the construction of a level " rejection region,
which is a set of extreme values that has only a small chance " of including the observed value of the
test statistic under Ho . The user of a level " test rejects Ho if and only if the observed value of the test
statistic falls inside the level " rejection region. For a given hypothesis testing question, the
significance level ", which is the probability of rejecting Ho when Ho is true (or Type 1 error
probability), is specified by the user at a sufficiently low level (typically .10, or .05, or .01) to give the
desired control over the chance of wrongly rejecting Ho .
Although the textbook emphasizes the classical approach to hypothesis testing, the P-value approach
will be emphasized in this course. Here, in order to judge the strength of statistical evidence against Ho
in favour of Ha , we ask: “If Ho is true and we were to rerun the experiment, what would be the chance of
finding evidence against Ho in favour of Ha at least as strong as the observed evidence?” This chance is
the P-value. The smaller the P-value, the more rare the observed result if Ho is true, the stronger the
statistical case against Ho .
For a given hypothesis testing question, the P-value is calculated from the observed value of the test
statistic as follows:
P-value = the probability, computed under the assumption that Ho is true, that a rerun of the experiment
would yield a value of the test statistic that is at least as extreme (in the direction of Ha ) as the observed
value.
The P-Value Approach, step-by-step:
1. Define parameter(s) to be tested. Use standard notation.
2. Specify Ho and Ha .
3. Specify Test Statistic and identify its (approximate) distribution under Ho .
4. Compute observed value of Test Statistic.
5. Compute P-value.
6. Report strength of evidence (very strong if
,
strong if
, moderate if
little or none if
) against Ho in favour of Ha , and report the estimated value
of parameter being tested plus the estimated standard error.
,
Supplement for Statistics 260
P-Values
1-2
7. If asked to test Ho at level " , compare " with P-value and reject Ho if and only if
. By doing this, a classical level " test can be carried out without ever
constructing a level " rejection region.
Example 1: The relative conductivity of a semiconductor device is determined by the amount of
impurity “doped” into the device during its manufacture. A silicon diode to be used for a specific
purpose requires a cut-on voltage of .60 volts, and if this is not achieved then the mechanism governing
the amount of impurity must be adjusted. A random sample of 80 such diodes yielded a sample
average voltage of .58 volts and a sample standard deviation of .12 volts. Do these data indicate that
the true average cut-on voltage is something other than .60?
Solution for Example 1:
1. : = true mean cut-on voltage of diode (in volts)
2.
3. Test Statistic
4.
5.
6. There is little or no evidence against the null hypothesis that
is plausible given the data. The estimated value of : is
A mean cut-on voltage of .60
with estimated standard error =
.
Example 2: A mixture of pulverized fuel ash and Portland cement to be used for grouting should have a
compressive strength of more than 1300 KN/m2. The mixture will not be used unless experimental
evidence indicates conclusively that the strength specification has been met. A random sample of 45
specimens of this mixture yielded a sample average compressive strength of 1324 KN/m2 and a sample
standard deviation of 62 KN/m2. Do these data indicate that this mixture meets the strength
specification? Test the relevant hypotheses using " = .05.
Solution for Example 2:
1. : = true mean compressive strength of mixture (in KN/m2)
2.
3. Test Statistic
4.
Supplement for Statistics 260
P-Values
1-3
5.
6. There is very strong evidence against the null hypothesis in favour of
value of : is
7. Since
. The estimated
KN/m2, with estimated standard error =
reject
KN/m2.
at level " = .05.
Example 3: A new design for the breaking system on a certain type of car has been proposed. For the
current system, the true average breaking distance at 65 kph under specified conditions is known to be
40 m. It is proposed that the new design be implemented only if sample data strongly indicate a
reduction in true average breaking distance for the new design. A random sample of 46 observations
using the new design yielded a sample average breaking distance of 38.9 m and a sample standard
deviation of 3.7 m. Do these data indicate that the new design should be implemented?
Solution for Example 3:
1. : = true mean breaking distance at 65 kph using the new design (in metres)
2.
3. Test Statistic
4.
5.
6. There is strong evidence against the null hypothesis in favour of
is
m, with estimated standard error =
m.
. The estimated value of :