Download Lecture Notes (Italics = Handouts)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lecture Notes (Italics = Handouts)
Chapter 18 (Moore)
Inference about a Population Proportion
Note: We are covering the last half of this book in a different order
and this chapter follows Chapter 12.
Statistical Inferences
Review – the potato picture
(population:parameter::sample:statistic)
Types of inference we will look at
Estimation of parameters
Hypothesis Testing (Tests of Significance)
Estimates of Parameters
Types – point vs interval
Confidence Interval Estimates (see boxes on pages 233 – 234)
level of confidence (confidence level, TI: C-Level)
margin of error m: maximum likely difference between
the parameter being estimated and our estimates
Estimation of a population proportion
Sampling Distribution of a proportion, p̂
Requirements:
SRS (Independence of sampled values)
Number being counted in the sample is binomial
The sample is sufficiently large
The sampling distribution of the proportions ( p̂ ) (note that
Moore does not use the standard notation but I find it
convenient so I will use it).
E( p̂ ) =  p̂ = p
SD( p̂ ) =  p̂ =
p(1  p)
n
This distribution is approximately normal for a random
sample of size n, sufficiently large (Moore states this by
saying that np and n(1 – p) are both  15).
When we can’t compute the standard deviation of a sampling
distribution because we don’t know the value of a parameter
(e.g. p) we use a statistic which estimates the parameter (e.g.
p̂ ) the resulting measure is called the “standard error”, SE, of
the sampling distribution.
SE( p̂ ) =
pˆ (1  pˆ )
p(1  p)
= SD( p̂ )

n
n
Note it is common notation to use q = 1 – p
Confidence Interval (CI) estimation of a population proportion, p
Margin of error, m = z*SE( p̂ ) = z*
pˆ (1  pˆ )
n
A CI estimate of p is ( p̂ – m, p̂ + m) (see page 331)
This can also be written as p̂  m
Sample size for a CI for p
2
 z*
n    p * (1  p*) , where p* is your best estimate for p. If
m
you have no estimate for p use p* = 0.5.
Significance tests for a population proportion, p
The basic idea of a hypothesis test
A hypothesis is a statement or claim about a property of a
population
The general idea:
 An assumption is made about the nature of the population.
 Based on this assumption an expectation about the value of
the statistic computed from the sample is made.
 Sample data is gathered and sample statistics are
computed.
 The differences between what we expected, based on our
assumption about the population, and what we observed in
the sample is examined.
 If the probability of the observation is extremely small we
would conclude that the assumption is probably not correct.
Example 1:
We assume that a coin is “fair” (P(head) = p = 0.50).
If we were to flip the coin a large number of times, say
400, we would expect the sample proportion, p̂ , to be
near 0.50.
If we observed 300 heads in the 400 flips of this coin then
p̂ = 0.75.
The likelihood of this observation with a fair coin is
extremely small, the probability is less than 10 –22!
Based on this observation we would conclude that the
assumption that this is a “fair” coin is incorrect.
Example 2:,
We assume that the mean life of 60 watt GE light bulbs is
more than 1000 hours.
If we sample 120 of these light bulbs and test them we’d
expect the sample mean, x , to be greater than 1000.
Our sample mean is x = 950 hours with s = 78 hours.
The likelihood of this observation if  > 1000 is extremely
small, the probability is less than 10 –12!
Based on this observation we would conclude that the
assumption that the mean is greater than 1000 hours is
incorrect.
Choosing a null and alternative hypothesis and the steps of a
hypothesis test. (Hypothesis Testing Format).
We will use the P-value method (today it is the most commonly
seen in research and journals). I use a 5 format, others use
anywhere from 4 to 8 steps by breaking it up differently but
the procedures are essentially the same.
Null and Alternative Hypotheses
Test statistic: z 
pˆ  p0
p0 (1  p0 )
n
P-value: use normalcdf to compute
Conclusions
When testing H0: p = p0 against
HA: p > p0 P-value = P(Z > zobserved)
HA: p < p0 P-value = P(Z < zobserved)
HA: p ≠ p0 P-value = 2 P(Z > |zobserved|)
Caution: “Fail to reject” is not the same as “accept”. The trial
analogy of acquittal vs guilty. “Acquittal” not equal to “innocent.”
Note that the P-value represents the strength of the evidence against
the null hypothesis (H0) and in favor of the alternative hypothesis
(HA) on a continuous scale. (The terminology is my own
interpretation and is by no means universal or even common.)
P-value
0.200
0.150
0.100
0.050
0.025
0.010
0.001
Strength of the
evidence against H0
Extremely weak
Very weak
Weak
Moderate
Moderately strong
Strong
Very Strong
Statistically
significant
Never
Rarely
Infrequently
Marginally
Usually
Almost always
Always
In traditional (old school) hypothesis testing a P-value of 0.049
would have been consider statistically significant and the null
hypothesis would be rejected in favor of the alternative while a Pvalue of 0.051 would have been considered not statistically
significant and H0 would not have been rejected. Most statisticians
would consider this somewhat misleading since P-values of 0.049
and 0.051 represent essentially the same degree of evidence. This is
why the more modern approach to presenting the results of a test of
significance is to simply give the P-value and some statement about
the strength of the evidence, e.g. “The data provide fairly strong
statistical evidence (P = 0.018) that the proportion women in favor
of stronger gun control laws is higher than the proportion of men in
favor these laws.”
Decision
Reject H0
Fail to reject H0
H0 true
Error (Type I)
Correct
H0 false
Correct
Error (Type II)
Examples of Type I and Type II errors.
P(Type I error) =  (alpha) and P(Type II error) =  (beta)
Controlling  and 
Statistical vs Practical significance:
Statistical significance simply means that there is sufficient
statistical evidence to reject the null hypothesis in favor of the
alternative.
Practical significance means the actual difference between the
value in the null hypothesis and reality is large enough to be
meaningful or useful in the real world. For example someone
claims that only about 40% of all dogs in Solano County are
licensed, a random sample of 800 dogs is taken and the sample
proportion is 43%, which is statistically significantly more
than 40% but is it of any practical importance that out of every
100 dogs 3 more are licensed?
The power of a test is the probability of rejecting a false null
hypothesis (making a correct decision what the null hypothesis is
false), this probability is 1 – . Computing power can be done with
some computer programs like Minitab.
Power is important because if a test is too powerful will reject H0
when the difference is so small that it is of no practical importance.
On the other hand if the power is too small we won’t even be able to
detect large difference between our assumption and reality.
Chapter 18: Exercises: 1, 3, 5, 9, 10, 13 – 19, 21, 25, 31, 33, 35
Related documents