Download Notes Chapter 19: Confidence Interval for a Single Proportion

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Notes Chapter 19: Confidence Interval for a Single Proportion
When we record categorical variables, our data consists of counts or of percents obtained
from counts. We are doing inference on the parameters of population proportions.
pˆ  x
n
Confidence tells how “confident” we are that our calculations captured the true proportion
- it lends credibility to our inferences.
A Confidence Interval in General:
statistic  (critical value)·(standard deviation of statistics)
A Confidence Interval for a Proportion:
pˆ  z *
pˆ (1  pˆ )
n
z* is the critical value – we look it up on the t-distribution table.
Conditions: (these are the same as they were for sampling distributions)
1) Randomization. The sample should be a simple random sample (SRS) of the
population. (This is often difficult to achieve in reality. We at least need to be very
confident that the sampling method was not biased and that the sample is representative of
the population.)
2) 10% Rule. In order to insure independence, we can not take a sample that is too large
without replacement. As long as our sample is no more than 10% of our population size,
we protect independence.
3) Success/Failure. To insure that the sample size is large enough to approximate normal,
we must expect at least 10 successes and at least 10 failures.
np  10 and n(1 – p)  10
Your two key phrases when making statements:
Phrase 1: Interprets the confidence level:
Saying that we are “95% confident” means that with this data, if many intervals were
constructed in this manner, we would expect approximately 95% of them to contain the
true proportion of ____(context).
Phrase 2: Interprets a single confidence interval:
We are #% confident that the true proportion of ____(context) lies in the interval ……
The more observations we have (n) the more we reduce our margin of error. However,
taking measurements can be difficult or costly. We must balance our desire for a small
margin of error with practical judgment.
There is also a relationship between the confidence level and the margin of error. As our
confidence increases so does our margin of error. A great deal of confidence is not very
helpful when it makes the margin of error so large that the interval tells us nothing.
Remember that this is inference. It is not a promise or a certainty.
**When trying to achieve a certain margin of error,
if p-hat has not been established yet, then you can estimate it to get an approximate sample
size. If no good estimate is available, p-hat = .5 is the most conservative estimate of p-hat
and will insure a margin or error smaller than that desired.
Notes Chapter 20: Testing Hypotheses about Proportions
The second type of inference: used to assess the evidence provided by data in favor of
some claim about the population.
We will begin by supposing for the sake of argument that the “effect” is not present. The
first step is to state a claim that we will try to find evidence against.
This is called the null hypothesis.
The null hypothesis, Ho, is the statement being tested. The test is then designed to assess
the strength of the evidence against the null hypothesis. Usually it is a statement of “no
effect or no difference”.
The alternative hypothesis, Ha, is the statement we hope or suspect is true instead of Ho.
Hypothesis always refer to some population, not a particular outcome. We must state Ho
and Ha in terms of population parameters.
Ha can be one sided or two sided.
P-Values
-A test of significance assesses the evidence against the null hypothesis in term of
probability.
- Ha determines what kinds of outcomes count as evidence against Ho and in favor of Ha
Definition of P-value
the probability (computed assuming that Ho is true) that the test statistic would take a value
as extreme or more extreme than that actually observed. The smaller the P-value, the
stronger the evidence against Ho provided by the data.
Significance Level – setting a fixed value that we regard as decisive. Level must be
chosen in advance of calculations or it is biased by the calculations. Denoted  (alpha).
If the P-value is as small or smaller than , we say that the data are statistically
significant at level .
Common Steps to all Significance Tests:
1) State Ho and Ha.
2) Specify significance level, .
3) Identify correct test and conditions.
4) Calculate the value of the test statistic
5) Find the P-value for the observed data
(If the P-value is less than or = to , the test
result is “statistically significant at level .)
6) Answer the question in context.
In general for Hypothesis Testing:
Standardized test statistic:
_____statistic – parameter______
standard deviation of statistic
Test Statistic for a proportion:
z
pˆ  p0
p0 (1  p0 )
n
Conditions are the same as those for confidence intervals of proportions.
To write a set of hypotheses,
Ho: p = po (the pop. proportion is the true center)
and one of the following:
Ha : p > po (seeks evidence that the pop. prop is larger)
Ha: p < po (seeks evidence that the pop. prop is smaller)
Ha: p  po (seeks evidence that the pop. prop is different)
(po is replaced with a numerical value of interest)
A few things to remember…
*Don’t base your null hypotheses on what you see in the data. You must always think
about the situation you are investigating and make your null hypothesis describe the
“nothing interesting” or “nothing has changed” scenario. No peeking at the data!
* Don’t base your alternative hypotheses on what you see in the data either. You must
again think about the situation you are investigating and decide on your alternative based
on what results would be of interest to you, not what you might see in the data.
* Don’t make your null hypothesis what you want to show to be true. Remember, the
null is the status quo, the “nothing is strange here” position. You wonder whether the data
casts doubt on that. You can reject the null hypothesis, but you can never “accept” or
“prove” the null.
Notes Chapter 21: More About Hypothesis Testing
We have talked some about α (alpha) levels. I like to think of an alpha level like a “line in
the sand”. It identifies for us up front, how extreme we think the sample statistics must be,
in order to be considered “significant”. The most common levels of alpha are .10, .05, and
.01. We choose an alpha based on the consequences of an incorrect conclusion. Those
incorrect conclusions are…
Type I and Type II Errors
Fail to
Reject H0
“my
decision”
Reject the
Ho
“The truth”
H0 is true
H0 is false
Confidence
Level (1 - α)
this is a good decision, it
is the probability of
stating no difference
when there is none.
Type I Error
(α)
the probability of stating
there is a difference when
there actually isn’t one.
Type II Error
(β)
probability of stating
there is no difference
when there actually
is.
Power
(1 - β)
this is also a good
decision, probability
of stating there is a
difference when
there actually is one.
The power of a test is defined as the probability to correctly reject a null hypothesis.
The distance between the null hypothesis value po and the true p is called the effect size.
Ideally we would like to reduce the probability we make type I and type II errors while at
the same time having a power test. Unfortunately it’s not that simple. As we alter one, we
often have an effect on the other. Here are some things you should know about Type I,
Type II, and Power…
*Increasing the sample size (which decreases the variability) will increase the power
(1 - β).
*Increasing the effect size will increase the power (1 - β).
*Increasing alpha (α) will increase the power (1 - β).
* Anything that increases the power (1 - β ) will automatically decrease the Type II error
(β).
It is like a balancing act between all of these!! There are no guarantees for a correct
decision.
On the AP Test you do not have to calculate power. You must understand power
conceptually and understand how changing other values effects power.
Chapter 19 Examples:
The Princeton Metro Times reported that 48% of a random sample of 369 students at the College of New
Jersey indicated that they were “binge drinkers”. Binge drinking was defined as consuming 5-6 drinks in
1 sitting for men and 4-5 drinks in 1 sitting for women. Construct and interpret a 90% confidence interval
for the proportion of students at the College of New Jersey who are binge drinkers.
Suppose a new treatment for a certain disease is given to a random sample of 200 patients with the
disease. The treatment was successful for 166 of the patients.
A) Construct and interpret a 99% confidence interval for the proportion of patients with this disease who
were successfully treated.
B) In the context of this situation, explain what it means to be 99% confident in any interval.
C) If the traditional treatment for this disease has a success rate of about 70%, does this interval give
evidence that the new treatment is better? Explain.
An automobile manufacturer would like to know what proportion of its customers are not satisfied with
the service provided by their local dealer. The customer relations department will survey a random sample
of customers and compute a 95% confidence interval for the proportion who are not satisfied. From past
studies they believe that this proportion will be about 0.2. Find the sample size needed if the margin of
error of this confidence interval is to be about 0.03.
Chapter 20 Examples:
Shaquille O’Neal of the Los Angeles Lakers, the NBA’s most valuable player for the 2000 season,
showed a significant weakness in free throw shooting, shooting only 53.3% from the free throw line.
During the off season after 2000, Shaq worked with assistant coach Tex Winter on his free throw
technique. During the first two games of the next season, Shaq made 26 out of 39 free throws.
Do these results provide evidence that Shaq has improved his free throw shooting?
The manufacturer of a particular brand of microwave popcorn claims that only 2% of its kernels of corn
fail to pop. A competitor, believing that the actual percentage is larger, tests 2000 kernels and finds that
44 failed to pop. Do these results provide sufficient evidence to support the competitor’s belief?
About 10% of the adult population is left handed. Suppose that a researcher speculates that artists are
more likely to be left handed than are other people in the general population. The researcher surveys 150
artists and finds that 18 of them are left handed. Is this sufficient evidence to support the researchers
claim?
Chapter 21 Examples:
Medical researchers now believe there may be a link between baldness and heart attacks in men.
A) State the null and alternative hypotheses for a study used to investigate whether or not there is such a
relationship.
B) In the context of this situation, what would a Type I error be and what would be a consequence of that
decision?
C) In the context of this situation, what would a Type II error be and what would be a consequence of that
decision?
The marketing department for a computer company must determine the selling price for a new model of
personal computer. In order to make a reasonable profit, the company would like the computer to sell for
$3200. If more than 30% of the potential customers would be willing to pay this price, the company will
adopt it. A survey of potential customers is to be carried out; it will include a question asking the
maximum amount that the respondent would be willing to pay for a computer with the features of the new
model. Let p denote the proportion of all potential customers who would be willing to pay $3200 or
more. Then the hypotheses to be tested are
Ho: p = .3 versus Ha: p > .3.
In the context of this example, describe type I and type II errors. Discuss the possible consequences of
each type of error.
Occasionally, warning flares of the type contained in most automobile emergency kits fail to ignite. A
consumer advocacy group is to investigate a claim against a manufacturer of flares brought by a person
who claims that the proportion of defectives is much higher than the value of .1 claimed by the
manufacturer. A large number of flares will be tested and the results used to decide between
Ho: p = .1 versus Ha: p > .1,
where p represents the true proportion of defectives for flares made by this manufacturer. If Ho is
rejected, charges of false advertising will be filed against the manufacturer.
A) Explain why the alternative hypothesis was chosen to be Ha: p > .1.
B) In this context, describe type I and type II errors and discuss the consequences of each.