Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Notes Chapter 19: Confidence Interval for a Single Proportion When we record categorical variables, our data consists of counts or of percents obtained from counts. We are doing inference on the parameters of population proportions. pˆ x n Confidence tells how “confident” we are that our calculations captured the true proportion - it lends credibility to our inferences. A Confidence Interval in General: statistic (critical value)·(standard deviation of statistics) A Confidence Interval for a Proportion: pˆ z * pˆ (1 pˆ ) n z* is the critical value – we look it up on the t-distribution table. Conditions: (these are the same as they were for sampling distributions) 1) Randomization. The sample should be a simple random sample (SRS) of the population. (This is often difficult to achieve in reality. We at least need to be very confident that the sampling method was not biased and that the sample is representative of the population.) 2) 10% Rule. In order to insure independence, we can not take a sample that is too large without replacement. As long as our sample is no more than 10% of our population size, we protect independence. 3) Success/Failure. To insure that the sample size is large enough to approximate normal, we must expect at least 10 successes and at least 10 failures. np 10 and n(1 – p) 10 Your two key phrases when making statements: Phrase 1: Interprets the confidence level: Saying that we are “95% confident” means that with this data, if many intervals were constructed in this manner, we would expect approximately 95% of them to contain the true proportion of ____(context). Phrase 2: Interprets a single confidence interval: We are #% confident that the true proportion of ____(context) lies in the interval …… The more observations we have (n) the more we reduce our margin of error. However, taking measurements can be difficult or costly. We must balance our desire for a small margin of error with practical judgment. There is also a relationship between the confidence level and the margin of error. As our confidence increases so does our margin of error. A great deal of confidence is not very helpful when it makes the margin of error so large that the interval tells us nothing. Remember that this is inference. It is not a promise or a certainty. **When trying to achieve a certain margin of error, if p-hat has not been established yet, then you can estimate it to get an approximate sample size. If no good estimate is available, p-hat = .5 is the most conservative estimate of p-hat and will insure a margin or error smaller than that desired. Notes Chapter 20: Testing Hypotheses about Proportions The second type of inference: used to assess the evidence provided by data in favor of some claim about the population. We will begin by supposing for the sake of argument that the “effect” is not present. The first step is to state a claim that we will try to find evidence against. This is called the null hypothesis. The null hypothesis, Ho, is the statement being tested. The test is then designed to assess the strength of the evidence against the null hypothesis. Usually it is a statement of “no effect or no difference”. The alternative hypothesis, Ha, is the statement we hope or suspect is true instead of Ho. Hypothesis always refer to some population, not a particular outcome. We must state Ho and Ha in terms of population parameters. Ha can be one sided or two sided. P-Values -A test of significance assesses the evidence against the null hypothesis in term of probability. - Ha determines what kinds of outcomes count as evidence against Ho and in favor of Ha Definition of P-value the probability (computed assuming that Ho is true) that the test statistic would take a value as extreme or more extreme than that actually observed. The smaller the P-value, the stronger the evidence against Ho provided by the data. Significance Level – setting a fixed value that we regard as decisive. Level must be chosen in advance of calculations or it is biased by the calculations. Denoted (alpha). If the P-value is as small or smaller than , we say that the data are statistically significant at level . Common Steps to all Significance Tests: 1) State Ho and Ha. 2) Specify significance level, . 3) Identify correct test and conditions. 4) Calculate the value of the test statistic 5) Find the P-value for the observed data (If the P-value is less than or = to , the test result is “statistically significant at level .) 6) Answer the question in context. In general for Hypothesis Testing: Standardized test statistic: _____statistic – parameter______ standard deviation of statistic Test Statistic for a proportion: z pˆ p0 p0 (1 p0 ) n Conditions are the same as those for confidence intervals of proportions. To write a set of hypotheses, Ho: p = po (the pop. proportion is the true center) and one of the following: Ha : p > po (seeks evidence that the pop. prop is larger) Ha: p < po (seeks evidence that the pop. prop is smaller) Ha: p po (seeks evidence that the pop. prop is different) (po is replaced with a numerical value of interest) A few things to remember… *Don’t base your null hypotheses on what you see in the data. You must always think about the situation you are investigating and make your null hypothesis describe the “nothing interesting” or “nothing has changed” scenario. No peeking at the data! * Don’t base your alternative hypotheses on what you see in the data either. You must again think about the situation you are investigating and decide on your alternative based on what results would be of interest to you, not what you might see in the data. * Don’t make your null hypothesis what you want to show to be true. Remember, the null is the status quo, the “nothing is strange here” position. You wonder whether the data casts doubt on that. You can reject the null hypothesis, but you can never “accept” or “prove” the null. Notes Chapter 21: More About Hypothesis Testing We have talked some about α (alpha) levels. I like to think of an alpha level like a “line in the sand”. It identifies for us up front, how extreme we think the sample statistics must be, in order to be considered “significant”. The most common levels of alpha are .10, .05, and .01. We choose an alpha based on the consequences of an incorrect conclusion. Those incorrect conclusions are… Type I and Type II Errors Fail to Reject H0 “my decision” Reject the Ho “The truth” H0 is true H0 is false Confidence Level (1 - α) this is a good decision, it is the probability of stating no difference when there is none. Type I Error (α) the probability of stating there is a difference when there actually isn’t one. Type II Error (β) probability of stating there is no difference when there actually is. Power (1 - β) this is also a good decision, probability of stating there is a difference when there actually is one. The power of a test is defined as the probability to correctly reject a null hypothesis. The distance between the null hypothesis value po and the true p is called the effect size. Ideally we would like to reduce the probability we make type I and type II errors while at the same time having a power test. Unfortunately it’s not that simple. As we alter one, we often have an effect on the other. Here are some things you should know about Type I, Type II, and Power… *Increasing the sample size (which decreases the variability) will increase the power (1 - β). *Increasing the effect size will increase the power (1 - β). *Increasing alpha (α) will increase the power (1 - β). * Anything that increases the power (1 - β ) will automatically decrease the Type II error (β). It is like a balancing act between all of these!! There are no guarantees for a correct decision. On the AP Test you do not have to calculate power. You must understand power conceptually and understand how changing other values effects power. Chapter 19 Examples: The Princeton Metro Times reported that 48% of a random sample of 369 students at the College of New Jersey indicated that they were “binge drinkers”. Binge drinking was defined as consuming 5-6 drinks in 1 sitting for men and 4-5 drinks in 1 sitting for women. Construct and interpret a 90% confidence interval for the proportion of students at the College of New Jersey who are binge drinkers. Suppose a new treatment for a certain disease is given to a random sample of 200 patients with the disease. The treatment was successful for 166 of the patients. A) Construct and interpret a 99% confidence interval for the proportion of patients with this disease who were successfully treated. B) In the context of this situation, explain what it means to be 99% confident in any interval. C) If the traditional treatment for this disease has a success rate of about 70%, does this interval give evidence that the new treatment is better? Explain. An automobile manufacturer would like to know what proportion of its customers are not satisfied with the service provided by their local dealer. The customer relations department will survey a random sample of customers and compute a 95% confidence interval for the proportion who are not satisfied. From past studies they believe that this proportion will be about 0.2. Find the sample size needed if the margin of error of this confidence interval is to be about 0.03. Chapter 20 Examples: Shaquille O’Neal of the Los Angeles Lakers, the NBA’s most valuable player for the 2000 season, showed a significant weakness in free throw shooting, shooting only 53.3% from the free throw line. During the off season after 2000, Shaq worked with assistant coach Tex Winter on his free throw technique. During the first two games of the next season, Shaq made 26 out of 39 free throws. Do these results provide evidence that Shaq has improved his free throw shooting? The manufacturer of a particular brand of microwave popcorn claims that only 2% of its kernels of corn fail to pop. A competitor, believing that the actual percentage is larger, tests 2000 kernels and finds that 44 failed to pop. Do these results provide sufficient evidence to support the competitor’s belief? About 10% of the adult population is left handed. Suppose that a researcher speculates that artists are more likely to be left handed than are other people in the general population. The researcher surveys 150 artists and finds that 18 of them are left handed. Is this sufficient evidence to support the researchers claim? Chapter 21 Examples: Medical researchers now believe there may be a link between baldness and heart attacks in men. A) State the null and alternative hypotheses for a study used to investigate whether or not there is such a relationship. B) In the context of this situation, what would a Type I error be and what would be a consequence of that decision? C) In the context of this situation, what would a Type II error be and what would be a consequence of that decision? The marketing department for a computer company must determine the selling price for a new model of personal computer. In order to make a reasonable profit, the company would like the computer to sell for $3200. If more than 30% of the potential customers would be willing to pay this price, the company will adopt it. A survey of potential customers is to be carried out; it will include a question asking the maximum amount that the respondent would be willing to pay for a computer with the features of the new model. Let p denote the proportion of all potential customers who would be willing to pay $3200 or more. Then the hypotheses to be tested are Ho: p = .3 versus Ha: p > .3. In the context of this example, describe type I and type II errors. Discuss the possible consequences of each type of error. Occasionally, warning flares of the type contained in most automobile emergency kits fail to ignite. A consumer advocacy group is to investigate a claim against a manufacturer of flares brought by a person who claims that the proportion of defectives is much higher than the value of .1 claimed by the manufacturer. A large number of flares will be tested and the results used to decide between Ho: p = .1 versus Ha: p > .1, where p represents the true proportion of defectives for flares made by this manufacturer. If Ho is rejected, charges of false advertising will be filed against the manufacturer. A) Explain why the alternative hypothesis was chosen to be Ha: p > .1. B) In this context, describe type I and type II errors and discuss the consequences of each.