Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 9 Inferring Population Means Copyright © 2013 Pearson Education, Inc. All rights reserved Central Limit Theorem, Mean Values App1: Theory App2: Confidence Intervals App3: Hypothesis Testing One-Population Two-Population Unmatched Two-Population Matched Copyright © 2013 Pearson Education, Inc.. All rights reserved. 9.1 Sample Means of Random Samples Copyright © 2013 Pearson Education, Inc. All rights reserved Central Limit Theorem, Mean Values Theory Copyright © 2013 Pearson Education, Inc.. All rights reserved. Inferring a Population’s Mean Value from a Sample’s Mean Value In this chapter we apply the Central Limit Thm to Mean Values in the same way did for Proportions (a.k.a. %). Remember that Proportions are used for Categorical Data and Means are used for Quantitative Data. Example: Do you smoke? VS. How many cigarettes do you smoke? The three applications covered for Mean Values are: Finding the mean value of a Group from within a population, given the population’s mean. Confidence Intervals for mean values Hypothesis Testing for means, both single and dual pops. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Notation: Statistics, Parameters, Means and Proportions Mean and Standard Deviation if the survey question has a numerical variable. Proportion if the survey question is Yes/No The confidence interval and hypothesis test always refer to the population not the sample Copyright © 2013 Pearson Education, Inc.. All rights reserved. Sample Proportions and the Central Limit Theorem Recall that all proportions (%) from all samples of the same size from a population form a Normal Dist. Likewise the Mean Value from all Samples, of the same size, form a Normal Distribution - even if the Population is Skewed! Population -> Individual Samples -> Histogram of all Samples’ Means -> ND Model used -> Example: Cherry Blossom Ten Mile Run The Race Times follow a Normal Distribution model with a mean of 97 minutes and a standard deviation of 17 minutes: N(97, 17). The population parameters are, in symbols, µ = 97 minutes σ = 17 minutes Now lets randomly sample 30 runners and calculate the average of their finishing times. Lets repeat this many times and build a histogram of the mean values from each sample. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Example: Cherry Blossom Ten Mile Run The population of all runners’s times look like this: Dot plot of around 100 samples (remember n = 30): The center models the pop Symmetric Unimodal (Ideally would be thinner) Copyright © 2013 Pearson Education, Inc.. All rights reserved. Simulating Many Sample Means As the sample size increases Accuracy Does Not Change Better Precision - sampling variability diminishes Copyright © 2013 Pearson Education, Inc.. All rights reserved. CLThm: Standard Error for Means The standard deviation of the sampling gets smaller with larger sample size. This is true for any population distribution. The Standard Error is the standard deviation of the sampling distribution. For sample mean it is: To complete the picture: µ represents the mean of the population σ represents the standard deviation of the population Copyright © 2013 Pearson Education, Inc.. All rights reserved. EXAMPLE 1: iTunes Library Statistics A student’s iTunes library of mp3s has a very large number of songs. The mean length of the songs is 243 seconds, and the standard deviation is 93 seconds. The distribution of song lengths is right-skewed. Using his mp3 player, this student will create a playlist that consists of 25 randomly selected songs. Q1: Is the mean value of 243 minutes an example of a parameter or a statistic? Explain. A: The mean of 243 is an example of a parameter, because it is the mean of the population that consists of all of the songs in the student’s library. Copyright © 2013 Pearson Education, Inc.. All rights reserved. EXAMPLE 1: iTunes Library Statistics (cont) Q2: What should the student expect the average song length to be for his playlist? The sample mean length can vary, but is typically the same as the population mean: 243 seconds. Q3: What is the standard error for the mean song length of of all samples of 25 randomly selected songs? The standard error is: Copyright © 2013 Pearson Education, Inc.. All rights reserved. Example: The mean cost per item at a grocery store is $2.75 and the standard deviation is $1.26. A shopper randomly puts 36 items in her cart. Is 2.75 a parameter or a statistic? Predict the average cost per item in the shopper’s cart. Parameter $2.75 Find the standard error for carts with 36 items. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Comparing Standard Errors The mean income for residents of the city is $47,000 and the standard deviation is $12,000. Find the standard error for the following sample sizes n=1 n=4 n = 16 n = 100 Copyright © 2013 Pearson Education, Inc.. All rights reserved. Comparing Standard Errors The mean income for residents of the city is $47,000 and the standard deviation is $12,000. Find the standard error for the following sample sizes n=1 n=4 n = 16 n = 100 → $12,000 → $6,000 → $3,000 → $1,200 Copyright © 2013 Pearson Education, Inc.. All rights reserved. 9.2 The Central Limit Theorem for Sample Means Copyright © 2013 Pearson Education, Inc. All rights reserved Application 1: Grouped Data, Means Copyright © 2013 Pearson Education, Inc.. All rights reserved. The Central Limit Theorem for Sample Means The Central Limit Theorem (CLT) assures us that no matter what the shape of the population distribution, if a sample is selected such that the following conditions are met, then the distribution of sample means follows an approximately Normal distribution. Random Sample and Independence. Each observation is collected randomly from the population, and observations are independent of each other. Normal. Either the population distribution is Normal or the sample size is large. Big Population. If the sample is collected without replacement, then the population must be at least 10 times larger than the sample size. (( No “10 success / 10 failure” condition for means. It does not make sense - no proportions involved here )) Copyright © 2013 Pearson Education, Inc.. All rights reserved. What is a Large Enough Sample Size? If the population distribution is not too far from Normal then the sample size can be small. For most population distributions n = 25 or higher gives sufficient accuracy. If the population distribution is far from normal, a larger sample size is needed. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Example: Visualizing the Central Limit Theorem Distribution of annual tuitions and fees at all twoyear colleges in the US ( 2008–2009 academic year) >>> Skewed Right Histogram of 30 randomly selected samples from the population. Each sample is of size 30. Already looking unimodal, not symmetric. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Visualizing the Central Limit Theorem Distribution of 200 randomly selected samples from the population. Each sample is of size 30. Now increase sample size. Distribution of 200 randomly selected samples from the population. Each sample is of size 60. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Application 1: Grouped Data, Means Flagship Example EXAMPLE 2: Pulse Rates Are Not Normal A large study in the US finds the mean resting pulse rate of adult women is 74 beats per minute, with a standard deviation of 13 bpm. The distribution is skewed right. QUESTION: If we take a random sample of 36 women from this population what is the probability that the average pulse rate of this group will be below 71 bpm? (a) Standard Error: (b) z-Score: (c) Area to the Left: P( z < -1.38 ) = .0838 The probability that a randomly selected group has a mean less than 71 bpm is 8.4% Copyright © 2013 Pearson Education, Inc.. All rights reserved. EXAMPLE 2: Pulse Rates - with Software A large study in the US finds the mean resting pulse rate of adult women is 74 beats per minute, with a standard deviation of 13 bpm. The distribution is skewed right. Q: If 38 women are selected, find the probability that the mean of the group is less than 72. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Individual vs. Grouped ! ! ! ! EXAMPLE: The EPA finds that consumer car mileage follow a normal distribution with a mean of 24 mpg and SD of 6 mpg. An Individual from a Normally Distributed Population Q1: What is the probability a random car will get < 20 mpg? A: Find the Area to the left of 20 mpg: ! z-score for 20 mpg: z = ( 20 - 24 ) / 6 = -0.67 ! Z-Table: => 0.2514 or 25.1% A Group from a Normally Distributed Population Q2: What is the probability a randomly selected group of 8 cars have an average < 20 mpg? ! SE = 6 / sqrt (8) = 2.121 ! z = ( 20 - 24 ) / 2.121 = - 1.89 ! Z-Table: ==> 0.0294 or 2.9% Copyright © 2013 Pearson Education, Inc.. All rights reserved. Contrasting Individual to Grouped ND If you do not have a Normally Distributed population you cannot find probabilities from the z-table for an individual. But you can find probabilities from the z-table for groups from within the population, so long as conditions are met, namely n > 25 or so. If you do have a Normally Distributed population you can find probabilities from the z-table And you can find probabilities from the z-table for groups from within that population, without condition, even for small group sizes, like n = 8. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Reviewing the different types of distributions The population distribution is the distribution of all individuals that exist. The distribution within a sample is the distribution of the individuals that were surveyed. The mean, standard deviation, and the shape are likely to be close to the population distribution. The sampling distribution is the distribution of all possible sample means of sample size n. The mean will be the same as the population mean, but the shape will be approximately normal and the standard deviation will be smaller. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Reviewing the different types of distributions Population Distribution An Individual Sample’s Distribution Distribution of a bunch of Samples’ Means Distribution Model for Samples’ Means 9.3 Answering Questions about the Mean of a Population: Confidence Intervals and Hypothesis Tests Copyright © 2013 Pearson Education, Inc. All rights reserved Application 2: Confidence Intervals One-Population, Means Gosset’s “Student’s - t” Distribution In the late 1800‘s William S. Gosset, an employee of the Guinness Brewery in Dublin, Ireland, worked long and hard to find the sampling model for Means. The sampling model that Gosset found has been known as Student’s t-model. The Student’s t-models form a whole family of related distributions that depend on a parameter known as degrees of freedom, a value related to sample size. We often denote degrees of freedom as df, and a particular instance of the model as tdf. Introduction to the t-Distribution The t-statistic will be: Compare this to the z-statistic: If σ is unknown, we cannot find the z-score. The difference is that the SE is an estimate Introduction to the t-Distribution We will need to alter the z-Table we use to look up our areas from. This new table (distribution) is called the: t-distribution This is not just a single table like the N(0,1) was. This is a family of z-like tables, one for each sample size, referred to as the Degree of Freedom. In other words each dof has it’s own z-like table. Because it is unwieldily to show a few dozen tables in the text we will use software or only the critical values from these tables. Introduction to the t-Distribution The t-distribution is broader and shorter than the Normal distribution, thus making extremes not as rare events. Introduction to the t-Distribution As sample size increases the t-distribution narrows and it’s shape approaches the Normal distribution, so for large n we could use the normal distribution. For n < 10 the two are almost indistinguishable. Introduction to the t-Distribution For n < 40 the two are indistinguishable. Application 2: Confidence Intervals A confidence interval is a useful answer to the following questions: “What’s the typical value for a variable in this large group of objects or people? How far away from the truth might this estimate of the typical value be?” You should provide a confidence interval whenever you are estimating the value of a population parameter on the basis of a random sample from that population. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Confidence Level The confidence level is a measure of how well the method used to produce the confidence interval performs. The Confidence level gives the percent of confidence intervals that contain the population mean. As in proportions, the Confidence Level is associated with a critical value, a t*-value, in this case. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Confidence Levels Here are a 100 samples’ Confidence Intervals. The 5 red ones miss the true proportion. It appears that a 95% Confidence Level was used. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Confidence Intervals for Means: The Steps 1. Conditions: Independence? Randomization Condition? 10% Condition? Nearly Normal Condition? 2. Do the Math Calculate the Standard Error Pick the critical value (the multiplier) called t* Form the Margin of Error = t* x SE Make the Interval: (mean - ME, mean + ME) 3. Interpret the results Copyright © 2013 Pearson Education, Inc.. All rights reserved. Assumptions and Conditions Independence Assumptions: Independence Assumption. The data values should be independent. Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly from an SRS) are ideal. 10% Condition: When a sample is drawn without replacement, the sample should be no more than 10% of the population. Nearly Normal: If the sample appears skewed seek n > 25. Finding the Multiplier t* The Degree of Freedom represents the number of ‘directions’ the sample size, n, can ‘vary’. For a single population means the df = n - 1 Once the df is calculated, we can access one family of the t-distribution tables. Next we utilize the selected Confidence Level. The left and right side t-value that captures a “Confidence Level’s worth of area” in the middle of the distribution defines the critical value t* Recall Critical Values for z-scores Many z-tables callout critical z* scores and the area they contain between their + and - values. Well, we do likewise for each of the df-families of the t-distribution and put them in a list: Finding t-Values By Hand The Student’s t-model has a different table for each value of degrees of freedom. Because of this, Statistics books usually have one table of only the t-model’s critical values for selected confidence levels, instead of page after page of all t-values. EXAMPLE 7: Finding the Multiplier t* Suppose we collect a sample of 15 iPads and wish to calculate a 90% Confidence Interval for the mean battery life. Q: Find the critical value, t*, for a 90% C.I. when n = 15 SOLUTION: df = n - 1 or here df = 15 -1 =14 A CI is a 2-tail problem, and the complement of 90% is 10% row 14, col 0.10 => 1.761 Application 2: Confidence Intervals One-Population, Means Flagship Example EXAMPLE 8: College Tuition Costs A random sample of 35 U.S. junior colleges had a mean tuition of $2380, and a standard deviation of $1160. Find a 90% confidence interval for the mean tuition of all U.S. junior colleges based on this sample. (i) Conditions: Independence: Random: Big Pop: Nearly Normal: (ii) Do the Math: t* : SE: : (iii) Interpret: EXAMPLE 8: College Tuition Costs A random sample of 35 U.S. junior colleges had a mean tuition of $2380, and a standard deviation of $1160. Find a 90% confidence interval for the mean tuition of all U.S. junior colleges based on this sample. (i) Conditions: Independence: Assumed Random: Stated Big Pop: There are more than 350 junior colleges in the U.S. Nearly Normal: Not mentioned but not needed since large sample (ii) Do the Math: t* : df = 35 - 1 = 34, 100% - 90% = 10% => t*-Table = 1.691 = 2380 ± (1.691 * 196.0758) = 2380 ± 331.56 (iii) Interpret: A 90% C I for the mean tuition of all U.S. junior colleges is ($2048, $2712). EXAMPLE 8: College Tuition Costs Statdisk: Analysis -> Confidence Interval -> Mean - One Sample Another CI Example 45 randomly selected college students worked on homework for an average of 9 hours per week. Their standard deviation was 2 hours. Find a 90% confidence interval for the population mean. d.f. = 45 - 1 = 44 → t* = 1.68 Standard Error: Interval: Lower Bound: 9 – 1.68 x 0.30 ≈ 8.5 Upper Bound: 9 + 1.68 x 0.30 ≈ 9.5 Next Slide Copyright © 2013 Pearson Education, Inc.. All rights reserved. Example (cont.) - CI Interpretations 45 randomly selected college students worked on homework for an average of 9 hours per week. Their standard deviation was 2 hours. Find a 90% confidence interval for the population mean. Interpretation of Confidence Interval: We are 90% confident that the population mean number of hours worked on homework for all college students is between 8.5 and 9.5 hours. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Example (Cont.) - CI Interpretations 45 randomly selected college students worked on homework for an average of 9 hours per week. Their standard deviation was 2 hours. Find a 90% confidence interval for the population mean. Interpretation of Confidence Level: If many groups of 45 randomly selected students were surveyed, 90% of these confidence intervals will succeed in containing the actual population mean number of hours worked on homework, but, 10% will not contain the true population mean. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Sample Size To find the sample size needed for a particular confidence level with a particular margin of error (ME), solve this equation for n: The problem with using the equation above is that we don’t know most of the values. There a two ways to overcome this: We can use s from a small pilot study. We can use z* in place of the necessary t value. Application 3: Hypothesis Tests, One-Population Part II: Hypothesis Test for a Population Mean The same four steps apply for a hypothesis test for a population mean: 1. Hypothesize. State your hypotheses about the population parameter. 2. Prepare. Get ready to test: Choose and state a significance level. Choose a test statistic appropriate for the hypotheses. Check conditions and state any assumptions that must be made. 3. Compute to compare. Compute the observed value of the test statistic in order to compare the null hypothesis value to our observed value. Find the p-value to measure your level of surprise. 4. Interpret. Do you reject or fail to reject the null hypothesis? Copyright What does this mean? © 2013 Pearson Education, Inc.. All rights reserved. Test Statistic for One-population Means Compare the observed value of the sample mean, x-bar, to the value claimed by the null hypothesis, µ. Called the one-sample t-test, is very similar in structure to the test for one proportion Copyright © 2013 Pearson Education, Inc.. All rights reserved. Conditions “Anyone can make a decision, but only a statistician can measure the probability that the decision is right or wrong.” We need to know the sampling distribution of our test statistic. The sampling distribution follows the t-distribution under these conditions: Condition 1: Random Sample and Independence. The data must be a random sample from a population, and observations must be independent of one another. Condition 2: Normal. The population distribution must be Normal or the sample size must be large. For most situations, 25 is large enough. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Application 3: Hypothesis Test, One-Population Flagship Example Example: Word Length Textbook authors must be careful that the reading level of their book is appropriate for the target audience. Some methods of assessing reading level require estimating the average word length. We have randomly chosen 20 words from a randomly selected page in Stats: Modeling the World and counted the number of letters in each word: 5, 5, 2, 11, 1, 5, 3, 8, 5, 4, 7, 2, 9, 4, 8, 10, 4, 5, 6, 6 Suppose that our editor was hoping that the book would have a mean word length of 6.5 letters. Does this sample indicate that the authors failed to meet this goal? Test an appropriate hypothesis and state your conclusion. Example: Word Length (cont.) Step 1: Hypothesis Let µ represents the population mean of word lengths. Then H0: µ = 6.5, The Claim Ha: µ ≠ 6.5 We will thus use a Two-Tail t-test The null hypothesis says that the mean is targeting the intended audience at 6.5 letters per word. The alternative is that the mean is not targeting the intended audience. The mean word length is either too low or too high Example: Word Length (cont.) Step 2: Conditions We will test using a 5% significance level. Conditions Random Sample? Stated. Independence? Assumed. 10% Condition? The textbook has more than 200 words. Normal? A histogram of the observed word lengths looks ‘roughly’ unimodal and symmetric, so the population of all word lengths may be approximately normal. It is appropriate to use a one sample t-test. Example: Word Length (cont.) Step 3: Do The Math Enter the word length into a calculator: x-bar = 5.50, s = 2.685 where together we get: 2.685 t*-Table: Referring to the two-tail header (since we have a two-tail alternative hypothesis) we see on row 19 that 1.67 is between combined tail areas of 0.10 to 0.20 (technology yields 0.11). This means that our p-value, between 0.10 and 0.20, is larger than the significance value of 0.05 so we fail to reject the null. Example: Word Length (cont.) Step 4: Interpret Statistician's statement: The p-value is higher than 0.05, and thus we fail to reject the null hypothesis! LA Times: So we conclude that this sample does not provide evidence that the average word length differs from the goal of 6.5 letters. Example: Word Length (cont.) Statdisk Analysis -> Hypothesis Testing -> Mean - One Sample EXAMPLE 9: Dieting Is the Weight Watchers diet effective? researchers examined 40 subjects who were randomly assigned to this diet. Researchers recorded the change in weight after 12 months. Only 26 of the 40 subjects stayed with the diet for that long, so we have data on only these 26 people. Test the hypothesis that people on the Weight Watchers diet tend to lose weight. (A negative weight change means the person lost weight.) Data: After a year, the average change in weight of the 26 people who stayed on the diet was negative 4.6 kilograms (about 10 pounds), with a standard deviation of 5.4 kg. EXAMPLE 9: Dieting (cont.) Step 1: Hypothesis Let µ represent the mean weight change of the population. H 0: µ = 0 Ha: µ < 0 => Left-Tail t-test The null hypothesis says that the mean is 0, because the neutral position here is that no change occurs, on average. This is the same as saying that the diet is ineffective. The alternative is that the mean change is negative - people lost weight! EXAMPLE 9: Dieting (cont.) Step 2: Conditions We will test using a 5% significance level. Condition 1: Random Sample and Independence? The subjects in this study were not selected randomly from the population of all dieters. Independence assumed. Condition 2: Normal? The distribution of the sample does not look Normal, so we suspect the population distribution is not Normal. But because the sample size is larger than 25, this condition is satisfied. EXAMPLE 9: Dieting (cont.) Step 3: Do The Math = ( -4.5 - 0) / 1.0590 = -4.34 We use a t-distribution with n - 1 = 25 degrees of freedom. On row 25 we see that 4.34 is to the left of 0.005, that is our p-value is less than 0.005 (technology yields 0.0001) EXAMPLE 9: Dieting (cont.) Step 4: Interpret The p-value is much smaller than 0.05, and thus we reject the null hypothesis! So we conclude that the mean weight change is in fact negative, meaning that people do tend to lose weight after one year on this diet. EXAMPLE 9: Dieting (cont.) Statdisk Analysis -> Hypothesis Testing -> Mean - One Sample Hypothesis Test Example (by Formula) 1. Ford claims that its 2012 Focus gets 40 mpg on the highway. Does your Focus’ mpg differ from 40 mpg? You chart your Focus over 35 randomly selected highway trips and find it got 39.5 mpg with a standard deviation of 1.4 mpg. Hypothesize 2. H0: µ = 40, Ha: µ ≠ 40 Prepare Choose α = 0.05, Use t-statistic: random and large sample Copyright © 2013 Pearson Education, Inc.. All rights reserved. Ford claims that it’s 2012 Focus gets 40 mpg on the highway. Does your Focus’ mpg differ from 40 mpg? You chart your Focus over 35 randomly selected highway trips and find it got 39.2 mpg with a standard deviation of 1.4 mpg. 3. Compute to Prepare 4. Interpret p-value = 0.04 < α = 0.05 Reject H0. Accept Ha. There is statistically significant evidence to conclude that your Focus does not get 40 mpg on average. Copyright © 2013 Pearson Education, Inc.. All rights reserved. 9.4 Comparing Two Population Means Copyright © 2013 Pearson Education, Inc. All rights reserved Two-Populations Means Copyright © 2013 Pearson Education, Inc.. All rights reserved. Comparing Two-Populations, Means We finish this chapter by comparing two populations to each other. We first create a confidence interval for the differences between two groups We next test samples to challenge a claim between two groups (a hypothesis test) We will do this for two types of data sets: Independent and Dependent (aka unmatched and matched, also, unpaired and paired) Copyright © 2013 Pearson Education, Inc.. All rights reserved. Independent vs. Dependent (Paired) Two samples are dependent or paired if each observation from one group is coupled with a particular observation from the other group. Before and After Identical Twins Husband and Wife Older Sibling and Younger Sibling If there is no pairing then the samples are independent. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Quiz: Independent or Dependent? Do women perform better on average than men on their statistics final? 60 women and 40 men were surveyed. 40 people’s blood pressure was measured before and after giving a public speech. Does blood pressure change on average? Is the average tip percent greater for dinner than lunch? 35 wait staff who worked both lunch and dinner looked at their receipts. Are Americans more stressed out on average compared to the French? 50 from each country were given a stress test. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Test: Independent or Dependent? Do women perform better on average than men on their statistics final? 60 women and 40 men → were surveyed. 40 people’s blood pressure was measured before and after giving a public speech. Does blood → pressure change on average? Is the average tip percent greater for dinner than lunch? 35 wait staff who worked both lunch and→ dinner looked at their receipts. Are Americans more stressed out on average compared to the French? 50 from each country → were given a stress test. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Ind Dep Dep Ind Comparing Two Means First, when comparing populations, examine the side-by-side boxplots Our parameter of interest : µ1 – µ2. Comparing Two Means (cont.) Recall that Variance = SD2 For independent random quantities, variances add 2 2 SD2 total = SD 1 + SD 2 So, the standard deviation of the sum or difference of two populations is the square root of the sum of variances: We still don’t know the true standard deviations of the two groups, so we need to estimate and use the standard error Comparing Two Means (cont.) Because we are working with means and estimating the standard error of their difference using data, we use the t-distribution model. The confidence interval we build is called a two-sample t-interval. The hypothesis test is called a two-sample ttest. Sampling Distribution for the Difference Between Two Means When the conditions are met, the standardized sample difference between the means of two independent groups is . The standard error for estimates will be Assumptions and Conditions Conditions (Each condition needs to be checked for both groups. Independence Assumption: Is each member of a sample independent from each other? Randomization Condition: Were the data collected with suitable randomization (representative random samples or a randomized experiment)? 10% Condition: We don’t usually check this condition for differences of means. We will check it for means only if we have a very small population or an extremely large sample. Assumptions and Conditions (cont.) Normal Population Assumption: Nearly Normal Condition: This must be checked for both groups. A violation by either one violates the condition. Independent Groups Assumption: The two groups we are comparing must be independent of each other. See the Next Part (matched/paired) if the groups are not independent of one another App2: Two-Population Confidence Interval When the conditions are met, we are ready to find the confidence interval for the difference between means of two independent groups. The confidence interval is where the standard error of the difference of the means is The critical value depends on the particular confidence level, C, that you specify and on the number of degrees of freedom, which we get from the sample sizes and a special formula. App2: Two-Population Confidence Interval The special formula for the degrees of freedom (or row #) for our t*- value is: We will let technology calculate degrees of freedom for us! ((For Exams, we will use the smaller of each populations degree of freedom: n1 - 1 and n2 - 1 )) Application 2: Confidence Interval, Two-Populations Flagship Example Copyright © 2013 Pearson Education, Inc.. All rights reserved. EXAMPLE 11 Comparing Men’s and Women’s Senses of Smell We compare men and women whose index of smell was measured while they were lying down. The summary statistics are: Men: x-bar = 10.0694, s = 3.3583, n = 18 Women: x-bar = 11.1250, s = 2.7295, n = 18 The box-plots give us a visual comparison: Q: Find a 95% confidence interval for the means difference in smelling ability between men and women. Copyright © 2013 Pearson Education, Inc.. All rights reserved. EXAMPLE 11 Comparing Men’s and Women’s Senses of Smell ! (i) Conditions ! Independence Assumption ! Randomization Condition: ! 10% Condition: ! Nearly Normal Condition: Copyright © 2013 Pearson Education, Inc.. All rights reserved. EXAMPLE 11 Comparing Men’s and Women’s Senses of Smell ! (i) Conditions ! Independence Assumption These data consist of two independent samples: 18 men and 18 women. ! Randomization Condition: Not Mentioned ! 10% Condition: Not needed if large population. ! Nearly Normal Condition: The box plots show Q2 and Q3 as being symmetric, but long tails exist. Proceed with caution since both sample sizes > 25. Copyright © 2013 Pearson Education, Inc.. All rights reserved. EXAMPLE 11 Comparing Men’s and Women’s Senses of Smell ! ! (ii) Do the Math ! df ! critical value: ! ME = ! Interval: (iii) Conclusion ! Copyright © 2013 Pearson Education, Inc.. All rights reserved. EXAMPLE 11 Comparing Men’s and Women’s Senses of Smell ! ! (ii) Do the Math ! df = min of (18-1 and 18 - 1) = 17 Technology yields = 33 ! critical value: t-table row 17, col 0.05, two-tail => t* = 2.110 ! ME = = 2.11x1.020 = 2.1522 ! Interval: -1.0556 +/- 2.1522, or about (-3.2, 1.1) (iii) Conclusion ! Because the interval contains zero, we cannot rule out the possibility that the mean difference in the population is 0. This suggests that men and women may not differ in their ability to smell. Copyright © 2013 Pearson Education, Inc.. All rights reserved. EXAMPLE 11 Comparing Men’s and Women’s Senses of Smell We compare men and women whose index of smell ... Men: x-bar = 10.0694, s = 3.3583, n = 18 Women: x-bar = 11.1250, s = 2.7295, n = 18 Statdisk: Analysis -> Confidence Interval -> Mean 2-Ind Samples Copyright © 2013 Pearson Education, Inc.. All rights reserved. Conf Int 2 -Pop Indep: Engr &Psych 38 randomly selected engineer majors and 42 randomly selected psychology majors were observed to estimate the difference in how long it takes to graduate. Use this data: Q: Find a 95% confidence interval for the difference. The two population are independent since there is no pairing between each engineer major and each psychology major. The students were selected randomly, independently, and the sample sizes are both greater than 25. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Conf Int 2 -Pop Indep: Engr &Psych ! Statdisk: Analysis -> Conf Intvl -> Mean 2-Ind Samples Conf Int 2 -Pop Indep: Engr &Psych 38 randomly selected engineer majors and 42 randomly selected psychology majors were observed to estimate the difference in how long it takes to graduate. Use this data: We are 95% confident that the average time it takes to graduate is between 0.3 and 0.7 years longer for psychology majors than for engineer majors. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Application 3: Hypothesis Tests, Two-Populations Copyright © 2013 Pearson Education, Inc.. All rights reserved. II. Hypothesis Test for the Difference Between Two Means We test the hypothesis H0: µ1 – µ2 = Δ0, where the hypothesized difference, Δ0, is almost always 0, using the statistic The standard error is When the conditions are met and the null hypothesis is true, this statistic can be closely modeled by a Student’s t-model with a number of dof given by a special formula. We use that model to obtain a P-value. Back Into the Pool Remember that when we know a proportion, we know its standard deviation. Thus, when testing the null hypothesis that two proportions were equal, we could assume their variances were equal as well. This led us to pool our data for the hypothesis test. Back Into the Pool (cont.) For means, there is also a pooled t-test. Like the two-proportions z-test, this test assumes that the variances in the two groups are equal. But, be careful, there is no link between a mean and its standard deviation… We will avoid pooled situations in Means 2-pop HT What Can Go Wrong? Watch out for paired data. The Independent Groups Assumption deserves special attention. If the samples are not independent, you can’t use two-sample methods. Regression? Look at the boxplots. Check for outliers and non-normal distributions by making and examining boxplots. HT 2-pop Paired: Chocolate & Memory Does eating chocolate improve memory. 12 people were give a memory test before and after eating chocolate. The data for the number of words recalled out of 50 are shown below. Assume Normality. 1. 16 33 9 42 38 27 30 41 After 20 29 11 42 39 25 34 44 26 Hypothesize 2. Before 24 H0: µdiff = 0, Ha: µdiff ≠ 0 Prepare α = 0.05, T-Statistic, large sample Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 2-pop Paired: Chocolate & Memory 3. Compute to Compare 4. Stat → T Statistics → Paired Interpret P-value = 0.13 > 0.05 = α Fail to Reject H0 Conclusion: There is insufficient evidence to make a conclusion about the mean number of words memorized increasing after eating chocolate. Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 2-pop Indp Samples: Hot & Cold Batteries 1. Do batteries last longer in colder climates than in warmer ones? The table shows some randomly selected battery lives in months. Florida 19 22 25 21 18 19 27 25 Montreal 37 49 22 26 47 41 38 37 Hypothesize H0: µF = µM Ha: µF < µM 2. Prepare = 0.05 Independent Samples, Assume Normal Distributions α Copyright © 2013 Pearson Education, Inc.. All rights reserved. 28 15 HT 2-pop Indp Samples: Hot & Cold Batteries 3. Compute to Compare Stat → T Statistics → Two sample → with data Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 2-pop Indp Samples: Hot & Cold Batteries Florida 19 22 25 21 18 19 27 25 Montreal 37 49 22 26 47 41 38 37 4. 28 15 Interpret P-value = 0.0009 < 0.05 = α Reject H0 Accept Ha Conclusion: There is statistically significance evidence to support the claim that on average batteries last longer in Montreal than in Florida. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Chapter 9 Guided Exercise 1 Copyright © 2013 Pearson Education, Inc. All rights reserved HT 1-pop: Is the Mean Body Temperature really 98.6? 1. A random sample of 10 independent healthy people showed body temperatures (in degrees Fahrenheit) as follows: 98.5, 98.2, 99.0, 96.3, 98.3, 98.7, 97.2, 99.1, 98.7, 97.2 Use α = 0.05. Hypothesize H0: µ = 98.6 Ha: µ ≠ 98.6 Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 1-pop: Is the Mean Body Temperature really 98.6? 2. Prepare Not far from normal. Sample collected randomly. Use the t-statistic. Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 1-pop: Is the Mean Body Temperature really 98.6? 3. Do the Math t ≈ -1.65 p-value ≈ 0.13 p-value ≈ 0.13 > 0.05 = α 4. Interpret We cannot reject 98.6 as the population mean body temperature from these data at the 0.05 level. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Chapter 9 Guided Exercise 2 Copyright © 2013 Pearson Education, Inc. All rights reserved HT 2-pop Indep: TV Watching and Wealth A two-sample t-test for the number of televisions owned in households of random samples of students at two different community colleges. Assume independence. One of the schools is in a wealthy community (MC), and the other (OC) is in a less wealthy community. Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 2-pop Indep: TV Watching and Wealth 1. Hypothesize Let µoc be the population mean number of televisions owned by families of students in the less wealthy community (OC), and let µmc be the population mean number of televisions owned by families of students at in the wealthy community (MC). H0: µoc = µm Ha: µoc ≠ µm Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 2-pop Indep: TV Watching and Wealth 2. Prepare Choose an appropriate t-test. Because the sample sizes are 30, the Normality condition of the t-test is satisfied. State the other conditions, indicate whether they hold, and state the significance level that will be used. Use a t-test with two independent samples. The households were chosen randomly and independently. The population of all households of each type is more than 10 times the sample sizes. Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 2-pop Indep: TV Watching and Wealth 3. Do the Math t = 0.95 p-value = 0.345 4. Interpret Since the p-value = 0.345 is very large, we fail to reject H0. At the 5% significance level, we cannot reject the hypothesis that the mean number of televisions of all students in the wealthier community is the same as the mean number of televisions of all students in the less wealthy community. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Chapter 9 Guided Exercise 3 Copyright © 2013 Pearson Education, Inc. All rights reserved HT 2-pop Paired: Pulse Before and After Fright Test the hypothesis that the mean of college women’s pulse rates is higher after a fright, using α = 0.05. 1. Hypothesize H0: µbefore = µafter Ha: µbefore > µafter 2. Prepare Choose a test: Should it be a paired t-test or a two-sample t-test? Why? Assume that the sample was random and that the distribution of differences is sufficiently Normal. Mention the level of significance. Paired t-test since before and after. Level of Significance: α = 0.05. Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 2-pop Paired: Pulse Before and After Fright 3. Compute to Compare t ≈ 4.9 p-value = 0.002 0.002 < 0.05 Copyright © 2013 Pearson Education, Inc.. All rights reserved. HT 2-pop Paired: Pulse Before and After Fright 4. Interpret Reject or do not reject H0. Then write a sentence that includes “significant” or “significantly” in it. Report the sample mean pulse rate before the scream and the sample mean pulse rate after the scream. Reject H0. There is statistically significant evidence to support the claim that mean blood pressure is higher after a fright. µbefore ≈ 74.8 µafter ≈ 83.7 Copyright © 2013 Pearson Education, Inc.. All rights reserved.