Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Foundations of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Psychometrics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Omnibus test wikipedia , lookup
Taylor's law wikipedia , lookup
Analysis of variance wikipedia , lookup
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal Email: [email protected] Chapter 11 Inferences About Population Variances Inference about a Population Variance Chi-Square Distribution Interval Estimation of 2 Hypothesis Testing Chi-Square Distribution We will use the notation to denote the value for the chi-square distribution that provides an area of a to the right of the stated value. For example, Chi-squared value with 5 degrees of freedom (df) at a =0.05 is 11.07. Interval Estimation of 2 .05 95% of the possible 2 values 0 2 .025 = 11.07 2 Interval Estimation of 2 Interval Estimate of a Population Variance ( n 1) s 2 a2 / 2 2 ( n 1) s 2 2(1 a / 2) where the values are based on a chi-square distribution with n - 1 degrees of freedom and 1 - a is the confidence coefficient. Interval Estimation of Interval Estimate of a Population Standard Deviation (n 1) s 2 (n 1) s 2 2 a / 2 (12 a / 2) Taking the square root of the upper and lower limits of the variance interval provides the confidence interval for the population standard deviation. Interval Estimation of 2 Example: Buyer’s Digest (A): Buyer’s Digest rates thermostats manufactured for home temperature control. In a recent test, 10 thermostats manufactured by ThermoRite were selected and placed in a test room that was maintained at a temperature of 68oF. The temperature readings of the ten thermostats are shown on the next slide. Interval Estimation of 2 Example: Buyer’s Digest (A) We will use the 10 readings below to develop a 95% confidence interval estimate of the population variance. Thermostat 1 2 3 4 5 6 7 8 9 10 Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2 Interval Estimation of 2 For n - 1 = 10 - 1 = 9 d.f. and a = .05 Selected Values from the Chi-Square Distribution Table Area in Upper Tail Degrees of Freedom 5 6 7 8 9 10 .99 0.554 0.872 1.239 1.647 2.088 .975 0.831 1.237 1.690 2.180 2.700 .95 1.145 1.635 2.167 2.733 3.325 .90 1.610 2.204 2.833 3.490 4.168 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209 Our .10 9.236 10.645 12.017 13.362 14.684 2 value .975 .05 11.070 12.592 14.067 15.507 16.919 .025 12.832 14.449 16.013 17.535 19.023 .01 15.086 16.812 18.475 20.090 21.666 Interval Estimation of 2 Sample variance s2 provides a point estimate of 2. 2 ( x x ) 6. 3 i s2 . 70 n 1 9 A 95% confidence interval for the population variance is given by: (10 1). 70 (10 1). 70 2 19. 02 2. 70 .33 < 2 < 2.33 Hypothesis Testing about a Population Variance Left-Tailed Test •Hypotheses H0 : 0 H a : 2 02 2 2 where 02 is the hypothesized value for the population variance •Test Statistic 2 ( n 1) s 2 20 Hypothesis Testing About a Population Variance Left-Tailed Test (continued) •Rejection Rule Critical value approach: p-Value approach: Reject H0 if 2 (12 a ) Reject H0 if p-value < a where (12 a ) is based on a chi-square distribution with n - 1 d.f. Hypothesis Testing About a Population Variance Right-Tailed Test •Hypotheses H0 : 0 2 2 H a : 2 20 where 02 is the hypothesized value for the population variance •Test Statistic 2 ( n 1) s 2 20 Hypothesis Testing About a Population Variance Right-Tailed Test (continued) •Rejection Rule Critical value approach: Reject H0 if 2 a2 p-Value approach: Reject H0 if p-value < a where a2 is based on a chi-square distribution with n - 1 d.f. Hypothesis Testing About a Population Variance Two-Tailed Test •Hypotheses H 0 : 2 20 H a : 2 20 where 02 is the hypothesized value for the population variance •Test Statistic 2 ( n 1) s 2 20 Hypothesis Testing About a Population Variance Two-Tailed Test (continued) •Rejection Rule Critical value approach: Reject H0 if 2 (12 a /2) or 2 a2 /2 p-Value approach: Reject H0 if p-value < a where (12 a /2) and a2 /2 are based on a chi-square distribution with n - 1 d.f. Hypothesis Testing About a Population Variance Example: Buyer’s Digest (B): Recall that Buyer’s Digest is rating ThermoRite thermostats. Buyer’s Digest gives an “acceptable” rating to a thermostat with a temperature variance of 0.5 or less. We will conduct a hypothesis test (with a = .10) to determine whether the ThermoRite thermostat’s temperature variance is “acceptable”. Hypothesis Testing About a Population Variance Example: Buyer’s Digest (B) Using the 10 readings, we will conduct a hypothesis test (with a = .10) to determine whether the ThermoRite thermostat’s temperature variance is “acceptable”. Thermostat 1 2 3 4 5 6 7 8 9 10 Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2 Hypothesis Testing About a Population Variance Hypotheses H 0 : 2 0.5 H a : 2 0.5 Rejection Rule Reject H0 if 2 > 14.684 Hypothesis Testing About a Population Variance For n - 1 = 10 - 1 = 9 d.f. and a = .10 Selected Values from the Chi-Square Distribution Table Area in Upper Tail Degrees of Freedom 5 6 7 8 9 10 .99 0.554 0.872 1.239 1.647 2.088 .975 0.831 1.237 1.690 2.180 2.700 .95 1.145 1.635 2.167 2.733 3.325 .90 1.610 2.204 2.833 3.490 4.168 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209 Our .10 value 2 .10 9.236 10.645 12.017 13.362 14.684 .05 11.070 12.592 14.067 15.507 16.919 .025 12.832 14.449 16.013 17.535 19.023 .01 15.086 16.812 18.475 20.090 21.666 Hypothesis Testing About a Population Variance Rejection Region 2 (n 1)s 2 2 9s 2 .5 Area in Upper Tail = .10 0 14.684 2 Reject H0 Hypothesis Testing About a Population Variance The sample variance s 2 = 0.7 Test Statistic 9(.7) 12.6 .5 2 Conclusion Because 2 = 12.6 is less than 14.684, we cannot reject H0. The sample variance s2 = .7 is insufficient evidence to conclude that the temperature variance for ThermoRite thermostats is unacceptable. Chapter 13, Part A: Analysis of Variance and Experimental Design Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality of k Population Means Introduction to Analysis of Variance Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means. We want to use the sample results to test the following hypotheses: H0: 1 = 2 = 3 = . . . = k Ha: Not all population means are equal Introduction to Analysis of Variance H0: 1 = 2 = 3 = . . . = k Ha: Not all population means are equal If H0 is rejected, we cannot conclude that all population means are different. Rejecting H0 means that at least two population means have different values. Assumptions for Analysis of Variance For each population, the response variable is normally distributed. The variance of the response variable, denoted 2, is the same for all of the populations. The observations must be independent. Test for the Equality of k Population Means Hypotheses H0: 1 = 2 = 3 = . . . = k Ha: Not all population means are equal Test Statistic F = MSTR/MSE Between-Treatments Estimate of Population Variance A between-treatment estimate of 2 is called the mean square treatment and is denoted MSTR. SSTR MSTR k 1 Denominator represents the degrees of freedom Numerator is the sum of squares due to treatments and is denoted SSTR Within-Samples Estimate of Population Variance The estimate of 2 based on the variation of the sample observations within each sample is called the mean square error and is denoted by MSE. SSE MSE nT k Denominator represents the degrees of freedom associated with SSE Numerator is the sum of squares due to error and is denoted SSE Test for the Equality of k Population Means k: # of subpopulations you are comparing. nT: Total number of observations. Rejection Rule Reject H0 if F > Fa where the value of Fa is based on an F distribution with k - 1 numerator d.f. and nT - k denominator d.f. Hypothesis Testing About the Variances of Two Populations Selected Values from the F Distribution Table Denominator Area in Degrees of Freedom 8 9 Numerator Degrees of Freedom Upper Tail .10 .05 .025 .01 7 2.62 3.50 4.53 6.18 8 2.59 3.44 4.43 6.03 9 2.56 3.39 4.36 5.91 10 2.54 3.35 4.30 5.81 15 2.46 3.22 4.10 5.52 .10 .05 .025 .01 2.51 3.29 4.20 5.61 2.47 3.23 4.10 5.47 2.44 3.18 4.03 5.35 2.42 3.14 3.96 5.26 2.34 3.01 3.77 4.96 Comparing the Variance Estimates: The F Test If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of MSTR/MSE is an F distribution with MSTR d.f. equal to k - 1 and MSE d.f. equal to nT - k. If the means of the k populations are not equal, the value of MSTR/MSE will be inflated because MSTR overestimates 2. Hence, we will reject H0 if the resulting value of MSTR/MSE appears to be too large to have been selected at random from the appropriate F distribution. ANOVA Table Source of Variation Sum of Squares Degrees of Freedom Mean Squares Treatment Error Total SSTR SSE SST k–1 nT – k nT - 1 MSTR MSE SST is partitioned into SSTR and SSE. F MSTR/MSE SST’s degrees of freedom (d.f.) are partitioned into SSTR’s d.f. and SSE’s d.f. ANOVA Table SST divided by its degrees of freedom nT – 1 is the overall sample variance that would be obtained if we treated the entire set of observations as one data set. With the entire data set as one sample, the formula for computing the total sum of squares, SST, is: nT k SST ( xij x ) 2 SSTR SSE j 1 i 1 k SSTR n j ( x j x ) 2 j 1 k SSE (n j 1) s 2j j 1 ANOVA Table ANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedom into their corresponding sources: treatments and error. Dividing the sum of squares by the appropriate degrees of freedom provides the variance estimates and the F value used to test the hypothesis of equal population means. Test for the Equality of k Population Means Example: Reed Manufacturing Janet Reed would like to know if there is any significant difference in the mean number of hours worked per week for the department managers at her three manufacturing plants (in Buffalo, Pittsburgh, and Detroit). Test for the Equality of k Population Means Example: Reed Manufacturing A simple random sample of five managers from each of the three plants was taken and the number of hours worked by each manager for the previous week is shown on the next slide. Conduct an F test using a = .05. Test for the Equality of k Population Means Observation 1 2 3 4 5 Sample Mean Sample Variance Plant 1 Buffalo 48 54 57 54 62 Plant 2 Pittsburgh 73 63 66 64 74 Plant 3 Detroit 51 63 61 54 56 55 26.0 68 26.5 57 24.5 Test for the Equality of k Population Means p -Value and Critical Value Approaches 1. Develop the hypotheses. H0: 1 = 2 = 3 Ha: Not all the means are equal where: 1 = mean number of hours worked per week by the managers at Plant 1 2 = mean number of hours worked per week by the managers at Plant 2 3 = mean number of hours worked per week by the managers at Plant 3 Test for the Equality of k Population Means Compute the test statistic using ANOVA Table Source of Variation Sum of Squares Degrees of Freedom Mean Squares Treatment Error Total 490 308 798 2 12 14 245 25.67 F 9.5 Test for the Equality of k Population Means p –Value Approach 4. Compute the critical value. With 2 numerator d.f. and 12 denominator d.f., Fa = 3.89. 5. Determine whether to reject H0. The F > Fa, so we reject H0. We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant.