Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Psychometrics wikipedia , lookup
Taylor's law wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Misuse of statistics wikipedia , lookup
Omnibus test wikipedia , lookup
EXAMPLE A recent national survey found that high school students watched an average (mean) of 6.8 videos per month. A random sample of 36 college students revealed that the mean number of videos watched last month was 6.2, with a standard deviation of 0.5. At the .05 significance level, can we conclude (test) that college students watch fewer videos a month than high school students? CS-College Students HS-High School Students Solution It is a one tail test because the key word is that college students watch fewer videos a month than high school students. Information: Xbar=6.2 μ=6.8 α=.05 σ=.5 n=36 Step 1: State the hypothesis Ha: CS<6.8 Ho: CS≥6.8 (High School Students) Step 2: Significance Level α=.05 Step 3: Test Statistic Z-Test because n>30 Step 4: Since one tailed test .5-.05=.45 & the closest number on the Z table is .4505 Which yields a Z= 1.65. Will also be on the negative side of the number line. Rejection criteria : Reject the null hypothesis , if the calculated value of test statistic is less than the critical value. Step 5: Calculate Z test result Z X / n = 6.2 6.8 0.5 / 36 Z Test of Hypothesis for the Mean Data 6.8 Null Hypothesis = Level of Significance 0.05 Population Standard Deviation 0.5 Sample Size 36 Sample Mean 6.2 Intermediate Calculations Standard Error of the Mean 0.083333333 Z Test Statistic -7.2 Lower-Tail Test Lower Critical Value -1.644853627 p-Value 3.01063E-13 Reject the null hypothesis | Reject Ho | Accept Ho z ---------- (-7.202) ---------- (-1.65) ---------0--------- (1.65) -----| Critical Value Conclusion: Since the calculated value of Z is less than the critical value, we reject the null hypothesis. Thus the sample provide enough evidence to support the claim that college students watch fewer videos a month than high school students SET2 Problem # 2 A recent national survey found that high school students watched an average (mean) of 5 videos per month. A random sample of 48 college students revealed that the mean number of videos watched last month was 5.7, with a standard deviation of 0.4. At the .05 significance level, can we conclude (test) that college students watch fewer videos a month than high school students? Use the template I followed in the first problem and show all the problem information and all 5 steps of the hypothesis testing process. Show a number line along with your final decision, please. Answer Step 1 : The null hypotheses tested is H0: CS≥5 (High School Students) Ha: CS<5 Step 2: Significance Level α=.05 Step 3: Test Statistic Step 4: Z-Test because n>30 Critical Value = -1.65 ( Since it a lower tailed test) Rejection criteria : Reject the null hypothesis , if the calculated value of test statistic is less than the critical value. Step 5: Calculate Z test result Z X = 12.1243 / n Details Z Test of Hypothesis for the Mean Data 5 Null Hypothesis = Level of Significance 0.05 Population Standard Deviation 0.4 Sample Size 48 Sample Mean 5.7 Intermediate Calculations Standard Error of the Mean 0.057735027 Z Test Statistic 12.12435565 Lower-Tail Test Lower Critical Value -1.644853627 p-Value 1 Do not reject the null hypothesis Distribution Plot Normal, Mean=0, StDev=1 0.4 Density 0.3 0.2 0.1 0.05 0.0 -1.64 0 Z Here the value of test statistic 12.12435565 falls in the acceptance region. Conclusion: Fails to reject the null hypothesis. The sample doe not provide enough evidence to support the claim that those college students watch fewer videos a month than high school students. Problem A. At the time she was hired as a server at the Grumney Family Restaurant, Beth Brigden was told, “You can average more than $20 a day in tips.” Over the first 35 days she was employed at the restaurant, the mean daily amount of her tips was $24.85, with a standard deviation of $3.24. At the .05 significance level, can Ms. Brigden conclude that she is earning an average of more than $20 in tips? Show all 5 steps Answer Step 1 : The null hypotheses tested is H0: The mean daily amount of her tips ≤ $ 20 Ha : The mean daily amount of her tips > $ 20 Step 2: Significance Level α=.05 Step 3: Test Statistic Step 4: Z-Test because n>30 Critical Value = 1.65 ( Since it a upper tailed test) Rejection criteria : Reject the null hypothesis , if the calculated value of test statistic is greater than the critical value. Step 5: Calculate Z test result Z X = 8.855860169 / n Details Z Test of Hypothesis for the Mean Data 20 Null Hypothesis = Level of Significance 0.05 Population Standard Deviation 3.24 Sample Size 35 Sample Mean 24.85 Intermediate Calculations Standard Error of the Mean 0.547659957 Z Test Statistic 8.855860169 Upper-Tail Test Upper Critical Value 1.644853627 p-Value 0 Reject the null hypothesis Distribution Plot Normal, Mean=0, StDev=1 0.4 Density 0.3 0.2 0.1 0.05 0.0 0 Z 1.64 Value of test statistic 8.855860169 falls in the critical region Conclusion: Reject the null hypothesis. The sample provides enough evidence to support the claim that The mean daily amount of her tips > $ 20 Problem B. According to the local union president, the mean gross income of plumbers in the Salt Lake City area is normally distributed, with a mean of $30,000 and a standard deviation of $3,000. A recent investigative reporter for KYAK TV found, for a sample of 18 plumbers, the mean gross income was $30,500. At the .10 significance level, is it reasonable to conclude that the mean income is not equal to $30,000? Show all 5 steps. Answer The null hypothesis tested is H0: Mean income = $30000 The alternative hypothesis is H0: Mean income ≠ $30000 The test statistic used is Significance level = 0.10 Student t test ( two tailed ) t X S/ n Critical value : ± 1.739606716 Rejection Criteria: Reject the null hypothesis, if the calculated value of test statistic | t | is greater than the critical value Details t Test for Hypothesis of the Mean Data 30000 Null Hypothesis = Level of Significance 0.1 Sample Size 18 Sample Mean 30500 Sample Standard Deviation 3000 Intermediate Calculations Standard Error of the Mean 707.1067812 Degrees of Freedom 17 t Test Statistic 0.707106781 Two-Tail Test Lower Critical Value -1.739606716 Upper Critical Value 1.739606716 p-Value 0.48908054 Do not reject the null hypothesis Distribution Plot T, df=17 0.4 Density 0.3 0.2 0.1 0.05 0.0 0.05 -1.74 0 t 1.74 Value of test statistic 0.707106781 falls in the acceptance region Conclusion : Fails to reject the null hypothesis. The sample does not support the claim that mean income is not equal to $30,000. Problem C. Tina Dennis is the comptroller for Meek Industries. She believes that the current cash-flow problem at Meek is due to the slow collection of accounts receivable. She believes that more than 60 percent of the accounts are in arrears more than three months. A random sample of 200 accounts showed that 140 were more than three months old. At the .05 significance level, can she conclude that more than 60 percent of the accounts are in arrears for more than three months? Answer : The null hypothesis tested is H0: The proportion of accounts that are in arrears for more than three months is less than or equal to 0.60 ( p ≤ 0.60) The alternative hypothesis is H1: The proportion of accounts that are in arrears for more than three months is greater than or equal to 0.60 ( p > 0.60) Significance level : 0.05 ( Upper tailed Z test) Test Statistic used is Z p p0 p0 (1 p0 ) n Rejection Criteria: Reject the null hypothesis, if the calculated value of test statistic Z is greater than the critical value of Z. Critical value : 1.645 Details Z Test of Hypothesis for the Proportion Data Null Hypothesis p= 0.6 Level of Significance 0.05 Number of Successes 140 Sample Size 200 Intermediate Calculations Sample Proportion 0.7 Standard Error 0.034641016 Z Test Statistic 2.886751346 Upper-Tail Test Upper Critical Value 1.644853627 p-Value 0.001946209 Reject the null hypothesis Distribution Plot Normal, Mean=0, StDev=1 0.4 Density 0.3 0.2 0.1 0.05 0.0 0 Z 1.64 Here the value of test Statistic Z (2.886751346) falls in the critical region Conclusion: Reject the null hypothesis. The sample provide enough evidence to support the claim that the proportion of accounts that are in arrears for more than three months is greater than or equal to 0.60. SET 4 Problem 1. What is the cutoff/critical F value for an ANOVA problem where the degrees of freedom in the numerator are equal to 6 and the degrees of freedom for the denominator are equal to 15. Use the 0.01 significance level, and then use the 1 percent F table. Distribution Plot F, df1=6, df2=15 0.8 0.7 0.6 Density 0.5 0.4 0.3 0.2 0.1 0.0 0.01 0 X 4.32 Critical value of F (6,15) at 0.01 significance level = 4.318 Problem 2. For the three treatments shown below, conduct an ANOVA Test. Use the 0.01 significance level. Use Excel and show all 5 steps of the hypothesis test. If you do not have the Excel Data Analysis tool, use F=21.9 for this problem. If you have the tool, show me the work, and of course it should match F=21.9. Treatment 1 8 6 10 9 treatment 2 3 2 4 3 Answer The null hypothesis tested is treatment 3 3 4 5 4 H0: There is no significant difference in the mean of the three treatments H1: There is significant difference in the mean of the three treatments Significance Level α=0.01 Test Statistic: F Test (ANOVA) Rejection Criteria: Reject the null hypothesis, if the calculated value of F is greater than the critical value of F at 0.05 significance level Details SUMMARY Groups Treatment 1 Treatment 2 Treatment 3 Count Sum 4 33 4 12 4 16 ANOVA Source of Variation Between Groups Within Groups Total Average 8.25 3 4 Variance 2.916667 0.666667 0.666667 SS df MS F P-value F crit 62.16667 2 31.08333 21.94118 0.000346 8.021517 12.75 9 1.416667 74.91667 11 Conclusion: Reject the null hypothesis. The sample provide enough evidence to support the claim that the mean of three treatments are different Problem 3. The following sample observations were randomly selected. Determine the coefficient of correlation and the coefficient of determination. Interpret. I HAVE THIS ONE BUT WANT TO CHECK IT! X: 4 Y: 4 5 6 3 5 6 10 7 7 X 4 5 3 6 10 28 Y 4 6 5 7 7 29 X2 16 25 9 36 100 186 Y2 16 36 25 49 49 175 XY 16 30 15 42 70 173 n n n i 1 i 1 n X iYi X i Yi The correlation is given by the formula r i 1 2 n n n n X X i .n Yi 2 Yi i 1 i 1 i 1 i 1 n 2 2 i where n =5 The calculated value of r = 0.752246 Coefficient of determination = r2 = 0.752246*0.752246 = 0.565874 Thus 56.58% variation in Y can be explained using X as the independent variable . Problem 4. Determine the regression formal for this data. USE .05 X: 5 5 4 3 6 10 11 Y: 5 6 5 7 7 9 12 Answer The general form of simple linear regression is Y= a + bX where Y is the dependent variable and X is the independent variable. a and be are known as the regression coefficients .They are estimated by the method of least squares. The estimates of a and b are given by n n bˆ X iYi n X Y i 1 i i i 1 n i 1 Xi n X i2 i 1 n i 1 â Y bˆ X n 2 The parameter b measures the impact of unit change in X on the dependent variable Y. It is the slope of the regression line. The parameter a is the value of Y when X=0. It is known as the Intercept term The regression equation can be used to predict the value of Y for a given X. The predicted value of Y is given by ˆ Yˆ aˆ bX The square of correlation between X and Y is known as the coefficient of determination (R2) . R2 gives the percentage of variation that can be predicated using the regression equation. X Y X2 Y2 XY 5 5 4 3 6 10 11 44 5 6 5 7 7 9 12 51 25 25 16 9 36 100 121 332 25 36 25 49 49 81 144 409 25 30 20 21 42 90 132 360 The estimated values of a = 2.8144 and b = 0.7113 The regression line is Y=2.8144+0.7113X Scatter diagram y = 0.7113x + 2.8144 R2 = 0.7494 14 12 Y 10 8 6 4 2 0 0 2 4 6 8 10 12 X The significance of regression coefficients can be tested using the student’s t test. The test ˆi ~ t / 2,n2 . SE ( ˆi ) The null hypothesis H0: i 0 is rejected when the test statistic is greater than the critical statistic used is t value. The critical value = 2.57 Intercept X Coefficients Standard Error t Stat P-value 2.814433 1.267077 2.221201 0.077012 0.71134 0.183985 3.86629 0.011805 Thus the regression coefficient is significant at 0.05 significance level