Download Lab 11

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
MAT 214 Lab 10: Hypothesis Testing: Large and Small Samples and Assessing normality
In this lab we will practice writing up hypothesis tests for the mean. As we have discussed in class, our
hypothesis testing procedure has several steps.
1. State the null and alternative hypotheses in words and symbols.
2. Choose a level of significance, α, test to use and corresponding critical value(s).
3. State the test statistic that is to be used.
4. State the decision rule. Calculate the rejection region. (optional)
5. State the assumptions that were used.
6. Calculate the test statistic and p-value.
7. State the decision and write your conclusion in terms of the problem.
8. Discuss type of error possible and it probability.
Hypothesis Testing – Review
H0: μ= μ0 (for some particular value μ0)
Ha: μ< μ0 (One-tailed test to the left)
Choose α and calculate appropriate statistic.
This is a one-tailed test, so reject H0 if statistic lies in the left-tail rejection region (containing area α).
The statistic should be negative.
H0: μ= μ0 (for some particular value μ0)
Ha: μ > μ0 (One-tailed test to the right)
Choose α and calculate appropriate statistic.
This is a one-tailed test, so reject H0 if statistic lies in the right-tail rejection region (containing area α).
The statistic should be positive.
H0: μ= μ0 (for some particular value μ0)
Ha: μ≠ μ0 (Two-tailed test, either to the left or to the right)
Choose α and calculate appropriate statistic.
This is a two-tailed test, so reject H0 if statistic lies in the left-tail rejection region (containing area α/2)
OR in the right-tail rejection region (containing area α/2). The statistic could be negative or positive.
When we calculate the statistic, we can also determine the p-value (the total area in the appropriate tail or
tails (depending upon the hypotheses) determined by the calculated value of the statistic.
Quick-and-dirty decision-making: If the p-value is less than α, we reject H0.
If the p-value is not less than α, we cannot reject H0.
We will use this approach throughout this lab.
I. This time you will learn how MINITAB can help a statistician on testing hypotheses for the mean using a
large sample. As usually, any information or graphs you create should be copied to your write-up. The
data file 1STGRADE.MTW can be found in MINITAB data folder.
While running hypotheses tests, please follow the steps from handout.
1) Hypothesis Testing – Using Minitab to aid in the calculations, i.e. perform the descriptive statistics.
a) The U.S. government claims that the mean height of a first grader is 42 inches. A researcher feels
that the actual mean height of first grade students exceeds 42 inches. Set up a hypothesis test for this
experiment and use the descriptive statistics of Height in the 1STGRADE.MTW file as the results of
the experiment. Do the test at the   0.05 level.
b) Would the conclusions have changed if we had used   0.01 ? What about   0.1? Make sure
you state the new rejection regions.
2) Hypothesis Testing – Using Minitab to do the entire test.
a) We will redo the above test using Minitab to calculate the z-values as well as calculate the p-value of
the test.
b) Consider the above hypothesis test for the height (1a).
i) Do the following.
(1) →Stat → Basic Statistics → One-Sample t.
Even though we are using the normal distribution here Minitab will not use s for  unless we
use the t-test. We will discuss in class this week what the t-distribution is and how it relates
to the z-distribution.
In the Samples in columns box select the Hgt variable.
(2) In the test mean box type in 42, which is  0 → Options.
(3) Make sure the confidence level is set to 95.0, which means that   0.05
(4) Change the Alternative to greater than → OK → OK.
ii) What are the T and P values for the test?
iii) Thinking of the T as the Z value what is your conclusion?
iv) Change the significance level to   0.01 and   0.1. What are the T and P values for each
test? Thinking of the T as the Z value what is your conclusion?
What have you noticed about the p-value when you change the significance level? Recall from
class that when P  value   we reject the null hypothesis. For each of the three values of 
used in each of the two tests compare them with the p-value to determine your conclusion.
c) Next use Minitab to test the hypotheses for the following situation:
The U.S. government claims that the mean weight of a first grader is 43 pounds. A researcher feels
that the actual mean weight of first grade students exceeds 43 pounds. Set up a hypothesis test for
this experiment using Weight. Do the test at the   0.05 level.
d) As the final answer to hypotheses tests about mean height and weight of first graders state the
conclusion and type of error associated with it.
II.
Next we will be considering small samples. Using the t-test for small samples requires a normality of
data set. Before checking our data set for normality, let’s see how some of the tests for normality work and
how the tests detect normal and non normal data. We want to get a sense of what these tests produce when
applied to several distributions. We will look at the normal, Cauchy, and exponential distributions. Use a
new worksheet and name columns C1, C2, and C3, N(0,12), C(0,12), and Exp(12), respectively
A. Generate the Random Samples:
1. Put a random sample of size 25 from N(0,12) in column N(0,12):
Click Calc
Click Random Data
Click Normal
Generate 25 rows of data
Store the data in N(0,12)
Let the mean equal 0
Let the standard deviation equal 12
Click OK
2. Put a random sample of size 25 from Cauchy(0,12) in column C(0,12):
Click Calc
Click Random Data
Click Cauchy
Generate 25 rows of data
Store the data in Cauchy(0,12)
Let the location equal 0
Let the scale equal 12
Click OK
3. Put a random sample of size 25 from Exp(12) in column Exp(12)
Click Calc
Click Random Data
Click Exponential
Generate 25 rows of data
Store the data in Exp(12)
Let the mean equal 12
Click OK
B. Check each column for normality. We will do this by producing a probability plot and the
Anderson Darling Test for each column.
A probability plot is a scatter plot of the data on “normal” graph paper and a fitted straight
line. If you are sampling from normal the scatter plot should look like a straight line. The
Anderson Darling Test tests the null hypothesis: the data is normal against the alternative
hypothesis: the data is not normal. A small p-value would mean the test suggests the data is
not normal.
1. Producing the plot and the Anderson-Darling Test:
Click Stat
Click Basic Stat
Click Normality Test
Double Click the column name, so that it appears in the variable box
Select the Anderson-Darling Test and Click OK
Discuss both the probability plot and the Anderson Darling Test for each column. Conclude, which of the
generated samples comes from normal population.
III. Go to the help menu and look up the data file BANKING.MTW. Copy and paste the information from
the help system into your word document. Also, load the BANKING.MTW file into Minitab.
A. Complete hypothesis test for the mean of the 1991 year. Test the mean being less than 450 against
the mean being equal to 450. Use the   0.05 level.
1. Using the sign test
Click Stat
Click Nonparametrics
Click 1-sample sign
Double Click 1991(so that it appears in the variable box)
Click Test Median, type 450 in the box
Click the Alternative box, and select “less than”
Click OK
2. Using the Wilcoxon test
Click Stat
Click Nonparametrics
Click 1-sample Wilcoxon
Double Click 1991(so that it appears in the variable box)
Click Test Median, type 450 in the box
Click the Alternative box, select “less than”
Click OK
3. Using the t-test
Click Stat
Click Basic Stat
Click 1-sample t-test
Double Click 1991 (so that it appears in the variable box)
Click Test Mean, type 450 in the box
Click Options
Click the Alternative box, select “less than”
Click OK
For each test decide whether Ho would be rejected at the .05 level of significance. How do the p-values for
each test compare? Do all of the above tests lead to the same conclusion?
B. Repeat part A 1-3 for the mean of the 1990 year. Test the mean being greater than 350 against the
mean being equal to 350. Use the   0.05 level.
IV. Check the 1990 and the 1991 data sets for normality. Was the t-test used in Part II appropriate for
these data sets? Explain.
V.
To use (non-parametric) Wilcoxon Test we need to know that the data comes from a symmetric
continuous distribution. Discuss how you might assess symmetry and then try to address whether or
not the Wilcoxon test used in Part II was appropriate for the 1990 and 1991 data sets. (Hint: Minitab
can make a histogram and there is also a symmetry plot, if you click on Stat and then Quality Tools.)