Download Stat 501 Lab 03

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Transcript
Stat 501 Lab 03
1
Prediction and con…dence intervals for response variable
The exercise in this section is intended to review the methods we learned for predicting a new response Y
and estimating the mean response E (Y ) for a given value of the predictor variable x h :
1.1
Alcoholism and arm strength study
Urbano-Marquez et al. (1989) reported on strength tests for a group of 50 alcoholic men. Their daily intake
of alcohol ranged from 118 to 350 grams (with a mean of 243 grams) for an average of 16 years. The total
lifetime consumption of alcohol (X = alcohol); in kg=kg of body weight, was determined for each person
in the study. The response (Y = strength) was the strength of the deltoid muscle, in kg; in each person’s
nondominant arm. The response was determined by making …ve measurements over a 20-minute period,
using an electric myometer, which measures muscular force against a …xed resistance. The resulting data
are stored in alcoholarm.txt.
1. Using Minitab, determine a 95% con…dence interval for the mean arm strength of all alcoholic men who
have consumed a lifetime dose of 21 kg=kg of body weight of alcohol. Write a sentence that summarizes
the …ndings of the interval.
2. Using Minitab, determine a 95% prediction interval for the arm strength of an alcoholic man who has
consumed a lifetime dose of 21 kg=kg of body weight of alcohol. Write a sentence that summarizes the
…ndings of the interval.
3. Which interval – the prediction interval or con…dence interval – is wider? Will this always be the case?
4. Would a prediction interval (or a con…dence interval) for a lifetime dose of 35 kg=kg of body weight
be wider or narrower than the above intervals? And, why? (Hint: To calculate the mean lifetime
consumption of alcohol for the 50 men in the sample use Calc >> Column statistics.... Select
Mean. Specify the input variable. Select OK.)
5. Using Minitab, create a …tted line plot with 95% con…dence bands and 95% prediction bands. One of
the data points (or observations) falls outside the prediction bands. Should this be surprising?
2
Analysis of variance approach for testing H0 : ¯1 = 0
The …rst exercise in this section is intended to review the analysis of variance (ANOVA) approach to testing
for a linear association between a predictor and a response. As you know, the ANOVA approach requires the
use of the F distribution. The second exercise is intended to help you understand what the F distribution
is, and how it depends on the numerator and denominator degrees of freedom.
2.1
Boiling point of water in the Alps
The data set alpswater.txt contains the barometric pressure (in inches of mercury) and the boiling point
(in degrees Fahrenheit) of water in the Alps. Treating the response Y = boiling and the predictor as
X = pressure; and using Stat >> Regression >> Regression..., perform a basic regression analysis in
Minitab.
1. Using a signi…cance level ® = 0:05 and the P-value reported by Minitab that is associated with the
F-test, what should we conclude?
1
2. What relationship exists that guarantees that the P-value associated with the t-test will always be the
same as the P-value associated with the F-test? Con…rm that this relationship exists with this data
set.
3. Use the F-test in conjunction with the critical value approach to test H0 : ¯ 1 = 0. Again, set ® = 0:05.
Is your conclusion consistent with your conclusion made using the P-value approach?
4. The regression sum of squares comprises what proportion of the total sum of squares? Does this …gure
appear anywhere in the standard Minitab regression output?
2.2
The F distribution
In this exercise, you are instructed to create one graph that contains two F distributions – the …rst one
with 10 numerator and 15 denominator degrees of freedom and the second one with 1 numerator and 24
denominator degrees of freedom.
1. For both of the F distributions, most of the F values fall between 0 and 3. So, let’s …rst create
a column containing the F values. Label the …rst column (C1) ‘F’. Then, create the column of numbers
between 0 and 3 by selecting Calc >> Make Patterned Data >> Simple set of numbers ... Now, …ll
in the boxes:
² Store patterned data in: ‘F’
² From …rst value: 0.1
² To last value: 3
² In steps of: 0.1
² List each value: 1 time
² List each sequence: 1 time
After selecting OK, ‘F’ should contain a set of numbers between 0.1 and 3, inclusive.
2. Now, let’s create a second column containing the corresponding heights of the probability distribution
of the …rst F distribution with 10 numerator and 15 denominator degrees of freedom. Label the second
column (C2) ‘Ht1’. Then, select Calc >> Probability Distributions >> F... Select Probability
density. In the box labled Numerator degrees of freedom, type 10. In the box labeled Denominator
degrees of freedom, type 15. Click on the button labeled Input column. In the box labeled Input
column, select the variable ‘F,’ and in the box labeled Optional storage, select the variable ‘Ht1’. Select
OK. The variable ‘Ht1’ should now contain the height of the …rst probability distribution.
3. Now, we just need to create a third column containing the corresponding heights of the probability
distribution of the second F distribution with 1 numerator and 24 denominator degrees of freedom. Label the
third column (C3) ’Ht2’. Then, select Calc >> Probability Distributions >> F... Select Probability
density. In the box labled Numerator degrees of freedom, type 1. In the box labeled Denominator
degrees of freedom, type 24. Click on the button labeled Input column. In the box labeled Input
column, select the variable ‘F,’ and in the box labeled Optional storage, select the variable ‘Ht2’. Select
OK. The variable ‘Ht2’ should now contain the height of the second probability distribution.
4. Now, let’s plot the two F distributions on the same graph. Select Graph >> Plot... For Graph 1,
select ‘Ht1’ as the y variable and ‘F’ as the x variable. For Graph 2, select ‘Ht2’ as the y variable and ‘F’
as the x variable. In the area labeled Data Display, under Display, select Connect. Under Frame, select
Multiple Graphs... Under Generation of Multiple Graphs, select Overlay graphs on the same
page. Select OK. Select OK. The graph with the two F distributions should appear in a new window.
5. Which of the two F distributions lo oks like the type of F distribution we will encounter when testing
H0 : ¯ 1 = 0 in the simple linear regression setting?
6. Use Minitab to …nd the probability that an F statistic with 10 numerator and 15 denominator degrees of
freedom will be greater than 1. And, use Minitab to …nd the probability that an F statistic with 1 numerator
2
and 24 denominator degrees of freedom will be greater than 1. Do the calculated probabilities look reasonable
in light of your above graphs? (To …nd the probabilities, select Calc > > Probability Distributions >>
F... Select Cumulative probability. Specify the appropriate numerator and denominator degrees
of freedom. Select Input constant, and specify 1. Select OK. The output will appear in the session
window.)
3
Minitab help
3.1
Con…dence intervals and prediction intervals for response
1. Select Stat >> Regression >> Regression ...
2. Specify the response and the predictor(s).
3. Select Options... In the box labeled “Prediction intervals for new observations,” specify either
the x value or a column name containing multiple x values. Specify the Con…dence level– the default
is 95%.
4. Select OK. Select OK. The output will appear in the session window.
3.2
Creating a …tted line plot with con…dence bands and/or prediction bands
1. Select Stat >> Regression >> Fitted line plot ...
2. Specify the response and the predictor.
3. Under Options . . . , select Display con…dence bands and select Display prediction bands.
Specify the desired con…dence level (95% is the default)
4. Select OK. Select OK. A new window containing the …tted line plot will appear.
3.3
To perform a standard regression analysis
1. Select Stat >> Regression >> Regression...
2. In the box labeled Response (Y), select the desired response variable.
3. In the box labeled Predictor (X), select the desired predictor variable.
4. Select OK. The standard regression analysis output will be displayed in the session window.
3.4
To …nd an F critical value
1. Select Calc >> Probability Distributions >> F:::
2. Click the button labeled Inverse cumulative probability.
3. Type the number of numerator degrees of freedom in the box labeled Numerator degrees of freedom. Type the number of denominator degrees of freedom in the box labeled Denominator degrees
of freedom.
4. Click the button labeled Input constant. In the box type the cumulative probability for which you
want to …nd the associated F critical value. Note that since the F test is a right-tailed test, if ® = 0:05;
then the appropriate cumulative probability is 0.95.
5. Select OK. The output will appear in the session window.
3