Download Computer Assignment 5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Computer Assignment 5
Due: 04/29/1999
Question 1
The data for this problem is taken from Table 8.1 on page 234 of your textbook. The data
set contains 40 measurements of the heights of one-year-old red pine seedlings in
centimeters. Suppose that the observed heights constitute a simple random sample from a
population with unknown mean µ and unknown standard deviation σ . The file
stat101cp5.dat
can
be
downloaded
from
the
course
web
page:
http://www.stat.unc.edu/students/owzar/stat101.html
(a) Use PROC MEANS to find the sample mean and sample standard deviation of the
40 measurements.
(b) Use you’re the result from 1(a) to create a 95% confidence interval for the mean
t (α / 2, n − 1) s
m of the form x ±
. Under what assumptions about the distribution
n
of the tree heights are a confidence interval of this form justified.
(c) Use PROC MEANS with the CLM option (consult the "Little SAS Book" or
Tutorial VIII for instructions) to compute the confidence interval in 1(b) directly.
(d) Suppose that in a previous set of measurements, the mean height was found to be
1.9 centimeters. Using PROC MEANS with the PRT option, test the hypothesis
that the new mean height is different than the old one (i.e., H0: µ = 1.9 vs. H1: µ ≠
1.9). This procedure gives the p-value of the test given the data (Note: make sure
to center the data as described in the tutorial). Should H0 be rejected at a 0.05
level of significance? What if the level of significance is 0.005?
(e) Turn in your output.
Question 2
The data for this problem (stat101cp1.dat) is the Law School data used in previous
assignments. It can be downloaded from the course web page:
http://www.stat.unc.edu/students/owzar/stat101.html.
Use PROC MEANS with the CLM option to construct a 90% confidence interval for the
mean LSAT. Do not forget to set your confidence level 1 − α (consult the "Little SAS
Book" or Tutorial VIII for instructions). Turn in your output.
Question 3
For this question, we will once again consider the heights of one-year-old red pine
seedlings from Question 1. We are interested in estimating the population proportion of
pine seedlings that are at least 2 cm.
(a) Find the observed proportion, call it p̂ , of seedlings in this sample that are at least
2cm.
(b) Using the observed proportion, construct a 90% confidence interval for the
pˆ (1 − pˆ)
. You do not have to use
population proportion of the form pˆ ± z (α / 2)
n
SAS for this part of the question.
(c) Turn in your calculated confidence interval from 3(b).
Question 4
The data set for this problem (stat101cp3.dat) can be downloaded from the course
web page: http://www.stat.unc.edu/students/owzar/stat101.html. The data set contains 50 entries.
Each entry contains two numbers. The first number is the sample average of 75
independent normal random variables, each with mean zero and variance one. The
second number is the sample standard deviation of the same 75 random variables.
(a) Construct a 90% confidence interval for each of the 50 averages using the
s z (α / 2)
formula: x ±
. You will need to create two new variables— one for the
n
lower bound of the confidence interval, and one for the upper bound of the
confidence interval. To check your output, use PROC PRINT.
(b) Construct a new variable that is 1 if the actual mean of zero is within these bounds
and 0 otherwise.
(c) Use PROC MEANS to find the number of the calculated confidence intervals that
contain the actual mean.
(d) Turn in your list of 90% confidence intervals, from 4(a), and the number of
intervals that contain the mean, from 4(c).
NOTE:The code shown below can be used to carry out 4(a) and 4(b).
DATA ci;
INFILE 'a:\stat101cp3.dat';
INPUT xbar s;
lower=xbar-1.64*s/SQRT(50);
upper=xbar+1.64*s/SQRT(50);
IF (lower < 0 AND upper >0) THEN good=1;
ELSE good=0;
RUN;QUIT;