Download Exercises

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Exercises
1.
Z ~ N(0, 1). Find P (-1.96 < Z < 1.96).
2.
Z ~ N(0, 1). Find the value of c such that P(-c < Z < c) = 0.95.
3.
Z ~ N(0, 1). Find the value of c such that P(-c < Z < c) = 0.90.
4.
X ~ N(500, 15). Find the values of c and d such that
P(c < X < d ) = 0.95.
week 9
1
5.
X~N(, ). Find the values of c and d (in terms of , and )
such that P(c < X < d ) = 0.95
6.
X~N(, ). Find the values of c and d (in terms of , and )
such that P(c < X < d ) = 0.90
7.
X~N(500, 15). Let X be the mean of a random sample of
size 9. Find the values of c and d such that
P( c < X < d ) = 0.95
8. X~ N(, ) Let X be the mean of a random sample of size n
Find the values of c and d such that P( c < X < d ) = 0.95
week 9
2
Point Estimates and CI
• A basic tool in statistical inference is point estimate of the
population parameter. However, an estimate without an
indication of it’s variability is of little value.
• Example:
Parameter Estimate
μ
σ2
X
S2
p
p̂
Std. Error
• A level C confidence interval for a parameter is an interval
computed from sample data by a method that has probability C
of producing an interval containing the true value of the
parameter.
week 9
3
Confidence interval for the population mean
• Choose a SRS of size n from a population having unknown
mean  and known stdev. . A level C confidence interval for 
is an interval of the form,
x  z* 
n

 



,xz 
x  z 

n
n

• Here z* is the value on the standard normal curve with area C
between  z* and z* . The interval is exact when the population
distribution is normal and approximately correct for large n in
other cases.
• In general CIs have the form: Estimate  margin of error
• In the above case,
Margin of error = m = z* 
n
week 9
4
• Note, in the above formula for the CI for the population mean,

n is the stdev. of the sample mean X (this is also known as
the std. error of the sample mean X ) and it can also be written
as
x  z*Std.Error( X )
• The width of any CI is L = 2m i.e. twice the margin of error.
• Here are three ways to reduce the margin of error (and the
width of the CI)
 Use a lower level of confidence (smaller C)
 Increase the sample size n.
 Reduce  (usually not possible).
week 9
5
Sample size for desired margin of error
• The CI for population mean will have a specified margin of
error m when the sample size is





z*
n m





2
• Example:
A limnologist wishes to estimate the mean phosphate content
per unit volume of lake water. It is known from previous
studies that the stdev. has a fairly stable value of 4mg. How
many water samples must the limnologist analyze to be 90%
certain that the error of estimation does not exceed 0.8 mg?
week 9
6
Exercise 6.10 on p397 IPS
• You want to rent an unfurnished one-bedroom apartment for
next semester. The mean monthly rent for a random sample of
10 apartments advertised in the local newspaper is $580.
Assume that the stdev. is $90. Find a 95% CI for the mean
monthly rent for unfurnished one-bedroom apartments
available for rent in this community.
• How large a sample of one-bedroom apartments would be
needed to estimate the mean µ within ±$20 with 90%
confidence?
week 9
7
Exercise 6.19 on p398 IPS
• The question gives the data on the Degree of Reading Power
(DRP) scores for 44 students. 95% CI for the population mean
score is given in the MINITAB output below.
DRP Scores
40
26
39
47
19
26
52
25
35
47
35
48
14
35
35
22
42
34
33
33
18
15
29
41
Z Confidence Intervals
The assumed sigma = 11.0
Variable
N
Mean
StDev
DRP Scor
44 35.09
11.19
25
44
34
51
43
40
41
27
SE Mean
1.66
46
38
49
14
27
31
28
54
19
46
52
45
95.0 % CI
(31.84 , 38.34)
• MINITAB Command
Stat > Basic Statistics > 1 Sample Z and select ‘Confidence interval’
week 9
8
Exercise
a)
b)
c)
d)
e)
A random sample of 85 students in Chicago city high schools
taking a course designed to improve SAT scores. Based on
these students a 90% CI for the mean improvement in SAT
scores for all Chicago high school students is computed as
(72.3, 91.4) points.
Which of the following statements are true?
90% of the students in the sample improved their scores by
between 72.3 and 91.4 points.
90% of the students in the population improved their scores
by between 72.3 and 91.4 points.
95% CI will contain the value 72.3.
The margin of error of the 90% CI above is 9.55.
90% CI based on a sample of 340 ( 85 X 4) students will have
margin of error 9.55/4.
week 9
9
CIs for the population proportion p
• Choose an SRS of size n from a population having unknown
proportion p of successes. An approximate level C confidence
interval for p is
ˆ  z*
p
ˆ (1 p
ˆ)
p
n
Again z* is the value on the standard normal curve with area C
between -z* and z*.
• Note 1: Std. error of the sample proportion is SE( pˆ ) = pˆ (1 pˆ )
ˆ (1 pˆ )
p
*
z
• Note 2: Margin of error of this CI m =
n
n
• The above CI can be written as pˆ  z*SE( pˆ )
• Use this interval when the number of successes and number of
failure are both at least 15.
week 9
10
• When the sample sizes are small use either tables of exact CIs
or approximate CIs based on Wilson’s estimate given by
p  z*SE( p)
where,
p  X 2
n4
and
SE( p) = p(1 p)
n 4
• Note, Wilson’s estimate is called the plus four estimate.
• Read Pages 539-540 in IPS.
week 9
11
Example
• In a sample of 400 computer memory chips made at Digital
Devices, Inc., 40 were found to be defective. Give a 95%
confidence interval for the proportion of defective chips in the
population from which the sample was taken?
week 9
12
Sample size for desired Margin of error
• The level C Confidence interval will have margin of error
approximately equal to the specified margin of error m when
the sample size n is
2


*
z
n  m  p*(1 p*)







• Here z* is the critical value for the confidence level C and p*
is a guessed value for the proportion of successes in a future
sample.
• The margin of error will be less then or equal to m if p* is
chosen to be 0.5. The sample size required is then given by
 z* 
n

 2m 
2
week 9
13
Example
The Gallup Poll asked a sample of 1785 U.S. adults, “Did you,
yourself, happen to attend church or synagogue in the last 7
days?” Of the respondents, 750 said “Yes.” Suppose (it is not,
in fact, true) that Gallup's sample was an SRS.
(a) Give a 99% confidence interval for the proportion of all U.S.
adults who attended church or synagogue during the week
preceding the poll.
(b) Do the results provide good evidence that less than half of the
population attended church or synagogue?
(c) How large a sample would be required to obtain a margin of
error of 0.01 in a 99% confidence interval for the proportion
who attend church or synagogue? (Use Gallup's result as the
guessed value of p).
week 9
14
Exercise
Assume that a U.S. study and a Canadian study to estimate the
proportion of adults in favour of capital punishment are
conducted using simple random samples (not really practical).
Assume the true unknown proportions in the 2 countries are
fairly similar. The U.S. survey uses a sample 9 times bigger
than the Canadian sample. Both samples are quite large. The
U.S. population is 9 times bigger than the Canadian
population. The Canadian confidence interval will be:
a) 9 times wider than the U.S. confidence interval
b) 3 times wider than the U.S. confidence interval
c) the same width as the U.S. confidence interval
d) 9 times smaller than the U.S. confidence interval
e) 3 times smaller than the U.S. confidence interval
week 9
15
Statistical tests for the population mean ( known)
• A significance test is a formal procedure for comparing
observed data with a hypothesis whose truth we want to
assess. The hypothesis is a statement about the parameters in a
population or model.
• Null hypothesis
The statement being tested in a test of significance is called the
null hypothesis. The test of significance is designed to assess
the strength of the evidence against the null hypothesis.
Usually the null hypothesis is a statement of “no effect” or “no
difference”.
• We abbreviate “null hypothesis” as H0 .
week 9
16
Example
Each of the following situations requires a significance test about a
population mean . State the appropriate null hypothesis H0 and alternative
hypothesis Ha in each case.
(a) The mean area of the several thousand apartments in a new development is
advertised to be 1250 square feet. A tenant group thinks that the apartments
are smaller than advertised. They hire an engineer to measure a sample of
apartments to test their suspicion.
(b) Larry's car consume on average 32 miles per gallon on the highway. He
now switches to a new motor oil that is advertised as increasing gas
mileage. After driving 3000 highway miles with the new oil, he wants to
determine if his gas mileage actually has increased.
(c) The diameter of a spindle in a small motor is supposed to be 5 millimeters.
If the spindle is either too small or too large, the motor will not perform
properly. The manufacturer measures the diameter in a sample of motors to
determine whether the mean diameter has moved away from the target.
week 9
17
Test Statistic
• The test is based on a statistic that estimate the parameter that
appears in the hypotheses. Usually this is the same estimate we
would use in a confidence interval for the parameter. When H0
is true, we expect the estimate to take a value near the
parameter value specified in H0.
• Values of the estimate far from the parameter value specified
by H0 give evidence against H0. The alternative hypothesis
determines which directions count against H0.
• A test statistic measures compatibility between the null
hypothesis and the data.
• We use it for the probability calculation that we need for our
test of significance
• It is a random variable with a distribution that we know.
week 9
18
Example
• An air freight company wishes to test whether or not the mean
weight of parcels shipped on a particular root exceeds 10
pounds. A random sample of 49 shipping orders was examined
and found to have average weight of 11 pounds. Assume that the
stdev. of the weights () is 2.8 pounds.
• The null and alternative hypotheses in this problem are:
H0: μ = 10 ;
Ha: μ > 10 .
• The test statistic for this problem is the standardized version of X
Z  X 
/ n
• Decision: ?
week 9
19
P-value and Significance level
• The probability computed under the assumption that H0 is true,
that the test statistic would take a value as extreme or more
extreme than that actually observed is called the P-value of the
test. The smaller the P-value the stronger the evidence against H0
provided by the data.
• The decisive value of the P is called the significance level. It is
denoted by .
• Statistical significance
If the P-value is as small or smaller than , we reject H0 and say
that the data are statistically significant at level .
• The P-value is the smallest level α at which the data are
significant.
week 9
20
Z Test for a population mean ( known)
• To test the hypothesis H0: µ = µ0 based on a SRS of size n
from a population with unknown mean µ and known stdev σ,
compute the test statistic
x
z
0

n
• In terms of a standard Normal variable Z, the P-value for the
test of H0 against
Ha : µ > µ0 is P( Z ≥ z )
Ha : µ < µ0 is
P( Z ≤ z )
Ha : µ ≠ µ0 is 2·P( Z ≥ |z|)
• These P-values are exact if the population distribution is
normal and are approximately correct for large n in other
cases.
week 9
21
Critical value approach
• We can base our test conclusions on a fixed level of significant
α without computing the P-value.
• For this we need to find a critical value z* from the standard
normal distribution with a specified tail area (to the right or
left depending on Ha). This tail area is called the rejection
region.
• If the test statistic falls in the rejection region we reject H0 and
conclude that the data are statistically significant at level .
• A P-value is more informative then a reject-or-not finding at a
fixed significance level because it can tell us about the strength
of evidence we found against the H0.
week 9
22
Example
• The Pfft Light Bulb Company claims that the mean life of its 2
watt bulbs is 1300 hours. Suspecting that the claim is too
high, Nalph Rader gathered a random sample of 64 bulbs and
tested each. He found the average life to be 1295 hours. Test
the company's claim using  = 0.01. Assume  = 20 hours.
week 9
23
Exercise
• A standard intelligence examination has been given for several
years with an average score of 80 and a standard deviation of
7. If 25 students taught with special emphasis on reading skill,
obtain a mean grade of 83 on the examination, is there reason
to believe that the special emphasis changes the result on the
test? Use  = 0.05.
week 9
24
Exercise 6.57 on p421 IPS
• The question gives the data on the Degree of Reading Power
(DRP) scores for 44 students. The MINITAB output for the test is
given below.
Z-Test
Test of mu = 32.00 vs mu > 32.00
The assumed sigma = 11.0
Variable
N
Mean
StDev
SE Mean
DRP Scor 44
35.09
11.19
1.66
Z
1.86
P
0.031
• MINITAB Command
Stat > Basic Statistics > 1 Sample Z and select ‘Test mean’
week 9
25
Confidence Intervals and two-sided tests
• A level  two-sided significance test rejects a hypothesis
H0: μ = μ0 exactly when the value μ0 falls outside the 1- α
confidence interval for .
• Example
For the exercise above a 95% CI is
83 ± 1.96·(7/5) = (80.256, 85.744)
The value 80 is not in this interval and so we reject H0:  = 80
at the 5% level of significance.
week 9
26
Large sample signif. tests for a population proportion
•
Draw a SRS of size n from a large population with unknown
proportion p of successes. To test the null hypothesis
H0: p = p0, compute the z statistic z  pˆ  p0
p0 (1 p0 )
n
•
In terms of a standard normal random variable Z, the
approximate p-value for the test of H0 against
Ha : p > p0 is P( Z ≥ z )
Ha : p < p0 is P( Z ≤ z )
Ha : p ≠ p0 is 2·P( Z ≥ |z|)
Use the large-sample z test as long as the expected number of
successes, np0, and the expected number of failure, n(1- p0),
are both greater then 10.
•
week 9
27
Example
Leroy, a starting player for a major college basketball team, made
only 38.4% of his free throws last season. During the summer he
worked on developing a softer shot in the hope of improving his
free-throw accuracy. In the first eight games of this season Leroy
made 25 free throws in 40 attempts. Let p be his probability of
making each free throw he shoots this season.
(a) State the null hypothesis H0 that Leroy's free-throw probability has
remained the same as last year and the alternative Ha that his work
in the summer resulted in a higher probability of success.
(b) Calculate the z statistic for testing H0 versus Ha.
(c) Do you accept or reject H0 for  = 0.05 ? Find the P-value.
(d) Give a 90% confidence interval for Leroy's free-throw success
probability for the new season. Are you convinced that he is now a
better free-throw shooter than last season?
(e) What assumptions are needed for the validity of the test and
confidence interval calculations that you performed?
week 9
28
• MINITAB gives the exact p-value for the test.
• Commands: Stats > Basic Statistics > 1 Proportion
• MINITAB output for the above example is given below.
Test and Confidence Interval for One Proportion
Test of p = 0.384 vs p > 0.384
Sample
1
X
25
N
40
Sample p
0.625000
Exact
90.0 % CI
P-Value
(0.482752, 0.752705) 0.002
week 9
29
Related documents