Download Handout 7

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
1
STAT 211
Handout 7
(Chapter 7: Statistical Intervals based on a Single Sample)
A point estimate of a population characteristic is a single number that is based on sample data and
represents a plausible value of the characteristic.
The best statistic (MVUE) is the unbiased statistic with the smallest standard deviation.
A confidence interval for a population characteristic (parameter) is an interval of plausible values
for the characteristic. It is constructed so that, with a chosen degree of confidence, the value of the
characteristic will be captured inside the interval. The confidence level, 1-, associated with a
confidence interval estimate the success rate of the method used to construct the interval.
If we repeatedly sample from a population and calculate a confidence interval each time with the
data available, then over the long run the proportion of the confidence intervals that actually contain
the true value of the population characteristic will be 100(1-)% (95%, 90%, or 99% for =0.05,
0.10, or 0.01, respectively).
The general form of a confidence interval:
(point estimate for a specified statistic)  (critical value).(standard error for the point estimate).
What is the best estimator for parameters, , 2, p? _____________
_
Empirical Rule tells you about 95% of all our values for x will be within 1.96 standard deviation
from the mean.
What is 1- when you compute 95% confidence interval? ___________________
What is  when you compute 95% confidence interval?
___________________
What is z / 2 when you compute 95% confidence interval? ___________________
Confidence Interval for a Population Mean, 
Suppose that the parameter of interest is the population mean,  and that
a. the population distribution is normal
b. the value of the population standard deviation  is known
Let X1, X2, ....,Xn be a random sample. Then 100(1-)% confidence interval for  is
_


_
 _
 
x 


 x  z / 2
, x  z / 2
 where P  z / 2 
 z / 2   1  
n
n
/ n





Thus, in 95% of all possible samples,  will be captured in the following calculated confidence
_
interval: x  1.96 

n
2
Choosing the sample size: Bound on the error estimation is z / 2
z / 2

n
B= z / 2

n
_
. I mean x will be within
of . The sample size required to estimate a population mean  to within an amount
z  
with 100(1)% confidence is n=   / 2  . The same formula can be written using
n
 B 

the interval width, w= 2 z / 2
2
 2z  
then n=   / 2  .
n
 w 

2
Example 1:Each of the following is a confidence interval for true average amount of time spent by
the patients using physical therapy device using the sample data: (10.90, 25.44), (13.58, 22.76)
(a) What is the value of the sample mean time spent by the patients using physical therapy device?
(b) The confidence level for one of these intervals is 95% and for the other is 99%. Which of the
intervals has the 95% confidence level and why?
Example 2: Suppose we want to estimate the average # of violent acts on TV per hour for a specific
network. Data was collected from viewing random selection of 50 prime time hours and average of
11.7 violent acts were recorded. Suppose it is known that =5.
The 95% CI for  is (10.3141 , 13.0859)
The 95% confidence interval for  if 100 prime time hours had been viewed where the same mean
and the variance obtained is (10.72 , 12.68)
The 90% CI for  is (10.5368 , 12.8632)
The width of the 90% confidence interval for  is 2.3264
The bound on the error estimation of the 90% confidence interval for  is 1.1632
Example 3: Investigators would like to estimate the average taxable income of apartment dwellers
to within $500, using a 95% CI, Suppose that the previous studies show that standard deviation is
$8000. How many people should they study? (Answer: 984)
Large Sample Confidence Interval for 
Suppose that the parameter of interest is the population mean,  and that
a. X1, X2, ...,Xn is a random sample from a population distribution with mean,  and standard
deviation, .
_
b. For the large sample size n, the CLT implies that X has approximately a normal distribution for
any population distribution.
3
c. The value of the population standard deviation  may not be known. Instead, the value of the
sample standard deviation s may be known.
If n is sufficiently large (n>40), 100(1-)% large sample confidence interval for  is
_
_
_
s _
s 
s
s 
 x  z / 2
, x  z / 2
 where P x  z / 2
   x  z / 2
  1  
n
n
n
n


Example 4: One method for solving the electric power shortage employs the construction of
floating nuclear power plants located a few miles offshore in the ocean. Because there is concern
about the possibility of a ship collision with the floating, an estimate of the density of ship traffic in
the area is needed. The number of ships passing within 10 miles of the proposed power-plant
location per day recorded for 60 days during July and August, possessed sample mean and variance,
7.2 and 8.8, respectively.
(a) Find a 98% confidence interval for the mean number of ships passing within 10 miles of the
proposed power-plant location during any day time period. (Answer:(6.3077,8.0923))
(b) Consider the possibility that  1 ship in precision of estimation are desired in 98 % confidence
interval for the mean number of ships passing within 10 miles of the proposed power-plant location
during a any day time period, what should be the sample size of ships observed? (Answer:48)
Example 5: I want to see how long on average, it takes Drano to unclog a sink. In a recent
commercial, the stated claim was that it takes on average, 15 minutes. I wanted to see if that claim
was true, so I tested Drano on 64 randomly selected sinks. I found that it took an average of 18
minutes with standard deviation of 2.5 minutes. Was their claim false?
99% CI for  is (17.1953 , 18.8047)
90% CI for  is (17.4859 , 18.5141)
What is different in one-sided confidence intervals? Discussion
Example 6: Determine the confidence level for each of the following large sample one-sided
confidence bounds:
_
s
(a) Upper bound: x  0.93
(Answer: 0.8238)
n
_
s
(b) Lower bound: x  1.75
(Answer: 0.9599)
n
A General Large Sample Confidence Interval
^
When the estimator  satisfies the following properties,
a. The estimator has approximately a normal population distribution
b. It is at least unbiased
c. standard deviation of the estimator is known
4
The
confidence
interval
for

^
can
be
constructed
  z / 2
as
^

where
^


 


P  z / 2 
 z / 2   1  
^





Example 7: large sample confidence interval for the parameter  in Poisson distribution is
_
_ 
_
_



x _
x
x 


, x  z / 2
 z / 2   1  
 x  z / 2
 where P  z / 2 
n
n
/n







Large Sample Confidence Interval for a population proportion, p
If n is sufficiently large, 100(1-)% large sample confidence interval for p is
^
^
^
^ 
^


^



p(1  p) ^
p(1  p) 
p p
where
p

z
,
p

z
P

z


z



 /2
 /2
/2
 /2   1
^
^
n
n




p(1  p) / n




^
^


Check if n p  10 and n1  p   10 to see if you have a large sample. Otherwise, there is a


formula (7.10) in your textbook, which can be used without checking if it is a large sample. I mean
formula (7.10) can be used for large and small samples.
^
Choosing the sample size: Bound on the error estimation is z / 2
^
^
^
p(1  p)
. I mean p will be
n
^
p(1  p)
within z / 2
of p. The sample size required to estimate a population proportion p to
n
^
^


^
^
z2 / 2 p1  p 
p(1  p)

 . The same
within an amount B= z / 2
with 100(1)% confidence is n=
2
n
B
^
^


^
^
4 z2 / 2 p1  p 
p(1  p)

.
formula can be written using the interval width, w= 2 z / 2
then n=
2
n
w
^
^
The conservative sample size can be found when p = 1  p =0.5
What is different in one-sided confidence intervals? Discussion
Example 8: We are interested in proportion of all students enrolled in Stat211 who listen to country
music. Using our class as our random sample from Stat211 students, we see that ___________ out
of ___________of you listen to country music. Estimate the true proportion of all Stat211 students
that listen to country music using 90% confidence interval.
5
What parameter are we estimating?_______________
Example 9:Scripps News service reported that 4% of the members of the American Bar
Association (ABA) are African American. Suppose that this figure is based on a random sample of
400 ABA members.
(a) Is the sample size large enough to justify the use of the large-sample confidence interval for a
population proportion?
(b) Construct and interpret a 90% confidence interval for the true proportion of all ABA members
who are African American. (Answer: (0.0239 , 0.0561))
Example 10: I want to estimate the proportion of freshmen Aggies who will drop out before
graduation. How many Aggies should I include in my study in order to estimate p within 0.05 with
95% confidence? (Answer: 385)
Intervals based on a Normal Population Distribution:
When the sample size is small, we have to make specific assumptions to find the confidence
intervals.
Assumption: The population of interest is normal, so that X1, X2, ...,Xn constitutes a random sample
from a normal distribution with both  and  unknown.
When the sample mean of a random sample of size n from a normal distribution with mean , and
_
the standard deviation s, the random variable T 
x 
s/ n
has a probability distribution called a t-
distribution with n-1 degrees of freedom.
Properties of t-distribution: discussion (page 296 of your textbook) and t-distribution table is on
page 725, Table A.5.
_
s _
s 
 x  t / 2;n 1
, x  t / 2;n 1

100(1-)%
confidence
interval
for

is
where
n
n

_


x



P  t / 2;n1 
 t / 2;n1   1  
s/ n




Example 11: Students weighed in kilograms at the beginning and end of a semester long fitness
class. Assume the population of weight changes follows a normal distribution. A random sample
of 12 female students yielded a mean of 0.45 and standard deviation of 1.5.
99% CI to estimate the true mean weight change is (-0.8949 , 1.7949).
Would you believe me if I claimed the average weight change was 0?
6
What is different in one-sided confidence intervals? Discussion
A Prediction Interval for a Single Future Value:
Let X1, X2, ...,Xn be a random sample from a normal population distribution and we wish to predict
the value of Xn+1, a single future observation. 100(1-)% prediction interval for Xn+1 is
_
1 _
1
 x  t / 2;n 1 s 1  , x  t / 2;n 1 s 1   where

n
n 



_


x  x n 1

P  t / 2;n 1 
 t / 2;n 1   1  


1


s 1
n


Example 12: What is the 99% prediction interval for the weight of an individual student from the
population distribution in example 11? (Answer: (-4.3992 , 5.2992))
Tolerance Intervals: Let k be a number between 0 and 100. A tolerance interval for capturing at
least k% of the values in a normal distribution with a confidence level 100(1-)% has the form
_
_

 x  critical value  s , x  critical value  s 


Table A.6 (page 726) is designed for the tolerance critical values where k=90, 95, 99 and =0.05
,0.01 in one and two-sided intervals.
Example 13: Use example 11 and calculate an interval that includes at least 95% of the student
weights in the population distribution using a confidence level of 99%. (Answer: (-5.355 , 6.255))
Confidence Intervals for the Variance, 2 and Standard Deviation,  of a Normal Population :
The population of interest is normal, so that X1, X2, ...,Xn constitutes a random sample from a
normal distribution with parameters  and 2. Then the random variable
_


x

x



i
2
(n  1)  s

i 1 

2
2
n

freedom.

100(1-)%
2
has a chi-squared (  2 ) probability distribution with n-1 degrees of
confidence
interval
for
2
is
 (n  1)  s 2 (n  1)  s 2

, 2
 2
  / 2;n 1  1 / 2;n 1




where


(n  1)  s 2
P 12 / 2;n1 
  2 / 2;n1   1   .
2



The details of the chi-squared (  2 ) probability distribution will be discussed in class and the table
of critical values (Table A.7, Page 727) will be demonstrated.
7
Example 14: Determine the following:
(a) The 95th percentile for the chi-squared distribution with n=20.
(b) The 5th percentile for the chi-squared distribution with n=20.
(c) P(10.117  2  30.143) where  2 is a chi-squared r.v. with n=20.
(d) P(  2 <10.283 or  2 >35.478) where  2 is a chi-squared r.v. with n=22.
Exercise 7.44:
(a) Is it plausible to assume that the data come from a normal population distribution?
(b) Calculate an upper bound with the confidence level 95% CI for the population standard
deviation of turbidity.
(c) Calculate a 95% CI for the population standard deviation of turbidity.
Variable
turbidity
n
15
Mean
25.313
Median
25.800
TrMean
25.438
Variable
turbidity
Minimum
21.700
Maximum
27.300
Q1
24.100
Q3
26.700
StDev
1.579
SE Mean
0.408
Normal Probability Plot for turbidity
99
ML Estimates
95
Mean:
25.3133
StDev:
1.52528
90
Percent
80
70
60
50
40
30
20
10
5
1
22
24
26
28
30
Data
Discussion on finding the confidence interval for the linear combination of the population means
Exercise 7. 51: 95% CI for   13 (1   2   3 )   4 where  i is the ith true average yield.
_
Treatment
si
ni
xi
1 (pesticide) 100 10.5 1.5
2 (pesticide) 90 10.0 1.3
3 (pesticide) 100 10.1 1.8
4 (ladybugs) 120 10.7 1.6
Related documents