Download Point estimation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Introduction
Point Estimation
Sampling Distribution of Sample Mean (X )
Sampling Distribution for the Difference between
Two Means
Sampling Distribution of Sample Proportion (p̂)
Sampling Distribution for the Difference between
Two Proportions
Introduction



A sampling distribution - probability distribution of a
sample statistic based on all possible simple random
sample of the same size from the same population.
If we take several sample and find mean of the sample,
therefore the distribution of the sample mean called
sampling distribution of the sample mean, X .
For example, suppose you sample 50 students from
your college regarding their mean GPA. If you obtained
many different samples of 50, you will compute a
different mean for each sample. We are interested in
the distribution of all potential mean GPA we might
calculate for any given sample of 50 students.
Introduction

If we take several sample and find the ratio of the
specific characteristic in the sample, therefore the
distribution of the sample proportion called
sampling distribution of the sample proportion,p̂ .
Point Estimation
Point estimation is a form of statistical inference.
In point estimation we use the data from the sample
to compute a value of a sample statistic that serves
as an estimate of a population parameter.
We refer to X as the point estimator of the population
mean .
s is the point estimator of the population standard
deviation .
p̂ is the point estimator of the population proportion p.
X



The probability distribution of X is called its
sampling distribution.
It lists the various values that X can assume, and the
probability of each value of X .
If the population is normally distributed with mean μ
and standard deviation σ, the sampling distribution
of the sample mean, X is also normally distributed
with:
i) Mean
ii) Standard deviation /
standard error
x  
x 

n
where n is the sample size


If the population is not normally distributed, apply
central limit theorem.
Central Limit Theorem:
◦ Even if the population is not normal, sample
means from the population will be approximately
normal as long as the sample size is large enough
n↑
(n≥30).
the sampling
As the
sample size
gets large
enough…
distribution
becomes almost
normal regardless
of shape of
population

Z-value for the sampling distribution of mean ( X ):
x x
Z


x
n
Suppose a population has mean μ = 8 and standard deviation σ = 3.
A random sample of size n = 36 is selected. What is the probability
that the sample mean is between 7.8 and 8.2?
Solution:
•
•
•
Even if the population is not normally distributed, the central limit
theorem can be used (n > 30).
So the sampling distribution of the sample mean is approximately
normal with mean,  x    8
and standard deviation,

3
x 

 0.5
Hence,
n
36


 7.8 - 8
X-μ
8.2 - 8 
P(7.8  X  8.2)  P



3
σ
3


36
n
36 

 P(-0.4  Z  0.4)  0.3108
Z
X



If n ≥ 30 (large) , the sampling distribution of the
sample mean is normally distributed;
 2 

x ~ N   ,
n 

2
2

Note: If the
unknown then it is estimated by s .
If n < 30 (small), is known, and the sampling
distribution of the sample mean is normally
distributed if the sample is from the normal
population;
 2 

x ~ N   ,
n 

2

If n<30 and is unknown. t distribution with n-1
degree of freedom is use;
2
T
x
2
s
n
~ t n 1
The amount of time required to change the oil and filter
of any vehicles is normally distributed with a mean of
45 minutes and a standard deviation of 10 minutes. A
random sample of 16 cars is selected.
What is the standard error of the sample mean to be?
 What is the probability of the sample mean between
45 and 52 minutes?
 What is the probability of the sample mean between
39 and 48 minutes?
 Find the two values between the middle 95% of all
sample means.

Solution:
X : the amount of time required to change the oil and filter of any vehicles

X ~ N 45,102

n  16
X : the mean amount of time required to change the oil and filter of any vehicles
 102 
X ~ N  45,

16


a) The standard error,  x 
10
16
 2.5
52  45 
 45  45
b) P  45  X  52   P 
Z

2.5 
 2.5
 P  0  Z  2.8 
 0.4974
48  45 
 39  45
c) P  39  X  48   P 
Z

2.5
2.5


 P  2.4  Z  1.2 
 0.4918  0.3849
 0.8767
P  a  X  b   0.95
d)
b  45 
 a  45
P
Z 
  0.95
2.5
2.5


P  za  Z  zb   0.95
from table:
za  1.96
zb  1.96
a  45
 1.96  a  40.1
2.5
b  45
 1.96  b  49.9
2.5
Z
Exercise:
A certain type of thread is manufactured with a mean tensile
strength is 78.3kg, and a standard deviation is 5.6kg.
Assuming that the strength of this type of thread is
distributed approximately normal, find:
a) The probability that the mean strength of a random
sample of 10 such thread falls between 77kg and 78kg.
b) The probability that the mean strength greater than 79kg.
c) The probability that the mean strength is less than 76kg.
d) The value of X to the right of which 15% of the mean
computed from random samples of size 10 would fall.
 Suppose we have two populations, and which are normally
distributed:
X 1 ~ N ( 1 ,  12 )
and X 2 ~ N (  2 ,  2 )
2
 Sampling distribution for X 1 and X 2:
 1
X 1 ~ N  1 ,
n1

2




and
2



2

X 2 ~ N   2 ,

n
2 

 Now we are interested in finding out what is the sampling
distribution of the difference between two sample means, the
distribution of X 1  X 2:

12  22 
X1  X 2 ~ N  1  2 ,


n1
n2 

A taxi company purchased two brands of tires, brand A and
brand B. It is known that the mean distance travelled before the
tires wear out is 36300 km for brand A with standard deviation
of 200 km while the mean distance travelled before the tires
wear out is 36100 km for brand B with standard deviation of
300 km. A random sample of 36 tires of brand A and 49 tires of
brand B are taken. What is the probability that the
a) Difference between the mean distance travelled before the
tires of brand A and brand B wear out is at most 300 km?
b) Mean distance travelled by tires with brand A is larger
than the mean distance travelled by tires with brand B before
the tires wear out?
Solution:
X 1 : The mean distance travelled before the tires brand A wear out
X 2 : The mean distance travelled before the tires brand B wear out
Exercise:
The mean final examination scores for students taking SM2703
is 30 marks (out f 50 marks) with standard deviation of 6
marks. Assume that the final scores are approximately normal.
Two random samples were taken randomly consisting of 32
and 50 students respectively. What is the probability that:
a) The mean final examination scores will differ by more than 3
marks?
b) Mean final examination scores from group 1 is larger than
group 2?
p̂


The probability distribution of the sample
proportion p̂ is called its sampling distribution.
The population and sample proportion are denoted
by p and p̂ respectively, and calculated as:
x
X
pˆ 
p
and
n
N
where
 N = total number of elements in the population;
 X = number of elements in the population that
possess a specific characteristic;
 n = total number of elements in the sample; and
 x = number of elements in the sample that
possess a specific characteristic

For the large values of n (n ≥ 30), the sampling
distribution is very closely normally distributed.
 pq 
pˆ ~ N  p,

n 


With the mean of sample proportion is denoted by
 p̂ and equal to the population proportion, p.
 pˆ  p

The standard deviation of the sample proportion is
denoted by
 pˆ 
pq
n
If the true proportion of voters who support Proposition A is
p = 0.40, what is the probability that a sample of size 200 yields a
sample proportion between 0.40 and 0.45?
Solution:
(0.4)(0.6) 

pˆ ~ N  0.4,

200 

pˆ ~ N 0.4,0.0012
 0.4  0.4
0.45  0.4 
P(0.40  pˆ  0.45)  P
Z

0.0012 
 0.0012
 P(0  Z  1.44)
 0.4251
Z
 Now say we have two binomial populations with
proportion of successes p̂1 and p̂2 :
p1 q1 and
p2 q2
pˆ 1 ~ N ( p1 ,
)
pˆ 2 ~ N ( p2 ,
)
n1
n2
 The sampling distribution of the difference between two
sample proportions, Pˆ1  Pˆ2 :

p1q1
p2 q2
ˆ
ˆ

P1  P2 ~ N  p1  p 2 ,

n1
n2




A certain change in a process for manufacture of component
parts was considered. It was found that 75 out of 1500 items
from the existing procedure were found to be defective and 80
of 2000 items from the new procedure were found to be
defective. If one random sample of size 49 items were taken
from the existing procedure and a random sample of 64 items
were taken from the new procedure, what is the probability
that
a) the proportion of the defective items from the new
procedure exceeds the proportion of the defective items from
the existing procedure?
b) proportions differ by at most 0.015?
c) the proportion of the defective items from the new
procedure exceeds proportion of the defective items from the
existing procedure by at least 0.02?
Solution:
PˆN : the proportion of defective items from the new procedure
Pˆ : the proportion of defective items from the existing procedure
E
80
75
pN 
 0.04
pE 
 0.05
2000
1500
0.04(0.96) 
0.05(0.95) 


ˆ
ˆ
PN ~ N  0.04,
PE ~ N  0.05,


64
49




0.05(0.95) 0.04(0.96) 

ˆ
ˆ
PN  PE ~ N  0.04  0.05,


49
64


Pˆ  Pˆ ~ N  0.01,0.0016 
N
E



a) P PˆN  PˆE  P PˆN  PˆE  0

0   0.01 

 PZ 

0.0016


 P  Z  0.25 
 0.4013
b) P | PˆN  PˆE | 0.015  P 0.015  PˆN  PˆE  0.015




0.015   0.01 
 0.015   0.01
 P
Z

0.0016
0.0016


 P  0.125  Z  0.625 
 0.2838
c) P PˆN  PˆE  0.02  P PˆN  PˆE  0.02




0.02   0.01 

 PZ 

0.0016


 P  Z  0.75 
 0.2266
Exercise:
Usually 3% of the diskettes produced by machine A is defective
while 2% of the diskettes produced by machine B is defective. If
a random sample of 50 diskettes produced by machine A and a
random sample of 50 diskettes produced by machine B are
chosen, what is the probability that
a)
b)
c)
The difference between the sample proportion of defective
diskettes produced by machine A and the sample of defective
diskettes produced by machine B do not exceed o.1?
The difference between the sample proportion of defective
diskettes produced by machine A and the sample of defective
diskettes produced by machine B is at least 0.15?
The sample proportion of defective diskettes produce by
machine A exceeds the sample proportion of defective
diskettes produced by machine B.