Download Sampling distributions chapter 6 ST 315

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Sampling distributions
chapter 6 ST 315
Nutan S. Mishra
Department of Mathematics and Statistics
University of South Alabama
Useful links
• http://oak.cats.ohiou.edu/~wallacd1/ssamp
le.html
• http://garnet.acns.fsu.edu/~jnosari/05.PDF
• http://www.ruf.rice.edu/~lane/stat_sim/sam
pling_dist/
Sampling distribution
In chapter 2 we defined a population parameter as a function of all the population
values.
Let population consists of N observations then population mean and population
standard deviation are parameters
N
  f ( x1 , x2 ..., x N ) 
x
i 1
i
N
2
(
x
)

2
x


N
  g ( x1 , x2 ..., x N ) 
N
For a given population, the parameters are fixed values.
Sampling distribution
On the other hand if we draw a sample of size n from a population of size N,
then a function of the sample values is called a statistics
For example sample mean and sample standard deviation are sample
statistics.
n
x  f ( x1 , x2 ..., xn ) 
x
i 1
i
n
2
(
x
)

2
x


n
s  g ( x1 , x2 ..., xn ) 
n 1
Since we can draw a large number of samples from the population the value of
sample statistic varies from sample to sample
Sampling distribution
Since value of a sample statistic varies from sample to sample, the
statistic itself is a random variable and has a probability distribution.
For Example sample mean x is random variable and it has a probability
distribution.
Example: Start with a toy example
Let the population consists of 5 students who took a math quiz of 5
points.
Name of the students and corresponding scores are as follows:
Name of the student
A
B
C
D
E
Score
2
3
4
4
5
For this population mean µ = 3.6 and standard deviation σ = 1.02
Sampling distribution
Now we repeatedly draw samples of size three from the population of size
5. then the possible samples are 10 as listed below
The population parameters are µ = 3.6 and s.d. σ = 1.02
Sample
sample
Sample values
x
s
1
A,B,C
2,3,4
3
1
2
A,B,D
2,3,4
3
1
3
A,B,E
2,3,5
3.33
1.53
4
A,C,D
2,4,4
3.33
1.16
5
A,C,E
2,4,5
3.67
1.53
6
A,D,E
2,4,5
3.67
1.53
7
B,C,D
3,4,4
3.67
.58
8
B,C,E
3,4,5
4
1
9
B,D,E
3,4,5
4
1
10
C,D,E
4,4,5
4.33
.58
Sampling distribution
X= score of a student in the math quiz
Sampling distribution of sample mean
Population distribution
x
f
P(x)
x
f
P(x )
2
1
.2
3
2
.2
3
1
.2
3.33
2
.2
4
2
.4
3.67
3
.3
5
1
.2
4
2
.2
4.33
1
.1
Thus we see that the sample mean x is a new random variable and has a
probability distribution.
Question: What is the mean of this random variable and what is its variance?
Sampling distribution
Let N be the size of the population and n be the size of the
sample
mean of sample mean  x  
If n/N > .05
and standard devation of sample mean
x 
And if n/N ≤.05

n
N n
N 1
mean of sample mean  x  
and standard devation of sample mean
x 

n
Sampling distribution of sample mean
Theorem
Let X be a random variable with population mean µ and population standard
deviation σ . If we collect the samples of size n then the new random
variable sample mean x has the mean same as µ and standard
deviation σ/√n
We can denote them as follows:
mean of x   x  
standard deviation of x   x  
n
Sampling distribution of sample mean
mean of x   x  
standard deviation of x   x  
n
It is easy to see that the standard deviation of sample
mean decreases as the sample size increases.
The mean of the sample remains unaffected with the
change in sample size.
Sample mean is called an estimator of the population
mean.
Because whenever population mean is unknown we
will use sample mean in place.
Sampling distribution of sample mean
x
P( x)
3
.2
3.33
.2
3.67
.3
4
.2
4.33
.1
From the above table when we compute the mean and variance
They are
Sampling distribution of sample mean
We have seen that distribution of the sample mean
x is derived from the distribution of x
Thus distribution of x is called parent distribution.
The next question is to investigate what is the
relationship between the parent distribution and
the sampling distribution of x .
Sampling distribution of sample mean
Let the distribution of x is normal with mean µ and standard
deviation σ then it is equivalent to saying that
Let the parent population is normal with mean µ and
standard deviation σ
If we draw a sample of size n from such a population then
• Mean of x that is  x is equal to the mean of the
population µ.
• Standard deviation of x that is  x is equal to σ/√n
• The shape of the distribution of x
the value of n
is normal whatever be
Sampling distribution of sample mean
If X~ N(µ, σ) then
x
~ N ((µ, σ/√n)
Where n is size of the sample
drawn from the population
Central Limit Theorem
For a large sample size, the sampling distribution
of x is approximately normal, irrespective of the
shape of the population distribution.
What size of the sample is considered to be large?
A sample of size ≥ 30 is considered to be large.
Sampling distribution of sample mean
Assume that population standard deviation σ is
known
If the random sample comes from a normal population, the
sampling distribution of sample mean is normal regardless
the size of the sample.
If the shape of the parent population is not known or not normal
then distribution of sample mean is approximately normal
when ever n is large (≥30).(this is central limit theorem)
If the shape of the parent population is not known or not normal
and sample size is small then we can not say readily about
the shape of sample distribution
Sampling distribution of sample mean
When population standard deviation is
unknown
• If the sample size is large the sampling
distribution of sample mean is still approximately
normal
• If the sample size is small then
X 
t
is a random variable having t - distributi on
S
n
with parameter   n - 1.
2
(
X

X
)
where S2   i
n 1
i 1
n
About t-distribution
•
•
•
•
•
t is a special continuous distribution
Its symmetric about zero
Has bell shaped curve like normal
Its variance depends on the parameter 
 is called degrees of freedom and is the only
parameter of t-distribution.
• Variance of t approaches 1 as n ∞
• In other words t approaches Z as n ∞
• The t-values are tabulated for different values of
the right tail areas and degrees of freedom
Sampling distribution of sample mean
Sampling distribution of sample mean
σ unknown
σ known
n>30
n<30
Normal
normal
n>30
n<30
Approx. normal
t
For t-distribution :assume that parent population is approximately normal
Related documents