Download Fundamental Sampling Distributions and Points Estimations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Central limit theorem wikipedia , lookup

Gibbs sampling wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
STAT 319 – Probability and Statistics For Engineers
Lecture 6
Fundamentals of Sampling
Distributions and Point
Estimations
Engineering College, Hail University, Saudi Arabia
6.1 Definitions
1
Definitions
A population: Consist of the totality of
observations with which we are concerned
A sample is a subset of a population
Let X1 , X2 , … , Xn be n independent random
variables, each having the same probability
distribution f(x). We then define X1 , X2 , … ,
Xn to be random sample of size n from the
population f(x).
Random Samples
The rv’s X1,…,Xn are said to form a (simple
random sample of size n if
1. The Xi’s are independent rv’s.
2. Every Xi has the same probability
distribution.
distribution
2
Statistic
A statistic is any quantity whose value can be
calculated from sample data. Prior to obtaining
data, there is uncertainty as to what value of any
particular statistic will result.
or
Any function of the random variables constituting
of random sample called a statistic.
If X1 , X2 , … , Xn represent a random sample of
size n, then:
p mean is defined by
y the statistic
1)) the sample
X =
1 n
∑X i
n i =1
2) the sample variance is defined by the statistic
S2=
3
1 n
(X i − X ) 2
∑
n − 1 i =1
6.2 Sampling Distributions
and the Central Limit
Theorem
Sample Mean
Let X1,…, Xn be a random sample from a
distribution with mean value μ and standard
deviation σ . Then
( )
2
2. V ( X ) = σ X2 = σ
n
1. E X = μ X = μ
In addition, with To = X1 +…+ Xn,
E (To ) = n μ , V (To ) = nσ 2 , and σTo = n σ .
4
Normal Population Distribution
Let X1,…, Xn be a random sample from a
normal distribution with mean value μ and
standard deviation σ . Then for any n, X is
normally distributed with mean μ and
standard deviation σ = σ / n .
X
The Central Limit Theorem
Let X1,…, Xn be a random sample from a
distribution with mean value μ and variance σ 2 .
Then if n sufficiently large, X has
approximately a normal distribution with
2
μ X = μ and σ X2 = σ n , and To also has
approximately a normal distribution with
μTo = nμ , σ To = nσ 2 . The larger the value of
n, the better the approximation.
5
The Central Limit Theorem
X small to
moderate n
X large n
Population
distribution
μ
If n > 30, the Central Limit Theorem can
be used.
Developing the Distribution
Of the Sample Mean
6
Developing a
Sampling Distribution
• Assume there is a population …
• Population size N=4
A
B
C
D
• Random variable, x,
is age of individuals
• Values of x: 18,
18 20,
20
22, 24 (years)
Developing a
Sampling Distribution
(continued)
Summary Measures for the Population Distribution:
∑x
μ=
=
N
.3
18 + 20 + 22 + 24
= 21
4
σ=
7
P(x)
i
∑ (x
i
− μ)2
N
.2
.1
0
= 2.236
18
20
22
24
A
B
C
D
Uniform Distribution
x
Developing a
Sampling Distribution
Now consider all possible samples of size n=2
1st
Obs
2nd Observation
Ob
ti
18
20
22
24
(continued)
16 Sample
Means
18 18,18 18,20 18,22 18,24
20 20,18 20,20 20,22 20,24
1st 2nd Observation
Obs 18 20 22 24
22 22,18 22,20 22,22 22,24
18 18 19 20 21
24 24,18 24,20 24,22 24,24
20 19 20 21 22
16 possible samples
(sampling with
replacement)
22 20 21 22 23
24 21 22 23 24
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Developing a
Sampling Distribution
Sampling Distribution of All Sample
Means
Sample Means
Distribution
16 Sample Means
1st 2nd Observation
Obs 18 20 22 24
18 18 19 20 21
8
(continued)
P(x)
.3
20 19 20 21 22
.2
22 20 21 22 23
.1
24 21 22 23 24
0
18 19
20 21 22 23
(no longer uniform)
24
_
x
Developing a
Sampling Distribution
(continued)
Summary Measures of this Sampling Distribution:
μx =
σx =
=
∑x
N
i
=
∑ (x
i
18 + 19 + 21 + L + 24
= 21
16
− μ x )2
N
(18 - 21)2 + (19 - 21)2 + L + (24 - 21)2
= 1.58
16
Comparing the Population with
its Sampling Distribution
Population
N=4
μ = 21
μx = 21
σ = 2.236
P(x)
.3
P(x)
.3
.2
.2
.1
.1
0
9
Sample Means Distribution
n=2
18
20
22
24
A
B
C
D
x
0
18 19
σ x = 1.58
20 21 22 23
24
_
x
Central Limit Theorem
a Normal Population
If a population is normal with mean μ and
standard deviation σ,
σ the sampling distribution
of
x
is also normally distributed with:
μx = μ
and
σx =
σ
n
Central Limit Theorem
a Normal Population
•Z value for the sampling distribution of
z=
( x − μ)
σ
n
Where
10
x
the sample mean
μ
the population mean
σ
population standard deviation
n
sample size
x
Sampling Distribution Properties
μx = μ
(i.e.
Normal Population
Distribution
x is unbiased )
μ
x
μx
x
Normal Sampling
Distribution
(has the same mean)
Sampling Distribution Properties
(continued)
•For sampling with replacement
σ x Decreases
As n increases
Smaller
sample
p size
Larger
sample size
μ
11
x
If the Population is not Normal
We can apply the Central Limit Theorem:
E
Even
if the
th population
l ti is
i nott normal,
l
…sample means from the population will be
approximately normal as long as the sample
size is large enough
…and the sampling distribution will have
σx =
μx = μ
σ
n
Central Limit Theorem
As the
A
th
sample
size gets
large
enough…
n↑
the sampling
distribution
becomes
almost normal
regardless of
shape of
population
x
12
If the Population is not Normal
(continued)
Sampling distribution
properties:
Population Distribution
Central Tendency
μx = μ
Variation
σx =
σ
n
Sampling Distribution
(becomes normal as n increases)
Larger
sample
size
Smaller
sample size
(Sampling with
replacement)
μx
Central Limit Theorem
13
x
μ
x
Example 6.1
In an Electronics company that manufactures
circuit boards,
boards the average imperfection
(defects) on a board is μ = 5 with a standard
deviation of σ = 2.34 when the production
process is under statistical control.
A random sample of n = 36 circuit boards has
been taken for inspection and a mean of
x = 6 defects per board was found.
What is the probability of getting a value of
x ≤ 6 if the process is under control?
Solution 6.1
Because the sample size is greater than 30, the Central Limit Theorem
can be used in this case even though the number of defects per
board follows a Poisson distribution.
Therefore, the distribution of the sample mean x is approximately
normal with the standard deviation :
The result Z = 2.56 corresponds
to 0.4948 on the table of
normal curve areas:
14
Z=2.56
P( ≤ 2.56)
P(z
2 56) = 0.9948
0 9948
Minitab Solution Example 6.1
15
Minitab Solution Example 6.1
Minitab
16
Continued
Example 6.2
An Electronics company that manufactures
resistors, the average resistance is μ = 100
Ohms with a standard deviation of σ = 10.
Find the probability that a random sample of n
= 25 resistors will have an average
resistance less than 95 Ohms
Example 6.3
The average number of parts that reach the end
of a production line defect
defect-free
free at any given
hour of the first shift is 372 parts with a
standard deviation of 7.
What is the probability that a random sample of
34 different productions’ first-shift hours would
yield a sample mean between 369 and 371
parts that reach the end of the line defect-free?
17
Example 6.4
Suppose
pp
a population
p p
has mean μ = 8 and
standard deviation σ = 3. Suppose a
random sample of size n = 36 is selected.
What is the probability that the sample mean
is between 7.8
7 8 and 8.2?
8 2?
6.4 Point Estimation
18
Point Estimator
A point estimator of a parameter θ is a single
number that can be regarded as a sensible
value forθ . A point estimator can be
obtained by selecting a suitable statistic and
computing its value from the given sample
data
data.
Unbiased Estimator
A point estimator θˆ is said to be an
unbiased estimator of θ if E (θˆ) = θ for
every possible value of θ . If θˆ is not
biased, the difference E (θˆ) − θ is called the
bias of θˆ .
19
The pdf’s of a biased estimatorθˆ1and an
unbiased estimatorθˆ for a parameter
2
θ.
pdf of θˆ2
pdf of θˆ1
θ
Bias of θ1
The pdf’s of a biased estimatorθˆ1and an
unbiased estimatorθˆ2 for a parameter θ .
pdf of θˆ2
pdf of θˆ1
θ
Bias of θ1
20
Some Unbiased
Estimators
If X1, X2,…, Xn is a random sample from a
distribution with mean μ , then X is an
unbiased estimator of μ .
When X is a binomial rv with pparameters n and
p, the sample proportion pˆ = X / n is an
unbiased estimator of p.
Some Unbiased
Estimators
Let X1, X2,,…,, Xn be a random sample
p from a
distribution with mean μ and variance σ .2
Then the estimator
σˆ = S
2
2
X
(
∑
=
is an unbiased estimator.
21
i
−X
n −1
)
2
Principle of Minimum Variance
Unbiased Estimation (MVUE)
Among all estimators of θ that are unbiased,
choose the one that has the minimum
variance. The resultingθˆ is called the
minimum variance unbiased estimator
(MVUE) off θ.
Graphs of the pdf’s of two
different unbiased estimators
pdf of θˆ1
pdf of θˆ2
θ
22
MVUE for a Normal Distribution
Let X1, X2,,…,, Xn be a random sample
p from a
normal distribution with parameters μ and σ
Then the estimator μˆ = X is the MVUE for
.
μ
A biased estimator that is preferable
to the MVUE
pdf of θˆ1 (biased)
pdf of θˆ2
(the MVUE)
θ
23
Standard Error
The standard error of an estimator
isθˆ
its standard deviation
σ θˆ = V (.θˆIf) the
standard error itself involves unknown
parameters whose values can be estimated,
substitution into
yields the estimated
σ θˆ
standard error of the estimator, denoted
σˆθˆ or sθˆ .
Confidence Intervals
An alternative to reporting a single value
for the parameter being estimated is to
calculate and report an entire interval of
plausible values – a confidence interval
(CI). A confidence level is a measure of
the degree of reliability of the interval.
interval
24
95% Confidence Interval
If after observing X1 = x1,…, Xn = xn, we
compute the observed sample mean x , then a
95% confidence interval forμ the mean of
normal population can be expressed if σ known
as:
σ
σ ⎞
⎛
x
−
1.96
⋅
,
x
+
1.96
⋅
⎜
⎟
n
n⎠
⎝
Other Levels of Confidence
z curve
shaded area = α / 2
1−α
− zα / 2
0
zα / 2
P ( − zα / 2 ≤ Z ≤ zα / 2 ) = 1 − α
25
Other Levels of Confidence
A 100(1 − α )% confidence interval for the
mean μ of a normal population when the
value of σ is known is given by
σ
σ ⎞
⎛
x
−
z
⋅
,
x
+
z
⋅
α /2
α /2
⎜
⎟
n
n⎠
⎝
Sample Size
The general formula for the sample size n
necessary to ensure an interval width w is
σ⎞
⎛
n = ⎜zα /2 ⋅ ⎟
w ⎠
⎝
26
2