Download Ch7 - YSU

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Statistical inference wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Sampling (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Transcript
Chapter 7
Sampling Distributions
1
Chapter Outline





Selecting A Sample
Point Estimation
Introduction to Sampling Distributions
Sampling Distribution of x
Sampling Distribution of p
2
Introduction
 The reason we select a sample is to collect data to
answer a research question about a population.
 To directly study the population is probably too
costly and too time-consuming, while to study a
portion of it, i.e. a sample is much more
manageable.
 The sample results provide only estimates of the
values of the population characteristics.
 With proper sampling methods, the sample results
can provide ‘good’ estimates of the population
characteristics.
3
Sampling from A Finite Population
 A simple random sample of size n from a finite
population of size N is a sample selected such that
each possible sample of size n has the same
probability of being selected.
 Replacing each sampled element before selecting
subsequent elements is called sampling with
replacement. Sampling without replacement is the
procedure used most often.
 In large sampling projects, computer-generated
random numbers are often used to automate the
sample selection process.
4
Sampling from An Infinite Population
 It’s impossible to obtain a list of all elements in an
infinite population.
 Populations are often generated by an ongoing
process where there is no upper limit on the
number of units that can be generated.
 A random sample must be selected with the
following conditions satisfied:
 Each element selected comes from the population of
interest.
 Each element is selected independently.
5
Point Estimation
 Point estimation is a form of statistical inference.
 In point estimation we use the data from the
sample to compute a value of a sample statistic the
serves as an estimate of a population parameter.
Sample
Statistics
Point Estimates
Population
Parameters
Sample mean: X
Population mean: 
Sample variance: S 2
Population variance: 
Sample proportion: P
Population proportion: P
2
6
Point Estimation
 Example: Checking Accounts
A local small bank has a total of 600 checking
accounts. The average daily balance of all the checking
accounts is $310 with a standard deviation of $66. The
proportion of accounts with a daily balance of no less
than $500 is 30%.
To find point estimates for the population, a random
sample of 121 checking accounts at the bank are chosen,
which shows an average daily balance of $306. The
standard deviation of the sample is $61, and the sample
proportion of accounts with a daily balance of no less
than $500 is 27%.
The following table summarizes the point estimates
from the sample of 121 checking accounts.
7
Summary of Point Estimates of A Simple
Random Sample of 121 Checking Accounts
Population
Parameter
Parameter
Value
Point
Estimator
Point
Estimate
 = Population mean
$310
x = Sample mean
$306
s = Sample std.
deviation for
account balance
$61
account balance
 = Population std.
$66
deviation for
account balance
p = Population proportion of account
balance no less than
$500
.3
account balance
p = Sample pro-
.27
portion of account
balance no less than
$500
8
Sampling Distribution of x
The sampling distribution of x is the probability
distribution of all possible values of the sample
mean x .
 Expected Value of x
E( x ) = 
where:  = the population mean
When the expected value of the point estimator
equals the population parameter, we say the point
estimator is unbiased.
9
Sampling Distribution of x
 Standard Deviation of x
Finite Population
Infinite Population
N n 
x 
( )
N 1 n
x 

n
x = the standard deviation of x
 = the standard deviation of the population
n = the sample size
N = the population size
10
Sampling Distribution of x
 Standard Deviation of x
Finite Population
N n 
x 
( )
N 1 n
Infinite Population
x 

n
• A finite population is treated as being
infinite if n/N < .05.
• ( N  n) / ( N  1) is the finite population
correction factor.
•  x is referred to as the standard error of the
mean.
11
Central Limit Theorem
When the population from which we are selecting
a random sample does not have a normal distribution,
the central limit theorem is helpful in identifying the
shape of the sampling distribution of x .
CENTRAL LIMIT THEOREM
In selecting random samples of size n from a
population, the sampling distribution of the sample
mean can be approximated by a normal distribution
as the sample size becomes large.
12
Sampling Distribution of x
 When the population follows a normal
distribution, the sampling distribution of x is
normally distributed for any sample size.
 According to the Central Limit Theorem, when the
sample size is large enough (at least 30 in most
cases), the sampling distribution of x can be
approximated by a normal distribution.
 The point of studying the sampling distribution of
x is to better estimate the population mean  .
13
Sampling Distribution of x
 Process of Statistical Inference
Population
with mean
=?
The value of x is used to
make inferences about
the value of .
A simple random sample
of n elements is selected
from the population.
The sample data
provide a value for
the sample mean x .
14
Sampling Distribution of x
 Example: Checking Accounts

Suppose the population of checking accounts follows a normal
distribution, a random sample of 121 checking accounts should also
follow a normal distribution with the mean account balance of $310 and
the standard error of 6.
x 
Ex   $310
66
121
6
x
15
Sampling Distribution of x
 Example: Checking Accounts

What is the probability that a simple random sample of 121 checking
accounts will provide an point estimate of the population mean account
balance within +/- $5 of the actual population mean? (i.e. between $305
and $315)
x
$305
$315
16
Sampling Distribution of x
 Example: Checking Accounts

Since x follows a normal distribution, we can first calculate the z values
of both cutoff points as follows:
z
x  E x 
305  310  0.83
6
x
315  310  0.83
x
6
17
Sampling Distribution of x
 Example: Checking Accounts

Next, we check the cumulative probabilities for the standard normal
distribution table for areas corresponding to the z values. Recall that the
numbers in the table represent the areas under the standard normal curve
to the LEFT of z values.
z
.00
.01
.02
.03
.04
.05
.06
.07
.08
.09
.
.
.
.
.
.
.
.
.
.
.
.5
.6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6
.7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7
.7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8
.7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9
.8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
.
.
.
.
.
.
.
.
.
.
.
18
Sampling Distribution of x
 Example: Checking Accounts

The area under the standard normal curve to the left of 0.83
Area = .7967
0
0.83
z
19
Sampling Distribution of x
 Example: Checking Accounts

The area under the standard normal curve to the left of -0.83
Area = 1-0.7967
= 0.2033
-0.83 0
z
20
Sampling Distribution of x
 Example: Checking Accounts

The area we are looking for is the area under the normal curve between
the cutoff points. Therefore, the probability is calculated as
P(-.83 < z < .83) = P(z < .83) - P(z < -.83)
= .7967 - .2033
= .5934
The probability that the sample mean account balance will be
between $305 and $315 is:
P(305 <
x < 315) = .5934
21
Sampling Distribution of x
 Example: Checking Accounts
x  6
Area = .5934
$305 $310 $315
x
22
Relationship Between the Sample Size and the
Sampling Distribution of x
 As the sample size increases, the standard error  x
becomes smaller.
x 

n
 With a smaller standard error, the values of x have less
variability and tend to be closer to the population mean.
23
Relationship Between the Sample Size and the
Sampling Distribution of x
 Example: Checking Accounts
With n = 225,
 x  4.4
With n=121 ,
x  6
Ex   $310
x
24
Sampling Distribution of p
The sampling distribution of p is the probability
distribution of all possible values of the sample
proportion p .
 Expected Value of p
E ( p)  p
where:
p = the population proportion
25
Sampling Distribution of p
 Standard Deviation of p
Finite Population
N  n p(1  p)
p 
N 1
n
Infinite Population
p(1  p)
p 
n
•  p is referred to as the standard error of
the proportion.
• ( N  n) / ( N  1) is the finite population
correction factor. When n/N is  5%, the correction
factor is close to 1.
26
Sampling Distribution of p
The sampling distribution of p can be approximated by a
normal distribution whenever the sample size is large
enough to satisfy the two conditions:
np > 5
and
n(1 – p) > 5
. . . because when these conditions are satisfied, the
probability distribution of x in the sample proportion, p =
x/n, can be approximated by normal distribution (and
because n is a constant).
27
Sampling Distribution of p
Making Inferences about a Population Proportion
Population
with proportion
p=?
The value of p is used
to make inferences
about the value of p.
A simple random sample
of n elements is selected
from the population.
The sample data
provide a value for the
sample proportion p.
28
Sampling Distribution of p
 Example: Checking Accounts
 In the example, n = 121 and p = .3, the sampling distribution
of p can be approximated as a normal distribution because:
np = 121(.30) = 36.3 > 5
and
n(1-p) = 121(.70) = 84.7 > 5
 The standard error of p is:
p 
600  121 
600  1
.31  .3
 0.037
121
29
Sampling Distribution of p
 Example: Checking Accounts
 p  0.037
E p   0.3
p
30