Download Sampling Theory

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter 7
Sampling Methods and
the Central Limit
Theorem
Prepared by:
Jean-Paul Olivier
Red River College
© 2009 McGraw-Hill Ryerson Limited
1 of 39
Learning Objectives
1. Explain why a sample is often the only feasible way
to learn something about a population
2. Describe methods to select a sample
3. Define and construct a sampling distribution of
the sample mean
4. Explain the central limit theorem
5. Use the central limit theorem to find probabilities
of selecting possible sample means from a
specified population
6. Define and construct a sampling distribution of a
proportion
© 2009 McGraw-Hill Ryerson Limited
2 of 39
Reasons To Sample
1. To contact the whole population would be
time consuming
2. The cost of studying all the items in a
population may be prohibitive
3. The physical impossibility of checking all
items in the population
4. The destructive nature of certain tests
5. The sample results are adequate
© 2009 McGraw-Hill Ryerson Limited
3 of 39
Methods of Sampling
1.
2.
3.
4.
Simple random sampling
Systematic random sampling
Stratified random sampling
Cluster sampling
© 2009 McGraw-Hill Ryerson Limited
4 of 39
Simple Random Sampling
• A sample selected so that each item or person in the
population has the same chance of being included
• Can use tables of random numbers
• Suppose a population consists of 845 employees of
Nitra Industries
– A sample of 20 employees is to be selected from that
population
© 2009 McGraw-Hill Ryerson Limited
5 of 39
Simple Random Sampling In
Excel
© 2009 McGraw-Hill Ryerson Limited
6 of 39
You Try It Out!
The following class roster lists the students enrolling in an introductory course in business statistics. Three
students are to be randomly selected and asked various questions regarding course content and method of
instruction.
a)
The numbers 00 through 45 are handwritten on slips of paper and placed in a bowl. The three numbers
selected are 31, 7, and 25. Which students would be included in the sample?
b)
Now use the table of random digits, Appendix E, to select your own sample
c)
What would you do if you encountered the number 59 in the table of random digits?
© 2009 McGraw-Hill Ryerson Limited
7 of 39
Systematic Random Sampling
• The items or individuals of the population are
arranged in some order. A random starting
point is selected and then every kth member of
the population is selected for the sample
• k is calculated as the population size divided by
the sample size
• When the physical order is related to the
population characteristic, then systematic
random sampling should not be used
© 2009 McGraw-Hill Ryerson Limited
8 of 39
Stratified Random Sampling
• A population is divided into subgroups, called strata,
and a sample is randomly selected from each stratum
• Once the strata are defined, we can apply simple
random sampling within each group or strata to
collect the sample
© 2009 McGraw-Hill Ryerson Limited
9 of 39
Cluster Sampling
A population is divided into clusters using naturally
occurring geographic or other boundaries. Then,
clusters are randomly selected and a sample is
collected by randomly selecting from each cluster
© 2009 McGraw-Hill Ryerson Limited
10 of 39
You Try It Out!
The following class roster lists the students enrolling in an introductory course in business
statistics. Three students are to be randomly selected and asked various questions regarding
course content and method of instruction.
a) Suppose a systematic random sample will select every ninth student enrolled in the class.
Initially, the fourth student on the list was selected at random. That student is numbered 03.
Remembering that the random numbers start with 00, which students will be chosen to be
members of the sample?
© 2009 McGraw-Hill Ryerson Limited
11 of 39
Sampling Error
• The difference between a sample statistic and
its corresponding population parameter
• Since the sample is a part or portion of the
population, it is unlikely that the sample mean
would be exactly equal to the population mean
• Similarly, it is unlikely that the sample standard
deviation would be exactly equal to the
population standard deviation
• We can therefore expect a difference between a
sample statistic and its corresponding population
parameter
© 2009 McGraw-Hill Ryerson Limited
12 of 39
EXAMPLE – Sampling Error
• Jane and Joe Miley operate
a bed and breakfast called
the Foxtrot Inn. There are
eight rooms available for
rent
• Rentals for June are
displayed
• During the month of June
there were 94 rentals so
the mean number of units
rented per night is 3.13
• The population mean,
μ=3.13
© 2009 McGraw-Hill Ryerson Limited
13 of 39
EXAMPLE – Sampling Error
• We take three random samples of five nights rented
– The first random sample of five nights resulted in the
following number of rooms rented: 4, 7, 4, 3, and 1
• The mean of this sample is 3.8 rooms, so the sampling error is
3.8-3.13=+0.67
– The second random sample of five nights resulted in the
following number of rooms rented: 3, 3, 2, 3, and 6
• The mean of this sample is 3.4 rooms, so the sampling error is
3.4-3.13=+0.27
– In the third sample, the sampling error was found to be
-1.33
© 2009 McGraw-Hill Ryerson Limited
14 of 39
Sampling Distribution of the
Sample Mean
• The sample means in the previous example
varied from one sample to the next. The mean
of the first sample of 5 days was 3.8 rooms,
and the second sample was 3.4 rooms. The
population mean was 3.13 rooms.
• If we organize the means of all possible
samples of five days into a probability
distribution, the result is called the sampling
distribution of the sample mean – a
probability distribution of all possible sample
means of a given sample size
© 2009 McGraw-Hill Ryerson Limited
15 of 39
EXAMPLE – Sampling
Distribution of the Sample Mean
Tartus Industries has seven
production employees
(considered the population).
The hourly earnings of each
employee are given in the table.
1.
2.
3.
4.
What is the population mean?
What is the sampling distribution of the sample mean for samples of
size 2?
What is the mean of the sampling distribution?
What observations can be made about the population and the sampling
distribution?
© 2009 McGraw-Hill Ryerson Limited
16 of 39
EXAMPLE – Sampling
Distribution of the Sample Mean
© 2009 McGraw-Hill Ryerson Limited
17 of 39
EXAMPLE – Sampling
Distribution of the Sample Mean
© 2009 McGraw-Hill Ryerson Limited
18 of 39
EXAMPLE – Sampling
Distribution of the Sample Mean
© 2009 McGraw-Hill Ryerson Limited
19 of 39
Relationship Between Population
Distribution and Sampling Distribution
1. The mean of the sample means is exactly
equal to the population mean
2. The variance of the sample means is equal to
the population variance divided by n.
3. The sampling distribution of the sample
means tends to become bell-shaped and to
approximate the normal probability
distribution
– This approximation improves with larger samples
© 2009 McGraw-Hill Ryerson Limited
20 of 39
You Try It Out!
The lengths of service of all the executives employed by
Standard Chemicals are:
a)
Using the combination formula, how many samples of
size 2 are possible?
b)
List all possible samples of two executives from the
population and compute their means.
c)
Organize the means into a sampling distribution.
d)
Compare the population mean and the mean of the
sample means.
e)
Compare the dispersion in the population with that in
the distribution of the sample mean.
f)
A chart portraying the population values follows. Is the
distribution of population values normally distributed
(bell-shaped)?
g)
Is the distribution of the sample mean computed in
part (c) starting to show some tendency toward being
bell-shaped?
© 2009 McGraw-Hill Ryerson Limited
21 of 39
The Central Limit Theorem
• If all samples of a particular size are
selected from any population, the sampling
distribution of the sample mean is
approximately a normal distribution
• This approximation improves with larger
samples
© 2009 McGraw-Hill Ryerson Limited
22 of 39
Illustration
© 2009 McGraw-Hill Ryerson Limited
23 of 39
EXAMPLE – The Central
Limit Theorem
Spence Sprockets Inc. faces some major decisions regarding health care
for these employees. Ed Spence decides to form a committee of five
representative employees to study the health care issue carefully and
make a recommendation. Ed feels the views of newer employees
toward health care may differ from those of more experienced
employees. If Ed randomly selects this committee, what can he expect
in terms of the mean years with Spence Sprockets for those on the
committee? How does the shape of the distribution of years of
experience of all employees (the population) compare with the shape
of the sampling distribution of the mean? The lengths of service of
the 40 employees currently on the Spence Sprockets Inc. payroll are:
© 2009 McGraw-Hill Ryerson Limited
24 of 39
EXAMPLE – The Central
Limit Theorem
This sample mean is 3.80 years.
© 2009 McGraw-Hill Ryerson Limited
25 of 39
EXAMPLE – The Central
Limit Theorem
© 2009 McGraw-Hill Ryerson Limited
26 of 39
Standard Error of the Mean
• The standard deviation of the sampling
distribution of the sample mean
• n is the number of observations in each sample
© 2009 McGraw-Hill Ryerson Limited
27 of 39
Important Conclusions
1. The mean of the distribution of the sample mean will
be exactly equal to the population mean if we are able to
select all possible samples of a particular size from a
given population. That is:
  X
Even if we do not select all samples, we can expect the mean of
the distribution of the sample mean to be close to the
population mean.
2. There will be less dispersion in the sampling
distribution of the sample mean than in the population.
If the standard deviation of the population is σ, the
standard deviation of the distribution of the sample
mean is σ/√n. Note that when we increase the size of
the sample, the standard error of the mean decreases.
© 2009 McGraw-Hill Ryerson Limited
28 of 39
Using the Sampling Distribution
of the Sample Mean (σ Known)
• If a population follows the normal distribution,
the sampling distribution of the sample mean will
also follow the normal distribution
• To determine the probability a sample mean falls
within a particular region, use:
© 2009 McGraw-Hill Ryerson Limited
29 of 39
EXAMPLE -- Using the Sampling
Distribution of the Sample Mean (σ Known)
The quality control department for Cola Inc. maintains records regarding
the amount of cola in its “Jumbo” bottle. The actual amount of cola
in each bottle is critical, but varies a small amount from one bottle to
the next. Cola Inc. does not wish to underfill the bottles, because it will
have a problem with truth in labelling. On the other hand, it cannot
overfill each bottle, because it would be giving cola away, hence
reducing its profits. Its records indicate that the amount of cola
follows a normal probability distribution. The mean amount per bottle
is 1 L and the population standard deviation is 12.8 ml. At 8 a.m. today
the quality control technician randomly selected 16 bottles from the
filling line. The mean amount of cola contained in the bottles is 1.006
L. Is this an unlikely result? Is it likely the process is putting too much
cola in the bottles?
© 2009 McGraw-Hill Ryerson Limited
30 of 39
EXAMPLE -- Using the Sampling
Distribution of the Sample Mean (σ Known)
Step 1: Find the z-value corresponding to the sample
mean of 1006ml given µ=1000ml and σ=12.8ml
© 2009 McGraw-Hill Ryerson Limited
31 of 39
EXAMPLE -- Using the Sampling
Distribution of the Sample Mean (σ Known)
Step 2: Find the probability of observing a Z equal
to or greater than 1.875
© 2009 McGraw-Hill Ryerson Limited
32 of 39
EXAMPLE -- Using the Sampling
Distribution of the Sample Mean (σ Known)
What do we conclude? It is unlikely, about a 3
percent chance, we could select a sample of 16
observations from a normal population with a
mean of 1 L and a population standard deviation
of 12.8 ml and find the sample mean equal to or
greater than 1.006 L. We conclude the process is
putting too much cola in the bottles. The quality
control technician should see the production
supervisor about reducing the amount of cola in
each bottle.
© 2009 McGraw-Hill Ryerson Limited
33 of 39
You Try It Out!
Refer to the Cola Inc. information. Suppose
the quality technician selected a sample of
16 Jumbo Cola bottles that averaged 0.996L.
What can you conclude about the filling
process?
© 2009 McGraw-Hill Ryerson Limited
34 of 39
Sampling Distribution of The
Proportion
• For the nominal scale of measurement
• A proportion is the fraction, ratio, or
percent indicating the part of the sample or
the population having a particular trait of
interest
© 2009 McGraw-Hill Ryerson Limited
35 of 39
EXAMPLE – Using the Sampling
Distribution of a Proportion
Alpha Corporation receives a shipment of flour
every morning from its supplier. The flour is in
40 kg bags and Alpha will reject any shipment
that is more than 5 percent underweight. The
foreman samples 50 bags with each shipment
and if the bags average more than 5 percent
underweight, the whole shipment is returned to
the supplier. What is the probability that in a
sample of 150 bags, the foreman will find that
less than 3 percent are underweight?
© 2009 McGraw-Hill Ryerson Limited
36 of 39
EXAMPLE – Using the Sampling
Distribution of a Proportion
© 2009 McGraw-Hill Ryerson Limited
37 of 39
You Try It Out!
Refer to the Alpha Corporation information.
Compute the probability that in a sample of
200, the foreman will find more than 4
percent of the bags underweight.
© 2009 McGraw-Hill Ryerson Limited
38 of 39
Chapter Summary
• There are many reasons for sampling a population
• In an unbiased sample, all members of the
population have a chance of being selected for the
sample.
• There are several probability sampling methods
including simple random sample, systematic sample,
stratified sample, and cluster sampling
• The sampling error is the difference between a
population parameter and a sample statistic
• The sampling distribution of the sample mean is a
probability distribution of all possible sample means
of a given size from a population
© 2009 McGraw-Hill Ryerson Limited
39 of 39