Download 7.0 Sampling and Sampling Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Gibbs sampling wikipedia , lookup

Sampling (statistics) wikipedia , lookup

Transcript
7.0 Sampling and Sampling
Distribution
7.2
7.1
Sampling Methods
Introduction to Sampling Distribution
Why?
Types
Sampling
Frame
Plan
2
Institut Matematik Kejuruteraan, UniMAP
BQT 173
less timeconsuming
more
practical
WHY??
less costly
less
cumbersome
3
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Sampling Frame
• listing of items that make up the population
• data sources such as population lists, directories, or
maps
Sampling Plan
• The way a sample is selected
• determines the quantity of information in the sample
• allow to measure the reliability or goodness of your
inference
4
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Types of Samples
Samples
Nonprobability
Samples
Convenient
Probability
Samples
Judgement
Simple
Random
5
Institut Matematik Kejuruteraan, UniMAP
Systematic
Stratified
Cluster
BQT 173
Selection of elements is left primarily to the interviewer.
Easy, inexpensive, or convenient to the sample
limitations- not representative of the population.
Recommended for pre testing Q, generating ideas, insight @ hypotheses.
Eg: a survey was conducted by one local TV stations involving a small number
of housewives, white collar workers & blue collar workers. The survey
attempts to elicit the respondents response towards a particular drama series
aired over the channel.
6
Institut Matematik Kejuruteraan, UniMAP
BQT 173
The population elements are selected based on the judgment of
the researcher.
From the judgment, the elements are representative of the
population of interest.
Eg: testing the consumers’ response towards a brand of instant
coffee, Indocafe at a wholesale market.
7
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Definition:
If a sample of n is drawn from a population of N in such a way that every
possible sample of size n has the same chance of being selected,
the sample obtained is called a simple random sampling.
N – number of units in the population
n – number of units in sample
8
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Do not have any bias element (every element treated equally).
Target population is homogenous in nature (the units have
similar characteristics)
Eg: canteen operators in primary school, operators in cyber
cafes, etc..
Disadvantages:
Sampling frame are not updated. Sampling frame are costly to
produce.
Impractical for large study area.
9
Institut Matematik Kejuruteraan, UniMAP
BQT 173
• Definition:
a sample obtained by randomly selecting one element
from the 1st k elements in the frame & every kth
element there is called a 1-in-k systematic sample,
with a random start.
• k – interval size
• k = population size
sample size
= N/n
10
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Systematic Sampling
 eg:
Let say, there are a total of N=500 primary school canteen operators in the KlangValley
in 1997 who are registered with the Ministry of Education. We required a sample of
n=25 operators for a particular study.
Step 1: make sure that the list is random(the name sorted
alphabetically).
Step 2: divide the operators into interval contain k operators.
k = population size = 500/25 = 20 for every 20 operators
sample size
selected only one to
represent that interval
Step 3: 1st interval only, select r at random. Let say 7. operators with id
no.7 will be 1st sample. The rest of the operators selected in
remaining intervals will depend on this number.
Step 4: after 7 has been selected, the remaining selection will be
operators with the following id no.
11
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Stratified Sampling
 Definition:
obtained by separating the population elements into non
overlapping groups, called strata, & then selecting a random
sample from each stratum.
 Large variation within the population.
 Eg: lecturers that can be categorized as lecturers, senior lecturers,
associate prof & prof.
12
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Stratified Sampling
 Definition:
obtained by separating the population elements into non
overlapping groups, called strata, & then selecting a random
sample from each stratum.
 Large variation within the population.
 Eg: lecturers that can be categorized as lecturers, senior lecturers,
associate prof & prof.
13
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Cluster Sampling
 Definition:
probability sample in which each sampling unit is a
collection, @ cluster of elements.
 Advantages- can be applied to a large study areas
- practical & economical.
- cost can be reduced-interviewer only need
to stay within the specific area instead
travelling across of the study area.
 Disadvantages – higher sampling error.
14
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Cluster Sampling
 Definition:
probability sample in which each sampling unit is a
collection, @ cluster of elements.
 Advantages- can be applied to a large study areas
- practical & economical.
- cost can be reduced-interviewer only need
to stay within the specific area instead
travelling across of the study area.
 Disadvantages – higher sampling error.
15
Institut Matematik Kejuruteraan, UniMAP
BQT 173
7.0 Sampling and Sampling
Distribution
7.2 Introduction to Sampling Distribution
• Sampling distribution is a probability
distribution of a sample statistic based on all
possible simple random sample of the same
size from the same population.
7.2.1 Sampling Distribution of Mean ( )
X1  X2  X3  ...  X n
x
n
Sample mean
will have a theoretical sampling distribution with
 
mean, x  E X  x
variance,  x  Var  X  
2
17
 x N  n
2
n
 N  1
and standard errors
 x N  n
of the sample mean X is n  N  1
Institut Matematik Kejuruteraan, UniMAP
BQT 173
 The spread of the sampling distribution of
is smaller than
the spread of the corresponding population distribution.
 The standard deviation of the sampling distribution of
decreases as the sample size increase.
 Consistent estimator ;
 the standard deviation of a sample statistics decrease as the
sample size increased
18
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Example :
 The mean wage per hour for all 5000 employees who work
at a large company is RM17.50 and the standard deviation is
RM29.90. Let be the mean wage per hour for a random
sample of certain employee selected from this company. Find
the mean and standard deviation of for a sample size of
 30
 75
 200
19
Institut Matematik Kejuruteraan, UniMAP
BQT 173
 Solution
Population mean, x  17.50
Population standard deviation,  x  29.90
(a)
Mean, x  x  17.50
Standard Deviation  29.90 5000  30
5000  1
30
 5.4590  0.9971
 5.4431
20
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Central Limit Theorem
 If we are sampling from a population that has an unknown
probability distribution, the sampling distribution of the
sample mean will still be approximately normal with mean
and standard deviation , if the sample size is large
21
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Example
 An electronic firm has a total of 350 workers with a mean
age of 37.6 years and a standard deviation of 8.3 years. If a
sample of 45 workers is chosen at random from these
workers, what is probability that this sample will yield an
average age less than 40 years?
22
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Solution
Population mean, x  37.6 years
Population SD,  x  8.3 years
Population size N  350
Random sample size ,n=45
So sample mean,  x  37.6 years
Sample SE, x 
x
n
 N  n
 N  1
8.3 350  45
45 350  1
 1.157

23
Institut Matematik Kejuruteraan, UniMAP
BQT 173
 Now as the sample size n  45  30 , then by Central Limit
Theorem, we have
N  37.6,1.157 2 
X
X  37.6
1.157
40  37.6 

P X  40  P  Z 

1.157 

Z


 P  Z  2.07 
 0.9808
 The probability that the sample will yield an average age less
than 40 years is 0.9808
24
Institut Matematik Kejuruteraan, UniMAP
BQT 173
7.2.2Sampling Distribution of Propotion ( )
 The population and sample proportion are denoted by p and p
, respectively, are calculated as,
X and
x
p
p
N
n
where
 N = total number of elements in the population;
 X = number of elements in the population that possess a
specific characteristic;
 n = total number of elements in the sample; and
 x = number of elements in the sample that possess a specific
characteristic.
25
Institut Matematik Kejuruteraan, UniMAP
BQT 173
Example
In a recent survey of 150 household, 54 had central air
conditioning. Find p and q where p is the proportion of
household that have central air conditioning.
Solution
Since X  54 and n  150
X 54

 0.36  36%
n 150
n  X 150  54 96
q


 0.64  64%
n
150
150
p
26
Institut Matematik Kejuruteraan, UniMAP
BQT 173