Download SAMPLING THEORY - PUBLICWEB2 Hosted Sites

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
LECTURE 3
SAMPLING THEORY
EPSY 640
Texas A&M University
POPULATIONS
• finite population consists of the actual
group of objects or persons, which we know
is potentially countable and finite.
• infinite population population is a
mathematical abstraction that is useful
because the properties of the population are
assumed or defined carefully, ,
POPULATIONS
• Parameter = characteristic of the
population.
• If a sample is drawn and the characteristic
computed, it will be a statistic for the
sample.
POPULATIONS
• Accessible vs. Target Populations.
• Target Population, the population we wish
to represent.
• Instead, we might be able to draw from all
public school grade 3 students in class
during a particular week in the school year.
This is our Accessible Population, the
population we have access to.
Accessible
population
Sample
Target population
Figure 4.1: Inferences from sample to Populations
Sampling Methods
• RANDOM SAMPLING
–
–
–
–
SIMPLE
STRATIFIED
MULTISTAGE
CLUSTER
• SYSTEMATIC SAMPLING
• CONVENIENCE (NONRANDOM)
SAMPLING
RANDOM SAMPLING
• If every member of a population has an
equal chance of being selected
• involves being able to define and count the
population.
• can then use a process called randomization
to select the sample
Table of Random Numbers
Location RN Location RN
234 75
308 01 …..
235 13
309 26 …..
236 95
310 31 …..
237 22
311 69 …..
238 46
312 29 …..
239 86
313 98 …..
240 55
314 34 …..
241 59
315 17 …..
In selecting a sample of 20 students from a list of of 75,
a random start point was selected as shown above. The
ad hoc rule was to go down the column to the bottom
and up the next. Thus, children with identifiers 75 1, 13,
26, 69, 22, 46, 29, 55, 34, 59, 59, and17 have been
selected within this section of the random number table.
The location value allows checking and replication of a
random sample selection process.
finite population correction
•
fpc= 1- n/N = 1-f
where n= # in sample
N=# in population
Finding survey sample size
(z/d)2
n =
________________________
1 + (1/N)(z/d)2
z = z-score for probability for confidence interval
required (usually 1.96 for .05 or 2.59 for .01)
 = SD of distribution (can be 1.0 for arbitrary units)
d = desired degree of error in SD units
Finding survey sample sizeexample
Alpha=.05, N=1,000,000 d=.1 ,  = 1
(1.96/.1)2
n =
________________________
1 + (1/1000000)(1.96/.1)2
= 19.62
= 384.16
Population SizeSample Size Required for d= .1
20
30
40
50
75
100
125
150
175
200
225
250
275
300
350
400
500
600
750
1000
1500
2000
2500
5000
7500
10000
100000
1000000
for  = .05 for  = .01
19
19
28
29
36
38
44
46
62
67
79
87
94
105
108
122
120
138
132
154
142
168
151
182
160
194
168
207
183
229
196
250
217
285
234
315
254
352
278
399
306
460
322
498
333
525
357
586
365
610
370
623
383
660
384
663
Table 4.2: Sample sizes required for various population sizes for
95% and 99% confidence intervals
Mean and standard deviation for
simple random sampling
•
•
•
•
•
(x.) =  (sample mean estimates
population mean unbiasedly)
V(x.) = (1/n) s2(1-f) (variance must be
corrected)
_____
s x. =  V(x.)
= standard error of the mean =sm
-1.96sm
-sm

sm
Mean from a particular
sample
1.96sm
Original
Data
Distribution
Distribution
of Means
-1.96sm
-sm

sm
Mean from a particular
sample
1.96sm
Confidence interval
• Mean  zsx.
• z = # SDs of normal distribution for some
probability of confidence, usually .01 or .05
• for real data: x.  1.96s x gives a
confidence interval around the mean:
– Interpretation: in 95 of 100 times we do the
study, the population mean will be in the
interval we construct.
Distribution
of Means
Confidence
interval
-1.96sm
-sm

sm
1.96sm
Mean from a particular
sample
Interpretation: in one event  is either IN or
OUT of the confidence interval; for 100
intervals, it should be IN 95 times on
average.
Stratified random sampling
• subpopulations, called strata.
• We then use simple random sampling for
each stratum.
• We can decide to sample proportionately
or disproportionately.
Stratified random sampling
• Proportionate sampling: percentage in
sample is same as in population
• or disproportionate sampling: percentage
in sample is different from that of
population
• Example Males and Females (50% in pop.).
– Proportional: 50 males, 50 females
– Disproportional: 75 males, 25 females
Stratified random sampling
• Example: Ethnicity of students in District:
80% Anglo, 10% Hispanic, 5% African
American, 5% Native American
• Proportional for 200 student sample:
– 160 Anglo, 20 Hispanic, 10 African-American,
10 Native American
• Disproportional:
– 50 Anglo, 50 Hispanic, 50 African-American,
50 Native American
Stratified random sampling
• Example: Ethnicity of students in District:
80% Anglo, 10%
Hispanic, 5% African American, 5% Native American
• Proportional for 200 student sample:
– 160 Anglo, 20 Hispanic, 10 African-American, 10 Native American
– May give poor estimates for H, AA, NA samples
• Disproportional:
– 50 Anglo, 50 Hispanic, 50 African-American,
50 Native American
– Will give estimates with similar confidence
intervals for all groups
– may need fpc for some groups
Mean for stratified random
sample.
s
x..est =  ( Ni xi.)/N
i=1
Where Ni = numer of cases in the population stratum i,
N = total number of cases in the entire population, and
s = number of strata.
Mean for stratified random
sample- example
3 strata, N1=1000, N2=2000, N3=3000
X1 = 70, X2 = 80, X3 = 90
s
x..est =  ( Ni xi.)/N
i=1
= [(1000 x 70) + (2000 x 80) + (3000 x 90) ] / 6000
= 83.33
SD for stratified random sample.
•
•
•
s
V(x..est) =  Ni2 s2x ./N2
•
x. = V(x..est) ,
i
i=1
• where s2x .= V(xi.), the variance error of the
mean using the simple random sample
formula
i
SD for stratified random sample.
SUBPOPULATION NI
ni
X.
si
sm.
A
77
50
10
5
.419
B
229
50
11
6
.751
C
738
50
12
7
.956
X..est = (77 x 10 + 229 x 11 + 738 x 12)/1044 = 11.63
V(X..est) = (772 x (.419)2 + 2292 x (.751)2 + 7382 x (.956)2 )
/10442 = .485
s(X..est) = .696
Table 4.3: Calculation of stratified sample
mean and variance error of the mean
SD for stratified random sample.
SUBPOPULATION NI
ni
X.
si
sm.
A
77
50
10
5
.419
B
229
50
11
6
.751
C
738
s2m = (1/ni)si2 (1-fi)
50
12
7
.956
.4192 = (1/50)52 (1-50/77)
.7512 = (1/50)62 (1-50/229)
.9562 = (1/50)72 (1-50/738)
Related documents