Download Sampling!

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Selecting Sampling Strategy
Chris Olsen
[email protected]
5/25/2017
Sampling Strategies
1
The sampling question du joir: just how tall IS Iowa
corn?
5/25/2017
Sampling Strategies
2
Professional basketball players’ view of Iowa Corn
(In our dreams…but what about reality??)
5/25/2017
Sampling Strategies
3
In order to measure the height of a stalk of corn we
must chop it down.
Measuring all the corn stalks is not on the table;
farmers being what they are, we have only one
cornstalk per Iowa county that we can utilize.
The economy being what it is, we can only afford to
chop down a small number of cornstalks.
Our problem: identify the counties.
5/25/2017
Sampling Strategies
4
The generation of a subset of a population is known as
“sampling” from the population.
We want our sample to be “representative” of the
population – if it is, we can make credible statements
about our population by generalizing from the sample.
We maximize the probability of getting a representative
sample by generating the sample randomly.
The randomization scheme allows the calculation of
probability distributions (“sampling” distributions) of
statistics.
5/25/2017
Sampling Strategies
5
A random (“probability”) sample is one such that each
population member has a greater-than-zero probability of
selection.
The basic random sampling strategy is the “simple”
random sample.
A simple random sample of size n from a population of size
N is a sample taken in such a way that each of the possible
NCn samples is equally likely.
5/25/2017
Sampling Strategies
6
Iowa has 99 counties -- perfect for a random number
table…
5/25/2017
Sampling Strategies
7
Bubble, bubble, toil and trouble…
5/25/2017
Sampling Strategies
8
(All other states use calculators)
MathPrbrandInt(1, 99) (A random county)
MathPrbrandInt(1, 99, 10) (10 random counties)
MathPrbrandInt(1, 99, 15) (15 random counties,
anticipating bad
luck)
MathPrbrandInt(1, 99, 15)L1 (Put in List1)
5/25/2017
Sampling Strategies
9
Oops?
5/25/2017
Sampling Strategies
10
A possible improvement on the simple random
sampling strategy is to take a stratified random
sample.
Stratified random sampling capitalizes on known
(or possibly suspected, but be careful) pockets of
homogeneity in the population.
5/25/2017
Sampling Strategies
11
Possible pockets: Golden Gopher Droppings?
5/25/2017
Sampling Strategies
12
If certain areas of the state have been contaminated by
a certain other state this might affect corn height, we
would want to take note of this in our sampling – in
advance!
We would not want to have each element of the sample
from a non-contaminated county;
we would not want to have each element of the sample
from a contaminated county;
we would want each part – contaminated and not –
represented in our sample.
5/25/2017
Sampling Strategies
13
To accomplish this representation, we could use a
stratified random sample.
Pristine
counties
Contaminated
counties
60 Pristine, 39 contaminated…
5/25/2017
Sampling Strategies
14
MathPrbrandInt(1, 60, 6)
(6 random pristine counties)
MathPrbrandInt(1, 39, 4)
(4 random contaminated counties)
5/25/2017
Sampling Strategies
15
Bravo!
5/25/2017
Sampling Strategies
16
In some circumstances we might have reason to
believe that the variability in the state is captured
in each region of the state.
As an example, consider the quadrennial blitz
known as the presidential primary season. All the
candidates don boots and overalls and milk the
standard cow.
This event generally causes all the news channels to
take a poll of Iowans on their opinions about the
milking technique of the candidates.
5/25/2017
Sampling Strategies
17
Newspersons would probably want to spend little time
“down on the farm,” and simple random sampling
could result in lots of drive time! So some sort of
improvement on the simple random sample is desired.
If the variability and representativeness (?) in the
state is captured in each region, why not just
randomly pick a few regions in the state ?
Why not, that is, take a “cluster sample?” (a random
sample of regions.)
5/25/2017
Sampling Strategies
18
randInt(1, 9, 2)L1
5/25/2017
Sampling Strategies
19
A special cluster sample: The transect.
5/25/2017
Sampling Strategies
20
5/25/2017
Sampling Strategies
21
5/25/2017
Sampling Strategies
22
Some newspersons might be unable to follow
complex directions. It is possible, however, they
can at least count up to some relatively small
number.
In this situation, systematic random sampling
might be considered.
5/25/2017
Sampling Strategies
23
The Systematic Sample: Getting it done
1. Decide on the sampling fraction. (Judgment)
2. Decide on a starting point.
(Random!)
3. Count off by n’s… (Arithmetic)
5/25/2017
Sampling Strategies
24
Systematic – every 11
5/25/2017
Sampling Strategies
25
Systematic – every 5
5/25/2017
Sampling Strategies
26
Systematic – alphabetical, every 9
5/25/2017
Sampling Strategies
27
Questions before practice?
5/25/2017
Sampling Strategies
28
A Review of Sampling Strategies:
Simple Random Sample: The Basic Strategy, requires
a list
Stratified Random Sample: Capitalizes on pockets of
homogeneity
Cluster sample: Capitalizes on there being NO pockets
of homogeneity
Systematic sample: (Alleged) Population arrives
serially
5/25/2017
Sampling Strategies
29
Problem #1: The Cultured Crowd
5/25/2017
Sampling Strategies
30
Problem #2: Some populations are elusive
and/or difficult to sample:
5/25/2017
Sampling Strategies
31
Problem #2:
Pick your difficult population…
1. Homeless
2. Illegal aliens
3. Teen texters in school
5/25/2017
Sampling Strategies
32
Problem #3:
The Case of the Fiddler Crab…
(Uca pugilato)
5/25/2017
Sampling Strategies
33
Path integration, eh?
5/25/2017
Sampling Strategies
34
The Sex Ratio of Fiddler Crabs?
Just to be clear, the sex ratio we’re talking
about is
•
•
5/25/2017
males / females,
NOT # events / time!!!
Sampling Strategies
35
Just the facts, ma’am…
That big claw is for courtship & fighting, but is
dysfunctional for foraging. (Males fight & forage more?)
Crabs outside burrows are susceptible to predation.
Males are territorial and promiscuous.
Females forage closer to water sources than males.
Breeding females may be smaller.
5/25/2017
Sampling Strategies
36
The end!
5/25/2017
Sampling Strategies
37