Download The Importance of Sampling

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Statistical inference wikipedia , lookup

Gibbs sampling wikipedia , lookup

Misuse of statistics wikipedia , lookup

Sampling (statistics) wikipedia , lookup

Transcript
The Importance of Understanding Sampling
In Research with a Focus on Business and
Human Resource Development in Thailand
By
Arthur Dryver, Wasita Boonsathorn,
and Kanogporn Narktabtee
National Institute of Development Administration
Outline
 What is sampling?

A representative sample
 Various sampling designs
 The role of sampling in quantitative research
 Example of the dangers of convenience sampling in relations to
quantitative research
 With a focus on Business and Human Resource
Development in Thailand
 Concluding remarks
What is sampling?
 A sample in the very general sense is a set of units observed
from the all possible units.
 The desire in taking a sample is to learn about a larger
group, the population.
 The sampling frame is the set of units the researcher will take
the sample from.
 Ideally the sampling frame is the same as the population of
interest.
 In reality this is often not possible.
 The sampling design is the methodology in which the data is
collected.
 The sampling design can aid in obtaining a representative
sample of the population. That is a sample that’s attributes
are similar to the population of interest.
Sampling
 More important than sample size is how the sample was taken.
Example:
 Imagine if a survey of the 10,000 people and their attitude on
the sky train and how often they take the sky train was taken
from people as they were entering or exiting from different
locations of the sky train.
 Imagine the same survey taken of 10,000 people living in
Bang Na.
 Imagine if the same survey taken of 2,000 people from
various randomly chosen locations throughout Bangkok.
 From the latter examples it is clear that how the data is collected
will have a great impact on the findings
 Which survey results would you trust to represent people living
in Bangkok.
Various sampling designs - Simple
Random Sampling (SRS)
 Simple Random Sampling (SRS)
 A simple random sample is a sample in which all
units in the sampling frame have an equal
probability of selection.
 Many statistical tests have certain assumptions that
they rely on and these assumptions are often met
when a simple random sample is taken.
 If the researcher wanted to collect a simple random
sample of people in Bangkok, the researcher would
need a list of all people in Bangkok.


Where would this list come from?
A telephone list, is only a list of all people in Bangkok
with a telephone.
Various sampling designs - Stratified
Sampling
 Stratified Sampling


The population is separated in groups or strata
and from within each strata a SRS is taken.
Again where would this list come from for each
strata to perform a SRS within each strata?
Various sampling designs Convenience Sampling
Convenience Sampling

A sample collected by what is convenient


For example, collecting surveys from a shopping
mall, yielding a lot of data at a low price.
Note: statistical tests are inappropriate when
performed on a convenience sample
The role of sampling in quantitative
research
 Statistics is at the heart of quantitative research and
sampling is a very important part of statistics.
 There is an old saying “Garbage in garbage out
(G.I.G.O.).”
 For understanding G.I.G.O. in reference to statistics
and sampling the reader can think of how a “garbage”
sample would yield “garbage” statistical results.
The role of sampling in quantitative
research
 For many research projects collecting data takes
a large portion of the overall time of the project.


After collecting and entering the data using
statistical software packages, such as SPSS or
Minitab, the statistics can be calculated within
minutes.
A very important fact though is that getting an
answer and getting the right answer are not the
same thing.

Most evident when thinking of exams.
 Think about G.I.G.O. before deciding how to
collect the data.
Example
 The authors have created a fictitious population consisting of 6
companies with large, medium, and small market capitals and varying
annual revenue.
 The example research question is to estimate the average annual
revenue of all companies in the population.
 The population mean annual revenue, μ, equals 5,283,333 for this
example.
Example (continued) SRS of size 3
Example (continued) SRS of size 3
excluding large Market Cap Companies
A bad sampling frame
A convenience sample – often results in giving many units
in the population a probability of selection equal to 0. In a
true convenience sample (which this is not) the researcher
does not know the probability of selection.
Example (continued) Stratified sample of
size 3, Strata is Market Cap (s.,m.,l.)
With a different sampling
design a different formula
for estimating the
population mean is
required.
Example (continued) Comparison of
the sampling strategies (Pop. Mean, μ = 5,283,333)
Unbiased
The maximum is
less than the
population mean.
• SRS is unbiased but has a large standard deviation.
•Stratified is unbiased with a much smaller standard deviation.
•SRS Excluding large market cap. is very biased, more than 50% and
has the smallest standard deviation, adding to how misleading the
results are.
Concluding Remarks
 In real life often sampling is driven by funding – monetary concerns.

It is easy to understand the cost and the sample size but not as
easy to understand the importance of proper sampling versus
convenience sampling.
 In real life only a single sample is taken and the difference from the
estimate and that of the truth can’t be quantified.
 Another reason many people go for quantity.
 Before collecting data think G.I.G.O. - quality over quantity

Statistical tests – p-values may often be performed/calculated using
convenience samples but they truly have no meaning when
calculated on a convenience sample
 Finally, the researchers note that much is easier said than done. That
is, to take a proper sample is much easier said than done.