Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Chapter Eleven Sampling Fundamentals Sampling Fundamentals • Population – The set of all objects that possess a common set of characteristics w.r.t. a marketing research problem • Sample – A subset of the population of interest • Census – When the entire population is surveyed • Parameter – A statistic generated from a census • Statistic – A statistic generated from the sample • Question: What would you survey if you wanted zero sampling error? The One and Only Goal in Sampling!! Select a sample that is as representative as possible. So that an accurate inference about the population can be made – goal of marketing research Sampling Fundamentals • When Is Census Appropriate? – when the population size is small – when the cost of making a sampling error is unacceptable • When Is Sample Appropriate? – when the population size is large and cannot be accessed in a reasonable amount of time and cost – when the population is reasonably homogeneous Error in Sampling • Total Error – Difference between the true value (in the population) and the observed value (in the sample) of a variable • Sampling Error – Error due to sampling (depends on how the sample is selected, and its size) • Non-sampling Error (dealt with in chapter 4) – Measurement Error, Data Recording Error, Data Analysis Error, Nonresponse Error • Trade-off decision between sampling and non-sampling errors to decide sample size Sampling Process: Identify Population • Example: For a toy store – all households with children living in Charlotte • Ambiguous with respect to sampling units (children’s ages) and geographical coverage (MSA or just metro area) • Who in the HH will provide information • Be as specific as possible • Question: For a small bookstore in RH specializing in romance novels, define the population. Sampling Process: Determine sampling frame • List of population members used to obtain the sample from • Example – to address a population of all advertising agencies in the US, the sampling frame would be the Standard Directory of Advertising Agencies • Availability of lists is limited, lists may be obsolete and incomplete Problems with sampling frames • Subset problem – The sampling frame is smaller than the population – Another sampling frame needs to be tapped • Superset problem – Sampling frame is larger than the population – A filter question needs to be posed • Intersection problem – A combination of the subset and superset problem – Most serious of the three Problems with sampling frames • Population: all RH residents. Sampling frame: Phone book. Which problem do we have? • Population: all advertising agencies with annual billings over $1 million. Sampling frame: The Standard Directory of Advertising Agencies. Which problem do we have? If all ad agencies are not required to register with the Directory, which problem do we have? Sampling Process: Sampling Procedure Probability Sampling • Each member of the population stands an equal chance of getting into the sample • Preferred due to greater representativeness Nonprobability Sampling • Convenience sampling – some members stand a better chance of being sampled than others Sampling Procedure Probability Sampling Sampling Procedures -Simple Random Sampling -Systematic Sampling -Stratified Sampling -Cluster Sampling Here’s the difference! Non-Probability Sampling -Convenience Sampling -Judgmental Sampling -Snowball Sampling -Quota Sampling Probability Sampling: Each subject has the same non-zero probability of getting into the sample! Probability Sampling Techniques Simple Random Sampling • Each population member, and each possible sample, has equal probability of being selected • Using a random numbers table can help – Computer generated – Knowledge of a string of ten numbers gives no indication of what the eleventh number could be Probability Sampling Techniques • Accuracy – cost trade off • Sampling Efficiency = Accuracy/Cost – Sampling efficiency can be increased by either reducing the cost, increasing the accuracy or doing both – This has led to modifying simple random sampling procedures Probability Sampling Techniques Stratified Sampling • The chosen sample is forced to contain units from each of the segments or strata of the population • Sometimes groups (strata) are naturally present in the population • Between-group differences on the variable of interest are high and within-group differences are low • Then it makes better sense to do simple random sampling within each group and vary within-group sample size according to – Variation on variable of interest – Cost of generating the sample Stratified Sampling – what strata are naturally present • Winthrop students: their attitudes towards romance novels • Winthrop students: their attitude towards marketing • Winthrop students: their knowledge about football • Winthrop students: their attitudes towards the food in Thompson Hall • Increases accuracy at a faster rate than cost Directly Proportionate Stratified Sampling Consumer type Group size Brand-loyal 400 10 Percent directly proportional stratified sample size 40 Variety-seeking 200 20 Total 600 60 Inversely Proportional Stratified Sampling • 600 consumers in the population: • 200 are heavy drinkers • 400 are light drinkers. • If heavy drinkers opinions are valued more and a sample size of 60 is desired, a 10 percent inversely proportional stratified sampling is employed. Selection probabilities are computed as follows: Denominator 600/200 + 600/400 = 3 + 1.5 = 4.5 Heavy Drinkers proportion and sample size 3/ 4.5 = 0.667; 0.667 * 60 = 40 Light drinkers proportion and sample size 1.5 / 4.5 = 0.333; 0.333 * 60 = 20 Probability Sampling Techniques Cluster Sampling • Involves dividing population into subgroups • Random sample of subgroups/clusters is selected and all members of subgroups are interviewed • Advantages – Decreases cost at a faster rate than accuracy – Effective when sub-groups representative of the population can be identified Cluster Sampling • Geography knowledge of all middle school children in the US • Attitudes to cell phones amongst all college students in the US • Attitudes to football amongst all college students in the US • Combine cluster and stratified sampling A Comparison of Stratified and Cluster Sampling Stratified sampling Cluster sampling Homogeneity within group Homogeneity between groups Heterogeneity between groups Heterogeneity within groups All groups are included Random selection of groups Sampling efficiency improved by increasing accuracy at a faster rate than cost Sampling efficiency improved by decreasing cost at a faster rate than accuracy. Probability Sampling Techniques • Systematic Sampling – Systematically spreads the sample through the entire list of population members – E.g. every tenth person in a phone book – Bias can be introduced when the members in the list are ordered according to some logic. E.g. listing women members first in a list at a dance club. – If the list is randomly ordered then systematic sampling results closely approximate simple random sampling – If the list is cyclically ordered then systematic sampling efficiency is lower than that of simple random sampling Non-Probability Sampling • Benefits – Driven by convenience – Costs may be less • Common Uses – Exploratory research – Pre-testing questionnaires – Surveying homogeneous populations – Operational ease required Non-Probability Sampling Techniques • Judgmental – Selected according to ‘expert’ judgment • Snowball – Each sample member is asked to recommend another – Used when populations are highly specialized / niched • Convenience – ‘whosoever is convenient to find’ • Quota – Judgment sampling with a stipulation that the sample include a minimum number from each specified sub-group Sampling Process Identify Target Population Determine Sampling Frame Select Sampling Procedure Determine Sample Size Determining Sample Size – Ad Hoc Methods • Rule of thumb – Each group should have at least 100 respondents – Each sub-group should have 20 – 50 respondents • Budget constraints – The question then is whether the study can be modified or cancelled • Comparable studies – Find similar studies and use their sample sizes as guides Factors determining sample size • Number of groups and sub-groups in the sample that are to be analyzed • Value of the study and accuracy required • Cost of generating the sample • Variability in the population Revisit definitions • Mean: the arithmetic average of scores on a variable – Only interval / ratio level data – Categorical data - Mode • Variance: the average value of the dispersion (spread) of squared scores on a variable. Based on how a response differs from the average response • Standard deviation: Square root of the variance Basic Statistics Mean Variance Standard Deviation Sample Size Population 2 N 1 n C = S Ci n i =1 n 1 2 2 = S C C s ( ) i n - 1 i =1 Sample X s2 s n Sampling distribution of the means • In most MR problems we are interested in knowing the mean. (e.g. mean attitude scores, mean sales, etc.). • We want an good estimate of the population mean • Since the population mean is generally unknown, we must select the sample with care so that the sample mean will be the closest approximation to the population mean Sampling distribution of means • E(X bar) = µ – Sampling distribution of means – Larger sample sizes give a better approximation of the sampling distribution of means to the normal distribution Sampling distribution of means • ơ(X bar) = ơ / sq. root of n – I.e. standard deviation of the sampling distribution of means (a.k.a. standard error of the mean) will equal the standard deviation of the population divided by the square root of the sample size – I.e. the greater the n, the smaller the ơ of the sampling distribution and more closer the approximation to the population standard deviation • Therefore random sampling with a larger sample size gives a more accurate estimate Normal Distribution The entire area under the curve adds up to 100% Example Histogram 40 30 20 Percent 10 0 1.00 Q11ATRAN 2.00 3.00 4.00 5.00 6.00 7.00 Interval Estimation • X bar varies from sample to sample • The difference between the sample mean (X bar) and the population mean (µ) is the sampling error • X + sampling error = interval estimate of population mean Interval Estimate of the Population Mean • X + sampling error Or x + z x / n n - sample size Size of Interval Estimate • Confidence level (e.g. 90%, 95%, 99%, etc.) – The number of times the population mean must fall within the confidence interval after repeated samplings – Lower confidence levels mean smaller sample sizes and smaller intervals; Higher confidence levels mean larger sample sizes and larger intervals • Population standard deviation – Generally unknown – Estimated from a previous study, a pilot, judgment or a worst case scenario Sample - Size Question • Size of the sampling error that is desired • Confidence level • Sample size n = Z2 2 /(sampling error)2