Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 8 – Sample Size Determination Introduction What is a sample size? What does it have to do market research and the population that is involved in the research? Why is sample size important? These things are all looked into through various methods. The sample size itself is dependant of the budget and degree of confidence in the sample size. The reliability, sampling error, and confidence levels of a sampling are all dependant on how much time, money, and effort the researchers would like to invest. The sample size is a very crucial part as the research must be based on a feasible amount of samples; in other words, there cannot be too little of a sample size or too large of a sample size, it must be between the two extremes and at a optimal point in order for the research to be successfully done. Purpose of Sample Size Determination The purpose of sample size determination is to determine what percentage of a population should be reached for information and data collection. Sample size determination is an important step in planning a statistical and professional marketing research; it can help researchers identify the research project and accomplish it successfully. As the researchers, it is necessary to provide the relative information, such as data analysis and questionnaire surveys, in order to determine the sample size. The main purpose of sampling size determination is to reduce the need for empirical observation; asking yourself, how can a small group can be a sample without losing its serviceability? In other words, what is the smallest number of test subjects in a group which can be enough to provide responsible data? Actually, the aim of sample size determination is being able to plan the studies of statistical data analysis on how many subjects are needed to achieve a desired precision. Factors Determining the Sample Size The key to any qualitative and quantitative research is to generate enough data so that the objective(s) of the study are satisfied. Therefore, it is essential to obtain an appropriate sample size that will generate sufficient data. Sample size is often determined by, but is not limited to the following factors: The purpose and expected results of the research (Shukla 58) The type of study that is being done – for example qualitative studies and case studies tend to use very small samples however, descriptive studies and correlational studies often require very large samples (Nisha 8) The degree of variability in the population – the more the variability, the larger the sample size will need to be (Shukla 58) The likely response rates – if these are believed to be low, the sample will need to be larger (Nisha 7) The required limit of accuracy or sampling error – the higher the accuracy, the larger the sample size is required (Shukla 58) The required level of confidence that the results will fall within a certain range – a higher level required, the larger the sample size is required (Shukla 58) The incident rate of the characteristic being researched – if this is common, the sample may be smaller (Nisha 7) The number of subgroups with the data – the smaller groups will have larger sampling errors and a larger sample might be needed to ensure that subgroups can be effectively analyzed (Shukla 58) Budget – always a factor in marketing decisions; the higher the sample size, the greater the cost (Nisha 8) Timings – the larger the sample size, the longer it takes to gather data and complete the analysis (Nisha 8) The risks attached to any decision – the greater the risk, the higher the level of accuracy is required (Nisha 7) The nature of the research may indicate complex analysis of sub-samples (Shukla 58), for example women as opposed to men buying a certain product; if this is the case the sub-samples need to be large enough to ensure statistical reliability Sub-Sampling Sizes Definition of subsampling: second round sampling based on the original sample. It can be once or more than once. -Generally, subsample size must be smaller than parent sample. Bid size makes small deviation with the parent sample; small size makes big deviation with the parent sample. A group of subsampling improves the small size subsample deviation problem. -Sometime, subsampling size need to focus on the research topic of the parent sample. For example, the topic is “average income after tax in the USA”. Then the population in different state is different, therefore, base on the population rate of each state, make different size of subsamples. It makes the analysis outcome more close to reality. Table 1. relationship between total population, sample, and subsamples. *The biggest circle is total population; the medium size circle is sample, the three small circles represent subsamples. Why subsampling? -When the parent sample has large size. Under this situation, random samples are used for sampling, subsampling would be an equivalent way to improve statistic and analysis efficiency. For example, researcher gathered 10000 street questionnaires which with the topic “diet structure (vegetarian or not) in age range 20-45 in BC”; randomly pick 500 survey as a subsample; we believe the result of the subsample can represent the sample’s analysis outcome, deviation is existing but acceptable. In this position, if more than one subsample are taken, the results between subsamples are very close. -When the sample contains hierarchical structure. For example, a research topic “structure of income after tax in Canada”; researchers need to classify interviewees into different categories, such like rich, middle class, normal workers, people with no job. Subsamples are taken from each category, and each subsample narrows down the topic and helpful for further analysis the research topic. Preliminary Sampling Purpose of Preliminary Sampling Pretesting is an important step in determining the sample size of a population. It gives researchers an idea of what results to expect and how to construct the sample in order to obtain results that are accurate and interpretable. For example, pretesting of a sample size is commonly used in studies that use questionnaires to obtain results. Questionnaires as data collecting method are prone to many problems such as the misunderstanding of the questions, confusion over a specific word, navigation from question to question, and the overall formatting of questions. Researchers hope that by conducting a test run, they can make the necessary adjustments in time for the real test (What Is A Survey). Benefits of Preliminary Sampling There are many benefits to conducting a preliminary sample. First, preliminary sampling helps confirm that the population being observed is also the one being sampled and that its distribution is being measured. Second, preliminary sampling helps reduce the amount of useless data collected, which can be commonly collected through ill sample preparation and ineffective equipment. This will help save time when collecting and organizing data and ensure that the labour hours are instead being put to use in other more important tasks. Lastly, any issues regarding the sampling design or methods used during the process that become apparent during the pretest can be modified to achieve the most accurate results during the actual sampling process (Preliminary Sampling). What is a sample size? A sample size is the number of respondents needed in any given study to give an accurate representation of the attitudes, opinion, beliefs, habits or characteristics of a given population. The appropriate sample size to be used is directly related to the type of research that is being accomplished. The accuracy of the research conducted tends to improve as the sample size increases. If your sample size is too small, your findings may misrepresent the population but if your sample size is too large, you could waste valuable resources. The Confidence Level The confidence level reflects the certitude that the answers of the sample truly reflect the answers of the total population. (Israel) Most often, a 95% confidence level is sufficient for making business decisions. However, some companies choose a 90% confidence level due to timing, budgets, and response rates. (Israel)Confidence intervals give us an estimate of the amount of error involved in our data. They tell us about the precision of the statistical estimates (e.g., means, standard deviations, correlations) we have computed. (Israel)Confidence intervals are related to the concept of the power. The larger the confidence interval the less power a study has to detect differences between respondents in survey research. (Israel) Here are the z-scores for the most common confidence levels: 90% – Z Score = 1.645 95% – Z Score = 1.96 99% – Z Score = 2.326 The formula for confidence interval is thus as follows: Finding Error or Confidence Interval When sample data is collected and the sample mean typically different from the population mean is calculated, that sample mean is . (How to)This difference between the sample and population means can be thought of as an error. (How to) The margin of error maximum difference between the observed sample mean population mean is the and the true value of the : 𝐸 = 𝑍𝑎⁄2 𝜎 √𝑛 where: 𝑍𝑎⁄2 is known as the critical value, the positive value that is at the vertical boundary for the area of 𝑍𝑎⁄2 in the right tail of the standard normal distribution. is the population standard deviation. is the sample size. Sample Question: Suppose you’re the manager of an ice cream shop, and you’re training new employees to be able to fill the large-size cones with the proper amount of ice cream (10 ounces each). You want to estimate the average weight of the cones they make over a one-day period, including a margin of error. Instead of weighing every single cone made, you ask each of your new employees to randomly spot check the weights of a random sample of the large cones they make and record those weights on a notepad. For n = 50 cones sampled, the sample mean was found to be 10.3 ounces. Suppose the population standard deviation is 0.6 ounces. What’s the margin of error? (Assume you want a 95% level of confidence.) (Rumsey) Solution: (Rumsey) 𝑧= 𝜎 √𝑛 = 1.96 0.6 √50 = (1.96)(0.0849) = 0.17 Formula For Calculating A Sample 𝑛=[ 𝑍𝑐 𝜎 2 𝐸 ] Where: is the sample size. 𝑍𝑐 is known as the critical value, the positive value that is at the vertical boundary for the area of 𝑍𝑐 in the right tail of the standard normal distribution. is the population standard deviation. E is the margin of error Sample Question: The college president asks the statistics teacher to estimate the average age of the students at their college. How large a sample is necessary? The statistics teacher would like to be 99% confident that the estimate should be accurate within 1 year. From a previous study, the standard deviation of the ages is known to be 3 years. (Bluman) Solution: 2 (2.58)(3) 𝑍𝑐 𝜎 2 𝑛=[ ] = [ ] = 59.9 𝐸 1 Substituting in the formula, one gets which is rounded up to 60. Therefore, to be 99% confident that the estimate is within 1 year of the true mean age, the teacher needs a sample size of at least 60 students. (Bluman) Determining Sample Size for a proportion The population of interest has certain characteristics that appear within a certain proportion p. To estimate p we draw a representative sample of size n from the population and count the number of individuals (X) in the sample with the characteristic we are looking for. This gives us the estimate X/n. 𝑝= 𝑋 𝑐𝑜𝑢𝑛𝑡 = 𝑛 𝑡𝑜𝑡𝑎𝑙 That is: total number of successes/total number of observations in the sample. 2 (𝑧𝑎/2 ) 𝑝(1 − 𝑝) 𝑛= 𝐸2 Where N is the sample size is known as the critical value, the positive area of value that is at the vertical boundary for the in the right tail of the standard normal distribution. E is the margin of error 𝑝 is the proportion that doesn’t support your hypothesis P is the proportion that do support your hypothesis Sample Question: (Anderson) A national survey of 900 women golfers was conducted to learn how women golfers view their treatment at golf courses in the United States. The survey found that 450 of the women golfers were satisfied with the availability of tee times. Solution: 2 (𝑧𝑎/2 ) 𝑝(1 − 𝑝) (1.96)2 (. 5)(1 − .5) 𝑛= = = 1536.6 (. 0252 ) 𝐸2