Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Cluster Sampling Module 3 Session 8 1 Purpose of the session To demonstrate how a cluster sample is selected in practice To demonstrate how parameters are estimated under cluster sampling We do this for clusters of same size and clusters of different sizes. The practicalities of cluster sampling is also discussed. 2 Introduction - Simple random sampling not always appropriate! Example Population of N=324 households Households arranged into 36 “villages” of 9 households each Costly to travel between villages Cheap to travel between households in a village Taking a SRS of n=27 households is a “costly” strategy 3 Cluster sampling Example (cont.) Each village is a primary sampling unit (PSU) Each household in a village is a secondary sampling unit (SSU) Take a sample of villages Sample all households within the selected villages This is one-stage cluster sampling. 4 Cluster sampling Cluster sampling is useful: • • • - Structure of the units is hierarchical (e.g. villages and households within villages) Sampling frame may not exist at SSU level (may only exist at PSU level) Cost e.g. in example, cluster sampling is cheaper than SRS for same sampling effort. 5 Illustration: Estimation Cluster sampling: 3 villages out of 36 selected using SRS. Income from sale of goods recorded for each household, and totalled up for village. Estimates: Mean village income is 256.7 Total income for area is 9240 6 In practice… Units in a cluster tend to be more similar to each other and different to units in other clusters Cluster sampling often leads to less precise estimates than SRS (opposite concept to stratification) Trade-off between convenience and precision: If cluster sampling cheap to do, could take larger sample to help improve precision. 7 Selecting the PSUs In this first (unrealistic) example, the villages all have the same number of households, hence we select villages using simple random sampling In general the PSUs (villages) may not have the same number of SSUs (households). Might then want to select PSUs using Probability proportional to size. gives large PSUs a greater probability of occurring in the sample than a small PSU 8 PPS Sampling (with replacement) Example: M=8 Villages (PSUs) of different sizes. Want to sample 3 of them (m=3). Assume interest is still in income from sale of goods (recorded for households and totalled for each village). Larger villages are likely to have higher incomes, and smaller villages lower incomes. 9 PPS sampling (cont) 240 households (SSUs) in the population arranged in the villages as follows: PSU (e.g. village no.) 1 2 3 4 5 6 7 8 SSUs (e.g. no. of h’holds) 10 10 20 20 40 40 50 50 Probability of village being selected (pi ) is: PSU 1 2 3 4 pi 1/24 1/24 1/12 1/12 5 1/6 6 1/6 7 8 5/24 5/24 10 PPS sampling (cont) Step 1: Calculate the cumulative sum of the SSUs PSU 1 2 3 4 5 6 7 8 Sum 10 20 40 60 100 140 190 240 Step 2: Draw a number at random from 1,2,…240 This determines which village is selected e.g. 48 would be in Village 4, and 190 in Village 7. 11 PPS sampling (cont) Step 3: Replace number and repeat to select other villages Three numbers may be 33, 174, 137 to give Villages 3, 7 and 6 Step 4: Sample all households in the selected villages The calculation of estimated total income for the area then weights according to the size of the village. 12