Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 8 Random Sampling: Planning Ahead for Data Gathering Chapter 8 Random Sampling and Sampling Distributions Using a Small Group to Represent the Population. page 8-1 Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-2 Importance of TS 3 “…the need of the great society for instruments of analysis by which an invisible and most stupendously difficult environment can be made intelligible.” Walter Lippman in Public Opinion Desired Properties of the Instrument 1. Efficiency a. Cost b. Speed – short cycle 2. Reasonable Reliability 3. Fairness Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-3 Target Populations for TQM Figure 1: Statistics for Total Quality Management Global Competition Customer Satisfaction Quality Quality Joiner Triangle Scientific Scientific Approach Approach All AllOne One Team Team 1. PDCA Cycle 2. Data Driven Statistical Process Management Employee Survey Customer Survey Statistical Thinking for Management 1. Identify the relevant population 2. Collect data using a valid design 3. Make inference on the key characteristics of the target population 4. Formulate actions using sensible criteria, including experimentation for process improvement Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-4 Strategy for Tracking Target Populations 1. Focus on a few “key parameters.” The population mean The population proportion 2. Collect a set of “representative” observations, the sample size, n. 3. “Approximate” the parameter value, with a reasonable accuracy. “Statistical Inference” the instrument for implementing this strategy. Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-5 Examples 1. CFO of a credit card company wants to determine the proportion of cardholders who pays more than the required minimum monthly payment. 2. A national credit company wants to learn the proportion of people using the company’s credit cards for rental payment. 3. A marketing manager of a cruise company would like to know the average family expenditure for vacation for the target customer group. 4. A newly opened up-scale supermarket wants to fined out the percentage of the customers in its trade area who redeem store coupons for free grocery items. 5. Human Resource Director wants to learn the percentage of employees who believe that top management reads suggestions by employees. 6. TIME/CNN wants to report the popularity of a certain political measure among US voters. Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-6 Summary of Examples Target Population Parameter Estimator 1 All cardholders of the bank Proportion Sample Proportion 2 All cardholders of the credit company Proportion Sample Proportion 3 All customers of the target group 4 All customers in the trade area Proportion Sample Proportion 5 All reported maintenance Expenditures Total expend = Sample Mean All Employees Proportion.. Sample Proportion 6 Your Own? Mean Vacation Expenditure N × Mean, μ Sample Mean Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-7 INF_TS: Statistical Inference Tool Set 1. Random Sampling For fair representation, and also for assuring the reasonable accuracy 2. Sampling Distribution What is the expected size of the estimation error?” 3. Inference Procedure How to account for the estimation error in the statement about the population parameter? (a) Confidence Interval (b) Hypothesis Testing Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-8 Relating Key Concepts of Ch. 8 To Probability Notions in Ch. 7 Chapter 8 Chapter 7 1. Random Sampling 1. Random Experiment Betting on a roulette game n times 2. Estimator / Statistic 2. Random Variable The average gain per bet 3. Sampling Distribution 3. Probability Distribution Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-9 INF_TS1: Random Sampling Definition: A random sample must satisfy: Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-10 Tools for Selecting a Random Sample (a) Frame (b) Random number table Random Number Table (a) Properties (b) How To Use Example: N = 870 – 3 digits Starting point Selection route Table 8.2.1 • For example – Starting in row 21, column 3 – We find 52794, then 01466 19 20 21 22 23 24 17594 09584 81677 45849 97252 26232 10116 23476 62634 01177 92257 77422 55483 09243 52794 13773 90419 76289 96219 65568 01466 43523 01241 57587 85493 89128 85938 69825 52516 42831 96955 36747 14565 03222 66293 87047 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 1 51449 16144 48145 83780 95329 11179 94631 64275 72125 16463 10036 85356 84076 76731 19032 72791 11553 71405 17594 09584 81677 45849 97252 26232 87799 46120 53292 81606 67819 50458 59772 94752 01885 85190 97747 43318 47874 24259 31947 37911 82714 82927 65934 56953 16278 96339 84110 49017 43560 25206 2 39284 56830 48280 48351 38482 69004 52413 10294 19232 42028 66273 51400 82087 39755 73472 59040 00135 70352 10116 23476 62634 01177 92257 77422 33602 62298 55652 56009 47314 20350 27000 91056 82054 91941 67607 84469 71365 48079 64805 93224 15799 37884 21782 04356 17165 95028 49661 60748 05552 15177 3 85527 67507 99481 85422 93510 34273 31524 35027 10782 27927 69506 88502 55053 78537 79399 61529 28306 46763 55483 09243 52794 13773 90419 76289 01931 69126 11834 06021 96988 87362 97805 08939 45944 86714 14549 26047 76603 71198 34133 87153 93126 74411 35804 68903 67843 48468 13988 03412 54344 63049 4 67168 97275 13050 42978 39170 36062 02316 25604 30615 48403 19610 98267 75370 51937 05549 74437 65571 64002 96219 65568 01466 43523 01241 57587 66913 07862 47581 98392 89931 83996 25042 93410 55398 76593 08215 86003 57440 95859 03245 54541 74180 45887 36676 21369 49349 12279 75909 09880 69418 12464 5 91284 25982 81818 26088 63683 26234 27611 65695 42005 88963 01479 73943 71030 11680 14772 74482 34465 62461 85493 89128 85938 69825 52516 42831 63008 76731 25682 40450 49395 86422 09916 59204 55487 77199 95408 34786 49514 94212 24546 57529 94171 36713 35404 35901 90163 81039 35580 94091 01327 16149 6 19954 69294 25282 17869 40587 58601 15888 36014 90419 79615 92338 25828 92275 78820 32746 76619 47423 41982 96955 36747 14565 03222 66293 87047 03745 58527 64085 87721 37071 58694 77569 04644 56455 39724 46381 38931 17335 55402 48934 38299 97117 52339 69987 86797 97337 56531 18426 90052 07771 18759 7 91166 32841 66466 94245 80451 47159 13525 17988 32447 41218 55140 38219 55497 50082 38841 05232 39198 15933 89180 63692 79993 58458 14536 20092 93939 39342 26587 50917 72658 71813 71347 44336 56940 99548 12449 34846 71969 93392 41730 65659 31431 68421 52268 83901 35003 10759 29038 43596 25364 96184 8 70918 20861 24461 26622 43058 82248 43809 02734 53688 43290 81097 13268 97123 56068 45524 28616 54456 46942 59690 09986 44956 77463 23870 92676 07178 42749 92289 16978 53947 97695 62667 55570 68787 13827 03672 28711 58055 31965 47831 00202 00323 35968 19894 68681 34915 19579 79111 21424 77373 15968 9 85957 83114 97021 48318 81923 95968 40014 31732 36125 53618 73071 09016 40919 36908 13535 98690 95283 36941 82170 47687 82254 58521 78402 12017 70003 57050 41853 39472 11996 28804 09330 21106 36591 84961 40325 42833 99136 94622 26531 07054 62793 67714 81977 02397 91485 00015 56049 16584 34841 89446 10 19492 12531 21072 73850 97072 99722 30667 29911 28456 68082 61544 77465 57479 55399 03113 24011 54637 93412 77643 46448 65223 07273 41759 43554 18158 91725 38354 23505 64631 58523 02152 76588 29914 76740 77312 93019 73589 11673 02203 40168 11995 05883 87764 55359 33814 22829 96451 67970 75927 07168 Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-11 INF_TS2: Sampling Distribution Definition: The probability distribution of the estimator. Describes the performance of the estimator, over the repeated independent samplings from the target population. Visualizing: The Population you do Sample n units Statistic (estimator) you ne gi a m i … Sample n units Sample n units … Sample n units … Statistic (estimator) Statistic (estimator) … Statistic (estimator) A histogram of these imagined values represents the sampling distribution of this statistic Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-12 The Sampling Distribution – Key Results A. Expected Value and SD Parameter Desired Population Mean, Estimator Sample Mean, Expected Value SD X Population Proportion, Sample Proportion, 1 n n p 1. The sample mean (proportion) is unbiased. 2. The accuracy of the sample mean or the sample proportion increases, as the sample size increases Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-13 B. Central Limit Theorem Predicts the shape of the sampling distribution • Individuals in population Highly non-normal distribution Mean , standard deviation 3000 0 0 • Averages of n = 3 individuals Non-normal, but less so Same mean Smaller std. deviation X / 3 $1k $2k $1k $2k $1k $2k 2000 1000 0 0 • Averages of n = 10 individuals Close to normal Same mean Smaller std. deviation 1000 X / 10 0 0 Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-14 Relating to Las Vegas Roulette Play the roulette n times, and compute the average gain per play is: equivalent to: Take a random sample of size n, and compute the sample mean. Example: Survey of Family Vacation Expenditure The corresponding roulette game: The die has N faces, each face shows the vacation expenditure of a household. xi, the face i value = the expenditure of the household i. You spin the wheel, so that each face appears equally likely, with probability 1/N. Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-15 Roulette for Random Sampling Game: You Play n Times x1 x2 xi Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-16 Analysis: Let X= the outcome of one spin 1 1 x E(X) = N 1 N x2 1 xN population mean N 2 1 2 1 2 1 2 x x ... x 1 2 N Var(X) = N N N (population variance) SD(X) = (population standard deviation) The Total for playing n games: E Total n SD Total n The average per each game: Total n E X n n Total n SD X n n n Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-17 INF_TS2A: Standard Error Definition: Estimated Standard Deviation of the Estimator A. Standard Error of the Mean = Standard Deviation of the Average = – S n n Problem: cannot be computed without population parameters Example Sample Size n = 100 Sample Average X = $633.91 Sample Standard Deviation S = $311.49 What is the expected estimation error? Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-18 B. Standard Error of the Proportion, Standard Deviation of p = Problem: Example 1 n p 1 p n Chapter 8 Random Sampling: Planning Ahead for Data Gathering page 8-19 Appendix Terminology of Sampling Population Size = N Sample Size = n Census Representative Sample (Random Sample) Population Sample Chapter 8 Random Sampling: Planning Ahead for Data Gathering Terminology of Sampling (cont.) Biased Sample: Non-random sample Sampling With/Without Replacement Frame Pilot Study page 8-20 Chapter 8 Random Sampling: Planning Ahead for Data Gathering Parameter, Statistic & Estimator Parameter and Statistic Estimator and Estimate Estimation Error (Sampling Error) page 8-21