Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
統計學 授課教師:統計系余清祥 日期:2014年11月13日 第七章:抽樣與抽樣分配 Fall 2014 STATISTICS in PRACTICE • MeadWestvaco Corporation, a leading producer of packaging, coated and specialty papers, and specialty chemicals. • MeadWestvaco’s internal consulting group uses sampling to provide a variety of information that enables the company to obtain significant productivity benefits and remain competitive. • Data collected from sample plots throughout the forests are the basis for learning about the population of trees owned by the company. 2 Chapter 7 Sampling and Sampling Distributions • Selecting a Sample • Point Estimation • Introduction to Sampling Distributions • Sampling Distribution of x • Sampling Distribution of p • Properties of Point Estimators • Other Sampling Methods 3 Introduction -1 An element is the entity on which data are collected. A population is a collection of all the elements of interest. A sample is a subset of the population. The sampled population is the population from which the sample is drawn. A frame is a list of the elements that the sample will be selected from. 4 Introduction -2 The reason we select a sample is to collect data to answer a research question about a population. The sample results provide only estimates of the values of the population characteristics. The reason is simply that the sample contains only a portion of the population. With proper sampling methods, the sample results can provide “good” estimates of the population characteristics. 5 7.1 The Electronics Associates Sampling Problem • The director of personnel for Electronics Associates, Inc. (EAI), has been assigned the task of developing a profile of the company’s 2500 managers. • Suppose that the necessary information on all the EAI managers was not readily available in the company’s database. • The question is how the firm’s director of personnel can obtain estimates of the population parameters by using a sample of managers rather than all 2500 managers in the population. • Suppose that a sample of 30 managers will be used. How we can identify a sample of 30 managers. 6 7.2 Selecting a Sample • Sampling from a Finite Population • Sampling from an Infinite Population 7 Sampling from a Finite Population -1 • Finite populations are often defined by lists such as: – Organization membership roster – Credit card account numbers – Inventory product numbers • A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected. 8 Sampling from a Finite Population -2 • Replacing each sampled element before selecting subsequent elements is called sampling with replacement. • Sampling without replacement is the procedure used most often. • In large sampling projects, computergenerated random numbers are often used to automate the sample selection process. • Excel provides a function for generating random numbers in its worksheets. 9 Sampling from a Finite Population -3 (Random Numbers) 10 Sampling from a Finite Population -4 • Example: To select a simple random sample from the finite population of EAI managers, 1. We first construct a frame by assigning each manager a number, the numbers 1 to 2500. 2. We refer to the table of random numbers. We may start the selection of random numbers anywhere in the table and move systematically in a direction of our choice. 3. For example, 6327 1599 8671 7445 1102 1514 1807, include 1599, 1102, 1514, 1807 in the random sample ignore the numbers 6327, 8671 and 7445. 11 Sampling from a Finite Population -5 • Example: St. Andrew’s College – St. Andrew’s College received 900 applications for admission in the upcoming year from prospective students. The applicants were numbered, from 1 to 900, as their applications arrived. The Director of Admissions would like to select a simple random sample of 30 applicants. 12 Sampling from a Finite Population -6 • Example: St. Andrew’s College – Step 1: Assign a random number to each of the 900 applicants. The random numbers generated by Excel’s RAND function follow a uniform probability distribution between 0 and 1. – Step 2: Select the 30 applicants corresponding to the 30 smallest random numbers. 13 Sampling from an Infinite Population -1 • Sometimes we want to select a sample, but find it is not possible to obtain a list of all elements in the population. • As a result, we cannot construct a frame for the population. • Hence, we cannot use the random number selection procedure. • Most often this situation occurs in infinite population cases. 14 Sampling from an Infinite Population -2 • Populations are often generated by an ongoing process where there is no upper limit on the number of units that can be generated. • Some examples of on-going processes, with infinite populations, are: – – – – parts being manufactured on a production line transactions occurring at a bank telephone calls arriving at a technical help desk customers entering a store 15 Sampling from an Infinite Population -3 • In the case of an infinite population, we must select a random sample in order to make valid statistical inferences about the population from which the sample is taken. • A random sample from an infinite population is a sample selected such that the following conditions are satisfied. 1. Each element selected comes from the population of interest. 2. Each element is selected independently. To prevent selection bias. 16 7.3 Point Estimation -1 Point estimation is a form of statistical inference. In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter. We refer to mean µ. as the point estimator of the population s is the point estimator of the population standard deviation σ. is the point estimator of the population proportion p. 17 Point Estimation -2 • Example: the EAI Problem, annual salary and training program status for a simple random sample of 30 EAI managers 18 Point Estimation -3 • Example: the EAI Problem, to estimate the population mean, the population standard deviation and population proportion. 19 Point Estimation -4 • Example: St. Andrew’s College – Recall that St. Andrew’s College received 900 applications from prospective students. The application form contains a variety of information including the individual’s Scholastic Aptitude Test (SAT) score and whether or not the individual desires on-campus housing. – At a meeting in a few hours, the Director of Admissions would like to announce the average SAT score and the proportion of applicants that want to live on campus, for the population of 900 applicants. 20 Point Estimation -5 • Example: St. Andrew’s College – However, the necessary data on the applicants have not yet been entered in the college’s computerized database. So, the Director decides to estimate the values of the population parameters of interest based on sample statistics. The sample of 30 applicants is selected using computer-generated random numbers. 21 Point Estimation -6 • Example: St. Andrew’s College – x as Point Estimator of µ xi 50, 520 ∑ = x = = 1684 30 30 – s as Point Estimator of σ = s 2 ( x − x ) ∑= i 29 210, 512 = 85.2 29 – p as Point Estimator of p = p 20 = 30 0.67 – Note: Different random numbers would have identified a different sample which would have resulted in different point estimates. 22 Point Estimation -7 • Example: St. Andrew’s College – Once all the data for the 900 applicants were entered in the college’s database, the values of the population parameters of interest were calculated. – Population Mean SAT Score = µ x ∑ = i 900 1697 – Population Standard Deviation for SAT Score = σ 2 ( x − µ ) ∑ i = 87.4 900 – Population Proportion Wanting On-Campus Housing 648 = p = 0.72 900 23 Summary of Point Estimates Obtained from a Simple Random Sample Population Parameter Parameter Value µ = Population mean 1697 SAT score Point Estimator Point Estimate x = Sample mean 1684 85.2 SAT score σ = Population std. 87.4 s = Sample standard deviation for SAT score p = Population proportion wanting campus housing 0.72 0.67 p = Sample proportion wanting campus housing deviation for SAT score 24 Practical Advice The target population is the population we want to make inferences about. The sampled population is the population from which the sample is actually taken. Whenever a sample is used to make inferences about a population, we should make sure that the targeted population and the sampled population are in close agreement. 25 7.4 Introduction to Sampling Distributions -1 • Example: the EAI Problem, Values of x and p from 500 Simple Random Samples of 30 EAI Managers 26 Introduction to Sampling Distributions -2 • Example: the EAI Problem, Frequency and Relative Frequency Distributions of from 500 Simple Random Samples of 30 EAI 27 Introduction to Sampling Distributions -3 • Example: the EAI Problem, Relative Frequency Histogram of x Values from 500 Simple Random Samples of 30 each. 28 Introduction to Sampling Distributions -4 • Example: the EAI Problem, Relative Frequency Histogram of p Values from 500 Simple Random Samples of 30 each. 29 Introduction to Sampling Distributions -5 • From the approximation we observe the bellshaped appearance of the distribution. • In practice, we select only one simple random sample from the population. • We repeated the sampling process 500 times in this section simply to illustrate that many different samples are possible and that the different samples generate a variety of values for the sample statistics x and p . 30 7.5 Sampling Distribution of x -1 • Expected Value of x • Standard Deviation of x • Form of the Sampling Distribution of x • Practical Value of the Sampling Distribution of x • Relationship Between the Sampling Size and the Sampling Distribution of x 31 Sampling Distribution of x -2 • Process of Statistical Inference Population with mean µ=? The value of x is used to make inferences about the value of µ. A simple random sample of n elements is selected from the population. The sample data provide a value for the sample mean x . 32 Expected Value of x • The sampling distribution of x is the probability distribution of all possible values of the sample mean x . • Expected Value of x E( x ) = µ where: µ = the population mean • When the expected value of the point estimator equals the population parameter, we say the point estimator is unbiased. 33 Standard Deviation of x -1 • We will use the following notation to define the standard deviation of the sampling distribution of x . σx = the standard deviation of x σ = the standard deviation of the population n = the sample size N = the population size 34 Standard Deviation of x -2 • Standard Deviation of x – Finite Population N −n σ σx = ( ) N −1 n Infinite Population σx = σ n – A finite population is treated as being infinite if n/N ≤ 0.05. – ( N − n) /( N − 1) is the finite population correction factor. – σ x is referred to as the standard error of the mean. 35 Standard Deviation of x -3 • Using the following expression to compute the Standard Deviation of σx = where: σ n 1. The population is infinite; or 2. The population is finite and the sample size is less than or equal to 5% of the population size; that is, n/N ≤ 0.05. 36 Standard Deviation of x -4 • Example: the EAI problem, the standard deviation of annual salary for the population of 2500 EAI managers is σ = 4000. The population is finite with N = 2500. With n = 30, we have n/N = 30/2500 = 0.012. • The sample size = 0.112 ≤ 5%, we can compute the standard error: σ = x σ 4000 = = 730.3 30 n 37 From the Sampling Distribution of x -1 When the population has a normal distribution, the sampling distribution of is normally distributed for any sample size. In most applications, the sampling distribution of can be approximated by a normal distribution whenever the sample is size 30 or more. In cases where the population is highly skewed or outliers are present, samples of size 50 may be needed. 38 From the Sampling Distribution of x -2 The sampling distribution of can be used to provide probability information about how close the sample mean is to the population mean µ . 39 Central Limit Theorem • When the population from which we are selecting a random sample does not have a normal distribution, the central limit theorem is helpful in identifying the shape of the sampling distribution of x . CENTRAL LIMIT THEOREM In selecting random samples of size n from a population, the sampling distribution of the sample mean can be approximated by a normal distribution as the sample size becomes large. 40 Central Limit Theorem • Illustration of The Central Limit Theorem 41 Central Limit Theorem • Illustration of The Central Limit Theorem 42 Sampling Distribution of x for the EAI Problem • Example: the EAI problem, we previously showed that E( x ) = $51,800 and σ x = 730.3. 43 Practical Value of the Sampling Distribution of x -1 • Example: the EAI problem – What is the probability that the sample mean computed using a simple random sample of 30 EAI managers will be within $500 of the population mean? – Refer to the sampling distribution of shown again in Figure 7.5. With a population mean of $51,800, the personnel director wants to know the probability that x is between $51,300 and $52,300. 44 Practical Value of the Sampling Distribution of x -2 • Example: the EAI problem, Probability of a Sample Mean being within $500 of the Population Mean for a Simple Random Sample of 30 EAI Managers 45 Practical Value of the Sampling Distribution of x -3 • Example: the EAI problem – At upper endpoint x = 52300, 52300 − 51800 z = 0.68 = 730.30 – We find a cumulative probability P(z ≤ 0.68) = 0.7517 – At lower endpoint x = 51300, 51300 − 51800 z= = −0.68 730.30 – We find a cumulative probability P(z ≤ –0.68) = 0.2483 46 Practical Value of the Sampling Distribution of x -4 • Example: the EAI problem – Calculate the area under the curve between the lower and upper endpoints of the interval. P(– 0.68 ≤ z ≤ 0.68) = P(z ≤ 0.68) – P(z ≤ –0.68) = 0.7517 – 0.2483 = 0.5034 – A simple random mean of 30 EAI managers has a 0.5034 probability of providing a sample mean x that with $500 of the population mean. – There is 1 – 0.5034 = 0.4966 probability that the difference between x and µ = $51800 will be more that $500. 47 Practical Value of the Sampling Distribution of x -5 • Example: St. Andrew’s College Sampling Distribution of x for SAT Scores E( x ) = 1697 σ = x σ 87.4 = = 15.96 n 30 x 48 Practical Value of the Sampling Distribution of x -6 • Example: St. Andrew’s College – What is the probability that a simple random sample of 30 applicants will provide an estimate of the population mean SAT score that is within +/-10 of the actual population mean µ ? – In other words, what is the probability that x will be between 1687 and 1707? 49 Practical Value of the Sampling Distribution of x -7 • Example: St. Andrew’s College – Step 1: Calculate the z-value at the upper endpoint of the interval. z = (1707 – 1697)/15.96= 0.63 – Step 2: Find the area under the curve to the left of the upper endpoint. P(z ≤ 0.63) = 0.7357 50 Practical Value of the Sampling Distribution of x -8 • Example: St. Andrew’s College Cumulative Probabilities for the Standard Normal Distribution 51 Practical Value of the Sampling Distribution of x -9 • Example: St. Andrew’s College Sampling Distribution of x for SAT Scores σ x = 15.96 Area = 0.7357 1697 1707 x 52 Practical Value of the Sampling Distribution of x -10 • Example: St. Andrew’s College – Step 3: Calculate the z-value at the lower endpoint of the interval. z = (1687 – 1697)/15.96 = – 0.63 – Step 4: Find the area under the curve to the left of the lower endpoint. P(z ≤ – 0.63) = 0.2643 53 Practical Value of the Sampling Distribution of x -11 • Example: St. Andrew’s College Sampling Distribution of x for SAT Scores σ x = 15.96 Area = 0.2643 1687 1697 x 54 Practical Value of the Sampling Distribution of x -12 • Example: St. Andrew’s College – Step 5: Calculate the area under the curve between the lower and upper endpoints of the interval. P(– 0.68 ≤ z ≤ 0.68) = P(z ≤ 0.68) – P(z ≤ –0.68) = 0.7357 - 0.2643 = 0.4714 – The probability that the sample mean SAT score will be between 1687 and 1707 is: P(1687 ≤ x ≤ 1707) = 0.4714 55 Practical Value of the Sampling Distribution of x -13 • Example: St. Andrew’s College Sampling Distribution of x for SAT Scores σ x = 15.96 Area = 0.4714 1687 1697 1707 x 56 Relationship Between the Sample Size and the Sampling Distribution of x -1 • Example: the EAI problem – A Comparison of The Sampling Distributions of x for Simple Random Samples of n = 30 and n = 100 EAI Managers 57 Relationship Between the Sample Size and the Sampling Distribution of x -2 • Example: the EAI problem – Suppose that in the EAI sampling problem we select a simple random sample of 100 EAI managers instead of the 30 originally considered. – E( x ) = µ regardless of the sample size. In our example, E( x ) remains at 730.3. 58 Relationship Between the Sample Size and the Sampling Distribution of x -3 • Example: the EAI problem – Whenever the sample size is increased, the standard error of the mean σ x is decreased. With the increase in the sample size to n = 30, the standard error of the mean is 730. – With the increase in the sample size to n = 100, the standard error of the mean id decrease to σ 4000 σ = = = 400 x n 100 59 Relationship Between the Sample Size and the Sampling Distribution of x -4 • Example: the EAI problem, Probability of a Sample Mean Being Within $500 of the Population Mean for a Simple Random Sample of 100 EAI Managers 60 Relationship Between the Sample Size and the Sampling Distribution of x -5 • Example: St. Andrew’s College – Suppose we select a simple random sample of 100 applicants instead of the 30 originally considered. – E( x ) = µ regardless of the sample size. In our example, E( x ) remains at 1697. – Whenever the sample size is increased, the standard error of the mean σ x is decreased. With the increase in the sample size to n = 100, the standard error of the mean is decreased from 15.96 to: = σx N −n σ = N − 1 n 900 − 100 87.4 = 0.94333(8.74) = 8.2 900 − 1 100 61 Relationship Between the Sample Size and the Sampling Distribution of x -6 • Example: St. Andrew’s College With n = 100, σ x = 8.2 With n = 30, σ x = 15.96 E( x ) = 1697 x 62 Relationship Between the Sample Size and the Sampling Distribution of x -7 • Example: St. Andrew’s College Sampling Distribution of x for SAT Scores σ x = 8.2 Area = 0.7776 1687 1697 1707 x 63 7.6 Sampling Distribution of p -1 • Expected Value of p • Standard Deviation of p • Form of the Sampling Distribution of p • Practical Value of the Sampling Distribution of p 64 Sampling Distribution of p -2 • Example: Relative Frequency Histogram of Sample Proportion Values from 500 Simple Random Samples of 30 each. 65 Sampling Distribution of p -3 • Making Inferences about a Population Proportion Population with proportion p=? The value of p is used to make inferences about the value of p. A simple random sample of n elements is selected from the population. The sample data provide a value for the sample proportion p. 66 Expected Value of p -1 • The sampling distribution of p is the probability distribution of all possible values of the sample proportion p . • Expected Value of p E ( p) = p where: p = the population proportion 67 Expected Value of p -2 • Example: the EAI problem – Because E( p ) = p , p is an unbiased estimator of p. – We noted that p = 0.60 for the EAI problem, where p is the proportion for the population of managers who participated in the company’s management training program. – Thus, the expected value of p for the EAI sampling problem is 0.60. 68 Standard Values of p -1 • Standard Values of p – Finite Population σp = N −n N −1 p(1 − p) n Infinite Population σp = p (1 − p ) n – σ p is referred to as the standard error of the proportion. – ( N − n) /( N − 1) is the finite population correction factor. 69 Standard Values of p -2 • Example: the EAI problem – For the EAI study, p = 0.60. With n/N = 30/2500 = 0.012, we can ignore the finite population correction factor when we compute the standard error of the population. – For the simple random sample of 30 managers, σp = p(1 − p ) = n 0.60(1 − 0.60) = 0.0894 30 70 Form of the Sampling Distribution of p -1 The sampling distribution of p can be approximated by a normal distribution whenever the sample size is large enough to satisfy the two conditions: np > 5 and n(1 – p) > 5 . . . because when these conditions are satisfied, the probability distribution of x in the sample proportion, p = x/n, can be approximated by normal distribution (and because n is a constant). 71 Form of the Sampling Distribution of p -2 • Example: the EAI problem – The proportion for the population of managers who participated in the company’s management training program is p = 0.60. – With a simple random sample of size 30, we have np = 30(0.60) = 18 and n(1 – p) = 30(0.40) = 12. 72 Form of the Sampling Distribution of p -3 • Example: the EAI problem, Sampling Distribution of p for the Proportion of EAI Managers who Participated in the Management training program 73 Practical Value of the Sampling Distribution of p -1 • Example: St. Andrew’s College – Recall that 72% of the prospective students applying to St. Andrew’s College desire oncampus housing. – What is the probability that a simple random sample of 30 applicants will provide an estimate of the population proportion of applicant desiring on-campus housing that is within plus or minus 0.05 of the actual population proportion? 74 Practical Value of the Sampling Distribution of p -2 • Example: St. Andrew’s College – For our example, with n = 30 and p = 0.72, the normal distribution is an acceptable approximation because: np = 30(0.072) = 21.6 > 5 and n(1 – p) = 30(0.28) = 8.4 > 5 75 Practical Value of the Sampling Distribution of p -3 • Example: St. Andrew’s College Sampling Distribution of p = σp E( p ) = 0.72 0.72(1 − 0.72) = 0.082 30 p 76 Practical Value of the Sampling Distribution of p -4 • Example: St. Andrew’s College – Step 1: Calculate the z-value at the upper endpoint of the interval. z = (0.77 – 0.72)/0.082 = 0.61 – Step 2: Find the area under the curve to the left of the upper endpoint. P(z < 0.61) = 0.7291 77 Practical Value of the Sampling Distribution of p -5 • Example: St. Andrew’s College Cumulative Probabilities for the Standard Normal Distribution 78 Practical Value of the Sampling Distribution of p -6 • Example: St. Andrew’s College Sampling Distribution of p σ p = 0.082 Area = 0.7291 p 0.72 0.77 79 Practical Value of the Sampling Distribution of p -7 • Example: St. Andrew’s College – Step 3: Calculate the z-value at the lower endpoint of the interval. z = (0.67 – 0.72)/0.082 = – 0.61 – Step 4: Find the area under the curve to the left of the lower endpoint. P(z ≤ –0.61) = 0.2709 80 Practical Value of the Sampling Distribution of p -8 • Example: St. Andrew’s College Sampling Distribution of p σ p = .082 Area = 0.2709 p 0.67 0.72 81 Practical Value of the Sampling Distribution of p -9 • Example: St. Andrew’s College – Step 5: Calculate the area under the curve between the lower and upper endpoints of the interval. P(–0.61 ≤ z ≤ 0.61) = P(z ≤ 0.61) – P(z ≤ – 0.61) = 0.7291 – 0.2709 = 0.4582 – The probability that the sample proportion of applicants wanting on-campus housing will be within +/–0.05 of the actual population proportion : P(0.67 ≤ p ≤ 0.77) = 0.4582 82 Practical Value of the Sampling Distribution of p -10 • Example: St. Andrew’s College Sampling Distribution of p σ p = 0.082 Area = 0.4582 0.67 0.72 0.77 p 83 Practical Value of the Sampling Distribution of p -11 • Example: the EAI problem – Suppose that the personnel director wants to know the probability of obtaining a value of p that is within 0.05 of the population proportion of EAI managers who participated in the training program. – That is, What is the probability of obtaining a sample with a sample proportion p between 0.50 and 0.65? 84 Practical Value of the Sampling Distribution of p -12 • Example: the EAI problem, Probability of obtaining p between 0.55 and 0.65 85 7.7 Properties of Point Estimators -1 • Several different sample statistics can be used as point estimator of different population parameters, we use the following general notation in this section θ = the population parameter of interest θˆ = the sample statistics or point estimator of θ • The notation θ is the Greek letter theta and the notation θˆ is pronounced “theta-hat”. 86 Properties of Point Estimators -2 • Before using a sample statistic as a point estimator, statisticians check to see whether the sample statistic has the following properties associated with good point estimators. Unbiased Efficiency Consistency 87 樣本統計量好壞的判斷 母體的特性參數(Parameter) 樣本中用以推測母體特性的估計值 統計量(Statistic) 對統計量的要求: (1) 不偏(Unbiased): E(統計量) = 參數 (2) 變異數(Variance)愈小愈好 變異數與風險(Risk)有相似的涵意 Precise Biased Unbiased Imprecise Properties of Point Estimators -3 Unbiased • If the expected value of the sample statistics is equal to the population parameter being estimated, the sample statistics is said to be an unbiased estimator of the population parameter. • The sample statistics θˆ is an unbiased estimator of the population parameters θ if where E(θˆ ) = θ E(θˆ ) = the expected value of the sample statistics θˆ 90 Properties of Point Estimators -4 • Examples of Unbiased and Biased Point Estimators 91 Properties of Point Estimators -5 Efficiency • Given the choice of two unbiased estimators of the same population parameter, we would prefer to use the point estimator with the smaller standard deviation, since it tends to provide estimates closer to the population parameter. • The point estimator with the smaller standard deviation is said to have greater relative efficiency than the other. 92 Properties of Point Estimators -6 • Example: Sampling Distributions of Two Unbiased Point Estimators. 93 Properties of Point Estimators -7 Consistency • A point estimator is consistent if the values of the point estimator tend to become closer to the population parameter as the sample size becomes larger. 94 7.8 Other Sampling Methods • Stratified Random Sampling • Cluster Sampling • Systematic Sampling • Convenience Sampling • Judgment Sampling 95 Stratified Random Sampling -1 The population is first divided into groups of elements called strata. Each element in the population belongs to one and only one stratum. Best results are obtained when the elements within each stratum are as much alike as possible (i.e. a homogeneous group). 96 Stratified Random Sampling -2 • Diagram for Stratified Random Sampling 97 分層隨機抽樣(Stratified Random Sampling) 第一層 ○○○○○○ ○○○○○ ○○ 第二層 XXXXX XXXX X 第三層 ∆∆∆∆ ∆∆∆ ∆∆ ○○○○ ○○ XXX XX ∆∆ ∆∆ 抽樣 Stratified Random Sampling -3 A simple random sample is taken from each stratum. Formulas are available for combining the stratum sample results into one population parameter estimate. Advantage: If strata are homogeneous, this method is as “precise” as simple random sampling but with a smaller total sample size. Example: The basis for forming the strata might be department, location, age, industry type, and so on. 99 Cluster Sampling -1 The population is first divided into separate groups of elements called clusters. Ideally, each cluster is a representative small-scale version of the population (i.e. heterogeneous group). A simple random sample of the clusters is then taken. All elements within each sampled (chosen) cluster form the sample. 100 Cluster Sampling -2 • Diagram for Cluster Sampling 101 集體隨機抽樣(Cluster Random Sampling) ○○○○○○○ ××× △△△△△ A ○○○○○○○ ××× △△△△△ B ○○○○○○○ ××× △△△△△ C 抽出A、D ○○○○○○○ ××× △△△△△ ==================⇒ ○○○○○○○ ××× △△△△△ D ○○○○○○○ ××× △△△△△ E ○○○○○○○ ××× △△△△△ F ○○○○○○○ ××× △△△△△ Cluster Sampling -3 Example: A primary application is area sampling, where clusters are city blocks or other well-defined areas. Advantage: The close proximity of elements can be cost effective (i.e. many sample observations can be obtained in a short time). Disadvantage: This method generally requires a larger total sample size than simple or stratified random sampling. 103 Systematic Sampling -1 If a sample size of n is desired from a population containing N elements, we might sample one element for every n/N elements in the population. We randomly select one of the first n/N elements from the population list. We then select every n/Nth element that follows in the population list. 104 Systematic Sampling -2 This method has the properties of a simple random sample, especially if the list of the population elements is a random ordering. Advantage: The sample usually will be easier to identify than it would be if simple random sampling were used. Example: Selecting every 100th listing in a telephone book after the first randomly selected listing 105 系統抽樣(Systematic Sampling) ∣○●○○∣○●○○∣○●○○∣○●○○∣○●○○∣母體 ↓ ●●●●● 樣本 較常見的非隨機抽樣法 • 立意抽樣:不依隨機原則抽取樣本,而由母 體中選取部份具有典型代表樣本。(e.g. 專家 意見) • 便利抽樣:事先不預定樣本,碰到即問或樣 本自動回答。(e.g. 街頭調查) • 滾球抽樣:利用樣本尋找樣本,對於特定族 群樣本取得不易時採用。(e.g. 愛滋病的罹病 人數) • 配額抽樣:規定具有某種特性的樣本比例, 類似分層隨機抽樣。 Convenience Sampling -1 It is a nonprobability sampling technique. Items are included in the sample without known probabilities of being selected. The sample is identified primarily by convenience. Example: A professor conducting research might use student volunteers to constitute a sample. 108 Convenience Sampling -2 Advantage: Sample selection and data collection are relatively easy. Disadvantage: It is impossible to determine how representative of the population the sample is. 109 Judgment Sampling -1 The person most knowledgeable on the subject of the study selects elements of the population that he or she feels are most representative of the population. It is a nonprobability sampling technique. Example: A reporter might sample three or four senators, judging them as reflecting the general opinion of the senate. 110 Judgment Sampling -2 Advantage: It is a relatively easy way of selecting a sample. Disadvantage: The quality of the sample results depends on the judgment of the person selecting the sample. 111 Recommendation It is recommended that probability sampling methods (simple random, stratified, cluster, or systematic) be used. For these methods, formulas are available for evaluating the “goodness” of the sample results in terms of the closeness of the results to the population parameters being estimated. An evaluation of the goodness cannot be made with non-probability (convenience or judgment) sampling methods. 112 End of Chapter 7 113