Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Dr.Neal, WKU MATH 382 The Sample Deviation Let x1 , x2 , . . . , xn be a random sample of size n of a measurement with unknown mean µ and unknown standard deviation σ . Let x be the sample mean. We then define the sample variance S 2 by S2 = 1 n ∑ ( xi − x ) 2 . n − 1 i =1 The sample deviation is given by S = S2 . Why Do We Divide by n − 1 ? 2 When { xi } is a census of measurements, then we obtain the true variance by σ = 1 n ∑ (x i − µ )2 , which is the average squared distance from the mean µ . To obtain this n i =1 average, we necessarily divide by n . But when { xi } is only a random sample of 2 measurements and when µ is unknown, then we cannot obtain σ . But we can use x as an unbiased estimate of µ , where the average of all possible x equals µ . 2 Likewise, we wish to define an unbiased estimator of σ . We should naturally 1 n 2 begin with the expression V = ∑ (x i − x )2 . However the average of all possible n i =1 2 2 such V over all possible random samples of size n does not equal σ . It can be shown n −1 2 2 that the average of all possible V equals σ . To adjust the average, we multiply n n 1 n 2 2 V 2 by to obtain S = ∑ ( x − x ) 2 . Now the average of all possible S over n −1 n − 1 i =1 i 2 all possible random samples of size n equals σ . 2 By dividing by n − 1 in the definition of S , 2 2 then S becomes an unbiased estimator of σ ; 2 that is, E[S 2 ] = σ . Dr.Neal, WKU 2 How Good is the Estimator S ? 2 2 In order to determine how well S estimates σ , we would like to know the magnitude 2 2 of the variance of all possible S . That is, are the S widely spread out with some 2 2 2 much less than σ and others much more than σ ? Or do the S have small variance 2 which makes them consistently close to their average σ ? 2 Only in certain cases can we find the variance of all possible S . When sampling from an arbitrary unknown population, then we generally cannot determine the 2 variance of S . However when sampling from a normally distributed population, then we do know the following results: Theorem. When choosing random samples of size n from a normally distributed measurement with mean µ and standard deviation σ , then (i) The distribution of all possible samples means x is normally distributed, 2 (ii) The variance of all possible sample variances S is given by Var(S2 ) = 2 σ4 . n −1 Example. Suppose composite ACT scores are found to be normally distributed with mean µ = 22.4 and standard deviation σ = 4.2. To check for discrepancies, various random samples of size n = 400 are collected in various regions. The sample means x 2 and sample variances S are noted in each case. What are the average, variance, and standard deviation of all possible sample means and of all possible sample variances? Solution. Assuming that the population of test takers is of size N that is much larger the n = 400, we can say µ x = µ = 22.4 σ 2x ≈ σ 2 4.22 = = 0.0441 n 400 σ x = σ 2x ≈ 0.21. Thus x is normally distributed with a mean of 22.4 and a standard deviation of about 0.21. So about 68.27% of the time, an x from a random sample of size 400 should lie within 22.4 ± 0.21. That is, P(22.19 ≤ x ≤ 22.61) ≈ 0.6827. 2 Because S is an unbiased estimator, we can say E[S 2 ] = σ Applying the theorem, we can further say Var(S2 ) = 2 σ4 2 ×(4.2)4 = ≈ 1.5597 n −1 399 2 2 2 = 4.2 = 17.64. and σ S 2 ≈ 1.249. So all the sample variances S should average out to 17.64 (the true measurement variance), with a standard deviation of about 1.249. Dr.Neal, WKU Arithmetic Relationships Between S and σ Your calculator or spreadsheet should display the sample deviation S along with the basic statistics. Here are a few computational facts: When x1 , x2 , . . . , xn is a census, then x = µ . Thus, σ2 = 1 n 1 n ∑ (x i − µ )2 = ∑ (x i − x )2 , n i =1 n i =1 2 and S = 2 Therefore, σ = 1 n ∑ ( xi − x ) 2 . n − 1 i =1 n −1 2 S and σ = n n −1 ×S. n Thus if you have the value of S , then you can multiply S by you actually have a census of data. 2 n −1 to obtain σ if n Moreover, if you wish to show work in computing S and S “by hand,” then we x12 + x22 +... +x n2 n n 2 2 know that σ = – µ 2 . Thus S = σ 2 and S = × σ. n −1 n −1 n Dr.Neal, WKU Exercises 1. A group of WKU freshmen were asked to give the number of hours that they spend on Facebook per week. The results were: Hrs # Fr 0 12 3 8 6 16 10 14 12 20 15 14 20 9 25 5 30 2 Compute sample mean and sample deviation. 2. Adult heights are found to be normally distributed with mean µ = 68 inches and standard deviation σ = 3.6 inches. Suppose various random samples of size n = 225 are collected. (a) What are the average, variance, and standard deviation of all possible sample means x? (b) What are the average, variance, and standard deviation of all possible sample 2 variances S ? (c) What is the probability that a sample mean x is (i) at most 67.9 (ii) at least 68.05 (iii) from 67.99 to 68.01? (d) Find the bounds in between which lie 80% of all sample mean heights from random samples of size n = 200 .