Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sample Distributions Suppose I take a sample of size n from a population and measure a statistic, say x. This number will be different, in general, from the true value μ . If I repeat the experiment many times, each time using the same fixed sample size, then I will obtain a collection of values x1 , x2 , etc., which I can arrange in the usual way into a relative frequency histogram. As the number of samples (for a fixed sample size) becomes infinitely large, I obtain the theoretical sample distribution for x. Although we can talk about the sample distribution for any statistic (e.g. the median or the standard deviation), it’s the sample distribution of the mean that we’re mainly interested in. The standard deviation of a sample distribution is called the standard error. When the statistic is the mean x, the standard deviation is called the standard error of the mean, which I’ll write as SE(x), or simply SE when there’s no chance of confusion. Theorem (Central Limit Theorem). Suppose a population has mean μ and standard deviation σ . Take samples of size n. (i) When n is large the the sample distribution of the mean approaches a normal distribution. The mean of this distribution is the same as the population mean, and the standard error is given by: σ SE = √ n (ii) If the population is normally distributed, then the sample distribution is normal (with the above values for mean and standard error) regardless of the sample size. Notes: • Note that the theorem says that even if the population is not normal, the sample distribution approaches a normal distribution for n large enough (Rule of thumb: “Large enough” is usually taken to mean n ≥ 30). • An increase in σ will make SE increase as well, but an increase in in the sample size n will cause SE to decrease. In other words, you can measure the mean more accurately by taking larger sample sizes. Large Sample Estimation Example 1. Suppose I know from previous studies that the average height of adult males is 69 inches, with a standard deviation of 2.5 inches. a. What is the probability that a randomly selected adult male will have a height between 68.5 and 69.5 inches? b. What is the probability that a random sample of 40 adult males will have a sample mean x between 68.5 and 69.5 inches? Solution. I’m not going to work the first part since we’ve already worked questions like this one. I included it so you can see the difference between the two questions. The first part uses the distribution of the population (which we would have to assume to be normal in order to work the problem). The second question uses an entirely different distribution — the sample distribution — which has a different standard deviation, which we call the standard error. So to answer part b we need to calculate z scores for this distribution using the Central Limit Theorem: z= M118/Fall 2006/David Lane x−μ x−μ √ = SE σ/ n 1 (you fill in the details) Notice in this example that we can work part b even if the shape of the population distribution is unknown, because we chose a sample size ≥ 30. Now let’s work the same problem backwards. Example 2. Given the setup from the previous problem: 95% of the time we would expect our calculated sample mean x to fall between what two height values? Solution. First take the relationship z = x−μ , and solve for x: SE x = μ + z · SE We were given μ and can calculate SE, so all we need are the z values which correspond to cumulative proportions of .025 and .975 (so that their difference is .05). From the table we can read off these values as z = ±1.96. It follows that μ − 1.96 SE < x < μ + 1.96 SE or ... (fill in details) What if we don’t know σ ? As long as n ≥ 30 you can estimate σ using the sample standard deviation we learned before. The formula for the standard error becomes s SE ≈ √ n Confidence Intervals What if, in an experiment like the ones above, you don’t know in advance the true values for either the population mean or the standard deviation? We can measure x and s, but then what can we say about μ ? Whatever μ is, we know that 95% of the time x is within 1.96 standard errors of μ , which is the same as M118/Fall 2006/David Lane 2 saying that 95% of the time μ is within 1.96 standard errors of x. (This sounds confusing, but it’s just a statement about distances. If the distance from A to B is less than C, then the distance from B to A is also less than C. Duh!) The interval we obtain through this procedure — from x − 1.96 SE to x + 1.96 SE — is called the 95% confidence interval. Example 3. Suppose you are studying a species of seabird and you measure the weights of 32 adult females and obtain a sample mean of 2.6 kg with a (sample) standard deviation of .38 kg. What is the 95% confidence interval? Solution. The standard error is approximated by s .38 SE = √ = √ = .067 n 32 So x − 1.96 SE = 2.6 − (1.96)(.067) = 2.45 and x + 1.96 SE = 2.6 + (1.96)(.067) = 2.73 The 95% confidence interval is from 2.45 to 2.73 kg. Question. Will the 90% confidence interval be narrower or wider than the 95% interval? Check your answer (You will need to find appropriate z-values from the table. Do you remember how we found the ±1.96?) Confidence Intervals for Small Samples In the example above I’ve been careful to give you a sample size n ≥ 30. Is the same technique valid for a smaller sample size? In order to use a z-table for a small sample: 1. The characteristic you are measuring must follow a normal distribution. 2. You must know the population standard deviation σ . Put another way: Suppose we conduct a series of measurements of x with a fixed sample size n, and we know the standard deviation σ and that the characteristic we’re measuring is normal. Then the Central Limit Theorem says that the distribution for the variable z= x−μ √ σ/ n will have a standard normal distribution (which justifies the use of the letter z) M118/Fall 2006/David Lane 3 On the other hand, if we don’t know σ then we have to calculate the “t” variable t= x−μ √ s/ n Each time we take a new sample both x and s will vary, so this variable t is more spread out than z. The variable t does not follow a standard normal distribution, but follows something called a t-distribution instead, which has the following properties: • There is only one (standard) normal distribution, with μ = 0 and σ = 1, but there is an entire family of t-distributions, one for each positive integer — called the degree of freedom, or df. • Fir a given df, the t-distribution is centered at 0 and is bell shaped. • A t-distribution is more spread out than the normal distribution. As df increases, the “spreading out” decreases, and for large df approaches the standard normal distribution. Theorem. Take a random sample of size n from a normal population distribution. Then the sampling distribution of the of the variable x−μ t= √ s/ n is the t distribution with df = n − 1. Example 4. Suppose in your seabird study (the previous example) you measured the weights of 12 adult females find that x = 2.6 and s = .38 kg. What is the 95% confidence interval? Solution. When we used the z-distribution we calculated the values s x − 1.96 √ n s and x + 1.96 √ n The ±1.96 are the “z-critical values” between which the total area is 95%. We need the corresponding t critical values for df = n − 1 = 11, which are found in Table 4 in the back of the book. This Table lists critical t values for various percentages and for different values of df. To find the critical value which gives a 95% two tailed confidence interval, we need the 2.5% single tailed critical value, which the book calls t.025 . So from the table, df = 11 gives t.025 = 2.201. Notice that as df increases, this t-critical value approaches the corresponding z-critical value of 1.96. So the correct calculation for the 95% confidence interval is s x − 2.201 √ n s and x + 2.201 √ n or from 2.36 to 2.84 kg. M118/Fall 2006/David Lane 4 The Language of Hypothesis Testing Continuing with the seabird example: Suppose you know (from previous studies) that the average weight of our seabird is supposed to be 2.4 kilograms. We want to conduct an experiment to either confirm this figure, or else establish that it’s in error. In the language of statistics we have a null hypothesis H0 (μ = 2.4) which we want to test. If H0 is true then we “accept” the hypothesis, and if we believe the hypothesis is false the we “reject” H0 (and accept the alternate hypothesis Ha ). Now of course when we conduct the actual experiment we don’t expect to get a vaule for x which is exactly equal to μ . But we need to decide whether the difference between our measured value and the hypothetical value is significant or not. (That’s why this is called “significance testing”). There are two possible errors we could make. • We might reject the null hypothesis when in fact it was correct. This type of mistake is called a Type I error. (You might call this a “false alarm” mistake.) • A Type II error occurs when we fail to reject the null hypothesis when we should have. (You might call this a “failed alarm.”) Jury Trial Analogy. Suppose a person is accused of a crime, and his (or her) case goes to a jury trial. Let’s take as our null hypothesis that the person is innocent, and our alternate hypothesis that the person is guilty. If the jury inadvertently convicts an innocent person, then this is a Type I error: The null (innocence) hypothesis has been incorrectly rejected. On the other hand, the jury may acquit a someone who is guilty, which is a Type II error. Unlike a subjective jury trial, in statistics we attach numbers to this process. We define α and β , repectively, to be the probabilities of committing the two types of error described. α is called the significance level. Since Type I errors are considered very bad, we’ll take α to be quite small, usually either 5% or 1%. Hence 1 − α (the probability of not committing a Type I error) will be high (either 95% or 99%). The quantity 1 − β is called the power of a hypothesis test. This is the probability of not committing a Type II error. In other words, the power measures the test’s ability to (correctly) reject the null hypothesis when it is actually false. Let’s set a significance level of 5%. In the seabird example (Example 4) we calculated that there’s a √ 95% chance that the distance between μ and x is less than 2.021s/ n = 0.24 kg. If the hypothesized value of μ is 2.4, and we obtain a sample value x = 2.6, this difference is not significant enough to reject the null hypothesis (if we want the probability of committing a Type I error to be less than 5%). Estimating the Difference Between Two Population Means Motivating Problem. Suppose that there are two populations of seabirds on different island. We’ll use subscripts to distinguish between the two populations, so μ1 is the population mean on island #1, and μ2 on island #2. If we take samples and measure x1 and x2 , these numbers will of course not be equal. But is the observed difference significant, or is it due to variations in the sample? In other words, should we accept or reject the null hypothesis H0 : μ1 = μ2 It seems like there are two sample distributions in play here, one for each island. But we can combine these into a single distribution if we take our statistic to be the difference x1 − x2 . M118/Fall 2006/David Lane 5 In other words, suppose we take a sample of size n1 from the first island, and calculate x1 . Then we take a sample of size n2 from the second island, calculate x2 , and subtract the two sample means. If we repeat this process infinitely many times then we will obtain a sample distribution for the quantity x1 − x2 . We need a theorem (similar to the Central Limit Theorem) which describes this distribution (see p318 in the book). Theorem (Properties of the Sampling Distribution of x1 − x2 ). Suppose independent random samples of size n1 and n2 have been taken from populations with mean μ1 and μ2 , and variances σ12 and σ12 . (i) If the two populations are normally distributed, then the sample distribution of (x1 − x2 ) is also normally distributed, regardless of the sample sizes. If the populations are not normally distributed, then (x1 − x2 ) approaches a normal distribution when n1 and n2 are large (≥ 30). (ii) The mean of (x1 − x2 ) is the difference of the population means: μx1 −x2 = μ1 − μ2 (iii) The standard error (i.e. the standard deviation of the sample distribution) is given by σ12 σ22 SE = + n1 n2 When the population variations are unknown, then we can use sample variations, as long as n1 , n2 are large: SE = s21 s22 + n1 n2 So in a problem like the one described above, there are three different cases: • If both samples are large, then we will calculate an appropriate z variable, and use the normal table to analyze our problem (Section 8.6 in the book). • If the samples are small, and the populations follow a normal distribution, and the variances are approximately the same, then we can use an appropriate t distribution (Section 10.4). • If neither of these criteria are met, then the problem is beyond the scope of the course, and we won’t talk about it. Example 5. You take a sample of 35 birds on Island #1 and find x1 = 2.4 kg with sample standard deviation s1 = .6 kg. A similar study of 30 birds on Island #2 gives x2 = 2.05 kg with sample standard deviation s1 = .7 kg. Is this difference significant? (Assume a significance level α = .05.) Solution. We’re trying to determine whether to reject the null hypothesis H0 : μ1 − μ2 = 0 (I re-phrased H0 in terms of the difference, since that’s the statistic we’re using.) Since n1 and n2 are both ≥ 30, it’s ok to use a z-test. To find the z variable we need to calculate statistic − hypothesized value z= SE M118/Fall 2006/David Lane = (x1 − x2 ) − (μ1 − μ2 ) SE = x 1 − x2 SE 6 Let’s calculate the standard error separately: s21 s22 .62 .72 SE = + = + = .1632 n1 n2 35 30 Therefore z is equal to z= 2.4 − 2.05 = 2.145 .1632 Since this is outside of the range −1.96 < z < 1.96, we are justified in rejecting the null hypothesis and concluding that the measured difference is statistically significant. Note that in this problem we could have reached the same conclusion by finding a 95% confidence interval for the measured variable x1 − x2 . If 0 (the hypothesised μ1 − μ2 ) is in this interval we accept the null hypothesis. If 0 is outside this interval we reject the hypothesis. The confidence interval is given by x1 − x2 ± 1.96SE = .35 ± (1.96)(.1632) = .35 ± .32 or from .03 to .67. Comparing Two Means When the Sample Size is Small In the small sample case we assume the following: 1. Both population distributions are normal 2. The population standard deviations are equal: σ1 = σ2 . Call this common value σ (without the subscript). If we happen to know σ , then we can use a z variable as before, using the formula z= (x1 − x2 ) − (μ1 − μ2 ) (x1 − x2 ) − (μ1 − μ2 ) = 1 1 σ2 σ2 2 σ + + n1 n2 n1 n2 Normally we won’t know sigma. We have to estimate it using the sample standard deviation(s), and this will lead to the use of a t distribution rather than the normal distribution. t= (x1 − x2 ) − (μ1 − μ2 ) 1 1 + s2 n1 n2 So there are two questions: (1) What number should we use for the variance s2 , since the experiment comes with two different values s21 and s22 ? And (2) What should we use as the “degrees of freedom” df in the t tables? Since we’re assuming that σ1 = σ2 , we expect that the value for s1 will be very close to s2 . If the sample sizes are the same then we can just average the variances to get the best possible s2 . But if one sample size is larger than the other we should give this value more credence, so we take a type of weighted average (see p401). (n1 − 1)s21 + (n2 − 1)s22 s2 = n1 + n2 − 2 The sample variance calculated in this manner is called the pooled estimate for σ 2 . M118/Fall 2006/David Lane 7 Theorem. If two populations are normally distributed with the same standard deviation, and the sample sizes are small, then the variable (x1 − x2 ) − (μ1 − μ2 ) t= 1 1 2 s + n1 n2 follows a t-distribution with df = n1 + n2 − 2. For s2 use the pooled estimate defined above. M118/Fall 2006/David Lane 8