Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability and Statistics 2 Lecture Notes 2 Tuğba EFENDİGİL, Ph.D. [email protected] Agenda Sampling distributions Sampling distributions of Means and the Central Limit Theorem Sampling distributions of the Difference between Two Means Sampling distributions of a Proportion Sampling distributions of S2 When σ known (chi-squared distribution) When σ unknown (t distribution) Sampling distributions of two sample variances (F distribution) 2 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions Statistical methods are used to make decisions and draw conclusions about populations. This aspect of statistics is generally called statistical inference. These techniques utilize the information in a sample in drawing conclusions. Statistical inference may be divided into two major areas: parameter estimation and hypothesis testing. 3 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions Statistical inference is always focused on drawing conclusions about one or more parameters of a population. An important part of this process is obtaining estimates of the parameters. Suppose that we want to obtain a point estimate (a reasonable value) of a population parameter. We know that before the data are collected, the observations are considered to be random variables, say, Therefore, any function of the observation, or any statistic, is also a random variable. 4 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions Since a statistic is a random variable, it has a probability distribution. We call the probability distribution of a statistic a sampling distribution. The sampling distribution of a statistic depends on the distribution of the population, the size of the samples, and the method of choosing the samples. The probability distribution of distribution of the mean. 5 is called the sampling Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions 6 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of Mean and the Central Limit Theorem [***] 7 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of Mean and the Central Limit Theorem lim f ( X ) ~ N ( ; 2 n) n 8 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of Mean and the Central Limit Theorem 9 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of Mean and the Central Limit Theorem Ex.: An electrical firm manufactures light bulbs that have a length of life that is approximately normally distributed, with mean equal to 800 hours and a standard deviation of 40 hours. Find the probability that a random sample of 16 bulbs will have an average life of less than 775 hours. Solution: 10 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of Mean and the Central Limit Theorem Ex.: Traveling between two campuses of a university in a city via shuttle bus takes, on average, 28 minutes with a standard deviation of 5 minutes. In a given week, a bus transported passengers 40 times. What is the probability that the average transport time was more than 30 minutes? Assume the mean time is measured to the nearest minute. Solution: 11 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of the Difference between Two Means A scientist or engineer may be interested in a comparative experiment in which two manufacturing methods, 1 and 2, are to be compared. The basis for that comparison is μ1 − μ2, the difference in the population means. 12 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of the Difference between Two Means 13 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of the Difference between Two Means Ex.: Two independent experiments are run in which two different types of paint are compared. Eighteen specimens are painted using type A, and the drying time, in hours, is recorded for each. The same is done with type B. The population standard deviations are both known to be 1.0. Assuming that the mean drying time is equal for the two types of paint, find P( 𝑋𝐴 − 𝑋𝐵 > 1.0), where 𝑋𝐴 and 𝑋𝐵 are average drying times for samples of size nA = nB = 18. Solution: 14 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of the Difference between Two Means Solution cont’d: 15 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of the Difference between Two Means Ex.: The television picture tubes of manufacturer A have a mean lifetime of 6.5 years and a standard deviation of 0.9 year, while those of manufacturer B have a mean lifetime of 6.0 years and a standard deviation of 0.8 year. What is the probability that a random sample of 36 tubes from manufacturer A will have a mean lifetime that is at least 1 year more than the mean lifetime of a sample of 49 tubes from manufacturer B? Solution: 16 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of the Difference between Two Means Solution cont’d: 17 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of a Proportion The estimator of a population proportion of successes is the sample proportion. That is, we count the number of successes in a sample and compute: X is the number of successes, n is the sample size. Note that n and p are the parameters of a binomial distribution. We know that the sampling distribution is approximately normal with mean p and variance if p(1 –p)/n is not too close to either 0 or 1 and if n is relatively large. Typically, to apply this approximation we require that np and n(1 -p) be greater than or equal to 5. 18 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of a Proportion 19 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Known: Chi-squared distribution 20 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Known: Chi-squared distribution 21 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Known: Chi-squared distribution Probability density functions of several chisquared distributions 22 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Known: Chi-squared distribution 23 Lecture #2------Tuğba Efendigil, Ph.D. Chi-squared Table (v) 0,99 0,98 0,95 0,90 0,75 0,50 0,10 0,05 0,03 0,01 1 0,00 0,00 0,00 0,02 0,10 0,45 2,71 3,84 5,02 6,63 2 0,02 0,05 0,10 0,21 0,58 1,39 4,61 5,99 7,38 9,21 3 0,11 0,22 0,35 0,58 1,21 2,37 6,25 7,81 9,35 11,34 4 0,30 0,48 0,71 1,06 1,92 3,36 7,78 9,49 11,14 13,28 5 0,55 0,83 1,15 1,61 2,67 4,35 9,24 11,07 12,83 15,09 6 0,87 1,24 1,64 2,20 3,45 5,35 10,64 12,59 14,45 16,81 7 1,24 1,69 2,17 2,83 4,25 6,35 12,02 14,07 16,01 18,48 8 1,65 2,18 2,73 3,49 5,07 7,34 13,36 15,51 17,53 20,09 9 2,09 2,70 3,33 4,17 5,90 8,34 14,68 16,92 19,02 21,67 10 2,56 3,25 3,94 4,87 6,74 9,34 15,99 18,31 20,48 23,21 11 3,05 3,82 4,57 5,58 7,58 10,34 17,28 19,68 21,92 24,72 12 3,57 4,40 5,23 6,30 8,44 11,34 18,55 21,03 23,34 26,22 13 4,11 5,01 5,89 7,04 9,30 12,34 19,81 22,36 24,74 27,69 14 4,66 5,63 6,57 7,79 10,17 13,34 21,06 23,68 26,12 29,14 15 5,23 6,26 7,26 8,55 11,04 14,34 22,31 25,00 27,49 30,58 24 Lecture #2------Tuğba Efendigil, Ph.D. Chi-squared Table (v) 0,99 0,98 0,95 0,90 0,75 0,50 0,10 0,05 0,03 0,01 16 5,81 6,91 7,96 9,31 11,91 15,34 23,54 26,30 28,85 32,00 17 6,41 7,56 8,67 10,09 12,79 16,34 24,77 27,59 30,19 33,41 18 7,01 8,23 9,39 10,86 13,68 17,34 25,99 28,87 31,53 34,81 19 7,63 8,91 10,12 11,65 14,56 18,34 27,20 30,14 32,85 36,19 20 8,26 9,59 10,85 12,44 15,45 19,34 28,41 31,41 34,17 37,57 21 8,90 10,28 11,59 13,24 16,34 20,34 29,62 32,67 35,48 38,93 22 9,54 10,98 12,34 14,04 17,24 21,34 30,81 33,92 36,78 40,29 23 10,20 11,69 13,09 14,85 18,14 22,34 32,01 35,17 38,08 41,64 24 10,86 12,40 13,85 15,66 19,04 23,34 33,20 36,42 39,36 42,98 25 11,52 13,12 14,61 16,47 19,94 24,34 34,38 37,65 40,65 44,31 26 12,20 13,84 15,38 17,29 20,84 25,34 35,56 38,89 41,92 45,64 27 12,88 14,57 16,15 18,11 21,75 26,34 36,74 40,11 43,19 46,96 28 13,56 15,31 16,93 18,94 22,66 27,34 37,92 41,34 44,46 48,28 29 14,26 16,05 17,71 19,77 23,57 28,34 39,09 42,56 45,72 49,59 30 14,95 16,79 18,49 20,60 24,48 29,34 40,26 43,77 46,98 50,89 40 22,16 24,43 26,51 29,05 33,66 39,34 51,81 55,76 59,34 63,69 50 29,71 32,36 34,76 37,69 42,94 49,33 63,17 67,50 71,42 76,15 37,48 40,48 43,19 46,46 52,29 59,33 74,40 79,08 83,30 88,38 60 25 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 Chi-squared values based on probability values Olasılıklar (1-α) Serb. Derecesi 0,01 0,025 0,05 0,1 0,2 0,9 0,95 0,975 0,99 1 0,00 0,00 0,00 0,02 0,06 2,71 3,84 5,02 6,63 2 0,02 0,05 0,10 0,21 0,45 4,61 5,99 7,38 9,21 3 0,11 0,22 0,35 0,58 1,01 6,25 7,81 9,35 11,34 4 0,30 0,48 0,71 1,06 1,65 7,78 9,49 11,14 13,28 5 0,55 0,83 1,15 1,61 2,34 9,24 11,07 12,83 15,09 6 0,87 1,24 1,64 2,20 3,07 10,64 12,59 14,45 16,81 7 1,24 1,69 2,17 2,83 3,82 12,02 14,07 16,01 18,48 8 1,65 2,18 2,73 3,49 4,59 13,36 15,51 17,53 20,09 9 2,09 2,70 3,33 4,17 5,38 14,68 16,92 19,02 21,67 10 2,56 3,25 3,94 4,87 6,18 15,99 18,31 20,48 23,21 11 3,05 3,82 4,57 5,58 6,99 17,28 19,68 21,92 24,72 12 3,57 4,40 5,23 6,30 7,81 18,55 21,03 23,34 26,22 13 4,11 5,01 5,89 7,04 8,63 19,81 22,36 24,74 27,69 26 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 Chi-squared values based on probability values Probabilities (1-α) v 0,01 0,025 0,05 0,1 0,2 0,9 0,95 0,975 0,99 14 4,66 5,63 6,57 7,79 9,47 21,06 23,68 26,12 29,14 15 5,23 6,26 7,26 8,55 10,31 22,31 25,00 27,49 30,58 16 5,81 6,91 7,96 9,31 11,15 23,54 26,30 28,85 32,00 17 6,41 7,56 8,67 10,09 12,00 24,77 27,59 30,19 33,41 18 7,01 8,23 9,39 10,86 12,86 25,99 28,87 31,53 34,81 19 7,63 8,91 10,12 11,65 13,72 27,20 30,14 32,85 36,19 20 8,26 9,59 10,85 12,44 14,58 28,41 31,41 34,17 37,57 21 8,90 10,28 11,59 13,24 15,44 29,62 32,67 35,48 38,93 22 9,54 10,98 12,34 14,04 16,31 30,81 33,92 36,78 40,29 23 10,20 11,69 13,09 14,85 17,19 32,01 35,17 38,08 41,64 24 10,86 12,40 13,85 15,66 18,06 33,20 36,42 39,36 42,98 25 11,52 13,12 14,61 16,47 18,94 34,38 37,65 40,65 44,31 26 12,20 13,84 15,38 17,29 19,82 35,56 38,89 41,92 45,64 27 12,88 14,57 16,15 18,11 20,70 36,74 40,11 43,19 46,96 28 13,56 15,31 16,93 18,94 21,59 37,92 41,34 44,46 48,28 29 14,26 16,05 17,71 19,77 22,48 39,09 42,56 45,72 49,59 27 30 14,95 16,79 Lecture Efendigil, 18,49 #2------Tuğba 20,60 23,36 Ph.D.40,26 43,77 46,98 50,89 Sampling distributions of S2 When σ is Known: Chi-squared distribution Ex.: Solution: 28 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Unknown: t distribution In many experimental scenarios, knowledge of σ is certainly no more reasonable than knowledge of the population mean μ. Often, in fact, an estimate of σ must be supplied by the same sample information that produced the sample average . As a result, a natural statistic to consider to deal with inferences on μ is since S is the sample analog to σ. 29 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Unknown: t distribution If the sample size is large enough, say n ≥ 30, the distribution of T does not differ considerably from the standard normal. However, for n < 30, it is useful to deal with the exact distribution of T. In developing the sampling distribution of T, we shall assume that our random sample was selected from a normal population. We can then write where and 30 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Unknown: t distribution 31 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Unknown: t distribution 32 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Unknown: t distribution 33 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of S2 When σ is Unknown: t distribution Ex.: A chemical engineer claims that the population mean yield of a certain batch process is 500 grams per milliliter of raw material. To check this claim he samples 25 batches each month. If the computed t-value falls between −t0.05 and t0.05, he is satisfied with this claim. What conclusion should he draw from a sample that has a mean 𝑥 = 518 grams per milliliter and a sample standard deviation s=40 grams? Assume the distribution of yields to be approximately normal. Solution: 34 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of two sample variances (F distribution) 35 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of two sample variances (F distribution) 36 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of two sample variances (F distribution) 37 Lecture #2------Tuğba Efendigil, Ph.D. Sampling distributions of two sample variances (F distribution) 38 Sampling distributions of two sample variances (F distribution) Variance ratio distribution 39 Lecture #2------Tuğba Efendigil, Ph.D.