Download Statistics Chapter 8 Estimation

Chapter 8 Estimation ESTIMATING 𝝁 WHEN 𝝈 IS KNOWN Note: • Because time and money constraints, difficulty in finding population members, and so forth, we usually do not have access to all measurements of an entire population. Therefore, we rely on information from a sample. • We will learn techniques for estimating the population mean using sample data. Assumptions about the random variable x • 1. We have a simple random sample of size n drawn from a population of x values • 2. The value of 𝜎, the population standard deviation of x, is known • 3. If the x distribution is normal, then our methods work for any sample size n • 4. If x has an unknown distribution, then we require a sample size 𝑛 ≥ 30. However, if the x distribution is distinctly skewed and definitely not mound-shaped, a sample size 50 or even 100 or higher may be necessary. Point estimate • An estimate of a population parameter given by a single number is called point estimate. • 𝑥 𝑖𝑠 𝑡ℎ𝑒 𝑝𝑜𝑖𝑛𝑡 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝑓𝑜𝑟 𝜇 Note: • Even with a large random sample, the value of 𝑥 is not exactly equal to population mean 𝜇. • We then need to calculate the margin of error Margin of Error • When using 𝑥 as a point estimate for 𝜇, the margin of error is the magnitude of 𝑥 − μ 𝑜𝑟 𝑥 − 𝜇 . Note: • We cannot say exactly how close 𝑥 is to 𝜇 is unknown. Therefore, the exact margin of error is unknown when the population parameter is unknown. • Therefore, we need to use confidence interval to calculate the probability to give us reliability of an estimate. Confidence Interval • For a confidence level c, the critical value 𝑧𝑐 is the number such that the area under the standard normal curve between −𝑧𝑐 𝑎𝑛𝑑 𝑧𝑐 𝑒𝑞𝑢𝑎𝑙𝑠 𝑐 • 𝑃 −𝑧𝑐 < 𝑧 < 𝑧𝑐 = 𝑐 Review! • Find the z value such that 99% of the area under the standard normal curve lies between –z and z • Another way to say this: • 𝑃 −𝑧0.99 < 𝑧 < 𝑧0.99 = 0.99 Group work • Find the z value such that 95% of the area under the standard normal curve lies between –z and z • 𝑃 −𝑧0.95 < 𝑧 < 𝑧0.95 = 0.95 Levels of confidence and their corresponding critical values Level of Confidence c Critical Value 𝑧𝑐 0.70, or 70% 1.04 0.75, or 75% 1.15 0.80, or 80% 1.28 0.85, or 85% 1.44 0.90, or 90% 1.645 0.95, or 95% 1.96 0.98, or 98% 2.33 0.99, or 99% 2.58 Let’s put everything together • Since from central limit theorem, we know 𝑥 is approximately normal with mean 𝜇𝑥 = 𝜇 when n is large. • Also based from central limit theorem 𝑧 = • Combined with 𝑃 −𝑧𝑐 < 𝑧 < 𝑧𝑐 = 𝑐 • We get: • 𝑃 𝜎 −𝑧𝑐 𝑛 <𝑥−𝜇 < 𝜎 𝑧𝑐 𝑛 =c 𝑥−𝜇 . 𝜎/ 𝑛 E is known as maximal margin of error • 𝜎 E=𝑧𝑐 𝑛 • E is also known as error tolerance. It is the bound on the margin of error Let’s do more math manipulation • We get: • 𝑃 −𝐸 < 𝑥 − 𝜇 < 𝐸 = c • Followed by • 𝑃 𝑥−𝐸 <𝜇 <𝑥+𝐸 =c • This we called a c confidence interval for 𝜇 Confidence Interval for 𝜇 • 𝑥 − 𝐸 𝑡𝑜 𝑥 + 𝐸 • It is an interval computed from sample data in such a way that c is the probability of generating an interval containing the actual value of 𝜇. In other words, c is the proportion of confidence intervals, based on random samples of size n, that actually contain 𝜇. Example: • Julia enjoys jogging. She has been jogging over a period of several years, during which time her physical condition has remained constantly good. Usually, she jogs 2 miles per day. The standard deviation of her times is 𝜎 = 1.80 minutes. During the past year, Julia has recorded her times to run 2 miles. She has a random sample of 90 of these times. For these 90 times, the mean was 𝑥=15.60 minutes. Let 𝜇 be the mean jogging time for the entire distribution of Julia’s 2 mile running times. Find a 0.95 confidence interval for 𝜇 Answer 𝜎 𝑛 1.80 E=1.96 90 • E=𝑧𝑐 • • E approx equals to 0.37 • • • 𝑥−𝐸 <𝜇 <𝑥+𝐸 15.60 − 0.37 < 𝜇 < 15.60 + 0.37 15.23 < 𝜇 < 15.97 • We can conclude with 95% confidence that interval from 15.23 minutes to 15.97 minutes is one that contains the population mean 𝜇 of jogging times for Julia Group work • Mr. Liu enjoys talking to people and get to know them. Usually he talks to 5 people per day. The standard deviation of talk time is 𝜎 = 3.20 𝑚𝑖𝑛𝑢𝑡𝑒𝑠. During the past year, he has recorded his time to talk to 5 people. He has a random sample of n=150. For those 150 times, the mean was 𝑥 = 12.46 𝑚𝑖𝑛𝑢𝑡𝑒𝑠. Let 𝜇 be the mean talking time for the entire distribution. Find a 0.95 confidence interval for 𝜇. Group Work • Walter meets Julia at the track. He prefers to jog 3 miles. He knows that 𝜎 = 3.67 𝑚𝑖𝑛𝑢𝑡𝑒𝑠. For a random sample 90 samples, the mean time was 𝑥 = 23.45 minutes. Let 𝜇 be the mean jogging time for the entire distribution of Walter’s 3-mile running times. Find a 0.99 confidence interval for 𝜇 How to find the sample size n for estimating 𝜇 when 𝜎 is known • Assume 𝑥 is approximately normal • 𝑛= 𝑧𝑐 𝜎 2 𝐸 • E= specified maximal error of estimate • 𝜎 = population standard deviation • 𝑧𝑐 = critical value from the normal distribution for the desired confidence level c • If n is not a whole number, increase n to the next higher whole number. Note that n is the minimal sample size for a specified confidence level and maximal error of estimate E. Example: • A wildlife study is designed to find the mean weight of salmon caught by an Alaskan fishing company. A preliminary study of a random sample of 50 salmon showed 𝑠 ≈ 2.15 𝑝𝑜𝑢𝑛𝑑𝑠. How large of a sample should be taken to be 99% confident that the sample mean 𝑥 is within 0.20 pound of the true mean weight 𝜇. Answer • Since sample of 50 fish is large enough to permit a good approximation (50>30) • 𝑛= 𝑧𝑐 𝜎 2 𝐸 • 𝑛= 2.58∗2.15 2 0.20 = 769.2 • So about 770 fish or larger Group Work • A study is designed to show the mean number of boyfriend/girlfriend a person has in his/her lifetime. A study with n=60 showed that 𝑠 ≈ 5.16 people. How large of a sample should be taken to be 99% confident that the sample mean 𝑥 is within 0.50 people of the true mean 𝜇? Homework Practice • Pg 338 #1-20 eoo (check answers in the back) ESTIMATING 𝝁 WHEN 𝝈 IS UNKNOWN Well…here is the situation • We have just learned how to calculate 𝜇 when 𝜎 is known. But much of the time, when 𝜇 is unknown, 𝜎 is also unknown. • In such cases, we use the sample standard deviation s to approximate 𝜎. • When we use s to approximate 𝜎, the sampling distribution for 𝑥 follows a new distribution called a Student’s t distribution Note: • What we are about to learn is the most common way to calculate. • What we learned last section almost never happens! Student’s t distribution • Assume that x has a normal distribution with mean 𝜇. For samples of size n with sample mean 𝑥 and sample standard deviation s, the t variable • 𝑡= 𝑥−𝜇 𝑠/ 𝑛 • Has a Student’s t distribution with degrees of freedom d.f. = n-1 What is degrees of freedom? • d.f. = n-1 • Degrees of freedom is the number of variables free to change when a statistic or parameter is fixed. • Example: if a student needs a 90 average based on three tests, and the first two scores are 82 and 95, then the last score is fixed. It must be a 93; in other words, only the first two scores were “free to vary” Properties of a Student’s t distribution • 1) the distribution is symmetric about the mean 0 • 2) The distribution depends on the degrees of freedom, d.f. (d.f. = n-1 for 𝜇 confidence intervals) • 3) The distribution is bell-shaped, but has thicker tails than the standard normal distribution • 4) As the degrees of freedom increase, the t distribution approaches the standard normal distribution. Now you have to be careful • When you look at the critical values for confidence intervals, you don’t want to use the wrong one. Confidence Interval • 𝑃 −𝑡𝑐 < 𝑡 < 𝑡𝑐 = 𝑐 Note!!!! • If the degrees of freedom d.f. you need are not in the table, use the closest d.f. in the table that is smaller. This procedure results in a critical value that is more conservative in the sense that it is larger. The resulting confidence interval will be longer and have a probability that is slightly higher than c. Example: Activity time! • Using t-chart • Go to Table 6 of Appendix II pg. A24 • Find the critical value 𝑡𝑐 for a 0.99 confidence level for a t distribution with sample size n=5 • Procedure: – First, we find the column with c heading 0.990 – Then we compute the number of degrees of freedom: d.f.=n-1 = 5-1 = 4 – Last we read down the column under the heading c=0.99 until we reach the row headed by 4. – The answer should be: 4.604 Group Activity • A) Find the critical value 𝑡𝑐 for a 0.95 confidence level for a t distribution with sample size n=13 • B) Find the critical value 𝑡𝑐 for a 0.99 confidence level for a t distribution with sample size n=32 • C) Find the critical value 𝑡𝑐 for a 0.90 confidence level for a t distribution with sample size n=7 Answer • A) 2.179 • B) 2.750 • C) 1.943 Maximal margin of error, E • 𝐸= 𝑠 𝑡𝑐 𝑛 Confidence Interval • 𝑃 𝑠 −𝑡𝑐 𝑛 <𝑥−𝜇 < 𝑠 𝑡𝑐 𝑛 =𝑐 • 𝑃 𝑥−𝐸 <𝜇 <𝑥+𝐸 =𝑐 • Look at last section for prove Summary: • Confidence interval for 𝜇 when 𝜎 is unknown • 𝑥−𝐸 <𝜇 <𝑥+𝐸 • Where 𝑥 =sample mean of a simple random sample • 𝐸= 𝑠 𝑡𝑐 𝑛 • C= confidence level (0<c<1) • 𝑡𝑐 =critical value for confidence level c and degrees of freedom d.f.=n-1 Example: • Suppose an archaeologist discovers only 7 fossil skeletons from a previously unknown species of miniature horse. Reconstructions of the skeletons of these 7 miniature horses show the shoulder heights (in cm) to be: • 45.3 47.1 44.2 46.8 46.5 45.5 47.6 • A) Find the mean and the standard deviation (sample) • B) Find a 99% confidence interval for 𝜇 Answer • A) 𝑥 = 46.14 𝑠 = 1.19 • B) d.f. = n-1 = 7-1 = 6 • 𝑡0.99 = 3.707 𝑠 1.19 𝐸 = 𝑡𝑐 = 3.707 = 1.67 𝑛 7 𝑥 − 𝐸 < 𝜇 < 𝑥 + 𝐸 = 46.14 − 1.67 < 𝜇 < Group Work • A company has a new process for manufacturing large artificial sapphires. In a trail run, 37 sapphires are produced. The mean weight for these 37 gems is 6.75 carats and the sample standard deviation is 0.33 carat. Let 𝜇 be the mean weight for the distribution of all sapphires produced by the new process. Find a 95% confidence interval for 𝜇 and interpret it. Answer • 6.64 < 𝜇 < 6.86 • The company can be 95% confident that the interval from 6.64 to 6.86 is an interval that contains the population mean weight of sapphires produced by the new process. Group Work • Sees candies uses a new process to create their chocolate candies. In a trial run, the weights per box are (in lbs): 12.1 10.9 15.2 11.3 12.5 11.8 • Find a 90% confidence interval for the population mean weight per box. Homework Practice • Pg 349 #1-14 eoe ESTIMATING 𝝆 IN THE BINOMIAL DISTRIBUTION Note: • Remember binomial distribution is completely determined by the number of trials n and the probability p of success on a single trial. • For most experiments, the number of trials is chosen in advance, then the distribution is completely determined by p. The point estimates for p and q are • 𝑝= 𝑟 𝑛 • 𝑞 =1−𝑝 • Where n= number of trials and r= number of successes. Margin of Error for Binomial Distribution • 𝑝−𝑝 • Or • 𝐸 = 𝑧𝑐 𝑝𝑞/𝑛 Summary on how to find a confidence interval for a proportion p • P is probability of success, q represents the population probability of failure. Let r be a random variable that represents the number of successes out of the n binomial trials • The point estimates for p and q are • 𝑝= • The number of trials n should be sufficiently large so that both 𝑛𝑝 > 5 𝑎𝑛𝑑 𝑛𝑞 > 5 • • Confidence interval for p 𝑝−𝐸 <𝜌 <𝑝+𝐸 • 𝐸 ≈ 𝑧𝑐 𝑝𝑞/𝑛 • • C= confidence level (0<c<1) 𝑧𝑐 = critical value for confidence level c based on the standard normal distribution 𝑟 𝑛 𝑎𝑛𝑑 𝑞 = 1 − 𝑝 Example: • Suppose that 800 students are selected at random from a student body of 20000 and that they are each given a shot to prevent a certain type of flu. These 800 students are then exposed to the flu, and 600 of them do not get the flu. Let p represent the probability that the shot will be successful for any single student selected at random from the entire population of 20000. • A) What is the number of trials n? What is the value of r? • B)What are the point estimates for p and q? • C) Would it seem that the number of trials is large enough to justify a normal approximation to the binomial? • D) Find a 99% confidence interval for p Answer • A) n=800, r=600 • B) 𝑝 = 800 = 0.75 • 𝑞 = 0.25 • C) 𝑛𝑝 ≈ 800 0.75 = 600 > 5, 𝑛𝑞 ≈ 800 0.25 = 200 > 5 a normal approximation is justified • D)𝐸 ≈ 𝑧0.99 • • 99% confidence interval is then 𝑝−𝐸 <𝑝 <𝑝+𝐸 • • 0.75 − 0.0395 < 𝑝 < 0.75 + 0.0395 0.71 < 𝑝 < 0.79 600 𝑝𝑞 𝑛 ≈ 2.58 0.75 0.25 800 ≈ 0.0395 Group Work • A random sample of 190 books purchased at a local bookstore showed that 71 of the books were science fiction. Let p represent the proportion of books sold by this store that are science fiction. • • • • A) what is a point estimate for p? B) Find a 90% confidence interval for p C) Interpret the confidence interval D) Can normal approximation be justified? Group Work • A random sample of 260 hand sanitizer was chosen at random showed that 102 of them kills the bacteria. Let p represent the proportion of hand sanitizer that kills the bacteria. • • • • A) What are the point estimates? B) Find a 95% confidence interval for p C) Can normal approximation be justified? D) Interpret the confidence interval General interpretation of poll results • 1) When a poll states the results of a survey, the proportion reported to respond in the designated manner is 𝑝, the sample estimate of the population proportion • 2) The margin of error is the maximal error E of a 95% confidence interval for p • 3) A 95% confidence interval for the population proportion p is • 𝑝𝑜𝑙𝑙 𝑟𝑒𝑝𝑜𝑟𝑡 𝑝 − 𝑚𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟 𝐸 < 𝑝 < 𝑝𝑜𝑙𝑙 𝑟𝑒𝑝𝑜𝑟𝑡 𝑝 + 𝑚𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟 𝐸 Example: • A) What confidence level corresponds to the phrase “chances are 19 of 20 that if…” • B) What is the error correspond to the phrase “results by no more than 2.6 percentage points in either direction”? Answer • A) 19/20 = .95 or 95% • B) 2.6% How to find the sample size n for estimating a proportion p • 𝑛 =𝑝 1−𝑝 • 𝑛= 1 𝑧𝑐 2 if 4 𝐸 𝑧𝑐 2 if 𝐸 you have a preliminary estimate for p you do not have a preliminary estimate for p • If n is not a whole number, increase n to the next higher whole number. • Also, if necessary, increase the sample size n to ensure that both np>5 and nq>5. Note that n is the minimal sample size for a specified confidence level and maximal error of estimate. Example • A company is in the business of selling wholesale popcorn to grocery stores. The company buys directly from farmers. A buyer for the company is examining a large amount of corn from a certain farmer. Before the purchase is made, the buyer wants to estimate p, the probability that a kernel will pop. • Suppose a random sample of n kernels is taken and r of these kernels pop. The buyer wants to be 95% sure that the point estimate 𝑝 = 𝑟 𝑓𝑜𝑟 𝑝 𝑤𝑖𝑙𝑙 𝑏𝑒 𝑖𝑛 𝑒𝑟𝑟𝑜𝑟either way by less than 0.01 𝑛 • A) if no preliminary study is made to estimate p, how large a sample should the buyer use? • B) A preliminary study showed that p was approx 0.86. If the buyer uses the results of the preliminary study, how large a sample should be used? Answer • 𝑧0.95 = 1.96 • A)𝑛 = 9604 1 𝑧𝑐 2 4 𝐸 • B) 𝑛 = 𝑝 1 − 0.86 0.14 = 1 1.96 2 4 0.01 = 0.25 38416 = 𝑧𝑐 2 𝑝 = 𝐸 1.96 2 = 4625.29 0.01 = 4626 Group Work • Blah blah blah blah…99% sure that the point 𝑟 estimate 𝑝 = 𝑓𝑜𝑟 𝑝 𝑤𝑖𝑙𝑙 𝑏𝑒 𝑖𝑛 𝑒𝑟𝑟𝑜𝑟 either way 𝑛 by less than 0.48 • A) if no preliminary study is made to estimate p, how large a sample should we use? • B) A preliminary study showed that p was approx 0.42. If we use the results of the preliminary study, how large a sample should be used? Homework practice • Pg 361 #1-21 odd ESTIMATING 𝝁𝟏 − 𝝁𝟐 AND 𝝆𝟏 − 𝝆𝟐 Note: • How can we tell if two populations are different? • One way is to compare the difference in population mean or the difference in population proportions • Two samples are independent if sample data drawn from one population are completely unrelated to the selection of sample data from the other population. • Two samples are dependent if each data value in one sample can be paired with a corresponding data value in the other sample. Dependent Samples • Dependent samples and data pairs occur very naturally in which the same object or item is measured twice. Independent Samples • Independent samples occur very naturally when we draw two random samples, one from the first population and one from the second population. Since both samples are random samples, there is no pairing of measurements between the two populations. 1st situation: confidence intervals for 𝜇1 − 𝜇2 when 𝜎1 𝑎𝑛𝑑 𝜎2 are known • Let 𝜎1 𝑎𝑛𝑑 𝜎2 be the population standard deviations of populations 1 and 2. Obtain two independent random samples from populations 1 and 2, where • • 𝑥1 and 𝑥2 are sample means from populations 1 and 2 𝑛1 and 𝑛2 are sample sizes from populations 1 and 2 • If you can assume that both population distributions 1 and 2 are normal, any sample sizes 𝑛1 and 𝑛2 will work. If you cannot assume this, then use sample sizes greater than or equals to 30; 𝑛1 ≥ 30 and 𝑛2 ≥ 30 • Confidence interval for 𝜇1 − 𝜇2 • (𝑥1 − 𝑥2 ) − E < 𝜇1 − 𝜇2 < (𝑥1 − 𝑥2 ) + E • Where 𝐸 = 𝑧𝑐 • • C = confidence level (0<c<1) 𝑧𝑐 = critical value for confidence level c 𝜎12 𝑛1 + 𝜎22 𝑛2 Example • Suppose you are a biologist studying fishing data from Yellowstone streams before and after a major disaster. Fishing reports include the number of trout caught per day per fisherman. A random sample of 𝑛1 =167 reports from period before the fire showed that the average catch was 𝑥1 =5.2 trout per day. Assume the standard deviation of daily catch per fisherman was 𝜎1 =1.9. Another random of 𝑛2 = 125 fishing reports 5 years after the disaster showed that the average catch per day was 𝑥2 =6.8 trout. Assume the s.d. during this period was 𝜎2 =2.3 • A) What is the population for each sample? Are the independent or dependent? • B) Compute a 95% confidence interval for 𝜇1 − 𝜇2 • C) Interpret the result Answer • A) Yes they are independent because they are from 2 different events. • B) Since 𝑛1 = 167, 𝑥1 = 5.2, 𝜎1 = 1.9, 𝑛2 = 125, 𝑥2 = 6.8, 𝜎2 = 2.3, 𝑧0.95 = 1.96 • 𝐸 = 𝑧𝑐 • • • • Therefore 95% CI is (𝑥1 − 𝑥2 ) − E < 𝜇1 − 𝜇2 < (𝑥1 − 𝑥2 ) + E (5.2−6.8) − 0.50 < 𝜇1 − 𝜇2 < (5.2−6.8) + 0.50 -2.10< 𝜇1 − 𝜇2 < -1.10 • C) Since the interval is negative, we are 95% confident that 𝜇1 < 𝜇2 so that we are 95% sure that average catch before the fire was less than average catch after the fire. 𝜎12 𝑛1 𝜎2 + 𝑛2 = 1.96 2 1.92 167 2.32 + 125 = .4955 = .50 • Situation 2: (Most common) confidence intervals for 𝜇1 − 𝜇2 when 𝜎1 𝑎𝑛𝑑 𝜎2 are unknown Obtain two independent random samples from populations 1 and 2, where • • • • 𝑥1 𝑎𝑛𝑑 𝑥2 are sample means from populations 1 and 2 𝑠1 𝑎𝑛𝑑 𝑠2 are sample standard deviations from populations 1 and 2 𝑛1 𝑎𝑛𝑑 𝑛2 are sample sizes from populations 1 and 2 If you can assume that both population distributions 1 and 2 are normal or at least mound shaped and symmetric, then any sample sizes 𝑛1 𝑎𝑛𝑑 𝑛2 will work, if not, use sample sizes greater than or equal to 30, 𝑛1≥ 30 𝑎𝑛𝑑 𝑛2 ≥ 30 • Confidence interval for 𝜇1 − 𝜇2 • (𝑥1 − 𝑥2 ) − E < 𝜇1 − 𝜇2 < (𝑥1 − 𝑥2 ) + E • Where 𝐸 = 𝑡𝑐 • • • C = confidence level (0<c<1) 𝑡𝑐 = critical value for confidence level c d.f.= degree of freedom, the smaller of 𝑛1 − 1 𝑎𝑛𝑑 𝑛2 − 1 • Example: if you have 𝑛1 − 1 =25 and 𝑛2 − 1 = 15, you use 15 as the d.f. 𝑠12 𝑛1 𝑠2 + 𝑛2 2 Example • Suppose that a random sample of 29 college students was randomly divided into two groups. The first group of 𝑛1 = 15 people was given ½ liter of red wine before going to sleep. The second group of 𝑛2 = 14 people was given no alcohol before going to sleep. Everyone in both groups went to sleep at 11 P.M. The average brain wave activity was determined for each individual in the groups. The results follow: • • Group 1: 16.0 19.6 19.9 20.9 20.3 20.1 16.4 20.6 20.1 22.3 18.8 19.1 17.4 21.1 22.1 • • Group 2: 8.2 5.4 6.8 6.5 4.7 5.9 2.9 7.6 10.2 6.4 8.8 5.4 8.3 5.1 • A) Do you think the samples are independent or dependent? Explain • B) What assumptions are we making about the data? • C) Compute a 90% confidence interval for 𝜇1 − 𝜇2 Answer • A) Since they are random sample of 29 students that was randomly divided into two groups, it is reasonable to say that they are independent. • B) We are assuming the populations of 𝑥1 and 𝑥2 are approximately normally distributed. • C) 𝑥1 = 19.65, s1 = 1.86 • 𝑥2 = 6.59, 𝑠2 = 1.91 • 𝐸 = 𝑡𝑐 𝑠12 𝑛1 + 𝑠22 𝑛2 = 1.771 1.862 15 + 1.912 14 = 1.24 • (19.65 − 6.59) − 1.24 < 𝜇1 − 𝜇2 < (19.65 − 6.59) + 1.24 • 11.82 < 𝜇1 − 𝜇2 < 14.30 • • • • 3rd situation: confidence intervals for 𝜇1 − 𝜇2 when 𝜎1 𝑎𝑛𝑑 𝜎2 are unknown but we believe that 𝜎1 = 𝜎2 𝑥1 𝑎𝑛𝑑 𝑥2 are sample means from populations 1 and 2 𝑠1 𝑎𝑛𝑑 𝑠2 are sample standard deviations from populations 1 and 2 𝑛1 𝑎𝑛𝑑 𝑛2 are sample sizes from populations 1 and 2 If you can assume that both population distributions 1 and 2 are normal or at least mound shaped and symmetric, then any sample sizes 𝑛1 𝑎𝑛𝑑 𝑛2 will work, if not, use sample sizes greater than or equal to 30, 𝑛1≥ 30 𝑎𝑛𝑑 𝑛2 ≥ 30 • Confidence interval for 𝜇1 − 𝜇2 when 𝜎1 = 𝜎2 • (𝑥1 − 𝑥2 ) − E < 𝜇1 − 𝜇2 < (𝑥1 − 𝑥2 ) + E • Where 𝐸 = 𝑡𝑐 • • • C = confidence level (0<c<1) 𝑡𝑐 = critical value for confidence level c d.f.= degree of freedom, d.f=𝑛1 + 𝑛2 − 2 𝑛1 −1)𝑠12 + 𝑛2 −1)𝑠22 𝑛1 +𝑛2 −2 Example (when you find the s.d. they seem very close to each other) • Height of Asians in Asia (in feet): • 5.14 5.75 5.29 5.86 5.92 6.12 5.77 5.81 5.80 5.78 • Height of Asians in US (in feet): • 5.16 5.72 5.30 5.84 5.95 6 5.79 5.80 5.85 5.81 • A) create a 85% confidence interval for the difference in population mean weight for Asians in Asia and USA. How to find a confidence interval for 𝝆𝟏 − 𝝆𝟐 • Binomial Experiment 1 • 𝑛1 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑙 • 𝑟1 =number of successes out of 𝑛1 trials • 𝑝1 = 𝑟1 𝑛1 • 𝑝1 =population probability of success • Binomial Experiment 2 • 𝑛2 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑙 • 𝑟2 =number of successes out of 𝑛1 trials • 𝑝2 = 𝑟2 𝑛2 • 𝑝2 =population probability of success The number of trials should be sufficiently large so that all four of the following are true: 𝑛1 𝑝1 > 5; 𝑛1 𝑞1 > 5; 𝑛2 𝑝2 > 5; 𝑛2 𝑞2 > 5 Confidence interval for 𝝆𝟏 − 𝝆𝟐 𝑝1 − 𝑝2 − 𝐸 < 𝜌1 − 𝜌2 < 𝑝1 − 𝑝2 + 𝐸 𝐸 = 𝑧𝑐 𝜎 = 𝑧𝑐 𝑝1 𝑞1 𝑝2 𝑞2 + 𝑛1 𝑛2 Example: • Suppose two groups of subjects were randomly chosen for a sleep study. In group I, before going to sleep, the subjects spent 1 hour watching a comedy movie. In this group there were total of 175 dreams recorded, of which 49 were dreams with feeling of anxiety, fear, or aggression. In group II, the subject just went to sleep. There were total of 180 dreams recoded, of which 63 were dreams with feeling of anxiety, fear or aggression. • A) Why could group I and II be considered independent binomial distributions? Do we have enough sample? • B) compute a 95% confidence interval for 𝝆𝟏 − 𝝆𝟐 Answer • A) yes, because they were chosen randomly and they don’t overlap. Also, we have enough samples because when we did the calculations it’s all over 5 (do the work!!) • B) 𝑧𝑐 𝑝1 𝑞1 𝑛1 + 𝑝2 𝑞2 𝑛2 = 1.96 .28 .72) 175 + .35 .65) 180 • E=0.096 • 𝑝1 − 𝑝2 − 𝐸 < 𝜌1 − 𝜌2 < 𝑝1 − 𝑝2 + 𝐸 • .28 − .35 − 0.096 < 𝜌1 − 𝜌2 < .28 − .35 + 0.096 • −0.166 < 𝜌1 − 𝜌2 < 0.026 Homework Practice • Pg 377 #1-23 eoo

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Statistics Chapter 8 Estimation