* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download the notes
Degrees of freedom (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Psychometrics wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Omnibus test wikipedia , lookup
Regression toward the mean wikipedia , lookup
Taylor's law wikipedia , lookup
Misuse of statistics wikipedia , lookup
Kuwait University College of Business Administration Department of Quantitative Methods and information System Tutorial Stat 220 Chapter 7: sampling and sampling distribution Chapter 8: confidence interval Chapter 9: hypothesis testing Chapter 10: statistical inference based on two samples Chapter 11: ANOVA Chapter 12: chi-square test for independence Chapter 13: simple linear regression Chapter 14: multiple regression and model building Done by: T.A Dalal Al-Odah T.A Narjes Akbar T.A Dalal AL-Banwan Supervised by Dr.Mohammed Qadry Grraph summer 2014/2015 1 Chapter 7 I. Μ Sampling distribution of sample mean π Exactly normal Approximately normal ο If the population data follow Normal (π, π) Then ο If the population data follow any distribution (not normal) and the sample size is large (π β₯ 30) Then ο The sampling distribution π π₯Μ ~π (ππ₯Μ = π, ππ₯Μ = ) βπ ο The sampling distribution π π₯Μ ~π (ππ₯Μ = π, ππ₯Μ = ) βπ (by central limit theorem CLT) Where: π₯Μ : Sample mean π= π₯Μ β π π βπ π: Population mean π: Population standard deviation ππ₯Μ : Standard deviation of the sample mean (standard error) ππ₯Μ : mean of the sample mean Note: mean = average = rate = expected 2 EXERCISES 1. The amounts of electric bills for all households in a city have a skewed probability distribution with a mean of $80 and a standard deviation of $25. Let π₯Μ be the mean amount of electric bills for a random sample of 75 households selected from this city. Find: a. The mean of the sampling distribution of π₯Μ b. The variance of the sampling distribution of π₯Μ c. The standard deviation (error) of the sampling distribution of π₯Μ d. What is the sampling distribution of the sample mean π₯Μ e. Find the probability that the mean amount of electric bills for a random sample of 75 households selected from this city will be i. Between $72 and $77 ii. Within $6 of the population mean iii. More than the population mean by at least $5 iv. Less than the population mean by at least $2 v. Either less than $72 or more than $77 3 2. The print on the package of 100-watt General Electric soft-white light bulbs claims that these bulbs have an average of 750 hours. Assume that the lives of such bulbs have a normal distribution with a mean of 750 hours and a standard deviation of 55 hours. Let π₯Μ be the mean life of a random sample of 25 such bulbs. a. Find the mean and standard deviation of π₯Μ and describe its sampling distribution. b. Find the probability that the mean life of a random sample of 25 such bulbs will be within 15 hours of the population mean. c. Find the fraction that π₯Μ will be less than the population mean by 20 hours or more. d. Find the percentage that π₯Μ will be more than the population mean by at least 20 hours. e. Find the probability that π₯Μ will be within 1.5 standard deviation (error) of the population mean. REVIEW 1. A quality control inspector periodically checks a production process. This inspector selects simple random samples of 30 finished products and computes the sample mean product weights Μ π₯. Test results over long period of time show that 2.5% of the π₯Μ values are over 2.1 kg and 2.5% are less than 1.9 kg. a. What are the mean and standard deviation for the population of products produced with this process? (π = π, π = π. ππππ). b. Find the probability that π₯Μ will be within one standard deviation (error) of the population Μ β€ π + ππΜ ) = π(βππΜ β€ π Μ β π β€ ππΜ ) = π(βπ β€ π β€ π) = π. πππ). mean?(π(π β ππΜ β€ π 2. A machine makes 3-inch-long nails. The probability distribution of the lengths of these nails is normal with a mean of 3 inches and a standard deviation of 0.1 inch. The quality control inspector takes a sample of 25 nails once a week and calculates the mean length of theses nails. If the mean of this sample is either less than 2.95 inches or greater than 3.05 inches, the inspector concludes that the machine needs an adjustment. a. What is the mean, standard deviation (error) and sampling distribution of the sample Μ ~π΅(ππΜ = π, ππΜ = π. ππ)) mean?(π b. What is the probability that based on a sample of 35 nails the inspector will conclude that the Μ < π. ππ) + π(π Μ > π. ππ) =. πππ) machine needs an adjustment?(π(π 4 II. Μ Sampling distribution of the sample proportion π· Approximately normal ο If ππ β₯ 5 and π(1 β π) β₯ 5 Then ο The sampling distribution πΜ~π(ππΜ = π, ππΜ = β π(1 β π) ) π (by central limit theorem CLT) Where: π= πΜ β π βπ(1 β π) π π: Population proportion πΜ: Sample proportion ππΜ : Mean of the sample proportion ππΜ : Standard deviation of the sample proportion (standard error) 5 EXERCISES 1. A corporation makes auto batteries. The company claims that 80% of its batteries are good for 70 months or longer. Assume that this claim is true. Let πΜ be the proportion in a sample of 100 batteries that are good for 70 months or longer. a. What is the mean, standard deviation (error), and sampling distribution of the sample proportion? b. The probability that the proportion is less than 0.09? c. The probability that this πΜ is within 0.05 of the population proportion? d. The probability that this πΜis not within 0.05 of the population proportion? e. What is the probability that πΜ is less than the population proportion by 0.06 or more? REVIEW 1. Suppose that among the undergraduate students at a very large university 5.9% are international students and 57.8% are female. a. If 28 students are randomly sampled, what is the probability that fewer than 14 are π ππ Μπ < π. π) = π·(π < βπ. ππ) = π. ππππ). female?(π·(ππ < ππ) = π· ( ππ < ππ) = π·(π b. If 10 students are randomly sampled, what is the probability that more than 10% are Μ π° > π. π) = π·(π > π. ππ) = π. ππππ). international students?(π·(π· 2. A machine that is used to make CDs is known to produce 6% defective CDs. The quality control inspector selects a sample of 100 CDs every week and inspects them for being good or defective. If 8% or more of the CDs in the sample are defective, the process is stopped and the machine will be adjusted. What is the probability that based on a sample of 100 CDs the process will be Μ > π. ππ) = π·(π > π. ππ) = π. ππππ). stopped to adjust the machine?(π·(π 6 Chapter 8 I. Population mean (π) Μ Μ) = π ο point estimate of π (π ο Confidence interval (C.I.) of π π Unknown π known π β₯ 30 π < 30 π₯Μ ± π‘πΌβ2,πβ1 π π₯Μ ± ππΌβ2 βπ π₯Μ ± ππΌβ2 π βπ π βπ Margin of Error (Max Error/ Error) ο To fine the sample size ππΌβ2 π π‘π. πππ£ 2 π=( ) πΈ ο If we have C.I. ( L , U ) then o Sample Mean o Marginal error πΏ+π 2 πβπΏ 2 = π€πππ‘β (πππππ‘β) 2 7 Exercises: 1. A random sample of 16 mid-sized cars, which were tested for fuel consumption, gave a mean of 26.4 miles/gallon with standard deviation of 2.3 miles/gallon. a. Find a 95% confidence interval for the average fuel consumption of a midsized car? b. What assumption(s) are necessary for your answer in (a) to be valid? c. Find the error of such interval? d. If we choose a sample of size 100 mid-sized cars, then repeat part (a)? e. What sample size would be required to reduce the margin of error by 50%? 2. An economist wants to find a 90% confidence interval for the mean sale price of houses in a state. How large a sample should he or she select so that the estimate is within $3500 of the population mean? Assume that the standard deviation for the sale prices of all houses in this state is $31500? 8 3. IQ tests are designed to yield results that are approximately normally distributed. Researchers think that the population standard deviation is 15. A reporter is interested in estimating the average IQ of employees in a large high-tech firm in California. She gathers the IQ information on 22 employees of this firm and records the sample mean IQ as 106. a. Compute 90% confidence intervals of the average IQ in this firm. b. If the C.I is (97.77, 114.23) find the confidence level 4. In analyzing the operating cost for a huge fleet of delivery trucks, a manager takes a sample of 25 cars and calculated the sample mean and variance of the operating cost. Under the assumption that the operating cost has a normal distribution, he found that the 95% confidence interval for the mean operating cost is between 253 and 300 K.D. a. Find the maximum error of estimate (error bound) for such interval? b. Find the sample mean and variance? c. The manager said he is 95% confidence that the sample mean lies within such interval, do you agree? Why? d. Construct a 90% confidence interval for the true mean? Find the error of such interval? 9 Review: 1. To measure the time taken to manufacture a device, a random sample was chosen. The following is the assembly time (the time taken to fix each device in minutes ) for the sample: 8 10 12 15 17 If the sample information is used to estimate the population mean of the assembly time then: a. Give a point estimate and a 99% confidence interval for the population mean? β π State your necessary assumptions we need?(πΜ = πΜ = ππ = ππ. π, πΜ ± ππΆβπ,πβπ βπ , ππππππππππ ππ ππππππ ππππππππππ ). b. If the population standard deviation is known to be 3, how large is the sample size needed to estimate the mean assembly time with 0.99 confidence, and error margin of one minute?(π = ( ππΆβ π π π π¬ ) =( π.πππ×π π π ) = ππ. ππ β ππ) 2. Determine the margin of error for a confidence interval estimate for the population mean of a normal distribution given the following information: Confidence level=0.98, n=13, S=15.68 (π΄. π¬ = ππ. ππ) 10 II. Population proportion ο Point estimate of P= πΜ ο Confidence interval of P πΜ (1 β πΜ) πΜ ± ππΌβ2 β π Margin of Error (Max Error/ Error) ο To find the sample size π=( ππΌβ 2 πΜπΜ 2 πΈ2 ) Where πΜ = 1 β πΜ 11 Exercises: 1. It is said that happy and healthy workers are efficient and productive. A company that manufactures exercising machines wanted to know the percentage of large companies that provide on-site health club facilities. A sample of 240 such companies showed that 96 of them provide such facilities on site. a. What is the point estimate of the percentage of all such companies that provide such facilities on site? What is the margin of error associated with this point estimate? b. Construct a 97% C.I for the percentage of all such companies that provide such facilities on site. 2. A consumer agency wants to estimate the proportion of all drivers who wear seat belts while driving. Assume that a preliminary study has shown that 76% of drivers wear seat belts while driving. How large should the sample size be so that the 99% C.I for the population proportion has a maximum error of 0.03? 3. A college registrar has received complaints about the online registration procedure at her college. She wants to estimate the proportion of all students at this college who are dissatisfied the online registration procedure. What is the most conservative estimate of the sample size that would limit the maximum error to be within 0.05 of the population proportion for 90% C.I? 12 Review 1. A researcher wanted to know the percentage of judges who are in favor of the death penalty. He took a random sample of 15 judges and asked them whether or not they favor of the death penalty. The responses of these judges are given here Yes No Yes Yes No No No Yes Yes No Yes Yes Yes No Yes a. What is the point estimate of the population proportion? What is the margin of Μ Μ Μ = π = π. π, π΄. π¬ = ππΆβ βπ·(πβπ·) = π. ππππ) error associated with this point estimate?(π· π π π b. Make a 95% C.I for the percentage of all judges who are in favor of the death penalty.(π. π ± π. πππππ) 2. a. How large a sample should be selected so that the maximum error of estimate for 99% C.I for the population proportion is 0.035? When the value of the sample proportion obtained from a preliminary sample is 0.29?(π. ππ±. πππ) b. Find the most conservation sample size that will produce the maximum error for a 99% C.I for p equal to 0.035(π―πππ: ππ πΜ πππ‘ πππ£ππ π‘πππ πΜ = 0.5, π = ππππ. ππ β ππππ) 13 Chapter 9 I. Testing hypothesis for π 1. State null hypothesis (π»0 ) and alternative hypothesis (π»1 ) π»0 : π = π»1 : π β < > vs β₯ β€ T.S for π 2. Calculate the test statistic (T.S) π π’πππππ€π π < 30 ππΆ = π ππππ€π ππΆ = π β₯ 30 π₯Μ β π π βπ ππΆ = π₯Μ β π π βπ π₯Μ β π π βπ 3. Calculate P-value 3. Determine the rejection region (R.R) p-value + p-value + βππΌβ 2 ππΌβ 2,πβ1 π‘πΌβ2,πβ1 βπ‘πΌβ 2 Two-Tailed βππΌ βπ‘πΌ,πβ1 Left-Tailed 4. Conclusion: we reject π»0 if T.S lies in the R.R p-value ππΌ π‘πΌ,πβ1 Right-Tailed βππΆ ππΆ Two-Tailed βππΆ ππΆ Left-Tailed 4. Conclusion: we reject π»0 if ππ£πππ’π < πΌ Right-Tailed 14 ο· Type Π error: o What is the type Π error? π πππππ‘ π»0 / π»0 ππ π‘ππ’π o What is the probability of type Π error? P (type Π error) = Ξ± = π(π πππππ‘ π»0 / π»0 ππ π‘ππ’π) o Note: 1 β πΌ = π(ππ πππ‘ ππππππ‘ π»0 /π»0 ππ π‘ππ’π) ο· Type Π error: o What is the type Π error? π·π πππ‘ ππππππ‘ π»0 / π»0 ππ ππππ π o What is the probability of type Π error? P (type Π error) = Ξ²= π(π·π πππ‘ ππππππ‘ π»0 / π»0 ππ ππππ π) o Note: 1 β π½ = π( π πππππ‘ π»0 /π»0 ππ ππππ π) Power of the test The null hypotheses is Your Decision based on a random sample true false Reject Type Π error Correct decision Do not Reject Correct decision Type Π error 15 Exercises: 1. According to survey by the national retail Association, the average amount that households in the United States planned to spend on gifts, decorations, greeting cards, and food during 2001 holiday season was $940. Suppose that a recent random sample of 324 households showed that they plan to spend an average of $1005 on such items during this yearβs holiday season with a standard deviation of $330. a. Test at the 1% significance level whether the mean of such planned holiday related expenditures for households for this year differs from $940. 1) 2) 3) 4) b. Find 99% C.I of µ. c. Use C.I from part (b) to test π»0 : π = 940 π£π π»1 : π β 940 16 2. A drug company is considering marketing a new local anesthetic. The effective time of the anesthetic the drug company that is currently produced has a normal distribution with an average of 7.4 minutes with standard deviation of 1.2 minutes. To market the new anesthetic, the mean effective time should be less than 7.4 min. a sample of size 36 results in a sample mean of 7.1. a hypothesis test will be done to help make a decision. a. State the null and the alternative hypotheses b. Compute the test statistic c. Compute the P-value of the test d. What is your recommendation to the drug company using a level of significance of 0.01? 3. Insurance companies track life expectancy information to assist in determining the cost of life insurance policies. Last year the average life expectancy of all policyholders was 77 years. A company wants to determine if their clients now have longer life expectancy on average, so they randomly sample 20 of their recent paid policies. The sample has a mean of 78.6 years and a standard deviation of 4.48 years. a. Write the null and alternative hypotheses b. What is the value of the test statistic? c. State your conclusion using πΌ=0.05 d. Considering the result of the test, which type of errors in hypothesis testing could you have made? e. State your assumptions 17 II. Testing hypothesis for P 1. State null hypothesis (π»0 ) and alternative hypothesis (π»1 ) π»0 : π = vs π»1 : π β < β₯ > β€ 2. Calculate the test statistic (T.S) T.S for π ππΆ = πΜ β π βπ(1 β π) π 3. Calculate P-value 3. Determine the rejection region (R.R) p-value + p-value + βππΌβ2 ππΌβ2 Two-Tailed βππΌ Left-Tailed ππΌ Right-Tailed βππΆ ππΆ Two-Tailed 4. Conclusion: we reject π»0 if T.S lies in the R.R p-value βππΆ ππΆ Left-Tailed 4. Conclusion: we reject π»0 if ππ£πππ’π < πΌ Right-Tailed 18 Exercises: 1. A food company is planning to market a new type of frozen yogurt. However, before marketing this yogurt, the company wants to find what percentage of the people like it. The companyβs management has decided that it will market this yogurt only if at least 35% of the people like it. The companyβs research department selected a random sample of 400 persons and asked them to taste this yogurt. Of these 400 persons, 112 said they like it. a. Testing at the 2.5% significant level, can you conclude that the company should market this yogurt? 1) 2) 3) 4) b. What will your decision be in part (a) if the probability of making a type Π error is zero? 2. A study by consumer reports showed that 64percent of supermarket shoppers believes supermarket brands to be as good as national name brands. a. Formulate the hypotheses that can be used to determine whether the percentage of supermarket shoppers who believe that supermarket brands to be as good as national name brands is different from 64%. b. If a sample of 100 shoppers showed that 52 stating that the supermarket brand was as good as the national brand, what is the value of the test statistic? 19 c. What is the p-value? d. At πΌ =0.05, what is your conclusion? Justify your answer. 3. Suppose that in a sample of 1000 employees 23% said that losing their job is the major reason of concern for them. a. Find a 98% confidence interval for the percentage of employees who said losing their job is the major reason of concern for them. b. According to your confidence interval obtained in (a) do you believe that percentage is different from 19% and why or why not? Review: 1. The policy of a company is to deliver on time at least 90% of all the orders it receives from its customers. The quality control inspector at the company usually takes samples of orders delivered and checks if this policy is maintained. A sample of 90 orders taken by this inspector showed that 75 of them were delivered on time. At the 2% significance level, can you conclude that the companyβs policy is maintained? Use the p-value to conduct the test.(π―π : π· < π. π, ππ = βπ. ππ, π β πππππ = π. ππππ, ππππππ π―π ). 20 one population variance ππ III. ο Point estimate for one population variance π 2 one population st. deviation π π2 π ο Confidence interval for One population variance π 2 One population St. Deviation π (π β 1)π 2 (π β 1)π 2 2 <π < 2 π₯πΌ2β ,πβ1 π₯1βπΌβ ,πβ1 2 (π β 1)π 2 (π β 1)π 2 <π<β 2 β 2 π₯πΌβ ,πβ1 π₯1βπΌβ ,πβ1 2 2 2 ο Hypothesis test about one variance π 2 1. π»0 : π2 = π»1 : π2 β vs β₯ < β€ > 2. Calculate the test statistic (T.S) π₯π2 (π β 1)π 2 = π2 3. Determine the rejection region 00 2 π₯1β πΌβ 2,πβ1 π₯πΌ2β 2,πβ1 2 π₯1βπΌ,πβ1 π₯πΌ2 ,πβ1 4. Conclusion: Reject π»0 if T.S π₯πΆ2 lies in rejection region Note: hypothesis test about one population st.dev(π)is the same as hypothesis test about one population variance(π 2 ), but you need to convert the hypothesis from π toπ 2 . 21 Exercises: 1. A professor claims that the variance of the lengths of his lectures is within 2 square minutes. A random sample of 23 of these lectures was timed, and the variance of the lengths of these lectures was found to be 2.7 square minutes. Assume that the lengths of all such lectures by the professor are approximately normally distributed. a) Find the point estimate of the population variance b) Make the 98% confidence intervals for the variance and standard deviation of the lengths of all lectures by the professor. c) Test at the 1% significance level whether the variance of the lengths of all such lectures by the professor exceeds 2 square minutes. 22 2. An assembly line produces units with a mean weight of 10 and a standard deviation of 0.20. A new process supposedly will produce units with the same mean and a smaller standard deviation. A sample of 20 units produced by the new method has a sample standard deviation of 0.126. At a significance level of 10% can we conclude that the new process has less variation than the old? Review 1. Automotive part must be machined to close tolerances to be acceptable to customers. Production specifications call for a maximum variance in the lengths of the parts of 0.0004. Suppose the sample variance for 30 parts turns out to be π 2 = 0.0005. Using Ξ±=0.05, test to see whether the population variance specification is being violated. (π―π : ππ > π. ππππ, πππ = ππ. ππ, πππππππ = ππ. πππ, π π πππ ππππππ π―π ) 23 Chapter 10 I. The difference between two population means (ππ β ππ ) for independent samples π1 & π2 unknown π1 & π2 known π1 & π2 Small π1 & π2 Large ο· Point estimate: π12 β π12 π22 = π22 ο· Point estimate: Μ π1 β πΜ 2 (Homogenous) ο· C.I: ο· Point estimate: ο· Point estimate: Μ π1 β πΜ 2 ο· C.I: π2 π2 (πΜ 1 β πΜ 2 ) ± ππΌβ β 1 + 2 2 π π2 1 ο· C.I: π12 π22 + π1 π2 (πΜ 1 β πΜ 2 ) ± π‘πΌβ ββ 2,π ο· Test statistic : ( Μ π1 β πΜ 2 ) β π· π‘π = π2 π2 β 1+ 2 π1 π2 ο· Μ π1 β πΜ 2 2 π2 π2 ( 1 + 2) π1 π2 2 2 π2 π2 ( 1) ( 2) π1 π + 2 π1 β 1 π2 β 1 π‘π ~π‘(πβ ) 1 1 + π1 π2 . ππ. β 2,π1 +π2 β2 ο· Pooled estimate: (π1 β 1)π12 + (π2 β 1)π22 ππ2 = π1 + π2 β 2 ο· Test statistic : ( Μ π1 β πΜ 2 ) β π· π‘π = 1 1 ππβ + π1 π2 Degree of freedom: πβ = (πΜ 1 β πΜ 2 ) ± π‘πΌβ ο· Test statistic : ( Μ π1 β πΜ 2 ) β π· ππ = π2 π2 β 1+ 2 π1 π2 Μ π1 β πΜ 2 ο· C.I: π2 π2 (πΜ 1 β πΜ 2 ) ± ππΌβ β 1 + 2 2 π π2 1 ο· Test statistic : ( Μ π1 β πΜ 2 ) β π· ππ = π2 π2 β 1+ 2 π1 π2 ππ ~π(0,1) ππ ~π(0,1) π‘π ~π‘(π1 +π2 β2) 24 II. 1. π»0 : π21 = π22 Hypothesis test about Homogeneity (Equal population variances π12 = π22 ) π»1 : π21 β π22 vs 2. Calculate the test statistic (T.S) πΉπΆ = 2 ππππππ 2 ππ ππππ 3. Determine the critical value πΉπΌβ2, πβ1, πβ1 Numerator Denominator 4. Conclusion: Reject π»0 if T.S (πΉπΆ ) lies under the rejection region (under shaded area). 25 Exercises: 1. El-Mraay Dairy company claims that its 8-ounces low-fat yogurt cups contain on the average fewer calories than the 8-ounces low-fat yogurt cups produced by its competitor El-Safy company. In order to check this claim a sample of 50 such cups produced by El-Mraay showed that they contains on the average 144 calories per cup with a standard deviation of 5.4 calories. A sample of 40 cups of El-Safy product showed that they contain on the average 147 calories per cup with a standard deviation 6.3 calories. a. Make a 98% confidence interval for the difference between the mean number of calories in the 8-ounces low fat yogurt cups produced by the two companies b. Find the standard and margin error of part (a). c. Does your C.I obtained in part (a) support the hypothesis that the two means are different, what is the probability of type Π error in that case. d. Test El-Mraay Dairy Company claim with Ξ±=0.05 26 2. Two brands (A and B) of tires are tested to compare their durability. The management of company claims that brand A is durable than brand B. Twelves from each brand are tested on a machine. The mileages (in hundreds of miles for each tire) have been recorded giving the following information. Mileages in hundreds Brand A 157 139 188 143 172 144 191 128 177 160 175 162 Brand B 160 118 150 165 158 159 127 133 170 164 152 142 Brand A Brand B Mean 161.3 149.8 Standard deviation 20.01 16.45 Assuming that two population are normally distributed a. At 5% level significance tests the hypothesis that the two populations are homogeneous (equal variances). b. Assuming homogeneous populations, test the management claim using Ξ±=0.05 27 3. A company claims that its medicine brand A provide faster relief from pain that another companyβs medicine brand B. a researcher tested both brands of medicines on two groups of randomly selected patients. The results of the test are given in the following table. The mean and the standard deviation of relief time are in minutes. Brand Sample size Mean of relief time Standard deviation of relief time A 21 44 12.5 B 17 49 7.5 Assuming that relief time is normally distributed a. Assuming equal variances test the company claim at 0.05 level of significance b. Using Ξ±=0.05 test the hypothesis of homogeneous population (equal variances) 28 Review: 1. In order to study the performance of CBA students in Stat. 120. The QMIS department selected randomly 13 female and 12 male students and their final scores are recorded giving the following summary statistics Sample size Mean STD Female 13 84.15 9.90 male 12 76.58 11.27 Assuming that the scores have homogeneous normal distributions test the hypothesis that the female students scores on the average more than the male students (Ξ±=0.05). (π―π : ππ > ππ , ππ = π. πππ, π.ππ,ππ = π. πππ, ππππππ π―π ) 2. A sample of 18 fathers who were company executives showed that they spend an average of 2.3 hours per week playing with their children, with a standard deviation of 0.54 hours. Another sample of 24 fathers who were medical professionals gave a mean of 4.6 hours per week with a standard deviation of 0.8 hours. Assume that the times spent per week playing with their children by all fathers who are executives and all fathers who are medical professional have normal distributions with equal standard deviations. a. Construct a 95% C.I for difference between the mean time spent per week playing with their children by all fathers who are executives and all fathers who are medical professionals. (-2.741, -1.858) b. Using the above C.I, test whether the mean time spent by all fathers who are executives is equal to that for all fathers who are medical professionals. (π―π : ππ β ππ , ππππππ π―π ) 3. A firm is studying the delivery times of two raw material suppliers. The firm is basically satisfied with supplier A and is prepared to stay with that supplier if the mean delivery time is the same as or less than that of supplier However, if the firm finds that the mean delivery time of supplier B is less than that of supplier A, it will begin making raw material from supplier B. a. What are the null and alternative hypotheses for this situation?(π―π : ππ¨ > ππ© ) b. Assume the independent samples show the following delivery time for the two suppliers: Supplier A Supplier B π1 =10 π2 =20 π₯Μ 1 =14 days π₯Μ 2 =12.5 days π 1 =4 π 2 =2 With Ξ±=0.05 and using t-test with pooled variance what is your conclusion for the hypothesis from part (a)? What do you recommend in terms of supplier selection? (ππ = π. ππ, π.ππ,ππ = π. πππ, π ππππ ππππππ π―π ). 29 4. In a random sample of nine gasoline stations in City βAβ, the prices per gallon of unleaded gas have a standard deviation of $0.08 per gallon. In a random sample of 14 gasoline stations in city βBβ, the prices per gallon have a standard deviation of $0.03 per gallon. Use the 10% significance level to test the null hypothesis that the price per gallon of gasoline is equally variable in two cities. (π―π : πππ β πππ , ππ = π. ππ, ππ.ππ,π,ππ = π. ππ, ππππππ π―π ). 5. On the basis of data provided by a salary survey, the variance in annual salaries for seniors in accounting firms is approximately 2.1 and the variance in annual salaries for managers in accounting firms is approximately 11.1. The salary data were provided in thousands of euros. Assuming the salary data were based on sample of 25 seniors and 26 managers, test the hypothesis that the population variances in the salaries are equal. At Ξ±=0.05, what is your conclusion? (π―π : πππ β πππ , ππ = π. ππ, ππ.πππ,ππ,ππ = π. ππ, ππππππ π―π ). 30 III. The difference between two population means (ππ β ππ = ππ ) for dependent (paired) samples ο Point estimate of (π1 β π2 = ππ ) πΜ π = πΜ ο Confidence interval C.I π πΜ ± π‘πΌβ2,πβ1 π ππ π π’πππππ€π πππ π < 30 βπ Where ο· d: difference between the two variables ο· βπ πΜ = π ο· ππ2 ο· ππ = βππ2 = β π2β (β π)2 π πβ1 or ππ2 = β π2 βππΜ 2 πβ1 or ππ2 = β(πβπΜ )2 πβ1 ο Hypothesis test ππ = πΜ β0 ππ βπ ππ π π’πππππ€π πππ π < 30 ππ ~π‘πβ1 Note: οΌ We will use Z instead of T in both C.I and hypothesis test if o Ο is known o Ο is unknown with nβ₯30 οΌ there are three ways to do the test as mentioned in chapter 9 in this note 31 Exercises: 1. To test the difference between two body shop garages, 10 randomly chosen damaged cars were sent to these two garages (A and B). The following are the estimated repair garages of these garages. A 236 137 379 255 279 321 369 333 137 390 B 310 187 392 232 321 318 389 288 167 432 d=A-B π π Assuming that the repair charges are normally distributed a. Test the hypothesis that the repair charge at garage A is lower than that at garage B, state your assumptions. b. Construct a 95% C.I of the difference between the two means. 32 2. The manufacture of gasoline additive claims that the use of this additive increases gasoline mileage. A random sample of 6 cars were driven for one week with the gasoline additive and then for one week without the gasoline additive. The following table provides the obtained information about the gasoline mileage. Gasoline mileage With without D=with-without Mean 25.12 23.4 1.717 Standard deviation 5.87 5.42 1.427 a. Compute a 99% confidence interval for the mean difference gasoline mileage? b. Is it possible to say that the manufacturerβs claim is true? Why? Use Ξ±=0.01 33 Review: 1. A company claims that its 12-week special exercise program significantly reduces weight. A random sample of six persons was selected, and these persons were put on this exercise program for 12 weeks. The following table gives the weights (in pounds) of those six persons before and after the program. Assume that the population of all paired differences is (approximately) normally distributed. Before After 180 183 195 187 177 161 221 204 208 197 199 189 a. Make a 95% confidence interval for the mean of the population paired differences, where a paired difference is equal to the weight before joining this exercise program minus the weight at the end of the 12-week program. (2.278, 17.382) b. Using the 1% significance level, can you conclude that the mean weight loss for all persons due to this special exercise program is greater than zero?(π―π : ππ > π, ππ = π. ππ, π.ππ,π = π. πππ, π ππππ ππππππ π―π ) 2. A study used to test whether a training course is helpful for students to pass a mathematics course. To evaluate the effectiveness of the training course, eight students test scores were compared before and after taking the training course. The results are as follows Scores student before after 1 46 50 2 52 50 3 64 71 4 67 70 5 58 54 6 55 61 7 60 62 8 60 68 a. Compute a 90% confidence interval for the mean difference scores? (0.25, 5.75) b. Is it possible to say that the training course is helpful? Why? (π―π : ππ > π, ππ = π. ππ, π.ππ,π = π. πππ) 3. A company is considering installing new machines to assemble its products. The company is considering two types of machines, but it will buy only one type. The company selected 11 assembly workers and asked them to use these two types of machines to assemble products. The time in minutes to assemble one unit of the product on each type of machine for each of these eleven workers is recorded and given to company statistician who supplied the following information Machine Π 23 26 19 24 27 22 20 18 17 21 25 Machine Π 21 24 23 25 24 28 24 21 17 26 23 Assuming normality, use a confidence interval for the difference between the average assembly time for the two machines to test the hypothesis that the two machines are the same at Ξ±=0.05. (π―π : ππ β π, (βπ. πππ, π. πππ), π ππππ ππππππ π―π ) 34 IV. The difference between two population proportions (π·π β π·π ) πΜ1 β πΜ2 ο Point estimate of π1 β π2 ο Confidence interval C.I (πΜ1 β πΜ2 ) ± ππΌβ2 β πΜ1 (1 β πΜ1 ) πΜ2 (1 β πΜ2 ) + π1 π2 ο Hypothesis test ππ = (πΜ1 β πΜ2 ) β π· 1 1 βπΜ (1 β πΜ ) ( + ) π1 π2 Where Combined sample proportion π +π πΜ = π1 +π2 1 2 or Μ Μ π π +π π πΜ = 1π1+π2 2 1 2 35 Exercises: 1. A company has two restaurants in two different areas in Kuwait. The company wants to estimate the percentage of customers who thinks that the food and service at each of these restaurants are excellent. A sample of 200 customers taken from restaurant in area A showed that 118 think that the food and service are excellent in this restaurant. Another sample of 250 customers taken from restaurant in area B showed that 160 think that the food and service are excellent in this restaurant. a. Find the point estimate of the difference between the two proportions. b. Construct a 97% C.I of the difference between the two proportions. c. Find the p-value to test the hypothesis that the proportion of customers who thinks that the food and service in area A is lower than the corresponding proportion at the restaurant in area B. d. What is your conclusion if Ξ±=0.025? 36 Review: 1. In a random sample of 800 men aged between 25 to 35, 24% of them said they live with one parent. In other sample of 850 women of the same age group, 18% said that they live with one parent. Construct a 95% confidence interval for the difference between the two population proportions. (0.021, 0.099) 2. A company that has many department stores wanted to find at two such stores the percentage of sales for which at least one of the items was returned. A sample of 800 sales randomly selected from store A showed that for 240 of them at least one of the items was returned. Another sample of 900 sales randomly selected from store B showed that for 279 of them at least one of the items was returned. a. Construct at 98% confidence interval for the difference between the proportions of all sales at the two stores for which at least one of the items was returned. (-0.0621, 0.0421) b. Find the standard error and the margin error of C.I. (0.02236,0.05211) c. Using the 1% significance level, can you conclude that at the two stores the proportions of all sales for which at least one of the items was returned are different?(π―π : π·π¨ β π·π© , ππ = β. ππ, π ππππ ππππππ π―π ) d. Find the p-value for the test mentioned in part (c). (0.6528) e. Find the standard error of the test. (0.02237) 37 Chapter 11 ANOVA ο Assumptions: 1. βkβ random independent samples from 2. Normal population with 3. Equal variances (homogenous populations) ο Hypothesis test: 1. π»0 : π1 = π2 = β― = ππ vs π»1 : at least one population means is different Where k: # of samples or groups or populations 2. T.S πΉπ = 3. πππ΅ (πππΉ) πππ (πππΈ) Calculated from ANOVA table Determine the Critical value πΉπΌ,πβ1,πβπ 4. Conclusion: Reject π»0 if T.S (πΉπ ) lies in the rejection region R.R (under shaded area). 38 Source of variation Between (Factor) Degrees of freedom (d.f) k-1 Sum of squares (SS) SSB (SSF) Within (Error) n-k SSW (SSE) Total n-1 SST ο· ο· ο· ο· ο· ο· SSW(SSE) π12 SSW(SSE) nβk Fc = MSB(MSF) MSW(MSE) - π22 π1 = π1 π₯Μ 1 , ππ2 π2 = π2 π₯Μ 2 , β¦ , ππ = ππ π₯Μ π π2 SSB = SST-SSW SSB = MSB (k-1) SSB = ( SSW = SST-SSB SSW = (π1 β 1)π12 + (π2 β 1)π22 + β― + (ππ β 1)ππ2 = β(ππ β 1)ππ2 SST = SSB+SSW πππΈ β = ππ2 = π1 + π2 + β―+ SSB = β ππ (π₯Μ π β π₯Μ )2 SSW = MSW (n-k) SST MSW(MSE) = k: number of samples/ groups/ populations. π = π1 + π2 + β― + ππ (π‘ππ‘ππ π πππππ π ππ§π). π1 = β π₯1 , π2 = β π₯2 , β¦ , ππ = β π₯π ππ π = π1 + π2 + β― + ππ π12 , π22 , β¦ , ππ2 β π₯ 2 = β π₯12 + β π₯22 + β― + β π₯π2 SSB(SSF) Mean squares (MS) SSB(SSF) MSB(MSF) = kβ1 ππ )β π2 π2 π2 1 2 π π SSW=β π₯ 2 β (π1 + π2 + β― + ππ ) SST =β π₯ 2 β π2 π (π1 β 1)π12 + (π2 β 1)π22 + β― + (ππ β 1)ππ2 πβπ Note: ο· β ππ β (β π)π ο· π»π π β β πππ ο· π»π π = (β ππ )π 39 Exercise 1. A consumer agency wanted to find out if the mean time it takes for each of three brands of medicine to provide relief from a headache is the same. The three drugs were administered to three randomly selected samples. The following table gives the time in minutes taken by each patient to get relief from a headache, followed by a Minitab output to such problem. Drug Π 14 20 18 24 Π 25 38 42 65 47 52 Level Drug Π Drug Π Drug Π¨ N 6 4 5 Mean 44.83 19.00 53.60 StDev 13.50 4.16 13.24 Π¨ 44 39 54 58 73 Individual 95% CIs for Mean Based on Pooled StDev -------+---------+---------+--------(------*------) (-------*-------) (-------*------) -------+---------+---------+--------16 32 48 a. Complete the analysis of variance table Source Factor Error Total df SS MS F b. Test the consumer agency claim at 5% level of significance c. Suppose that the hypothesis of equal means has been rejected which of the drugs is different and why? 40 2. A panel of trained testers judges the flavor quality of different vanilla frozen desserts: frozen yogurts, ice milks, other frozen desserts measured on a scale from 0 to 100. The sample sizes are respectively, π1 = 13, π2 = 8, πππ π3 = 6. Below is most of the ANOVA output from the computer: Source Factor (Between) Error (Within) Total df ? SS 6364 MS 3182 24 3031 ? ? ? F ? a. Complete the ANOVA table b. Test whether there is a significant difference in the flavor quality of the three different disserts ο· State the null and alternative hypotheses ο· Find the value of the test statistic ο· Find the critical value(s). use 0.05 significance level ο· What is your conclusion about the flavor quality of the three different disserts? ο· What are the assumptions required to make this test? 41 3. Sex hours were selected from each of 3 radio stations, and analysis of variance was performed on the data. Part of the ANOVA table is shown below Source Between Within Total df SS MS 1311.02 F 13.368 a. Complete ANOVA table b. At 0.05 significance level, is there a difference in the stations means 42 Review: 1. A statistics professor has developed four methods (M1, M2, M3, M4) for teaching a senior level class. He wishes to investigate if there is a difference in the four methods. The professor assigns students to the four teaching methods. The final exam scores for each group were recorded. The four sample sizes and sample means are in the following table: Method Sample size Sample mean M1 7 79 M2 4 83.75 M3 6 70 M4 5 72.8 Also you are given that the error (within) sum of squares βSSEβ (SSW)=861.55 Carry out ANOVA test using a 1% level of significance ο· State the null and alternative hypotheses ο· Find the value of test statistic (3.973) ο· Find the critical value (5.09) ο· What is your conclusion about the four different methods of teaching? (donβt reject π―π ). Source Between Within Total df 3 18 21 SS 570.45 861.55 1432 MS 190.15 47.864 F 3.973 2. Samples were selected from three populations, the data obtained is given below Sample 1 Sample 2 Sample 3 91 77 88 98 87 75 107 84 73 102 95 84 85 75 82 a. State the assumption needed to use the analysis of variance to test the equality of the three population means b. Test the hypothesis of no difference between the three population means at 0.05 level of significance.(ππ = ππ. πππ, π.ππ,π,ππ = π. ππ, ππππππ π―π ) Source Between Within Total df 2 12 14 SS 968.7 489 1457.7 MS 484.35 40.75 F 11.885 43 Chapter 12 Independence 1. π»0 : two variables are independent (π§π¨π π«ππ₯ππππ) vs π»1 : two variables are dependent (π«ππ₯ππππ) 2. Calculate the test statistic (T.S) π₯π2 = β (π β πΈ)2 πΈ Where O: Observed value E: Expected value E= row total β column total total 3. Determine the critical value 2 π₯πΌ,(πβ1)(πβ1) 4. Conclusion: Reject π»0 if T.S (π₯π2 ) lies in the rejection region (under shaded area). 44 Exercises: 1. Let's try an example. 500 elementary school boys and girls are asked which is their favorite color: blue, green, or pink? Results are shown below: Boys Girls Total Blue 100 20 120 Green 150 30 180 Pink 20 180 200 Total 300 200 500 would you conclude that there is a relationship between gender and favorite color? a. The two hypothesis π»0 : π£π π»1 : b. The test statistic c. The critical value(s). Use 0.05 significance level. d. The conclusion 45 2. One hundred auto drivers who were stopped by police for some violation were also checked to see if they were wearing seat belt. The following table records the results of this survey Wearing seat belt Not wearing seat belt Total Men 34 21 55 Women 32 13 45 total 66 34 100 For a chi square test of independence for this contingency table: a. What is the number of degrees of freedom? b. What is the total of the second row? c. How many drivers are in the sample ? d. What are the observed frequencies for the first row? e. What are the expected frequencies for the second row? f. What are the observed frequencies for the second column? g. What are the expected frequencies for the second column? 46 CHAPTER 13 Simple Linear Regression I. The population regression model π¦ = π½0 + π½1 π₯ + π Where o o o o o π¦: is the dependent variable π₯: is the independent variable π½0: is y-intersection or constant term π½1: is the slope π: is a random error term ο Estimate the population regression model by the sample linear regression model π¦Μ = π0 + π1 π₯ This equation is called the least squares regression line or the prediction equation. ο Sum of squares πππ₯π¦ = β π₯π¦ β βπ₯βπ¦ ππ πππ₯π¦ = β π₯π¦ β ππ₯Μ π¦Μ ππ πππ₯π¦ = β(π₯ β π₯Μ )(π¦ β π¦Μ ) π πππ₯π₯ (β π₯)2 = βπ₯ β ππ πππ₯π₯ = β π₯ 2 β ππ₯Μ 2 ππ πππ₯π₯ = β(π₯ β π₯Μ )2 π πππ¦π¦ (β π¦)2 = βπ¦ β ππ πππ¦π¦ = β π¦ 2 β ππ¦Μ 2 ππ πππ¦π¦ = β(π¦ β π¦Μ )2 π 2 2 ο Estimated value of π½0 and π½1 πππ₯π¦ πππ₯π₯ Μ π΅0 = π0 = π¦Μ β π1 π₯Μ π΅Μ1 = π1 = ο Prediction value of y π¦Μ = π0 + π1 π₯ given ο Residual (Error) π = π¦ β π¦Μ given 47 II. How to Evaluate the estimated model 1. Coefficient of Determination π2 = π1 πππ₯π¦ πππ = πππ¦π¦ πππ 0 β€ π2 β€ 1 It explains the variation in βyβ by the independent variable βxβ π 2 increased Note: SSR increased Good model 2. Coefficient of Correlation π = βπ 2 or π= (with the same sign of π1 ) πππ₯π¦ βπππ₯π₯ πππ¦π¦ = π1 β πππ₯π₯ πππ¦π¦ β1 β€ π β€ 1 It measures the strength of the linear relationship between two variables Perfect Perfect III. Estimation of the variance and standard deviation of random errors ο· ο· The estimated variance of errors πΜπ2 = ππ2 = πππΈ The estimated St. Deviation of errors πΜπ = ππ = βπππΈ π΄πΊπ¬ = πΊπΊππ β ππ πΊπΊππ πΊπΊπ¬ = πβπ πβπ (k: number of parameters) 48 IV. Inferences about π·π ππ ~π΅(πππ = π©π , πΊππ = ο· πΊπ βπΊπΊππ ο· Population simple linear regression equation π¦ = π½0 + π½1 π₯ + π Point estimate of π½1: π½Μ1 = π1 ο· Confidence interval of π½1 o If π < 30 ο· Hypothesis test about π΅1 π1 ± π‘πΌβ2,πβπ ππ1 ) (k: number of parameters) (Test of π1 or Test if there is a good relationship between x and y). 1. π―π : π· π = π Means there is no relationship between X and Y (not significant linear relationship) (X will dropped from model) π―π : π· π β π Means there is a relationship between X and Y (significant linear relationship) π―π : π· π > π Positive linear relationship π―π : π· π < π negative linear relationship 2.Calculate Test statistic π ππ = π 1 o If π < 30 π1 3.Determine the rejection region βπ‘πΌβ2,πβπ π‘πΌβ2,πβπ βπ‘πΌ,πβπ π‘πΌ,πβπ 4.Conclusion: Reject π»0 if T.S lies in rejection region (R.R) 49 V. Testing the overall model 1. π»0 : π½1 = 0 The model is not significant/ not useful/not good/not adequate /data not fit model. π»1 : π½1 β 0 The model is significant/ useful/ good/adequate/data fit model 2. Test statistic πππ πΉπ = πππΈ (Calculated from ANOVA table) 3. Critical value πΉπΌ,πβ1,πβπ 4. Conclusion Reject π»0 if T.S (πΉπ ) lies in the rejection region R.R (under shaded area). ANOVA table Source of variation Regression Residual Error Total d.f Sum of squares SS Mean squares MS k-1 πππ = π1 πππ₯π¦ n-k πππΈ = πππ β πππ πππ πβ1 πππΈ πππΈ = πβπ n-1 πππ = πππ¦π¦ πππ = T.S πΉπ = πππ πππΈ π = # ππ ππππππππππ 50 CHAPTER 14 Multiple Regressions I. Least squares regression line equation π¦Μ = π0 + π1 π₯1 + π2 π₯2 + π3 π₯3 + β― π¦Μ: Dependent variable (prediction of y) π₯π : Independent variable II. Hypothesis test 1. π»0 : π½1 = π½2 = π½3 = β― = 0 (the model is not significant) Vs π»1 : at least one Ξ² is not equal to zero (the model is significant) 2. Test statistic πππ πΉπ = πππΈ (Calculated from ANOVA table) 3. Critical value πΉπΌ,πβ1,πβπ 4. Conclusion Reject π»0 if T.S (πΉπ ) lies in the rejection region R.R (under shaded area). ANOVA table Source of d.f variation Regression k-1 Residual Error Total Sum of squares SS Mean squares MS πππ = π1 πππ₯π¦ πππ πβ1 πππΈ πππΈ = πβπ n-k πππΈ = πππ β πππ n-1 πππ = πππ¦π¦ πππ = T.S πΉπ = πππ πππΈ 51 Exercises: 1. The following table gives information on the temperature in a city and the volume of ice cream (in pounds) sold at an ice cream parlor for a random sample of eight days during the summer of 1999. Temperature 93 86 77 89 98 102 87 79 Ice cream sold 208 175 123 198 232 277 158 117 βπ₯ = 711 , βπ¦ = 1488, βπ₯ 2 = 63713, βπ¦ 2 = 297428, βπ₯π¦ = 135466, π₯Μ = 88.88, π¦Μ = 186 a. Find sum of squares (SS) b. Find the least squares regression line (π¦Μ = π + ππ₯) c. Give a brief interpretation of the values of a and b - a(π0 ): the initial value of y when x=0 (the initial value of the volume of ice-cream sold (y) is equal to a= -361.5008 when the temperature (x)is equal to zero) - b(π1 ): if we increase x by 1 unit then y will change (increase or decrease) by the value of b. (if we increase the temperature (x) by 1 degree then the volume of ice-cream sold (y) will increase by b=6.16. d. Compute π and π 2 , explain what they mean e. Predict the amount of ice cream sold on a day with a temperature of 95° f. Compute the standard deviation of errors g. Construct a 99% confidence interval for π½1 h. Testing at 1% significance level, can you conclude that π½1 is different from zero? 52 2. Regression analysis was applied between sales data (in $1000) and advertising data (in $100) and the following information was obtained π¦Μ = 12 + 1.8π₯ π = 17, πππ = 225, πππΈ = 75, ππ1 = 0.2683 a. Based on the above estimated regression equation, if advertising is $3000, then the predicted value of sales (in dollars) is b. The F statistic computed from the above data is c. The critical F value at Ξ±=0.05 is d. Is the estimated regression model significant? ο· The two hypotheses ο· Your conclusion e. The t statistic for testing the significance of the slope is f. Is the linear relationship between X and Y significant? Use the t-test to answer this equation. ο· The two hypotheses ο· The critical value(s) ο· Your conclusion g. Calculate the 95% confidence interval of the slope of the regression line for all statistics students h. Develop an analysis of variance table Source of d.f SS variation MS 53 3. The owner of a bowling establishment is interested in the relationship between the price she charges for a game of bowling and the number of games bowled per day. She collected data on the number of games bowled per day at 15 different prices. Fill in the missing entries in the following MINITAB output that was obtained for these data. In this output, X represents the price of a game and Y is the number of games bowled per day. The regression equation is y = ____ ____ x Predictor Constant x Coef 691.02 -141.30 S = _____ STDEV 21.70 ____ R-Sq = _____ T ____ -16.83 P 0.000 0.000 R-Sq(adj) = 95.3% Analysis of Variance Source Regression Residual Error Total DF ___ ___ ___ SS 148484.0 _______ _______ MS _______ _______ F 283.13 P 0.000 4. A researcher wanted to examine the relation between a dependent variable y and an independent variable x, He selected randomly 10 observations giving the following partial MINITAB output: A. Predictor Constant X1 S = 15.2158 Coef 247.97 -8.172 SE Coef 15.01 1.077 R-Sq = 87.8% R-Sq(adj) = 86.3% 1. Write down the estimated regression equation: 2. Test the hypothesis of no regression (π1 =0) using Ξ±=0.05 3. Find the correlation coefficient between y and x1 and comment on its value 54 B. To researcher decided to add another independent variable x2 to the model in (A) he obtained the following results The regression equation is y = 180 - 6.17 x1 + 0.848 x2 Predictor Constant X1 X2 S = 10.2957 Coef 180.07 ______ 0.8484 SE Coef 23.31 ______ 0.2622 R-Sq = _______ T _____ -6.46 _____ R-Sq(adj) = 93.7% Analysis of Variance Source Regression Residual Error Total DF 2 _ 9 SS ______ 742.0 15182.9 MS _______ _______ F _____ 1. Complete the missing values in the previous MINITAB output 2. Test at Ξ±=0.05 of no regression model (i.e. the overall model does not fit the data). 3. Find a 95% confidence interval for the regression coefficient of x2 4. Test the hypothesis that the true value of the regression coefficient of x1 equal to 0 5. Which of the two models you prefer, the one estimated in (A) above or the one estimated in (B) above? And why? 6. Test the hypothesis that the true value of the regression coefficient of π₯2 is positive (Ξ±=.05). 55 Inference about π·π The regression equation is y = 180 - 6.17 x1 + 0.848 x2 π0 π1 Predictor Constant π2 Coef SE Coef π0 180.07 X1 π1 -6.17 X2 π2 T ππ0 23.31 7.725 ππ2 0.2622 S = 10.2957 R-Sq = 0.9511 Estimated standard deviation of error Coefficient of determination βπππΈ πππ πππ 0.00 ππ -6.46 πππ ππ1 0.955 0.8484 p-value 0.00 3.236 0.00 Testing hypothesis for two tailed test we compare p-value with Ξ±. The significant of π½π or the linear relationship between π₯π πππ π¦π R-Sq(adj) = 93.7% Over all model (conduct the f test of model usefulness/test the whole model/ π―π : π·π = π·π = π ππ π―π : ππ πππππ πππ π· β π) Analysis of Variance Source DF SS Regression 2 14440.9 Residual Error 7 742.0 Total 9 15182.9 MS F 7220.45 68.117 106 Error mean square πππΈ = Or πππΈ πβπ πππΈ = π 2 56 5. The wner of ShowTime Movie theaters would like to estimate weekly gross revenue as a function of advertising expenditure. Historical data for a sample of 8 weeks follow Weekly gross Revenue (Y) Television advertising (X1) Newspaper advertising (X2) ($1000s) ($1000s) ($1000s) 96 5.0 1.5 90 2.0 2.0 95 4.0 1.5 92 2.5 2.5 95 3.0 3.3 94 3.5 2.3 94 2.5 4.2 94 3.0 2.5 A portion of the MINITAB computer follows The regression equation is y = 83.2 + 2.29 x1 + 1.30 x2 Predictor Constant X1 X2 S = 0.642587 Coef SE Coef 1.574 0.3041 0.3207 T R-Sq = 91.9% Analysis of Variance Source Regression Residual Error Total DF SS 23.435 MS F 5 a. What is the estimate of the weekly gross revenue for a week when $3500 is spent on television advertising and $1800 is spent on newspaper advertising? 57 b. Find and interpret π 2 c. When television advertising was the only independent variable, π 2 =0.653 (65.3%) do you prefer the multiple regression results? Why? d. Use Ξ±=0.05 to test the hypotheses π»0 : π½1 = π½2 = 0, π»1 : π½1 πππ/ππ π½2 is not equal to zero. Did the estimated regression equation provide a good fit to the data? Explain e. Find the mean square error. Find the standard error of the estimate (πΜ) f. Use Ξ±=0.05 to test the significance of each independent variable. Should X1 or X2 be dropped from the model 58 TABLES 59 60 61 62 63 64 65 66 67 68 69 70