Download Example: Making an inference about m 1

Inference About One Population 1 1. Introduction • In this chapter we utilize the approach developed before to describe a population. – Identify the parameter to be estimated or tested. – Specify the parameter’s estimator and its sampling distribution. – Construct a confidence interval estimator or perform a hypothesis test. 2 1. Introduction • We shall develop techniques to estimate and test three population parameters. – Population mean m – Population variance s2 – Population proportion p 3 本章我們介紹 1. 母體平均數的假設檢定 2. 母體變異數的假設檢定 3.母體比例的假設檢定 4 Problem objective? Describe a population Compare two populations Data type? Data type? Nominal Interval Z test & estimator of p Type of descriptive measurement? Central location Variability Type of descriptive measurement? Central location t- test & c2- test & estimator of m estimator of s2 Continue Nominal Interval Z test & estimator of p1-p2 Variability F- test & estimator of s12/s225 Experimental design? Continue Experimental design? Continue Matched pairs Continue Independent samples t- test & estimator of mD Population variances? Equal Unequal Problem objective? t- test & estimator of m1-m2 (Equal variances) t- test & estimator of m1-m2 (Unequal variances) Describe a population Compare two populations Data type? Data type? Nominal Interval test && ZZ test estimator of of pp estimator Type of descriptive measurement? Central location Variability Type of descriptive measurement? Central location t-t- test cc22-- test test && test && estimator estimator of of mm estimator estimator of of ss22 Continue Nominal Interval test && ZZ test estimator of of pp11-p -p22 estimator Variability F- test test6&& Festimator of of ss1122/s /s22223 estimator Experimental design? Continue Compare two or more populations Experimental design? Independent samples Population distribution Normal Nonnormal Kruskal-Wallis test ANOVA one-way or Two-Factor Data type? Interval Blocks c2 - test of a Ordinal Population distribution Normal Nominal contingency table Experimental design? Nonnormal Friedman test ANOVA (randomized blocks) Independent samples Kruskal-Wallis test Blocks Friedman test 7 2. Inference About a Population Mean When the Population Standard Deviation Is Unknown Recall that when s is known we use the following statistic to estimate and test a population mean z xm s n When s is unknown, we use its point estimator s, and the z-statistic is replaced then by the t-statistic 8 The t - Statistic ZZt t t Z ttt Z x  m xm Z t t t t t Z ss n s s s sss n s ssss sssssss When the sampled population is normally distributed, the t statistic is Student t distributed. 9 The t - Statistic t The t distribution is mound-shaped, and symmetrical around zero. d.f. = v2 v1 < v2 d.f. = v1 0 xm s n The “degrees of freedom”, (a function of the sample size) determine how spread the distribution is (compared to the normal distribution) 10 Testing m when s is unknown • Example 1. - Productivity of newly hired Trainees 11 Testing m when s is unknown • Example 1. – In order to determine the number of workers required to meet demand, the productivity of newly hired trainees is studied. – It is believed that trainees can process and distribute more than 450 packages per hour within one week of hiring. – Can we conclude that this belief is correct, based on 12 productivity observation of 50 trainees . Testing m when s is unknown • Example 1. – Solution – The problem objective is to describe the population of the number of packages processed in one hour. – The data are interval. H0:m = 450 H1:m > 450 – The t statistic t x m s n d.f. = n - 1 = 49 13 Testing m when s is unknown • Solution continued (solving by hand) – The rejection region is t > ta,n – 1 ta,n - 1 = t.05,49 @ t.05,50 = 1.676. From the data we have  x i  23,019 2 x  i  10,671,357, thus 23,019 x  460.38, and 50  x  x   n  2 s2 2 i i n 1 s  1507 .55  38.83  1507 .55. 14 Testing m when s is unknown Rejection region • The test statistic is t 1.676 x m s n  460.38  450 38.83 50 1.89  1.89 • Since 1.89 > 1.676 we reject the null hypothesis in favor of the alternative. • There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at .05 significance level. 15 Testing m when s is unknown • Since .0323 < .05, we reject the null hypothesis in favor of the alternative. .05 • There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at .05 significance level. .0323 16 Estimating m when s is unknown • Confidence interval estimator of m when s is unknown x  ta s 2 n d.f .  n  1 17 t Distribution t distribution (20 degrees of freedom) Standard normal distribution t distribution (10 degrees of freedom) z, t 0 18 t Distribution Table • For Areas in the Upper Tail 19 t Distribution For more than 100 degrees of freedom, the standard normal z value provides a good approximation to the t value. The standard normal z values can be found in the infinite degrees (  ) row of the t distribution table. 20 t Distribution Degrees Area in Upper Tail of Freedom .20 .10 .05 .025 .01 .005 . . . . . . . 50 .849 1.299 1.676 2.009 2.403 2.678 60 .848 1.296 1.671 2.000 2.390 2.660 80 .846 1.292 1.664 1.990 2.374 2.639 100 .845 1.290 1.660 1.984 2.364 2.626 .842 1.282 1.645 1.960 2.326 2.576  Standard normal z values 21 Interval Estimation of a Population Mean: s Unknown • Interval Estimate x  ta / 2 s n where: 1 -a = the confidence coefficient ta/2 = the t value providing an area of a/2 in the upper tail of a t distribution with n - 1 degrees of freedom s = the sample standard deviation 22 Interval Estimation of a Population Mean: s Unknown At 95% confidence, a = .05, and a/2 = .025. t.025 is based on n - 1 = 50- 1 = 49 degrees of freedom. In the t distribution table we see that t.025 = 2.009. x  ta 2, n 1 s 38.83 @ 460.38  2.009  449.35, 471.41 n 50 Degrees Area in Upper Tail of Freedom .20 .10 .05 .025 .01 .005 . . . . . . . 50 .849 1.299 1.676 2.009 2.403 2.678 60 .848 1.296 1.671 2.000 2.390 2.660 80 .846 1.292 1.664 1.990 2.374 2.639 100 .845 1.290 1.660 1.984 2.364 2.626 .842 1.282 1.645 1.960 2.326 2.576 23 Summary of Interval Estimation Procedures for a Population Mean Can the population standard deviation s be assumed known ? Yes Use the sample standard deviation s to estimate s s Known Case Use x  za /2 s n No s Unknown Case Use x  ta /2 s n 24 Estimating m when s is unknown • Example 2. – An investor is trying to estimate the return on investment in companies that won quality awards last year. – A random sample of 83 such companies is selected, and the return on investment is calculated had he invested in them. – Construct a 95% confidence interval for the mean return. 25 Estimating m when s is unknown • Solution (solving by hand) – The problem objective is to describe the population of annual returns from buying shares of quality award-winners. – The data are interval. x  15 .02 s 2  68 .98 s  68 .98  8.31 – Solving by hand • From the data we determine x  ta 2, n 1 s 8.31 @ 15.02  1.990  13.20,16.83 n 83 t.025,82@ t.025,80 26 Checking the required conditions • We need to check that the population is normally distributed, or at least not extremely nonnormal. • There are statistical methods to test for normality (one to be introduced later in the book). • From the sample histograms we see… 27 A Histogram for Example1 14 12 10 8 6 4 2 0 400 425 450 475 500 525 550 Packages A Histogram for Example 2 30 575 More 25 20 15 10 5 0 -4 2 8 14 Returns 22 30 More 28 3. Inference About a Population Variance • Sometimes we are interested in making inference about the variability of processes. • Examples: – The consistency of a production process for quality control purposes. – Investors use variance as a measure of risk. • To draw inference about variability, the parameter of interest is s2. 29 3. Inference About a Population Variance • The sample variance s2 is an unbiased, consistent and efficient point estimator for s2. (n  1)s 2 • The statistic has a distribution called Chi2 s squared, if the population is normally distributed. d.f. = 5 c2  (n  1)s 2 s 2 d.f .  n  1 d.f. = 10 30 Testing and Estimating a Population Variance • From the following probability statement P(c21-a/2 < c2 < c2a/2) = 1-a we have (by substituting c2 = [(n - 1)s2]/s2.) (n  1)s 2 c 2a / 2  s2  (n  1)s 2 c12a / 2 31 Chi-Square Distribution • A Chi-Square Distribution with 19 Degrees of Freedom 32 Chi-Square Distribution • Selected Values form the Chi-Square Distribution Table 33 Chi-Square Distribution 34 Interval Estimation of s2 2 c.975  (n  1)s 2 .025 .025 95% of the possible c2 values 0 2 c .975 s2 2  c .025 2 c .025 c2 35 Interval Estimation of s2 • There is a (1 – a) probability of obtaining a c2 value such that c (12 a / 2)  c 2  ca2 / 2 • Substituting (n – 1)s2/s2 for the c2 we get c 2 (1a / 2)  (n  1) s 2 s2  ca2 / 2 • Performing algebraic manipulation we get (n  1) s 2 c a2 / 2 s2  (n  1) s 2 c (21a / 2) 36 Interval Estimation of s2 • Interval Estimate of a Population Variance ( n  1) s 2 c a2 / 2  s2  ( n  1) s 2 c 2(1 a / 2) where the c2 values are based on a chi-square distribution with n - 1 degrees of freedom and where 1 - a is the confidence coefficient. 37 Interval Estimation of s • Interval Estimate of a Population Standard Deviation Taking the square root of the upper and lower limits of the variance interval provides the confidence interval for the population standard deviation. (n  1) s 2 (n  1) s 2 s  2 ca / 2 c (12 a / 2) 38 Testing the Population Variance • Example 3. (operation management application) – A container-filling machine is believed to fill 1 liter containers so consistently, that the variance of the filling will be less than 1 cc (.001 liter). – To test this belief a random sample of 25 1-liter fills was taken, and the results recorded. – Do these data support the belief that the variance is less than 1cc at 5% significance level? 39 Testing the Population Variance • Solution – The problem objective is to describe the population of 1-liter fills from a filling machine. – The data are interval, and we are interested in the variability of the fills. – The complete test is: H0: s2 = 1 2 2 H1: s <1 (n  1)s 2 The test statistic is c  The rejection region . s is c 2  c12a ,n1 2 40 Testing the Population Variance • Solving by hand – Note that (n - 1)s2 = S(xi - x)2 = Sxi2 – (Sxi)2/n – From the sample we can calculate Sxi = 24,996.4, and Sxi2 = 24,992,821.3 – Then (n - 1)s2 = 24,992,821.3-(24,996.4)2/25 =20.78 2 ( n  1 ) s 20.78 2 c   2  20.78, 2 s 1 c12a ,n1  c.295,251  13.8484. There is insufficient evidence to reject the hypothesis that the variance is less than 1. Since 13.8484  20.78, do not reject the null hypothesis. 41 Testing the Population Variance a = .05 1-a = .95 Rejection region c 2  13.8484 13.8484 20.8 c2 c.295,251 Do not reject the null hypothesis 42 Estimating the Population Variance • Example 3.1. – Estimate the variance of fills in Example 3 data with 99% confidence. • Solution – We have (n-1)s2 = 20.78. From the Chi-squared table we have c2a/2,n-1 = c2.005, 24 = 45.5585 c21a/2,n-1 c2.995, 24 = 9.88623 43 Estimating the Population Variance • The confidence interval estimate is (n  1)s (n  1)s 2 s  2 2 ca / 2 c1a / 2 2 2 20.78 20.78 2 s  45.5585 9.88623 .46  s  2.10 2 44 4. Inference About a Population Proportion • When the population consists of nominal data, the only inference we can make is about the proportion of occurrence of a certain value. • The parameter p was used before to calculate these probabilities under the binomial distribution. 45 4. Inference About a Population Proportion • Statistic and sampling distribution – the statistic used when making inference about p is: x pˆ  where n x  the number of successes. n  sample size. – Under certain conditions, [np > 5 and n(1-p) > 5], p̂ is approximately normally distributed, with m = p and s2 = p(1 - p)/n. 46 Testing and Estimating the Proportion • Test statistic for p p̂  p Z p(1  p) / n where np  5 and n(1  p)  5 • Interval estimator for p (1-a confidence level) p̂  z a / 2 p̂(1  p̂) / n provided np̂  5 and n(1  p̂)  5 47 Testing the Proportion • Example 4. (Predicting the winner in election day) – Voters are asked by a certain network to participate in an exit poll in order to predict the winner on election day. – Based on the data presented where 1=Democrat, and 2=Republican, can the network conclude that the republican candidate will win the state college vote? 48 Testing the Proportion • Solution – The problem objective is to describe the population of votes in the state. – The data are nominal. – The parameter to be tested is ‘p’. – Success is defined as “Vote republican”. – The hypotheses are: H0: p = .5 H1: p > .5 More than 50% vote Republican 49 Testing the Proportion – Solving by hand • The rejection region is z > za = z.05 = 1.645. • From file we count 407 success. Number of voters participating is 765. • The sample proportion is p̂  407 765  .532 • The value of the test statistic is Z p̂  p p(1  p) / n  .532  .5 .5(1  .5) / 765  1.77 • The p-value is = P(Z>1.77) = .038 50 Testing the Proportion There is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis. At 5% significance level we can conclude that more than 50% voted Republican. 51 Selecting the Sample Size to Estimate the Proportion • Recall: The confidence interval for the proportion is pˆ  za / 2 pˆ (1  pˆ ) / n • Thus, to estimate the proportion to within W, we can write W  za / 2 pˆ (1  pˆ ) / n 52 Selecting the Sample Size to Estimate the Proportion • The required sample size is  za / 2 pˆ (1  pˆ ) n W     2 53 Sample Size to Estimate the Proportion • Example – Suppose we want to estimate the proportion of customers who prefer our company’s brand to within .03 with 95% confidence.  1.96 p̂(1  p̂) – Find the sample size. n – Solution .03  W = .03; 1 - a = .95, therefore a/2 = .025, so z.025 = 1.96 Since the sample has not yet been taken, the sample proportion is still unknown. We proceed using either one of the following two methods: 54    2 Sample Size to Estimate the Proportion • Method 1: – There is no knowledge about the value of p̂ • Let p̂  .5 . This results in the largest possible n needed for a 1-a confidence interval of the form p̂  .03 . • If the sample proportion does not equal .5, the actual W will be narrower than .03 with the n obtained by the formula below. • Method 2: – There is some idea about the value of p̂ • Use the value of p̂ to calculate the sample size  1.96 .5(1  .5) n .03  2    1,068  1.96 .2(1  .2)  n .03  2    683  55 Inference about Two Populations 56 Problem objective? Describe a population Compare two populations Data type? Data type? Nominal Interval Z test & estimator of p Type of descriptive measurement? Central location Variability Type of descriptive measurement? Central location t- test & c2- test & estimator of m estimator of s2 Continue Nominal Interval Z test & estimator of p1-p2 Variability F- test & 2 estimator of s12/s257 Experimental design? Continue Experimental design? Continue Matched pairs Continue Independent samples t- test & estimator of mD Population variances? Equal Unequal Problem objective? t- test & estimator of m1-m2 (Equal variances) t- test & estimator of m1-m2 (Unequal variances) Describe a population Compare two populations Data type? Data type? Nominal Interval test && ZZ test estimator of of pp estimator Type of descriptive measurement? Central location Variability Type of descriptive measurement? Central location t-t- test cc22-- test test && test && estimator estimator of of mm estimator estimator of of ss22 Continue Nominal Interval test && ZZ test estimator of of pp11-p -p22 estimator Variability F- test test F58&& 22 22 estimator of of ss11 /s /s22 3 estimator Experimental design? Continue 1. Introduction • Variety of techniques are presented whose objective is to compare two populations. • We are interested in: – The difference between two means. – The ratio of two variances. – The difference between two proportions. 59 2. Inference about the Difference between Two Means: Independent Samples • Two random samples are drawn from the two populations of interest. • Because we compare two population means, we use the statistic x1  x 2. 60 The Sampling Distribution of x1  x 2 1. 2. x1  x 2 is normally distributed if the (original) population distributions are normal . x1  x 2 is approximately normally distributed if the (original) population is not normal, but the samples’ size is sufficiently large (greater than 30). 3. The expected value of 4. The variance of x1  x 2 is m1 - m2 x1  x 2 is s12/n1 + s22/n2 61 Making an inference about m1 – m2 • If the sampling distribution of x1  x 2 is normal or approximately normal we can write: ( x 1  x 2 )  (m1  m 2 ) Z 2 2 s1 s 2  n1 n2 • Z can be used to build a test statistic or a confidence interval for m1 - m2 62 Making an inference about m1 – m2 • Practically, the “Z” statistic is hardly used, because the population variances are not known. ( x 1  x 2 )  (m1  m 2 ) Zt  2 2 2 2 s s ? S ? S11 22  n1 n2 • Instead, we construct a t statistic using the sample “variances” (S12 and S22). 63 Making an inference about m1 – m2 • Two cases are considered when producing the t-statistic. – The two unknown population variances are equal. – The two unknown population variances are not equal. 64 Inference about m1 – m2: Equal variances • Calculate the pooled variance estimate by: 2 2 ( n  1 ) s  ( n  1 ) s 1 2 2 S p2  1 n1  n2  2 The pooled variance estimator n1 = 10 S n2 = 15 S 22 2 1 Example: s12 = 25; s22 = 30; n1 = 10; n2 = 15. Then, (10  1)( 25)  (15  1)( 30) Sp   28.04347 10  15  2 2 65 Inference about m1 – m2: Equal variances • Calculate the pooled variance estimate by: 2 2 ( n  1 ) s  ( n  1 ) s 1 2 2 S p2  1 n1  n2  2 The pooled Variance estimator n2 = 15 n1 = 10 S 2 1 S 22 S p2 Example: s12 = 25; s22 = 30; n1 = 10; n2 = 15. Then, (10  1)( 25)  (15  1)( 30) Sp   28.04347 10  15  2 2 66 Inference about m1 – m2: Equal variances • Construct the t-statistic as follows: ( x1  x 2 )  (m1  m 2 ) t 1 2 1 sp (  ) n1 n2 d.f .  n1  n2  2 • Perform a hypothesis test H0: m1  m2 = 0 H1: m1  m2 > 0 or < 0 or 0 Build a confidence interval ( x1  x 2 )  t a 2 1 1 sp (  ) n1 n2 2 where 1  ais the confidence level. 67 Inference about m1 – m2: Unequal variances t ( x1  x2 )  ( m1  m 2 ) d.f.  s12 s 22 (  ) n1 n2 ( s12 n1  s 22 / n2 ) 2 ( s12 2 n1 )  n1  1 ( s 22 n2 ) n2  1 2 68 Inference about m1 – m2: Unequal variances Conduct a hypothesis test as needed, or, build a confidence interval Confidence interval s12 s22 ( x1  x2 )  ta 2 (  ) n1 n2 where 1  a is the confidence level 69 Example: Making an inference about m1 – m2 • Example 1. – Do people who eat high-fiber cereal for breakfast consume, on average, fewer calories for lunch than people who do not eat high-fiber cereal for breakfast? – A sample of 150 people was randomly drawn. Each person was identified as a consumer or a non-consumer of high-fiber cereal. – For each person the number of calories consumed at lunch was recorded. 70 Example: Making an inference about m1 – m2 Consmers Non-cmrs 568 498 589 681 540 646 636 739 539 596 607 529 637 617 633 555 . . . . 705 819 706 509 613 582 601 608 787 573 428 754 741 628 537 748 . . . . Solution: • The data are interval. • The parameter to be tested is the difference between two means. • The claim to be tested is: The mean caloric intake of consumers (m1) is less than that of non-consumers (m2). 71 Example: Making an inference about m1 – m2 • The hypotheses are: H0: (m1 - m2) = 0 H1: (m1 - m2) < 0 – To check the whether the population variances are equal, we use computer output to find the sample variances We have s12= 4103, and s22 = 10,670. – It appears that the variances are unequal. 72 Example: Making an inference about m1 – m2 • Compute: Manually – From the data we have: x1  604 .02,  x2  633 .23 s12  4,103 , s22  10,670 (4103 43  10670 107 ) 2 4103  43  10670 107   43  1 107  1 2 2  122 .6 @ 123 73 Example: Making an inference about m1 – m2 • Compute: Manually – The rejection region is t < -ta, = -t.05,123 @1.658 t ( x1  x2 )  ( m1  m2 ) s12 s22  n1 n2 (604.02  633.23)  (0)   -2.091 4103 10670  43 107 74 Example: Making an inference about m1 – m2 At the 5% significance level there is sufficient evidence to reject the null hypothesis. 75 Example: Making an inference about m1 – m2 • Compute: Manually The confidence interval estimator for the difference between two means is  s2 s2   1  2 (x  x )  t 1 2 a 2  n n  2  1 4103 10670  (604.02  633.239)  1.9794  43 107  29.21  27.65   56.86, 1.56 76 Example: Making an inference about m1 – m2 • Example 2. – An ergonomic chair can be assembled using two different sets of operations (Method A and Method B) – The operations manager would like to know whether the assembly time under the two methods differ. 77 Example: Making an inference about m1 – m2 • Example 2. – Two samples are randomly and independently selected • A sample of 25 workers assembled the chair using method A. • A sample of 25 workers assembled the chair using method B. • The assembly times were recorded – Do the assembly times of the two methods differs? 78 Example: Making an inference about m1 – m2 Assembly times in Minutes Method A Method B 6.8 5.2 Solution 5.0 6.7 7.9 5.7 5.2 6.6 • The data are interval. 7.6 8.5 5.0 6.5 • The parameter of interest is the difference 5.9 5.9 5.2 6.7 between two population means. 6.5 6.6 . . . . • The claim to be tested is whether a difference . . between the two methods exists. . . 79 Example: Making an inference about m1 – m2 • Compute: Manually –The hypotheses test is: H0: (m1 - m2)  0 H1: (m1 - m2)  0 – To check whether the two unknown population variances are equal we calculate S12 and S22 . – We have s12= 0.8478, and s22 =1.3031. – The two population variances appear to be equal. 80 Example: Making an inference about m1 – m2 • Compute: Manually – To calculate the t-statistic we have: x1  6.288 x2  6.016 s12  0.8478 s22  1.3031 (25  1)( 0.848)  (25  1)(1.303) S   1.076 25  25  2 2 p t (6.288  6.016)  0 1   1 1.076    25 25   d . f .  25  25  2  48  0.927 81 Example: Making an inference about m1 – m2 • The rejection region is t < -ta/2, =-t.025,48 = -2.009 or t > ta/2, = t.025,48 = 2.009 For a = 0.05 • The test: Since t= -2.009 < 0.927 < 2.009, there is insufficient evidence to reject the null hypothesis. Rejection region Rejection region -2.009 .093 2.009 82 Example: Making an inference about m1 – m2 83 Example: Making an inference about m1 – m2 • Conclusion: There is no evidence to infer at the 5% significance level that the two assembly methods are different in terms of assembly time 84 Example: Making an inference about m1 – m2 A 95% confidence interval for m1 - m2 is calculated as follows: ( x1  x2 )  ta 2 1 1 s (  ) n1 n2 2 p 1 1  6.288  6.016  2.0106 1.075(  ) 25 25  0.272  0.5897  [0.3177, 0.8617] Thus, at 95% confidence level -0.3177 < m1 - m2 < 0.8617 Notice: “Zero” is included in the confidence interval 85 Checking the required Conditions for the equal variances case (Example 2.) Design A 12 10 The data appear to be approximately normal 8 6 4 2 0 5 5.8 6.6 Design B 7.4 8.2 More 4.2 5 5.8 7 6 5 4 3 2 1 0 6.6 7.4 More 86 4. Matched Pairs Experiment • What is a matched pair experiment? • Why matched pairs experiments are needed? • How do we deal with data produced in this way? The following example demonstrates a situation where a matched pair experiment is the correct approach to testing the difference between two population means. 87 4. Matched Pairs Experiment Example 3. – To investigate the job offers obtained by MBA graduates, a study focusing on salaries was conducted. – Particularly, the salaries offered to finance majors were compared to those offered to marketing majors. – Two random samples of 25 graduates in each discipline were selected, and the highest salary offer was recorded for each one. – Can we infer that finance majors obtain higher salary offers than do marketing majors among MBAs?. 88 4. Matched Pairs Experiment • Solution – Compare two populations of interval data. – The parameter tested is m1 - m2 – H0: (m1 - m2) = 0 H1: (m1 - m2) > 0 Finance 61,228 51,836 20,620 73,356 84,186 . . . Marketing 73,361 36,956 63,627 71,069 40,203 . . . m1 The mean of the highest salary offered to Finance MBAs m2 The mean of the highest salary offered to Marketing MBAs 89 4. Matched Pairs Experiment • Solution – continued There is insufficient evidence to conclude that Finance MBAs are offered higher salaries than marketing MBAs. 90 The effect of a large sample variability • Question – The difference between the sample means is 65624 – 60423 = 5,201. – So, why could we not reject H0 and favor H1 where (m1 – m2 > 0)? 91 The effect of a large sample variability • Answer: – Sp2 is large (because the sample variances are large) Sp2 = 311,330,926. – A large variance reduces the value of the t statistic and it becomes more difficult to reject H0. ( x1  x 2 )  (m1  m 2 ) t 1 2 1 sp (  ) n1 n2 92 Reducing the variability The range of observations sample A The values each sample consists of might markedly vary... The range of observations sample B 93 Reducing the variability Differences ...but the differences between pairs of observations might be quite close to one another, resulting in a small The range of the variability of the differences. differences 0 94 The matched pairs experiment • Since the difference of the means is equal to the mean of the differences we can rewrite the hypotheses in terms of mD (the mean of the differences) rather than in terms of m1 – m2. • This formulation has the benefit of a smaller variability. Group 1 Group 2 Difference 10 15 12 11 -2 +4 Mean1 =12.5 Mean2 =11.5 Mean1 – Mean2 = 1 Mean Differences = 1 95 The matched pairs experiment • Example 4. – It was suspected that salary offers were affected by students’ GPA, (which caused S12 and S22 to increase). – To reduce this variability, the following procedure was used: • 25 ranges of GPAs were predetermined. • Students from each major were randomly selected, one from each GPA range. • The highest salary offer for each student was recorded. – From the data presented can we conclude that Finance majors are offered higher salaries? 96 The matched pairs hypothesis test • Solution (by hand) – The parameter tested is mD (=m1 – m2) Finance Marketing – The hypotheses: H0: mD = 0 The rejection region is H1: mD > 0 t > t.05,25-1 = 1.711 – The t statistic: t xD  mD sD Degrees of freedom = nD – 1 n 97 The matched pairs hypothesis test • Solution :From the data we calculate: GPA Group Finance Marketing Difference 1 95171 89329 5842 2 88009 92705 -4696 3 98089 99205 -1116 4 106322 99003 7319 5 74566 74825 -259 6 87089 77038 10051 7 88664 78272 10392 8 71200 59462 11738 9 69367 51555 17812 10 82618 81591 1027 . . . . . . . . . 98 The matched pairs hypothesis test • Solution x D  5,065 s D  6,647 – Calculate t x D  mD 5065  0 t   3.81 sD n 6647 25 99 The matched pairs hypothesis test 100 The matched pairs hypothesis test Conclusion: There is sufficient evidence to infer at 5% significance level that the Finance MBAs’ highest salary offer is, on the average, higher than that of the Marketing MBAs. 101 The matched pairs mean difference estimation Confidence Interval Estimator of m D xD  ta / 2,n 1 s n Example 13.5 The 95% confidence interval of the mean difference 6647 in Example 13.4 is 5065  2.064  5,065  2,744  [2321, 7809] 25 102 Checking the required conditions for the paired observations case • The validity of the results depends on the normality of the differences. Frequency Histogram 10 5 0 0 5000 10000 15000 20000 Difference 103 13.5 Inference about the ratio of two variances • In this section we draw inference about the ratio of two population variances. • This question is interesting because: – Variances can be used to evaluate the consistency of processes. – The relationship between population variances determines which of the equal-variances or unequalvariances t-test and estimator of the difference between means should be applied 104 Parameter and Statistic • Parameter to be tested is s12/s22 • Statistic used is 2 1 2 2 s s F s s 2 1 2 2 • Sampling distribution of s12/s22 – The statistic [s12/s12] / [s22/s22] follows the F distribution with 1 = n1 – 1, and 2 = n2 – 1. 105 Parameter and Statistic – Our null hypothesis is always H0: s12 / s22 = 1 S12/s12 – Under this null hypothesis the F statistic F = 2 2 S2 /s2 becomes s F s 2 1 2 2 106 Testing the ratio of two population variances Example 1.1. (revisiting Example 1.) Calories intake at lunch In order to perform a test regarding average consumption of calories at people’s lunch in relation to the inclusion of high-fiber cereal in their breakfast, the variance ratio of two samples has to be tested first. Consmers Non-cmrs 568 498 589 681 540 646 636 739 539 596 607 529 637 617 633 555 . . . . 705 819 706 509 613 582 601 608 787 573 428 754 741 628 537 748 . . . . The hypotheses are: 2 s H0: 1  1 s 22 s12 1 H1: 2 s2 107 Example: Making an inference about m1 – m2 • The hypotheses are: H0: (m1 - m2) = 0 H1: (m1 - m2) < 0 – To check the whether the population variances are equal, we use computer output to find the sample variances We have s12= 4103, and s22 = 10,670. – It appears that the variances are unequal. 108 F Distribution • Selected Values From the F Distribution Table 109 F Distribution Table 110 Testing the ratio of two population variances • Solving by hand – The rejection region is F>Fa/2,1,2 or F<1/Fa/2,2,1 F  Fa / 2, 1, 2  F.025, 42,106  F.025,40,120  1.61 F 1 Fa / 2, 2, 1  1 F.025,106, 42  1 F.025,120,40 1   .58 1.72 – The F statistic value is F=S12/S22 = .3845 – Conclusion: Because .3845<.58 we reject the null hypothesis in favor of the alternative hypothesis, and conclude that there is sufficient evidence at the 5% significance level that the 111 population variances differ. Testing the ratio of two population variances Example 6. (revisiting Example 1.) In order to perform aare: test The hypotheses regarding average 2 s consumption at H0: 1 ofcalories 1 2 in relation to people’s s lunch 2 the inclusion 2 of high-fiber s1 cereal in breakfast, the  1 H1: their 2 s variance ratio 2 of two samples has to be tested first. F-Test Two-Sample for Variances Consumers Nonconsumers Mean 604 633 Variance 4103 10670 Observations 43 107 df 42 106 F 0.3845 P(F<=f) one-tail 0.0004 F Critical one-tail 0.6371 112 Estimating the Ratio of Two Population Variances • From the statistic F = [s12/s12] / [s22/s22] we can isolate s12/s22 and build the following confidence interval: 2 2   s12   s s 1 1    1 Fa / 2, 2,1   2  s2  F  s2  s 2  2  a / 2,1, 2  2 where 1  n  1 and  2  n2  1 113 Estimating the Ratio of Two Population Variances • Example 1.2. – Determine the 95% confidence interval estimate of the ratio of the two population variances in Example 1. – Solution • We find Fa/2,v1,v2 = F.025,40,120 = 1.61 (approximately) Fa/2,v2,v1 = F.025,120,40 = 1.72 (approximately) • LCL = (s12/s22)[1/ Fa/2,v1,v2 ] = (4102.98/10,669.77)[1/1.61]= .2388 • UCL = (s12/s22)[ Fa/2,v2,v1 ] = (4102.98/10,669.77)[1.72]= .6614 114 6. Inference about the difference between two population proportions • In this section we deal with two populations whose data are nominal. • For nominal data we compare the population proportions of the occurrence of a certain event. • Examples – Comparing the effectiveness of new drug versus older one – Comparing market share before and after advertising campaign – Comparing defective rates between two machines 115 Parameter and Statistic • Parameter – When the data are nominal, we can only count the occurrences of a certain event in the two populations, and calculate proportions. – The parameter is therefore p1 – p2. • Statistic – An unbiased estimator of p1 – p2 is p̂1  p̂ 2 (the difference between the sample proportions). 116 Sampling Distribution of p̂1  p̂ 2 • Two random samples are drawn from two populations. • The number of successes in each sample is recorded. • The sample proportions are computed. Sample 1 Sample size n1 Number of successes x1 Sample proportion pˆ 1  x1 n1 Sample 2 Sample size n2 Number of successes x2 Sample proportion x2 p̂ 2  n2 117 Sampling distribution of p̂1  p̂ 2 • The statistic p̂1  p̂ 2 is approximately normally distributed if n1p1, n1(1 - p1), n2p2, n2(1 - p2) are all greater than or equal to 5. • The mean of p̂1  p̂ 2 is p1 - p2. • The variance of p̂1  p̂ 2 is (p1(1-p1) /n1)+ (p2(1-p2)/n2) 118 The z-statistic Z ( pˆ 1  pˆ 2 )  ( p1  p 2 ) p1 (1  p1 ) p 2 (1  p 2 )  n1 n2 Because p1 and p 2 are unknown the standard error must be estimated using the sample proportions. The method depends on the null hypothesis 119 Testing the p1 – p2 • There are two cases to consider: Case 1: H0: p1-p2 =0 Calculate the pooled proportion Case 2: H0: p1-p2 =D (D is not equal to 0) Do not pool the data x1  x 2 p̂  n1  n 2 Then (p̂1  p̂ 2 )  (p1  p 2 ) Z 1 1 p̂(1  p̂)(  ) n1 n2 x1 p̂1  n1 Then Z x2 p̂ 2  n2 (p̂1  p̂ 2 )  D p̂1 (1  p̂1 ) p̂ 2 (1  p̂ 2 )  n1 n2 120 Testing p1 – p2 (Case 1) • Example 5. – The marketing manager needs to decide which of two new packaging designs to adopt, to help improve sales of his company’s soap. – A study is performed in two supermarkets: • Brightly-colored packaging is distributed in supermarket 1. • Simple packaging is distributed in supermarket 2. – First design is more expensive, therefore,to be financially viable it has to outsell the second design. 121 Testing p1 – p2 (Case 1) • Summary of the experiment results – Supermarket 1 - 180 purchasers of Johnson Brothers soap out of a total of 904 – Supermarket 2 - 155 purchasers of Johnson Brothers soap out of a total of 1,038 – Use 5% significance level and perform a test to find which type of packaging to use. 122 Testing p1 – p2 (Case 1) • Solution – The problem objective is to compare the population of sales of the two packaging designs. – The data are nominal (Johnson Brothers or other soap) Population 1: purchases at supermarket 1 – The hypotheses are Population 2: purchases at supermarket 2 H0: p1 - p2 = 0 H1: p1 - p2 > 0 – We identify this application as case 1 123 Testing p1 – p2 (Case 1) • Compute: Manually – For a 5% significance level the rejection region is z > za = z.05 = 1.645 The sample proportions are pˆ 1  180 904  .1991 , and pˆ 2  155 1,038  .1493 The pooled proportion is pˆ  ( x1  x 2 ) (n1  n 2 )  (180  155 ) (904  1,038 )  .1725 The z statistic becomes ( pˆ  pˆ 2 )  ( p1  p 2 ) .1991  .1493 Z 1   2.90  1  1 1  1  .1725 (1  .1725 )    pˆ (1  pˆ )  124  904 1,038   n1 n 2  Testing p1 – p2 (Case 1) Conclusion: There is sufficient evidence to conclude at the 5% significance level, that brightly-colored design will outsell the simple design. 125 Testing p1 – p2 (Case 2) • Example 5.1. (Revisit Example 5.) – Management needs to decide which of two new packaging designs to adopt, to help improve sales of a certain soap. – A study is performed in two supermarkets: – For the brightly-colored design to be financially viable it has to outsell the simple design by at least 3%. 126 Testing p1 – p2 (Case 2) • Summary of the experiment results – Supermarket 1 - 180 purchasers of Johnson Brothers’ soap out of a total of 904 – Supermarket 2 - 155 purchasers of Johnson Brothers’ soap out of a total of 1,038 – Use 5% significance level and perform a test to find which type of packaging to use. 127 Testing p1 – p2 (Case 2) • Solution – The hypotheses to test are H0: p1 - p2 = .03 H1: p1 - p2 > .03 – We identify this application as case 2 (the hypothesized difference is not equal to zero). 128 Testing p1 – p2 (Case 2) • Compute: Manually Z   ( pˆ 1  pˆ 2 )  D pˆ 1 (1  pˆ 1 ) pˆ 2 (1  pˆ 2 )  n1 n2  180   155     .03   904   1,038   1 .15 .1991 (1  .1991 ) .1493 (1  .1493 )  904 1,038 The rejection region is z > za = z.05 = 1.645. Conclusion: Since 1.15 < 1.645 do not reject the null hypothesis. There is insufficient evidence to infer that the brightly-colored design will outsell the simple design by 3% or more. 129 Testing p1 – p2 (Case 2) z-Test: Two Proportions Supermark et 1 Supermark et 2 Sample Proportions 0.1991 0.1493 Observations 904 1038 Hypothesized Difference 0.03 z Stat 1.14 P(Z<=z) one tail 0.1261 z Critical one-tail 1.6449 P(Z<=z) two-tail 0.2522 z Critical two-tail 1.96 130 淑女與下午茶 The Lady Tasting Tea 131 CH17 變異數分析 132 一九二Ｏ年代的一個夏日午後，一群大學研究員與他們的女眷及訪客，正坐在英國劍橋的戶外餐桌旁，悠閒地享受著下午茶。有位女士宣稱，下午茶的調製順序對風味有很大的影響，把茶加進牛奶裡，和把牛奶加進茶裡，兩者喝起來完全不同。席間那些有科學頭腦的紳士們都對這種說法嗤之以鼻，怎麼會不一樣？ 133 這時有個身材瘦小、嘴上留著小鬍子的紳士很興奮地說：「我們來檢定這個命題」。並立刻著手準備實驗。他調製很多杯不同的茶，有些先放茶水在加牛奶，有些先放牛奶後加茶水，然後一杯杯拿給那位主張味道不同的女士分辨。 134 留著小鬍子的紳士是費雪（Sir R. A. Fisher ）。費雪當時所考慮的問題是，如果只拿一杯茶給她品嚐，她有百分之五十的機會猜出這杯茶的調製方法，就算她其實分不出來，也有同樣大的機會；如果給她兩杯茶，她還是有機會猜對，事實上，她如果知道這兩杯茶以不同的方法調製，她可能一次就全部猜對或全部猜錯。 135 同樣的，即使她真的可以分辨其中的不同，她還是有機會弄錯。可能其中一杯的茶與牛奶沒有充分混合，又或者在泡茶的時候，茶水的溫度不夠高，影響了茶的味道。她可能試了十杯茶，其中九杯都說對了，只有一杯說錯。 136 這是一個很典型的實驗設計的例子。需要考慮各種不同可設計出的實驗方法，來測出那位女士是否能分辨不同的茶。該如何決定準備多少杯茶，依照什麼順序拿給她，是否該讓她知道試喝的順序，再依照她的答對與否，計算出各項結果的機率。 137 R. A. Fisher An historical note: In David Salsburg's The Lady Tasting Tea: How Statistics Revolutionized Science In The 20th Century, Hugh Smith, a witness on that summer afternoon in Cambridge, does not recall the exact number of trials in the experiment, but does say that the lady in question passed with flying colors, correctly identifying milk-intea or tea-in-milk every time; as for the cause of the taste difference, pouring hot tea into cold milk makes the milk curdle, but not so pouring cold milk into hot tea. 138 實驗是一種累積知識的工具，但很多人並沒有發現到這一點意義。一流的科學家可以做出很有價值的實驗，產生新知識，而二流的科學家只是忙於各種實驗，蒐集大量數據，但對知識的累積沒有什麼用處。雖然科學是從審慎思考、觀測與實驗發展而成的，但究竟要怎麼做實驗，卻從來沒有人提及，所有的實驗結果通常也不會公布出來給大家看。 139 究竟要怎麼做實驗設計？費雪的結論是：科學家應該從潛在實驗結果的數學模型著手。數學模型是指一組方程式，其中有些符號代表我們想經由實驗蒐集到的數據，而其餘的符號則代表實驗的結果。在考量科學問題時，科學家必須先從實驗中取得數據，再由這些數字計算出恰當的結果。 140 費雪指出，在設計這種實驗的時候，第一步是要建立一組數學方程式，描述欲蒐集的數據與待估計結果之間的關係，因此，有用的實驗必須是能夠提供估計值的實驗。 141 比如說，農業科學家想知道某種人工肥料對不同品種的馬鈴薯生長有什麼影響時，他所做的實驗要能提供必要的數據，讓他能夠估計這項影響。 142 如果我們觀察一塊栽種農作物的農地，會發現某些區域的土壤比其他區域更肥沃；在某些角落，農作物長得又高又茂盛，在其他的角落，相同的農作物則又瘦又稀疏。其中的原因可能是水的流向、土壤的種類，有一些不知名的養分存在，當然也可能有某種因素抑制了野草的生長，甚至還有一些先前不知道的因素。 143 如果農業科學家想試驗兩種肥料成分之間的不同，他可以把兩種肥料施加在同一塊田裡的不同部分，但這麼一來，不同肥料產生的結果，會與土壤或排水性等其他因素造成的結果發生交絡(Confounded)，而無法區分；如果是在同一塊地試驗但選在不同的年度，則由肥料導致的結果，會與每年的天氣變化發生交絡。 144 不過，如果在同一年裡，我們在相鄰兩株植物上施不同的肥料，土壤的差異就會減至最少。但由於處理的作物，土壤的條件不可能完全一樣，所以土壤的差異還是存在。費雪決定以隨機的方式，處理一個區塊裡不同列的農作物。把農場分成一小區一小區的，每一區的作物在進一步種植成一列一列的，然後在每一列以隨機的方式來處理。由於是隨機處理，因此沒有固定的型態，故土壤可能的差異就會互相抵銷掉，平均掉。 145 這是一種在精心設計的科學實驗中，區別不同處理所得到的結果的方法，費雪稱它為「變異數分析」 (Analysis of Variance)，簡稱為 ANOVA。 146 變異數分析最早應用於農業方面的實驗，目前已廣泛的被應用於各種科學的研究及各種決策： 1.商品陳列在商店十個位置的選擇，例如：放在某一地點比放在其他地點可以賣得更好。 2.生產線的三種方法a, b, c，不同的方法於生產線上，對產量是否有影響。 3. A、B、C、D四種口味及甲、乙、丙三種添加物對產品銷售的影響。 147 上面的研究都是三個或三個以上的母體平均數是否相同的問題：十個地點（位置），三種方法，四種口味，三種添加物為已知，或為實驗者（研究者）所控制，稱為獨立變數(independent variable)或因子 (factor)。上面的方法稱為因子，而三種不同的a, b, c 方法稱為三個處理(treatment)，每一個處理視為一個母體，而實驗中的產物稱為實驗單位(experiment unit) 。商品平均收入，生產線平均產量，產品平均銷售量則為實驗者（研究者）所欲觀察的反應變數，稱為依變數(dependent variable)。 148 變異數分析本章介紹變異數分析，它是用來檢定三個或三個以上母體平均數相等的假設。變異數分析依據因子的數目可區分為一因子變異數分析及二因子變異數分析。 149 變異數分析是用來檢定三個或三個以上母體平均數相等的假設，看起來變異數分析這個名詞似乎並不恰當，因為我們要檢定的是母體平均數而非變異數，然而事實上，變異數分析的檢定過程是根據樣本資料的變異量為分析基礎的。 150 一因子變異數分析一個蘋果果汁製造商新推出一種濃縮果汁，比舊的罐裝果汁有三個優點： 1.方便 2.品質好 3.價錢較便宜。市場經理不知該用何種特質來廣告此一新的產品，於是他選擇了三個非常類似的城市，在每一個城市各用其中一種特質（方便、品質、價錢）來做廣告，看看平均售出的果汁是否因廣告的特質不同而有所差異。 151 檢定的步驟如下： 1. 設立兩個假設 H0:u1=u2=u3 （三種特質的廣告效果一樣，賣出的罐數相同） HA:ui不全等（賣出的罐數不相同）那如何來進行此一假設檢定呢？ 152 One-Way ANOVA H 0 : m1  m2  m3 H A : Not all m j are the same The Null Hypothesis is True m1  m 2  m3 153 One-Way ANOVA H 0 : m1  m2  m3 H A : Not all m j are the same m1  m 2  m3 The Null Hypothesis is NOT True m1  m2  m3 154 首先，從三個城市獨立觀察20個星期，記錄每個星期的銷售量，得到 X1 =577.55 X=653.00 2 X3 =608.65 看起來三種不同廣告的特質所賣出的罐數是不相同的。而由此結果，我們可以不可以就此下結論說：「三個特質的廣告效果，所賣出的罐數的不相同的」？ 155 答案是「不行」，因為三個平均數的不同，可能來自於抽樣的隨機誤差，亦可能來自三種不同的廣告效果。 156 2. 選取檢定統計量各種廣告特質所賣出的瓶數的總差異來自兩方面：不同特質的差異（母體間的差異）同一特質間的差異 157 不同特質的差異（母體間的差異）即注重方便、品質及價錢的不同母體所購買的瓶數不相同，此稱為因子的差異或組間差異 (between-treatments variation)。如果虛無假設為真，則三個母體平均數相同，此時樣本平均數雖仍有差異，但很小，而組間差異亦必很小；如果三個母體平均數不相同，此時樣本平均數的差異較大，因而組間差異亦必較大，亦即平均數的不同係來自不同的母體。 158 同一特質間的差異即在一特質間（母體）的不同週的銷售瓶數的不相同，稱為組內差異(withintreatments variation)，或隨機差異，亦即銷售瓶數的差異來自機遇的結果。 159 30 25 x3  20 20 x 2  15 16 15 14 11 10 9 x3  20 20 19 x 2  15 x1  10 12 10 9 x1  10 7 A small variability within Treatment 1 Treatment 2 Treatment 3 the samples makes it easier to draw a conclusion about the population means. 1 The sample means are the same as before, Treatment 1 Treatment 2 Treatment but the larger within-sample variability 3 makes it harder to draw a conclusion160 about the population means. 根據上述母體總差異的分解方法，將樣本的總差異分解為因子引起的差異（組間差異）與隨機（組內）差異兩種： xij  x  ( x j  x )  ( xij  x j ) 組間差異組內差異 161 若將上式取平方和，則可得下式： k nj k nj k nj 2 2 2 ( x  x )  ( x  x )  ( x  x )  ij  j  ij j j 1 i 1 j 1 i 1 SST = SSB j 1 i 1 + SSE SST： (Total Sum of Squares)為總變異 SSB： (Sum of Squares for Treatment)為因子所引起的變異 SSE： (Sum of Squares for Error)為隨機變異 162 若各個母體其平均數與整個全體母體平均數相等，則因子變異數會等於零，而隨機變異不受影響；而若各個母體有一母體平均數與整個全體母體平均數不相等，則因子引起的變異不會等於零，而隨機變異仍然相同。變異數分析的方法即是用樣本資料來比較這兩個變異的大小，以檢定因子所引起的變異(SSB)是否夠大到足以拒絕虛無假設。 163 若虛無假設成立，則SSB係來自抽樣誤差，因此，SSB 相對SSE不會太大；若虛無假設不成立，則SSB的數值相對SSE的數值則將會較大。另外，SSB及SSE會受樣本個數多寡的影響，因此，不能直接比較SSB及SSE的大小，而必須進一步求平均變異，分別為： MSB=SSB/(k-1) MSE=SSE/(n-k) 式中， MSB(Mean Square for Treatment)：因子所引起之平均變異 MSE(Mean Square for Error)：隨機平均變異 164 現在我們可以比較MSB與MSE這兩個變異數，當MSB相對MSE較大時，顯示因子會影響依變數，因此我們以 MSB/MSE作為檢定統計量來進行假設檢定。究竟的數值要多大，我們才會拒絕呢？這就必須先求檢定統計量 F=MSB/MSE的分配為何了！ 165 F=MST/MSE的抽樣分配在進行變異數分析時有下列幾個假設： 1.假設因子對依變數的影響效果是固定的，為一常數，而不是隨機變數。 2. 假設母體均為常態分配。 3.變異數齊一性(Homogeneity)，每個母體的變異數均相等。 4.抽樣方法為獨立簡單隨機抽樣，即自k個母體分別選取獨立之隨機樣本。 166 在前述變異數的四個假設下，當H0為真時， F=MSB/MSE的抽樣分配為一個自由度為k1及nk的F分佈。 MSB F ~ F ( k  1, n  k ) MSE 167 One-Way ANOVA Ho : m  m  m  1 2 m 3 k Ha : At least one of the means is different from the others MSB F MSE If F > Fc, reject Ho. If F  F , do not reject H . c o 168 3. 決定決策法則在選定顯著水準a下，決策法則為：若F> Fa ,k 1,n，則拒絕H k 0。若F< Fa ,k 1,n，則接受H k 0。此即表示我們採取右尾檢定，理由是當u=ui時，E(MSB)=E(MSE)，此時的值應在1左右；但當 E(MSB)>E(MSE)時，顯示ui不全等，因此在檢定時，若MSB/MSE值很大，應拒絕H0，亦即拒絕域放於右尾。 169 4. 計算及比較檢定統計量。 5.根據決策法則得檢定結果，然後下結論。 170 例子：蘋果果汁的例子 X X =653.00 =608.65 u=613.07 X =577.55 SSB=57512.23 SSE=506967.88 k=3 n=60 MSB=SSB/(k-1)=28756.12, MSE=SSE/(n-k)=8894.17 F0.05=3.15 F=MSB/MSE=3.23>  拒絕 , 2 , 57 1 2 3 由上面的結果，我們可以下結論：「三個城市的蘋果汁銷售瓶數的不同係來自廣告不同特質的差異」。 171 ANOVA Sums of Squares Degrees of Freedom Treatments SSB k-1 MSB=SSB/(k-1) Error SSE n-k MSE=SSE/(n-k) Total SS(Total) n-1 Source of Variation Mean Squares F-Statistics P-Value F=MSB/MSE 172 single factor ANOVA 173 在前面蘋果果汁的例子中，除了方便、品質及價錢三個特質之外，如果在每一個城市廣告的媒介有二種：報紙與電視。市場經理也想知道：要在那一種媒介廣告可能比較有效？那要如何來進行實驗呢？ 174 第一種方法是選擇六個城市，觀察10個星期，記錄每個星期的銷售量。 1.方便、電視 2.方便、報紙 3.品質、電視 4.品質、報紙 5.價錢、電視 6.價錢、報紙 H0:u1=u2=u3=u4=u5=u6 HA: ui不全等如果結果是F=2.45> F0.05,5,54，則拒絕。結論：「在這六個城市，蘋果果汁銷售量不同」。 175 市埸經理如果要利用此結果來進行他的行銷策略，他該如何做呢？他如何來辨別要利用報紙或者電視來做推銷呢？或者如何採何種混合策略能有較高的銷售量呢？ 176 二因子變異數分析第二種方法是進行二因子變異數分析，假設二因子（特質、媒介）各有a個及b個處理(treatments)，在每一個混合的處理，有r 個重覆樣本(replicate)，稱這樣的實驗為 complete ab factorial experiment，如果每一個混合處理的重覆樣本都是同樣r個，稱為balanced設計。 177 如果在這些混合處理的結果，有存在樣本平均量的不同，那我們想知道，這些的不同到底是由於因子A或是因子B的影響，或者是同時影響。如果是同時影響，那是獨立影響或是有交叉影響呢? 1. 因子A、因子B都有影響，但無交叉影響。 2. 因子A有影響，因子B無影響。 3. 因子B有影響，因子A無影響。 4. 因子A、B有交叉影響。 178 去檢定上述四種可能性，我們需進行三個F檢定，來決定影響平均數的不同是來自交叉影響，或者是因子A ，或者是因子B。 179 例子：蘋果果汁的例子 A因子（特質）有方便、品質、價錢三個處理 B因子（媒介）有電視、報紙二個處理在做二因子變異數分析時，除非在實驗前已知或假設二因子無交叉影響，否則都要先做交叉影響的檢定。 180 如果對交叉影響的檢定發現有顯著影響，那就不須再做分別對因子A及因子B的假設檢定，因為如果有交叉影響，那就表示在某些因子A的處理及某些因子B的處理的混合，會造成平均數的不同，在這種情形下，大部份如果再去做對因子A或因子B的檢定結果也會是顯著的。但是此顯著結果可能是錯的，因為對某些因子的處理的平均數的不同，可能只是因為交叉影響造成的，而不是因子本身所造成的。所以，如果有交叉影響，則不須再對其它的兩個A、B因子去做檢定。 181 所以在做二因子變異數分析時，先對交叉影響做檢定，如果沒有交叉影響，則再做對因子A及因子B的檢定，看看因子A、因子B對依變數是否有影響。 182 Two-Way Factorial Design Column Treatment . . Row Treatment Cells . . . . . . . . . . . . . . . 183 Two-Way ANOVA • Assumptions – Normality • Populations are normally distributed – Homogeneity of Variance • Populations have equal variances – Independence of Errors • Independent random samples are drawn 184 Two-Way ANOVA: Hypotheses Row Effects: Ho: Row Means are all equal. Ha: At least one row mean is different from the others. Columns Effects: Ho: Column Means are all equal. Ha: At least one column mean is different from the others. Interaction Effects: Ho: The interaction effects are zero. Ha: There is an interaction effect. 185 Two-Way ANOVA Total Variation Partitioning Variation Due to Factor A Total Variation SST d.f.= N-1 = Variation Due to Factor B Variation Due to Interaction Variation Due to Random Sampling SSA d.f.= r-1 SSB d.f.= c-1 + + SSAB + d.f.= (r-1)(c-1) SSE d.f.= rc(n-1) 186 Formulas for Computing a Two-Way ANOVA r SSR  nc  ( i 1 c SSC  nr  ( j 1 r X i X ) 2 X j X ) c SSI  n ( i 1 j 1 c X ij  X i  X j  X ) n i 1 j 1 k 1 c r n SST   ( X c 1 r 1 k 1 R df C 2 SSE   ( X ijk  X ij ) r df 2 df I  r 1  c 1   r  1 c  1 where : n =number of observations per cell c =number of column treatments r = number of row treatments 2 df E  rc  n  1 i = row treatment level j = column treatment level ijk  X ) 2 df T  N 1 k = cell member X X X X ijk MSR  SSR r 1 SSC MSC  c 1 MSI  SSI  r  1 c  1 MSE  SSE rc  n  1 F F F  R C I MSR MSE MSC  MSE  MSI MSE ij i j = individual observation = cell mean = row mean = column mean X = grand mean 187 Two-Way ANOVA: The F Test Statistic H0: m1 ..= m2 .. = ••• = mr .. F Test for Factor A Main Effect MSA H1: Not all mi .. are equal F  MSE SSA MSA  r 1 Reject if F > FU H0: m1. = m.2. = ••• = mc. F Test for Factor B Main Effect MSB H1: Not all m.j. are equal F  MSE SSB MSB  c 1 Reject if F > FU H0: ij = 0 (for all i and j) F Test for Interaction Effect H1: ij  0 MSAB F MSE SSAB MSAB   r  1 c  1 Reject if F > FU 188 Two-Way ANOVA Summary Table Source of Variation Degrees of Freedom Sum of Squares Mean Squares F Statistic r–1 SSA MSA = SSA/(r – 1) MSA/ MSE c–1 SSB MSB = SSB/(c – 1) MSB/ MSE AB (Interaction) (r – 1)(c – 1) SSAB MSAB = SSAB/ [(r – 1)(c – 1)] MSAB/ MSE Error r c n’ SSE MSE = SSE/[rc n’ – 1)] Total r  c  n’ – 1 Factor A (Row) Factor B (Column) – 1) SST 189 Difference between the levels of factor A, and Difference between the levels of factor A difference between the levels of factor B; no No difference between the levels of factor B interaction M R Level 1 of factor B Level 1and 2 of factor B e e s a p Level 2 of factor B n o n s e Levels of factor A Levels of factor A M R e e s a p n o n s e 1 M R e e s a p n o n s e 2 3 1 No difference between the levels of factor A. Difference between the levels of factor B M R e e s a p n o n s e 2 Interaction Levels of factor A 1 2 3 3 1 2 Levels of factor A 3 190 A 2  3 Factorial Design with Interaction Row effects Cell Means R1 R2 C1 C2 Column C3 191 A 2  3 Factorial Design with Some Interaction Row effects Cell Means R1 R2 C1 C2 Column C3 192 A 2  3 Factorial Design with No Interaction Row effects Cell Means R1 R2 C1 C2 C3 Column 193 ANOVA Source of Variation Degrees of Freedom Sums of Squares Mean Squares F-Statistics Factor A SS(A) a-1 MS(A)=SS(A)/(a-1) F=MS(A)/MSE Factor B SS(B) b-1 MS(B)=SS(B)/(b-1) F=MS(B)/MSE Interaction SS(AB) (a-1)(b-1) MS(AB)=SS(AB)/(a-1)(b-1) F=MS(AB)/MSE Error SSE n-ab Total SS(Total) n-1 P-Value 194 F tests for the Two-way ANOVA • Example - continued – Test for interaction between factors A and B H0: mTV*conv. = mTV*quality =…=mnewsp.*price H1: At least two means differ Interaction AB = Marketing*Media 195 F tests for the Two-way ANOVA • Example - continued – Test for interaction between factor A and B H0: mTV*conv. = mTV*quality =…=mnewsp.*price H1: At least two means differ F = MS(Marketing*Media)/MSE = .087 MS(AB)/MSE Fcritical = Fa,a-1)(b-1),n-ab = F.05,(3-1)(2-1),60-(3)(2) = 3.17 (p-value= .917) – At 5% significance level there is insufficient evidence to infer that the two factors interact to affect the mean weekly sales. 196 F tests for the Two-way ANOVA • Example – continued – Test of the difference in mean sales between the three marketing strategies H0: mconv. = mquality = mprice H1: At least two mean sales are different Factor A Marketing strategies 197 F tests for the Two-way ANOVA • Example – continued – Test of the difference in mean sales between the three marketing strategies H0: mconv. = mquality = mprice H1: At least two mean sales are different MS(A)/MSE F = MS(Marketing strategy)/MSE = 5.325 Fcritical = Fa,a-1,n-ab = F.05,3-1,60-(3)(2) = 3.17; (p-value = .008) – At 5% significance level there is evidence to infer that differences in weekly sales exist among the marketing strategies. 198 F tests for the Two-way ANOVA • Example - continued – Test of the difference in mean sales between the two advertising media H0: mTV. = mNespaper H1: The two mean sales differ Factor B = Advertising media 199 F tests for the Two-way ANOVA • Example - continued – Test of the difference in mean sales between the two advertising media H0: mTV. = mNespaper H1: The two mean sales differ MS(B)/MSE F = MS(Media)/MSE = 1.419 Fcritical = Fa,a-1,n-ab = F.05,2-1,60-(3)(2) = 4.02 (p-value = .239) – At 5% significance level there is insufficient evidence to infer that differences in weekly sales exist between the two advertising media. 200 1. 檢定A、B因子是否有交叉影響： F=0.087 < =3.15= F0.05,2,54 接受H0 結論：「無交叉影響」。 2. 再對因子B做檢定： F=5.325 > =3.15= F0.05,2,54 拒絕H0 結論：「蘋果汁銷售量的不同來自廣告不同特質的差異」。 3. 再對因子A做檢定： F=1.419 <=4.00 = F0.05,1,54 接受H0 結論：「廣告媒介的不同對蘋果汁銷售量並無影響」。 201 CH14 Chi Squared Tests 202 1. Introduction • Two statistical techniques are presented, to analyze nominal data. – A goodness-of-fit test for the multinomial experiment. – A contingency table test of independence. • Both tests use the c2 as the sampling distribution of the test statistic. 203 2. Chi-Squared Goodness-of-Fit Test • The hypothesis tested involves the probabilities p1, p2, …, pk.of a multinomial distribution. • The multinomial experiment is an extension of the binomial experiment. – There are n independent trials. – The outcome of each trial can be classified into one of k categories, called cells. – The probability pi that the outcome fall into cell i remains constant for each trial. Moreover, p1 + p2 + … +pk = 1. – Trials of the experiment are independent 204 2. Chi-squared Goodness-of-Fit Test • We test whether there is sufficient evidence to reject a pre-specified set of values for pi. • The hypothesis: H 0 : p1  a1 , p 2  a 2 ,..., p k  a k H 1 : At least one p i  a i • The test builds on comparing actual frequency and the expected frequency of occurrences in all the cells. 205 The multinomial goodness of fit test Example • Example 1. – Two competing companies A and B have enjoy dominant position in the market. The companies conducted aggressive advertising campaigns. – Market shares before the campaigns were: • Company A = 45% • Company B = 40% • Other competitors = 15%. 206 The multinomial goodness of fit test Example • Example 1. – continued – To study the effect of the campaign on the market shares, a survey was conducted. – 200 customers were asked to indicate their preference regarding the product advertised. – Survey results: • 102 customers preferred the company A’s product, • 82 customers preferred the company B’s product, • 16 customers preferred the competitors product. 207 The multinomial goodness of fit test Example • Example 1. – continued Can we conclude at 5% significance level that the market shares were affected by the advertising campaigns? 208 The multinomial goodness of fit test Example • Solution – – – – The population investigated is the brand preferences. The data are nominal (A, B, or other) This is a multinomial experiment (three categories). The question of interest: Are p1, p2, and p3 different after the campaign from their values before the campaign? 209 The multinomial goodness of fit test Example • The hypotheses are: H0: p1 = .45, p2 = .40, p3 = .15 H1: At least one pi changed. The expected frequency for each category (cell) if the null hypothesis is true is shown below: 90 = 200(.45) 80 = 200(.40) What actual frequencies did the sample return? 102 82 1 2 1 3 2 30 = 200(.15) 3 16 210 The multinomial goodness of fit test Example • The statistic is 2 ( f  e ) i c2   i ei i 1 k where e i  np i • The rejection region is c 2  c 2a ,k 1 211 The multinomial goodness of fit test Example • Example 1. – continued k c2   i1 (102  90)2 (82  80)2 (16  30)2    8.18 90 80 30 2 ca2 ,k 1  c.05,3 1  5.99147 The p value  0.01679 212 The multinomial goodness of fit test Example • Example 1. – continued c2 with 2 degrees of freedom 0.025 Conclusion: Since 8.18 > 5.99, there is sufficient evidence at 5% significance level to reject the null hypothesis. At least one of the probabilities pi is different. Thus, at least two market shares have changed. 0.02 0.015 0.01 Alpha 0.005 0 0 2 4 5.99 6 P value 8.18 8 10 Rejection region 12 213 Required conditions – the rule of five • The test statistic used to perform the test is only approximately Chi-squared distributed. • For the approximation to apply, the expected cell frequency has to be at least 5 for all the cells (npi  5). • If the expected frequency in a cell is less than 5, combine it with other cells. 214 3. Chi-squared Test of a Contingency Table • This test is used to test whether… – two nominal variables are related? – there are differences between two or more populations of a nominal variable • To accomplish the test objectives, we need to classify the data according to two different criteria. 215 Contingency table c2 test – Example • Example 2. – In an effort to better predict the demand for courses offered by a certain MBA program, it was hypothesized that students’ academic background affect their choice of MBA major, thus, their courses selection. – A random sample of last year’s MBA students was selected. The following contingency table summarizes relevant data. 216 Contingency table c2 test – Example Degree BA BENG BBA Other Accounting 31 8 12 10 61 Finance 13 16 10 5 44 Marketing 16 7 17 7 47 60 31 60 39 152 The observed values 217 Contingency table c2 test – Example • Solution – Since ei = npi but pi is unknown, we need to The hypotheses are: estimate the unknown H0: The two variables are independent probability from the data, H1: The two variables are dependent assuming H0 is true. – The test statistic k c  2  i1 ( fi  e i )2 ei k is the number of cells in the contingency table. – The rejection region c2  c2a,(r 1)( c 1) 218 Estimating the expected frequencies Undergraduate Degree Accounting BA BENG BBA Other 6161 Probability 61/152 MBA Major Finance Marketing 44 44 44/152 6060 31 3939 22 47 47/152 Probability 60/152 31/152 39/152 22/152 152 152 Under the null hypothesis the two variables are independent: P(Accounting and BA) = P(Accounting)*P(BA) = [61/152][60/152]. The number of students expected to fall in the cell “Accounting - BA” is eAcct-BA = n(pAcct-BA) = 152(61/152)(60/152) = [61*60]/152 = 24.08 The number of students expected to fall in the cell “Finance - BBA” is eFinance-BBA = npFinance-BBA = 152(44/152)(39/152) = [44*39]/152 = 11.29 219 The expected frequencies for a contingency table • The expected frequency of cell of raw i and column j in the contingency table is calculated by (Column j total)(Row i total) eij = Sample size 220 k c  2  i1 ( fi  e i )2 ei Calculation of the c2 statistic • Solution – continued Undergraduate Degree Accounting 31 (24.08) 24.08 BA k BENG 2 8 (12.44) BBA 31 24.08 12 (15.65) Other 10 (8.83) i61 1 31 24.08 c  31 24.08 31 c2= 24.08  MBA Major Finance Marketing 13 (17.37) 2 16 (18.55) 16 (8.97) 7 (9.58) i i 10 (11.29) 17 (12.06) (6.39) 77 6.80 (6.80) 55 6.39 i 44 47 (f  e ) e 5 6.39 The expected frequency 5 6.39 60 31 39 22 152 7 6.80 7 6.80 7 6.80 5 6.39 (31 - 24.08)2 (5 - 6.39)2 (7 - 6.80)2 = +….+ +….+ 24.08 6.39 6.80 14.70 221 Contingency table c2 test – Example • Solution – continued – The critical value in our example is: c 2a ,( r 1)( c 1)  c.205,( 4 1)( 31)  12.5916 • Conclusion: Since c2 = 14.702 > 12.5916, there is sufficient evidence to infer at 5% significance level that students’ undergraduate degree and MBA students courses selection are dependent. 222 223 224 Required condition Rule of five – The c2 distribution provides an adequate approximation to the sampling distribution under the condition that eij >= 5 for all the cells. – When eij < 5 rows or columns must be added such that the condition is met. Example 10 (10.1) 14 18 (12.8) (17.9) 23 (16.0) (22.3) 12 (12.7) 16 (12.8) 8 ( 7.2) 12 8 (9.2) We combine column 2 and 3 14 + 4 16 + 7 8+4 4 (5.1) 7 (6.3) 4 (3.6) 12.8 + 5.1 16 + 6.3 9.2 + 3.6 225 CH15 迴歸分析 226 • 複迴歸（Multiple Regression，又稱「多元迴歸」）屬單準則變數的相依方法，其目的在了解及建立一個連續尺度之準則變數與一組連續尺度之預測變數間的關係。 • 複迴歸可用下列一般形式來說明： Y  X1  X 2  ( 連續 ) ( 連續 )  Xm 227 Example • A distributor of frozen desert pies wants to evaluate factors thought to influence demand – Dependent variable: Pie sales (units per week) – Independent variables: Price (in $) Advertising ($100’s) • Data are collected for 15 weeks 228 Pie Sales Example Week Pie Sales Price ($) Advertising ($100s) 1 350 5.50 3.3 2 460 7.50 3.3 3 350 8.00 3.0 4 430 8.00 4.5 5 350 6.80 3.0 6 380 7.50 4.0 7 430 4.50 3.0 8 470 6.40 3.7 9 450 7.00 3.5 10 490 5.00 4.0 11 340 7.20 3.5 12 300 7.90 3.2 13 440 5.90 4.0 14 450 5.00 3.5 15 300 7.00 2.7 229 230 231 利用複迴歸分析，希望可以回答以下的三個問題： •描述：能否找出一個線性結合，用以簡潔的說明一組預測變數（X）與一個準則變數（Y）之間的關係？如果能的話，此種關係的強度有多大？ •推估：整體關係是否具有統計上的顯著性？在解釋準則變數的變異方面，哪些預測變數最為重要？ •預測：利用預測變數的線性結合來預測準則變數的能力如何？ 232 Pie Sales Example 係數a 模式 1 (常數) price advertising 未標準化係數 B 之估計值標準誤 306.526 114.254 標準化係數 Beta 分配 t 2.683 顯著性 .020 -24.975 10.832 -.461 -2.306 .040 74.131 25.967 .570 2.855 .014 a. 依變數：piesales 233 Example: Programmer Salary Survey A software firm collected data for a sample of 20 computer programmers. We want to determine if salary was related to the years of experience and the score on the firm’s programmer aptitude test. The years of experience, score on the aptitude test, and corresponding annual salary ($1,000s) for a sample of 20 programmers is shown on the next slide. 234 Example: Programmer Salary Survey Exper. Score Salary Exper. Score Salary 4 7 1 5 8 10 0 1 6 6 78 100 86 82 86 84 75 80 83 91 24 43 23.7 34.3 35.8 38 22.2 23.1 30 33 9 2 10 5 6 8 4 6 3 3 88 73 75 81 74 87 79 94 70 89 38 26.6 36.2 31.6 29 34 30.1 33.9 28.2 30 235 236 利用複迴歸分析，希望可以回答以下的三個問題： •描述：能否找出一個線性結合，用以簡潔的說明一組預測變數（X）與一個準則變數（Y）之間的關係？如果能的話，此種關係的強度有多大？ •推估：整體關係是否具有統計上的顯著性？在解釋準則變數的變異方面，哪些預測變數最為重要？ •預測：利用預測變數的線性結合來預測準則變數的能力如何？ 237 Example: Programmer Salary Survey 係數a 模式 1 (常數) 未標準化係數 B 之估計值標準誤 3.174 6.156 標準化係數 Beta 分配 t .516 顯著性 .613 exper 1.404 .199 .741 7.070 .000 score .251 .077 .340 3.243 .005 a. 依變數：salary 238 MBA Program Admission Policy • The dean of a large university wants to raise the admission standards to the popular MBA program. • He plans to develop a method that can predict an applicant’s performance in the program. • He believes a student’s success can be predicted by: – Undergraduate GPA – Graduate Management Admission Test (GMAT) score – Number of years of work experience 239 MBA Program Admission Policy • A randomly selected sample of students who completed the MBA was selected. MBA GPA UnderGPA 8.43 6.58 8.15 8.88 . . 10.89 10.38 10.39 10.73 . . GMAT Work 584 483 484 646 . . 9 7 4 6 . . • Develop a plan to decide which applicant to admit. 240 241 利用複迴歸分析，希望可以回答以下的三個問題： •描述：能否找出一個線性結合，用以簡潔的說明一組預測變數（X）與一個準則變數（Y）之間的關係？如果能的話，此種關係的強度有多大？ •推估：整體關係是否具有統計上的顯著性？在解釋準則變數的變異方面，哪些預測變數最為重要？ •預測：利用預測變數的線性結合來預測準則變數的能力如何？ 242 MBA Program Admission Policy 係數a 模式 1 (常數) 未標準化係數 B 之估計值標準誤 .466 1.506 標準化係數 Beta 分配 t .310 顯著性 .758 UnderGPA .063 .120 .042 .524 .602 GMAT .011 .001 .650 8.159 .000 Work .093 .031 .238 2.996 .004 a. 依變數：MBAGPA MBA GPA = 0.466 + 0.063×UnderGPA + 0.011×GMAT + 0.093×Work 243 Multiple Regression Decision Process Stage 1: Stage 2: Stage 3: Stage 4: Stage 5: Stage 6: 研究問題目的研究設計迴歸假定估計迴歸模式評估解釋能力驗證迴歸結果 244 研究問題目的複迴歸是使用甚廣的一種多變量分析技術，可利用複迴歸來研究的問題可分為兩大類，即解釋與預測。這兩類的研究問題並不互相排斥，研究人員可應用複迴歸技術來單獨分析解釋或預測的研究問題，也可同時處理這兩類的研究問題。複迴歸的目的是要建立一個準則變數和一組預測之間的關係，研究人員首先要決定哪一個變數是準則變數，哪些變數是預測變數。準則變數的選擇通常會由研究問題來決定；預測變數的選擇雖然也視研究問題而定，但最好要有理論上的依據，以免將一些不相關或不合適的預測變數納入迴歸模式中。 245 La Quinta Motor Inns Example • Where to locate a new motor inn? – La Quinta Motor Inns is planning an expansion. – Management wishes to predict which sites are likely to be profitable. – Several areas where predictors of profitability can be identified are: • Competition • Market awareness • Demand generators • Demographics • Physical quality 246 La Quinta Motor Inns Example Profitability Competition Rooms Number of hotels/motels rooms within 3 miles from the site. Market awareness Nearest Distance to the nearest La Quinta inn. Customers Office space College enrollment Operating Margin Community Physical Income Disttwn Median household income. Distance to downtown. 247 研究設計 Issues to consider: • • Sample size, Unique elements of the dependence relationship – can use dummy variables as independents. 248 Sample Size Considerations • Simple regression can be effective with a sample size of 20, but in multiple regression requires a minimum sample of 50 and preferably 100 observations for most research situations. • The minimum ratio of observations to variables is 5 to 1, but the preferred ratio is 15 or 20 to 1, and this should increase when stepwise estimation is used. • Maximizing the degrees of freedom improves generalizability and addresses both model parsimony and sample size concerns. 249 La Quinta Motor Inns Example • Data were collected from randomly selected 100 inns that belong to La Quinta, and ran for the following suggested model: Margin 55.5 33.8 49 31.9 57.4 49 Number 3203 2810 2890 3422 2687 3759 Nearest 4.2 2.8 2.4 3.3 0.9 2.9 Office Space 549 496 254 434 678 635 Enrollment 8 17.5 20 15.5 15.5 19 Income 37 35 35 38 42 33 Distance 2.7 14.4 2.6 12.1 6.9 10.8 250 Variable Transformations • Nonmetric variables can only be included in a regression analysis by creating dummy variables. • Dummy variables can only be interpreted in relation to their reference category. • Adding an additional polynomial term represents another inflection point in the curvilinear relationship. • Quadratic and cubic polynomials are generally sufficient to represent most curvilinear relationships. 檢查迴歸假定在求得估計的迴歸模式之後，接著要檢查模式中各準則變數與預測變數以及整個迴歸關係是否符合複迴歸的假定條件。如發現有嚴重不符合情事，應採取必要的改正行動並重新估計迴歸模式。有關複迴歸的四項基本假定，包括 1. 直線性（linearity）、 2. 變異數相等性（homoscedasticity）、 3. 獨立性（independence）和 4. 常態性（normality）。 252 檢查迴歸假定複迴歸模式有四種基本的假定，我們建立的複迴歸必須符合這四項規定，才稱得上是一個有效的、合適的模式。這四項假定是： 1. 準則變數與預測變數之間的直線關係。 2. 誤差項的變異數相等。 3. 誤差項的獨立性。 4. 誤差項分配的常態性。為檢視複迴歸模式是否符合上述各項規定，可以觀察誤差值散佈圖的形狀。 253 Residuals Plots • Histogram of standardized residuals – enables you to determine if the errors are normally distributed. • Normal probability plot – enables you to determine if the errors are normally distributed. It compares the observed (sample) standardized residuals against the expected standardized residuals from a normal distribution. • ScatterPlot of residuals – can be used to test regression assumptions. It compares the standardized predicted values of the dependent variable against the standardized residuals from the regression equation. If the plot exhibits a random pattern then this indicates no identifiable violations of the assumptions underlying regression analysis. 254 (1)直線關係準則變數（Y）與預測變數（X）應具有直線關係。此一直線關係可從誤差值的散佈形狀觀察出來。我們以誤差值（ ei  Yi  Yˆi，亦即實際觀察值與估計值之差）為縱軸，以估計值（ Yˆ ）為橫軸所繪出的散佈圖形，如果呈現出曲線形狀，表示Y和X之間有非直線的關係存在。此時可利用資料的轉型來使Y和X具有直線關係。複迴歸模式有兩個或以上的預測變數，誤差項代表所有預測變數的總和效果，不能分辨出各個預測變數的個別效果。欲瞭解是否具有直線關係，可以觀察誤差值的散佈圖形來加以檢查。如果散佈圖的形狀呈曲線關係，如圖所示，則可能表示有曲線關係。 255 256 (2)誤差項的變異數相等複迴歸模式的第二項假定是誤差項的變異數要相等，違反此一假定，即是所謂的「變異數不等性」（heteroscedasticity）。要瞭解誤差項的變異數是否相等，可以觀察誤差值的散佈圖形或利用簡單的統計檢定來加以檢查。如果散佈圖的形狀呈三角行或菱形，如圖所示，則可能表示有變異數不等性的現象。如果有必異數不等性的情形，同樣可用資料轉換的方法來加以改善。 257 258 (3)誤差項的獨立性複迴歸的另一項基本假定是每一個預測變數的數值都是獨立的，都和任何其他的預測變數數值無關。誤差項的獨立性也可從觀察誤差值的散佈形狀可知，如圖所示。資料的轉形，諸如時間序列模式中第一階差（first difference）、增列指標變數、或特別設計的迴歸模式等，可用來處理不符此一假定的情形。 259 260 (4)誤差項分配的常態性複迴歸模式假定預測變數和準則變數都具常態性。最簡單的檢視方法是觀察誤差值的直方圖（histogram），如圖所示，如果直方圖的分配接近常態分配，通常表示符合此一假定。此法雖然簡單，但如樣本較小的話，因直方圖的分配不具意義，此法就不適用了。此時利用常態機率圖（normal probability plot），以標準化的誤差值與常態分配相比較。遇到違反常態性假定時，有許多資料轉形的方法可用來處理這種情形。 261 262 估計迴歸模式 In Stage 4, the researcher must accomplish three basic tasks: 1. Select a method for specifying the regression model to be estimated, 2. Assess the statistical significance of the overall model in predicting the dependent variable, and 3. Determine whether any of the observations exert an undue influence on the results. 263 Variable Selection Approaches: • Confirmatory (Simultaneous) • Sequential Search Methods:  Stepwise (variables not removed once • included in regression equation).  Forward Inclusion & Backward Elimination.  Hierarchical. Combinatorial (All-Possible-Subsets) 264 Regression Analysis Terms • • • Explained variance = R2 (coefficient of determination). • Standard Error of the Estimate (SEE) = a measure of the accuracy of the regression predictions. It estimates the variation of the dependent variable values around the regression line. It should get smaller as we add more independent variables, if they predict well. Unexplained variance = residuals (error). Adjusted R-Square = reduces the R2 by taking into account the sample size and the number of independent variables in the regression model (It becomes smaller as we have fewer observations per independent variable). 265 Regression Analysis Terms 模式摘要調過後的模式 R R 平方 R 平方估計的標準誤 a 1 .681 .464 .445 .78794 a. 預測變數：(常數), Work, UnderGPA, GMAT 變異數分析b 模式 1 迴歸平方和 45.597 殘差 52.772 自由度 3 平均平方和 15.199 85 .621 F 檢定 24.481 顯著性 a .000 總和 98.369 88 a. 預測變數：(常數), Wo rk, Un derGPA, GMAT b. 依變數：MBAGPA 係數a 模式 1 (常數) 未標準化係數 B 之估計值標準誤 .466 1.506 標準化係數 Beta 分配 t .310 顯著性 .758 UnderGPA .063 .120 .042 .524 .602 GMAT .011 .001 .650 8.159 .000 Work .093 .031 .238 2.996 .004 a. 依變數：MBAGPA 266 Regression Analysis Terms Continued . . . • Total Sum of Squares (SST) = total amount of variation that exists to be explained by the independent variables. TSS = the sum of SSE and SSR. • Sum of Squared Errors (SSE) = the variance in the dependent variable not accounted for by the regression model = residual. The objective is to obtain the smallest possible sum of squared errors as a measure of prediction accuracy. • Sum of Squares Regression (SSR) = the amount of improvement in explanation of the dependent variable attributable to the independent variables. 267 Assessing Multicollinearity: The researcher’s task is to: • Assess the degree of multicollinearity, • Determine its impact on the results, and • Apply the necessary remedies if needed. 268 Multicollinearity Diagnostics: • Variance Inflation Factor (VIF) – measures how much the variance of the regression coefficients is inflated by multicollinearity problems. If VIF equals 0, there is no correlation between the independent measures. A VIF measure of 1 is an indication of some association between predictor variables, but generally not enough to cause problems. A maximum acceptable VIF value would be 10; anything higher would indicate a problem with multicollinearity. • Tolerance – the amount of variance in an independent variable that is not explained by the other independent variables. If the other variables explain a lot of the variance of a particular independent variable we have a problem with multicollinearity. Thus, small values for tolerance indicate problems of multicollinearity. The minimum cutoff value for tolerance is typically .10. That is, the tolerance value must be smaller than .10 to indicate a problem of multicollinearity. 269 評估解釋能力 • • • • Coefficient of Determination. Regression Coefficients Variables Entered. Multicollinearity ?? 270 驗證迴歸結果在確認最佳的迴歸模式後，最後的一個步驟是去驗證迴歸的結果，俾使所獲得的模式能代表母體。最好的方法是從同一母體再抽出一個新的樣本，然後有兩種方法來驗證原始模式的效度：一是原始模式能預測新樣本中的數值，並計算預測的配合度；一是用新樣本的資料來估計另一個迴歸模式，然後比較原始模式和新的模式在某些特性（如包含的重要變數；變數的符號、大小、和相對重要性；預測的正確性等）上的差異情形。 271 驗證迴歸結果有許多時候研究人員受限於成本、時間壓力或其他因素，未能收集新的資料。此時，研究人員可以將樣本分為估計用的次樣本和驗證用的次樣本兩部分，然後先利用估計用的樣本來求得迴歸模式，再利用驗證用的樣本來檢定或驗證迴歸模式。 272 Example. Where to locate a new motor inn? – La Quinta Motor Inns is planning an expansion. – Management wishes to predict which sites are likely to be profitable. – Several areas where predictors of profitability can be identified are: • Competition • Market awareness • Demand generators • Demographics • Physical quality 273 Example Profitability Competition Rooms Number of hotels/motels rooms within 3 miles from the site. Market awareness Nearest Distance to the nearest La Quinta inn. Customers Office space College enrollment Operating Margin Community Physical Income Disttwn Median household income. Distance to downtown. 274 Example • Data were collected from randomly selected 100 inns that belong to La Quinta, and ran for the following suggested model: Margin = b0 b1Rooms b2Nearest b3Office  b4College + b5Income + b6Disttwn Margin 55.5 33.8 49 31.9 57.4 49 Number 3203 2810 2890 3422 2687 3759 Nearest 4.2 2.8 2.4 3.3 0.9 2.9 Office Space 549 496 254 434 678 635 Enrollment 8 17.5 20 15.5 15.5 19 Income 37 35 35 38 42 33 Distance 2.7 14.4 2.6 12.1 6.9 10.8 275 Model Diagnostics 276 Model Diagnostics 277 Regression Analysis Margin = 38.139 - 0.008Number +1.646Nearest + 0.020Office Space +0.212Enrollment + 0.413Income - 0.225Distance 278 Model Assessment • The model is assessed using three tools: – The standard error of estimate – The coefficient of determination – The F-test of the analysis of variance • The standard error of estimates participates in building the other tools. 279 Standard Error of Estimate • The standard deviation of the error is estimated by the Standard Error of Estimate: SSE se  n  k 1 • The magnitude of se is judged by comparing it to y. 280 Standard Error of Estimate • From the printout, se = 5.5121 • Calculating the mean value of y we have y  45.739 • It seems se is not particularly small. • Question: Can we conclude the model does not fit the data well? 281 Coefficient of Determination • The definition is SSE R  1 2 ( y  y )  i 2 • From the printout, R2 = 0.525 • 52.51% of the variation in operating margin is explained by the six independent variables. 47.49% remains unexplained. • When adjusted for degrees of freedom, Adjusted R2 = 1-[SSE/(n-k-1)] / [SS(Total)/(n-1)] 282 = 49.4% Testing the Validity of the Model • We pose the question: Is there at least one independent variable linearly related to the dependent variable? • To answer the question we test the hypothesis H0: b0 = b1 = b2 = … = bk=0 H1: At least one bi is not equal to zero. • If at least one bi is not equal to zero, the model has some validity. 283 Testing the Validity of the La Quinta Inns Regression Model • The hypotheses are tested by an ANOVA MSR/MSE procedure ANOVA df k = 6 n–k–1 = 93 n-1 = 99 Regression Residual Total SSR SS 3123.8 2825.6 5949.5 MS 520.6 30.4 F Significance F 17.14 0.0000 MSR=SSR/k SSE MSE=SSE/(n-k-1) 284 Testing the Validity of the La Quinta Inns Regression Model [Variation in y] = SSR + SSE. Large F results from a large SSR. Then, much of the variation in y is explained by the regression model; the model is useful, and thus, the null hypothesis should be rejected. Therefore, the rejection region is… F SSR SSE k n  k 1 Rejection region F>Fa,k,n-k-1 285 Testing the Validity of the La Quinta Inns Regression Model ANOVA Regression Residual Total Conclusion: There is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis. At least dfone of the b at least SSi is not equal MS to zero. F Thus, Significance F one independent variable 6 3123.8is linearly 520.6 related 17.14 to y. 0.0000 This linear 93 regression 2825.6 model 30.4 is valid 99 5949.5 Fa,k,n-k-1 = F0.05,6,100-6-1=2.17 F = 17.136 > 2.17 Also, the p-value (Significance F) = 0.0000 Reject the null hypothesis. 286 Interpreting the Coefficients • b0 = 38.139. This is the intercept, the value of y when all the variables take the value zero. Since the data range of all the independent variables do not cover the value zero, do not interpret the intercept. • b1 = – 0.008. In this model, for each additional room within 3 mile of the La Quinta inn, the operating margin decreases on average by .008% (assuming the other variables are held constant). 287 Interpreting the Coefficients • b2 = 1.646. In this model, for each additional mile that the nearest competitor is to a La Quinta inn, the operating margin increases on average by 1.646% when the other variables are held constant. • b3 = 0.020. For each additional 1000 sq-ft of office space, the operating margin will increase on average by .02% when the other variables are held constant. • b4 = 0.212. For each additional thousand students the operating margin increases on average by .212% when the other variables are held constant. 288 Interpreting the Coefficients • b5 = 0.413. For additional $1000 increase in median household income, the operating margin increases on average by .413%, when the other variables remain constant. • b6 = -0.225. For each additional mile to the downtown center, the operating margin decreases on average by .225% when the other variables are held constant. 289 Testing the Coefficients • The hypothesis for each bi is H0: bi  0 H1: bi  0 • Excel printout Intercept Number Nearest Office Space Enrollment Income Distance Coefficients Standard Error 38.14 6.99 -0.0076 0.0013 1.65 0.63 0.020 0.0034 0.21 0.13 0.41 0.14 -0.23 0.18 Test statistic b i  bi t sb i t Stat 5.45 -6.07 2.60 5.80 1.59 2.96 -1.26 d.f. = n - k -1 P-value 0.0000 0.0000 0.0108 0.0000 0.1159 0.0039 0.2107 290 Using the Linear Regression Equation • The model can be used for making predictions by – Producing prediction interval estimate for the particular value of y, for a given values of xi. – Producing a confidence interval estimate for the expected value of y, for given values of xi. • The model can be used to learn about relationships between the independent variables xi, and the dependent variable y, by interpreting the coefficients bi 291 La Quinta Inns, Predictions • Predict the average operating margin of an inn at a site with the following characteristics: – – – – – – 3815 rooms within 3 miles, Closet competitor .9 miles away, 476,000 sq-ft of office space, 24,500 college students, $35,000 median household income, 11.2 miles distance to downtown center. MARGIN = 38.139 - 0.008(3815) +1.646(.9) + 0.020(476) +0.212(24.5) + 0.413(35) - 0.225(11.2) = 37.1% 292 MBA Program Admission Policy • The dean of a large university wants to raise the admission standards to the popular MBA program. • She plans to develop a method that can predict an applicant’s performance in the program. • She believes a student’s success can be predicted by: – Undergraduate GPA – Graduate Management Admission Test (GMAT) score – Number of years of work experience 293 MBA Program Admission Policy • A randomly selected sample of students who completed the MBA was selected. MBA GPA UnderGPA 8.43 6.58 8.15 8.88 . . 10.89 10.38 10.39 10.73 . . GMAT Work 584 483 484 646 . . 9 7 4 6 . . • Develop a plan to decide which applicant to admit. 294 MBA Program Admission Policy • Solution – The model to estimate is: y = b0 +b1x1+ b2x2+ b3x3+e y = MBA GPA x1 = undergraduate GPA [UnderGPA] x2 = GMAT score [GMAT] x3 = years of work experience [Work] – The estimated model: MBA GPA = b0 + b1UnderGPA + b2GMAT + b3Work 295 Regression Diagnostics • The conditions required for the model assessment to apply must be checked. – Is the error variable normally Draw a histogram of the residuals distributed? – Is the error variance constant? Plot the residuals versus y^ – Are the errors independent? Plot the residuals versus the time periods – Is multicolinearity (intercorrelation)a problem? 296 Model Diagnostics 297 Model Diagnostics 298 Model Diagnostics 係數a 模式 1 (常數) 未標準化係數 B 之估計值標準誤 .466 1.506 標準化係數 Beta 分配 t .310 顯著性 .758 共線性統計量允差 VIF UnderGPA .063 .120 .042 .524 .602 .998 1.002 GMAT .011 .001 .650 8.159 .000 .996 1.004 Work .093 .031 .238 2.996 .004 .998 1.002 a. 依變數：MBAGPA 299 MBA Program Admission Policy – Model Assessment SUMMARY OUTPUT Regression Statistics Multiple R 0.6808 R Square 0.4635 Adjusted R Square 0.4446 Standard Error 0.788 Observations 89 • 46.35% of the variation in MBA GPA is explained by the model. • The model is valid (p-value = 0.0000…) • GMAT score and years of work experience are linearly related to MBA GPA. • Insufficient evidence of linear relationship between undergraduate 300 GPA and MBA GPA. ANOVA df Regression Residual Total 3 85 88 SS 45.60 52.77 98.37 MS 15.20 0.62 Coefficients Standard Error t Stat Intercept 0.466 1.506 0.31 UnderGPA 0.063 0.120 0.52 GMAT 0.011 0.001 8.16 Work 0.093 0.031 3.00 F Significance F 24.48 0.0000 P-value 0.7576 0.6017 0.0000 0.0036 Example: Programmer Salary Survey A software firm collected data for a sample of 20 computer programmers. A suggestion was made that regression analysis could be used to determine if salary was related to the years of experience and the score on the firm’s programmer aptitude test. The years of experience, score on the aptitude test, and corresponding annual salary ($1,000s) for a sample of 20 programmers is shown on the next slide. 301 Example: Programmer Salary Survey Exper. Score Salary Exper. Score Salary 4 7 1 5 8 10 0 1 6 6 78 100 86 82 86 84 75 80 83 91 24 43 23.7 34.3 35.8 38 22.2 23.1 30 33 9 2 10 5 6 8 4 6 3 3 88 73 75 81 74 87 79 94 70 89 38 26.6 36.2 31.6 29 34 30.1 33.9 28.2 30 302 Example: Programmer Salary Survey • Multiple Regression Model Suppose we believe that salary (y) is related to the years of experience (x1) and the score on the programmer aptitude test (x2) by the following regression model: y = b 0 + b 1 x1 + b 2 x2 + e where y = annual salary ($1,000) x1 = years of experience x2 = score on programmer aptitude test 303 Example: Programmer Salary Survey • Multiple Regression Equation Using the assumption E (e ) = 0, we obtain E(y ) = b0 + b1x1 + b2x2 • Estimated Regression Equation b0, b1, b2 are the least squares estimates of b0, b1, b2 Thus y = b0 + b1x1 + b2x2 304 Example: Programmer Salary Survey • Solving for the Estimates of b0, b1, b2 Least Squares Output Input Data x1 x2 y 4 78 24 7 100 43 . . . . . . 3 89 30 Computer Package for Solving Multiple Regression Problems b0 = b1 = b2 = R2 = etc. 305 Model Diagnostics 306 Model Diagnostics 307 Example: Programmer Salary Survey • Computer Output The regression is Salary = 3.17 + 1.40 Exper + 0.251 Score Predictor Constant Exper Score s = 2.419 Coef 3.174 1.4039 .25089 R-sq = 83.4% Stdev 6.156 .1986 .07735 t-ratio .52 7.07 3.24 p .613 .000 .005 R-sq(adj) = 81.5% 308 Example: Programmer Salary Survey • Computer Output (continued) Analysis of Variance SOURCE Regression Error Total DF 2 17 19 SS 500.33 99.46 599.79 MS F P 250.16 42.76 0.000 5.85 309 Example: Programmer Salary Survey • F Test – Hypotheses H0 : b 1 = b 2 = 0 Ha: One or both of the parameters is not equal to zero. – – – Rejection Rule For a = .05 and d.f. = 2, 17: F.05 = 3.59 Reject H0 if F > 3.59. Test Statistic F = MSR/MSE = 250.16/5.85 = 42.76 Conclusion We can reject H0. 310 Example: Programmer Salary Survey • t Test for Significance of Individual Parameters – Hypotheses H0 : b i = 0 Ha: bi = 0 – – Rejection Rule For a = .05 and d.f. = 17, t.025 = 2.11 Reject H0 if t > 2.11 Test Statistics b1 1.4039 b2 .25089   7.07   3.24 sb1 .1986 sb .07735 2 – Conclusions Reject H0: b1 = 0 Reject H0: b2 = 0 311 Qualitative Independent Variables • In many situations we must work with qualitative independent variables such as gender (male, female), method of payment (cash, check, credit card), etc. • For example, x2 might represent gender where x2 = 0 indicates male and x2 = 1 indicates female. • In this case, x2 is called a dummy or indicator variable. • If a qualitative variable has k levels, k - 1 dummy variables are required, with each dummy variable being coded as 0 or 1. • For example, a variable with levels A, B, and C would be represented by x1 and x2 values of (0, 0), (1, 0), and (0,1), respectively. 312 Example: Programmer Salary Survey (B) As an extension of the problem involving the computer programmer salary survey, suppose that management also believes that the annual salary is related to whether or not the individual has a graduate degree in computer science or information systems. The years of experience, the score on the programmer aptitude test, whether or not the individual has a relevant graduate degree, and the annual salary ($1,000) for each of the sampled 20 programmers are shown on the next slide. 313 Example: Programmer Salary Survey (B) Exp. Score 4 78 7 100 1 86 5 82 8 86 10 84 0 75 1 80 6 83 6 91 Degr. No Yes No Yes Yes Yes No No No Yes Salary 24 43 23.7 34.3 35.8 38 22.2 23.1 30 33 Exp. 9 2 10 5 6 8 4 6 3 3 Score Degr. Salary 88 Yes 38 73 No 26.6 75 Yes 36.2 81 No 31.6 74 No 29 87 Yes 34 79 No 30.1 94 Yes 33.9 70 No 28.2 89 No 30 314 Example: Programmer Salary Survey (B) • Multiple Regression Equation E(y ^ ) = b0 + b1x1 + b2x2 + b3x3 • Estimated Regression Equation y = b0 + b1x1 + b2x2 + b3x3 where y = annual salary ($1,000) x1 = years of experience x2 = score on programmer aptitude test x3 = 0 if individual does not have a grad. degree 1 if individual does have a grad. degree Note: x3 is referred to as a dummy variable. 315 Model Diagnostics 316 Model Diagnostics 317 Example: Programmer Salary Survey (B) • Computer Output The regression is Salary = 7.95 + 1.15 Exp + 0.197 Score + 2.28 Deg Predictor Constant Exp Score Deg s = 2.396 Coef Stdev t-ratio 7.945 7.381 1.08 1.1476 .2976 3.86 .19694 .0899 2.19 2.280 1.987 1.15 R-sq = 84.7% R-sq(adj) = 81.8% p .298 .001 .044 .268 318 Example: Programmer Salary Survey (B) • Computer Output (continued) Analysis of Variance SOURCE Regression Error Total DF SS 3 507.90 16 91.89 19 599.79 MS F P 169.30 29.48 0.000 5.74 319 Diagnostics: Multicolinearity • Example: Predicting house price – A real estate agent believes that a house selling price can be predicted using the house size, number of bedrooms, and lot size. – A random sample of 100 houses was drawn and data recorded. Price 124100 218300 117800 . . Bedrooms 3 4 3 . . H Size 1290 2080 1250 . . Lot Size 3900 6600 3750 . . – Analyze the relationship among the four variables 320 Model Diagnostics 321 Model Diagnostics 322 Diagnostics: Multicolinearity • The proposed model is PRICE = b0 + b1BEDROOMS + b2H-SIZE +b3LOTSIZE + e SUMMARY OUTPUT Regression Statistics Multiple R 0.7483 R Square 0.5600 Adjusted R Square 0.5462 Standard Error 25023 Observations 100 The model is valid, but no variable is significantly related to the selling price ?! ANOVA df Regression Residual Total 3 96 99 SS 76501718347 60109046053 136610764400 Coefficients Standard Error Intercept 37718 14177 Bedrooms 2306 6994 House Size 74.30 52.98 Lot Size -4.36 17.02 MS 25500572782 626135896 t Stat 2.66 0.33 1.40 -0.26 F Significance F 40.73 0.0000 P-value 0.0091 0.7423 0.1640 0.7982 323 Diagnostics: Multicolinearity • Multicolinearity is found to be a problem. Price Price Bedrooms H Size Lot Size 1 0.6454 0.7478 0.7409 Bedrooms H Size 1 0.8465 0.8374 1 0.9936 Lot Size 1 • Multicolinearity causes two kinds of difficulties: – The t statistics appear to be too small. – The b coefficients cannot be interpreted as “slopes”. 324 Model Diagnostics 係數a 模式 1 (常數) 未標準化係數 B 之估計值標準誤 37717.595 14176.742 標準化係數 Beta 分配 t 2.661 顯著性 .009 共線性統計量允差 VIF Bedrooms 2306.081 6994.192 .042 .330 .742 .282 3.540 HouseSize 74.297 52.979 .865 1.402 .164 .012 83.067 Lotsize -4.364 17.024 -.154 -.256 .798 .013 78.841 a. 依變數：Price 325 Durbin - Watson Test: Are the Errors Autocorrelated? • This test detects first order autocorrelation between consecutive residuals in a time series • If autocorrelation exists the error variables are not independent n Residual at time i d  (ei  ei 1 ) 2 i 2 n  ei 2 i 1 The range of d is 0  d  4 326 Positive First Order Autocorrelation + + + Residuals + 0 + + Time + + Positive first order autocorrelation occurs when consecutive residuals tend to be similar. Then, the value of d is small (less than 2). 327 Negative First Order Autocorrelation Residuals + + + + + + + 0 Time Negative first order autocorrelation occurs when consecutive residuals tend to markedly differ. Then, the value of d is large (greater than 2). 328 One tail test for Positive First Order Autocorrelation • If d<dL there is enough evidence to show that positive first-order correlation exists • If d>dU there is not enough evidence to show that positive first-order correlation exists • If d is between dL and dU the test is inconclusive. First order correlation exists dL Inconclusive test Positive first order correlation Does not exists dU 329 One Tail Test for Negative First Order Autocorrelation • If d>4-dL, negative first order correlation exists • If d<4-dU, negative first order correlation does not exists • if d falls between 4-dU and 4-dL the test is inconclusive. Negative first order correlation does not exist Inconclusive test 4-dU Negative first order correlation exists 4-dL 330 Two-Tail Test for First Order Autocorrelation • If d<dL or d>4-dL first order autocorrelation exists • If d falls between dL and dU or between 4-dU and 4-dLthe test is inconclusive • If d falls between dU and 4-dU there is no evidence for first order autocorrelation First order correlation exists 0 dL First order correlation does not exist Inconclusive test dU 2 First order correlation does not exist Inconclusive test 4-dU First order correlation exists 4-dL 4 331 Testing the Existence of Autocorrelation, Example • Example – How does the weather affect the sales of lift tickets in a ski resort? – Data of the past 20 years sales of tickets, along with the total snowfall and the average temperature during Christmas week in each year, was collected. – The model hypothesized was TICKETS=b0+b1SNOWFALL+b2TEMPERATURE+e – Regression analysis yielded the following results: 332 The Regression Equation – Assessment (I) The model seems to be very poor: SUMMARY OUTPUT Regression Statistics Multiple R 0.3465 R Square 0.1200 Adjusted R Square 0.0165 Standard Error 1712 Observations 20 • R-square=0.1200 • It is not valid (Signif. F =0.3373) • No variable is linearly related to Sales ANOVA df Regression Residual Total Intercept Snowfall Tempture 2 17 19 SS 6793798 49807214 56601012 Coefficients Standard Error 8308.0 903.73 74.59 51.57 -8.75 19.70 MS 3396899 2929836 F Signif. F 1.16 0.3373 t Stat P-value 9.19 0.0000 1.45 0.1663 -0.44 0.6625 333 Diagnostics: The Error Distribution The errors histogram 7 6 5 4 3 2 1 0 -2.5 -1.5 -0.5 0.5 1.5 2.5 More The errors may be normally distributed 334 Diagnostics: Heteroscedasticity Residual vs. predicted y 3000 2000 1000 0 -10007500 -2000 8500 9500 10500 11500 12500 -3000 -4000 It appears there is no problem of heteroscedasticity (the error variance seems to be constant). 335 Diagnostics: First Order Autocorrelation Residual over time 3000 2000 1000 0 -1000 0 -2000 -3000 -4000 5 10 15 20 25 The errors are not independent!! 336 Diagnostics: First Order Autocorrelation Durbin-Watson Statistic -2793.99 -1723.23 d = 0.5931 The residuals -2342.03 -956.955 -1963.73 . . Test for positive first order autocorrelation: n=20, k=2. From the Durbin-Watson table we have: dL=1.10, dU=1.54. The statistic d=0.5931 Conclusion: Because d<dL , there is sufficient evidence to infer that positive first order autocorrelation exists. 337 The Modified Model: Time Included The modified regression model TICKETS=b0+ b1SNOWFALL+ b2TEMPERATURE+ b3TIME+e • All the required conditions are met for this model. • The fit of this model is high R2 = 0.7410. • The model is valid. Significance F = .0001. • SNOWFALL and TIME are linearly related to ticket sales. • TEMPERATURE is not linearly related to ticket sales. 338 參考資料 •統計學(謝邦昌 )：CH12 – CH17

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Example: Making an inference about m 1