Download PPT presentation

MBA Statistics 51-651-00 COURSE #2 Do we have winning conditions? Decision making from statistical inference Very often, a decision is taken following a quantitative analysis of certain parameters.    You are proposed two advertising concepts to launch a new product. You will choose the one which will obtain the best score of effectiveness in your targeted market. If the resistance or the average durability of a new product is significantly larger than the one of the best competing product, you will put this product on the market. If the « winning conditions » were present and more than 50% of people in Quebec voted yes in a referendum for sovereignty, then Bernard Landry would make the decision to hold one. 2 In general, the parameters which interest us are estimated using a sample and our decision will be made following a hypothesis test.  Example: We ask 1000 residents of the Province of Quebec, chosen at random who have the right to vote, if today, they would vote yes in a Quebec referendum on sovereignty. 3 What would Bernard Landry do if:  432 voters voted yes? (432/1000 = 43.2%) • He would most probably not hold a referendum.  517 voters voted yes? (517/1000 = 51.7%) • Is 51.7 % significantly larger than 50%?  612 voters voted yes? (612/1000 = 61.2%) • 61.2% is probably significantly larger than 50%. Therefore he would decide to hold a referendum on the sovereignty of Quebec. 4 Basic notions of hypothesis tests   To help us decide (especially in case 2 of the previous slide), we will try to quantify the term «significantly different », statistically speaking, by associating a probability of error with it. In other words, we want to know, starting from the results obtained in the sample, what is the probability that the Premier is making a mistake deciding to hold a referendum on sovereignty. 5 Basic notions of hypothesis tests (cont’d.)   If the probability of making a mistake is small (for example, lower than 5%) he will then decide to hold a referendum on sovereignty soon. If this probability is large (for example, higher than 5%) he will then wait a certain time to have « winning conditions » and to hold a referendum. 6 Basic notions of hypothesis tests(cont’d)  There are essentially two possibilities: 1. 2.   50% or less of the voters would vote yes if a referendum took place today; more than 50% of the voters would vote yes. The first possibility is called the null hypothesis (noted H0). The second possibility is called the alternate hypothesis (noted H1). 7 Notation:  Let « p » be the true proportion of voters who would vote yes at a referendum. We then have the following two possibilities :   H0: p  50% vs H1: p > 50% Often, the alternate hypothesis is what we want to show « in any reasonable doubt! » i.e. we want the probability of making a mistake by making the decision H1 starting from the results of the sample, to be small. 8 Choosing H1  The choice of H1 is determined by the question you need to answer.  H1 must be chosen in such a way that you can answer yes (resp. no) to the question if one accepts H1 and you can answer no (resp. yes) if one accepts H0.  Typically there are three choices for H1 :  > 0,  < 0 or  ≠ 0 9 Choosing H1 (continued) The question Bernard Landry is asking himself is: Do I have a chance of winning?  H1: p < ½ is not good. If one accepts H0 then one can conclude that p ≥ ½ so the answer to his question is not yes or no! The same is true for the choice H1: p ≠ ½.  But H1: p > ½ is the right choice. If H1 is accepted, the answer is yes while if H0 is accepted, then p ≤ ½ so the answer is no. 10 Possible errors in decision making starting from a sample:  Type I error:    To reject H0 in favour of H1 (i.e. to take the decision H1) when actually H0 is true. The probability of Type I error is the probability that we have observed the « value » obtained in our sample, or a value even « further away » from H0 , if H0 is true. In statistical jargon, this probability is often called «p-value ». Type II error:  Not rejecting H0 in favour of H1 when actually H1 is true. 11 Is the defendant guilty or not guilty? Jury decision H0 not guilty Truth H0 not guilty H1 guilty H1 guilty  Type I Error Type II Error  12 Control of Type I and Type II errors  Given the results obtained in the sample, we calculate the probability of Type I error (p-value).  If this probability is relatively small (for example p-value < 5%), then we will reject H0 to make the decision H1. If not, we will not reject H0. 13 P-value Measures the confidence you should have about H0  A small p-value indicates that you should be less confident in H0  How small the p-value should be to reject H0 in favor of H1?  It depends on you…  Illustration: p-value.xls  14 Real life analog One of your friend just lied to you. Is he still your friend?  Then he lies again, and again, and again?  When will you stop considering him/her as a friend?  15 Control of Type I and Type II errors (continued)  For a type I error fixed in advance (ex. 5%), we control, using the sample size, the type II error before undertaking the study.  We define the power of the hypothesis test as the quantity: ( 1 - probability of a type II error ) 16 In the next few hours, we will see basic statistical tests: 1. 2. 3. Test of a proportion. Test of a mean. Test of a difference between two means from the same sample (similar to case 2). 17 1. Test of a proportion: Example: Two years ago, a company put a new product on the market. The top management of the firm plans to increase expense if less than 70% of the population know the product. 18 What are the possible hypotheses we want to examine? Let « p » be the true proportion of individuals in the population who know the product and « p0 » the value which corresponds to our hypothesis or decision making (p0 = 70% in the previous example). We have to choose between :  H0 : p  p0 vs H1 : p > p0 (right-tailed test)  H0 : p  p0 vs H1 : p < p0 (left-tailed test)  H0 : p = p0 vs H1 : p  p0 (two-tailed test) 19  One must choose the hypothesis H1 so that the answer to the question is yes or no.  In this case, the question is: should we increase advertising expenses? 20  H0 : p  70% vs H1 : p > 70%  If H1 is accepted, the answer is No. If H0, is accepted, the answer is NYES!  H1 : p > 70% is not appropriate. 21  H0 : p = 70% vs H1 : p  70%  If H0 is accepted, the answer is No. If H1is accepted, the answer is NYES!  H1 : p  70% is not appropriate. 22    H0 : p  70% vs H1 : p < 70% If H0 is accepted, the answer is No. If H1 is accepted, the answer is Yes! H1 : p < 70% is the appropriate choice. 23 Procedure : We take a sample of n individuals in the target population, and we calculate the proportion of individuals who know the product. We will reject the null hypothesis H0, at the  level, if we have sufficient proof against it, i.e. enough evidence in favour of the alternate hypothesis H1, i.e. p-value < . 24 The test statistic is given by : If the null hypothesis H0 is true and the sample size is large, the statistic z will approximately follow a normal distribution with mean 0 and variance 1 [ denoted N(0,1) ]. 25 In order to make a decision, we calculate the p-value  Right-tailed test: p-value = Prob[N(0,1) > z]  Left-tailed test: p-value = Prob[N(0,1) < z]  Two-tailed test: p-value = 2 x Prob[N(0,1) > |z|] The p-value is calculated with proportion-1t.xls 26 The company contacted by telephone 500 people from the target population  330 individuals answer that they know the product (330/500 = 66%).  H0 : p  70% vs H1 : p < 70% z 0.66  0.70  1.9518 0.70(1  0.70) 500  p-value = 0.0255  We reject H0 (or accept H1) at level 5%.  Therefore we will make the decision to rise the advertising budget for this product. 27 Intentions to vote example:  We choose at random 1000 residents of Quebec that have the right to vote and ask them if today, they would vote yes in a referendum on sovereignty. In the sample, 517 voters answered that they would vote yes.  H0: p  50% vs H1: p > 50%   p-value = 0.1411  We will not reject H0 at the 5% level  Bernard Landry will not hold a referendum in a near future. 28 Intentions to vote example:  We choose at random 1000 residents of Quebec that have the right to vote and ask them if today, they would vote yes in a referendum on sovereignty. In the sample, 612 voters answered that they would vote yes.  H0: p  50% vs H1: p > 50% z 0.6120.5 7.0203 0.5(10.5) 1000  p-value = 1.1146E-12  We will reject H0 at the 5% level  Bernard Landry will hold a referendum in a near future. 29 Exercise  Recall the last example in the estimation section.  Can you now answer the question satisfactorily? 30 Remark: Test vs Confidence interval Testing H0 : p = p0 vs H1 : p  p0 is equivalent to constructing a confidence interval for p0.  H0 is rejected iff p0 is not in the interval.  31 2. Test of one mean  Example:You are in charge of the department which manufactures and produces 170 g bags of chips (brand CCC). To verify if, on average, the process of filling is maintained at 170 g, each day one of your employees is asked to take a random sample of 100 bags and the average weight of the sample is calculated. The process of filling will be stopped if the average weight is significantly different from 170 g. 32 What are the possible hypotheses we want to examine? Let «  » be the true mean of a characteristic in the population. This mean is unknown, as is the variance 2. Let « 0 » be the value of the mean which corresponds to our hypothesis or decision making (0=170g in the previous example ). We have to choose between:  H0 :   0 vs H1 :  > 0 (right-tailed test )  H0 :   0 vs H1 :  < 0 (left-tailed test )  H0 :  = 0 vs H1 :   0 (two-tailed test) 33 Procedure: We take a sample of size n in the target population and we calculate the mean and the standard deviation s. We will reject the null hypothesis H0, at the  level, if we have sufficient proof against it, i.e. enough evidence in favour of the alternate hypothesis H1, i.e. p-value < . 34 The test statistic is given by: If the null hypothesis H0 is true, the t statistic will follow a Student distribution with n-1 degrees of freedom [noted t(n-1)]. 35 In order to make a decision, we calculate p-value.    Right-tailed test: p-value=Prob[ t(n-1) > t ] Left-tailed test : p-value=Prob[ t(n-1) < t ] Two-tailed test : p-value= 2 x Prob[ t(n-1) > |t| ]  (1-) confidence interval for  : X  t(n-1);  /2  s2 n The p-value is calculated using mean-1t.xls 36 Example:  The sample mean of the 100 bags of chips is 169.9 grams and the standard deviation s=0.27.  H0:  = 170g  vs H1:   170g  p-value = 0.0003  We reject H0 without being afraid of being wrong!  95% confidence interval for : [169.846 ; 169.953]  The interval does not contain the value 170  We reject H0 at the 5% level 37  If the mean of the sample of 100 bags of chips is 170.011 grams and the standard deviation s = 0.27. H0:  = 170g  vs H1:   170g p-value = 0.69 We will not reject H0 95% confidence interval for  :  [169.957 ; 170.064] The interval contains the value 170  we will not reject H0 at the 5% level 38 Case study   The average annual salary of a group of employees in a city is 45 000$. One of the main issue of the negotiations is that the representative of the union states that this particular group is paid much lower than in other comparable cities. One decides to verify that hypothesis. If the union is right, the employer will increase the salaries in such a way that the average salary will not be significantly lower than in the other cities. Both parties agree to take a risk of 5%. 39 Case study (continued)  To perform the comparison, 50 comparable cities were chosen at random, the mean of the 50 (average) annual salaries was 50000$, and the standard deviation was 16 000$.  a) What is the conclusion?  b) The city proposes to increase to average annual salary to 46 500$. Is it honest? 40 Remark: Test vs Confidence interval Testing H0 :   0 vs H1 :   0 is equivalent to constructing a confidence interval for 0.  H0 is rejected if 0 is not in the interval.  41 3. Test of a difference of two means from the same sample Example:The human resources director of a company wants to suggest that the management implement a special training program for the employees assigned to the assembling department. To evaluate the effectiveness of this 3-week program, we chose, at random, 15 employees and we observed the number of parts assembled during this period of time. Thereafter, these 15 employees participated in the training program and once again, we observed the number of parts assembled during the same period of time. 42 The results obtained (hr.xls) were as follow: individual 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 before 15 13 8 9 7 12 11 12 11 9 10 12 11 7 12 after 17 16 10 9 9 13 14 15 14 11 14 11 13 10 13 difference 2 3 2 0 2 1 3 3 3 2 4 -1 2 3 1 43 The results of the statistical analysis using Excel were as follow: 44 This test is equivalent to a test of the mean difference between after and before: T test for a mean (unknown sigma) X-bar Mu0 n s t statictic 2 0 15 1.309 5.916 p-value Confidence CI: lower limit CI: upper limit 2-tailed test level 0.0000 95.0% 1.3 2.7 p-value for H1: Mu > Mu0 0.0000 p-value for H1: Mu < Mu0 1.0000 Thus, the average productivity is significantly higher after the program. If the costs of the training program are less than the profits in productivity, then the program will be adopted. 45

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download PPT presentation