* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Hypothesis Testing
Psychometrics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Eigenstate thermalization hypothesis wikipedia , lookup
Taylor's law wikipedia , lookup
Foundations of statistics wikipedia , lookup
Omnibus test wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Misuse of statistics wikipedia , lookup
Hypothesis Testing What is Hypothesis Testing? Testing for the population mean One-tailed testing Two-tailed testing Tests Concerning Proportions Types of Errors Hypothesis Testing A Hypothesis is a statement about the value of a population parameter developed for the purpose of testing. Examples of hypotheses made about a population parameter are: – The mean monthly income for systems analysts is $3,625 (Statement on a population mean μ). – Twenty percent of all restaurant customers at return for another meal within a month (Statement on a population proportion π). Hypothesis Testing Hypothesis testing is a procedure, based on sample evidence and probability theory, used to determine whether the hypothesis is a reasonable statement and should not be rejected, or is unreasonable and should be rejected. Given or implied by the problem Decide if the z or the t distribution is to be used. Find the Critical Values and the accept/reject regions. Find the z or t value of the sample and check if it falls in an accept/reject region. Hypothesis Testing o H0: null hypothesis and H1: alternate hypothesis o H0 and H1 are mutually exclusive and collectively exhaustive o H0 is always presumed to be true o H1 has the burden of proof Hypothesis Testing In problem solving, look for key words and convert them into symbols. Some key words include: “improved, better than, as effective as, different from, has changed,… 3 possible situations: H0: μ = value H1: μ ≠ value H0: μ < value H1: μ > value H0: μ > value H1: μ < value Possible Keywords: • Is there a Change? • has not changed • is larger than • is better than • has improved • is less than • is less effective Hypothesis Testing (Two-Tailed) H0: μ = value H1: μ ≠ value Reject H0 if : α/2 Z X Z / 2 0 or Critical Value t X t / 2,( df n 1) = -Zα/2 X ZX n α/2 Accept H0 Critical Value = Zα/2 X tX s n Hypothesis Testing (One-Tailed) H0: μ ≥ value H1: μ < value H0: μ ≤ value H1: μ > value Reject H0 if : Reject H0 if : Z X Z Z X Z or t X t / 2,( df n 1) or t X t / 2,( df n 1) α α 0 0 CV= -Zα/2 Accept H0 Accept H0 CV= Zα/2 Hypothesis Testing (for the mean μ) Hypothesis Testing (for the proportion π) Example 26, page 356 50% of students change their major within the first year. A random sample of 100 students revealed that 48 students changed their major within the first year. Has there been a significant decrease in the number of students who changed their major? Test at a 0.05 level of significance. H0: π ≥ 0.5 H1: π < 0.5 Example 26, page 356 H0: π ≥ 0.5 H1: π < 0.5 p=48/100 = 0.48 α = 0.05 => z of CV = -1.65 => H0 is rejected if z<-1.65 p z (1 ) n 0.48 0.5 0.4 0.5(1 0.5) 100 -0.4 > -1.65 => Ho is not rejected. The proportion of students changing their major has not changed. Decisions and Consequences Null Hypothesis Ho is true Ho is false Researcher Accepts Rejects Ho Ho Correct Type I error decision () Type II Error (b) Correct decision Type II Error (β) Is the probability that the null hypothesis is NOT rejected when it is actually false. Example (page 356): At a plant manufacturing pins using steel, past experiences indicate that mean tensile strength of all incoming shipments µ0 is 10,000 psi and standard deviation σ is 400 psi. To make a decision, the manufacturer sets up the following rule to a quality control inspector: Take a sample of 100 steel bars. At the .05 significance level, if the sample mean X-bar strength falls between 9,922 psi and 10,078 psi, accept a lot. Otherwise the lot is rejected. Suppose that the unknown population mean of an incoming lot, designated by µ1, is really 9,900 psi. What is the probability that the inspector will fail to reject the shipment (type II error)? Type II Error (β) Type II Error (β) The z value of X c is: z X c 1 n Then z 9,922 9,900 22 0.55 400 / 100 40 P(z>0.55)= .2088. So the probability of a type II error or β is 0.5-.2088=.2912. Always draw a picture of the normal curve and areas to solve! Chapter 11 Two-Sample Tests of Hypothesis Our Objectives Conduct a test of hypothesis about the difference between two independent population means. Conduct a test of a hypothesis about the difference between two population proportions. Conduct a test of a hypothesis about the mean difference between paired or dependent observations. Understand the difference between dependent and independent samples. Two-Sample Tests of Hypothesis We take samples from two populations and compare the population means. In the one-sample test of hypothesis, we took a sample from a population and compared the sample statistic to the population parameter. Example: Is there a difference in the mean value of residential real estate sold by male agents and female agents in a particular area? Two-Sample Test of Hypothesis: Independent Samples Example A financial accountant wishes to know whether there is a difference in the mean rate of return for high yield mutual funds and global mutual funds. There are two independent populations: high yield mutual funds , and the global mutual funds. If there is a difference between the population means, then we expect that there is a difference between the sample means. If the size of the two samples is more than 30, we can reason that the distribution of the difference in the sample means is Normal. Mean of the distribution of the differences: If zero, we conclude that there is no difference in the two populations. If positive or negative value, we conclude that two populations do not have the same mean. Two-Sample Test of Hypothesis: Independent Samples with known population SD σ H0: μ1 = μ2 H1: μ1 ≠ μ2 H0: μ1 - μ2 =0 H1: μ1 - μ2 ≠ 0 1 X 1 ~ Norm( 1 , ) and n1 then (X X ) 1 2 2 X 2 ~ Norm( 2 , ) n2 12 22 ~ Norm 1 2 , n n 1 2 Two-Sample Test of Hypothesis: Independent Samples with known population SD σ Standardize the distribution of the differences. The test statistic for the difference between two means is: z X1 X 2 12 n1 22 n2 The variance of the distribution of differences in sample means is: 2 X1X 2 12 n1 22 n2 Given or implied by the problem Decide if the z or the t distribution is to be used. Find the Critical Values and the accept/reject regions. Find the z or t value of the sample and check if it falls in an accept/reject region. Example 2, page 374 First population: A sample of 65 observations is selected. Population SD = 0.75. Sample mean = 2.67. Second population: A sample of 50 observations is selected. Population SD = 0.66. Sample mean = 2.59. Use a 0.08 significance level. H0: μ1 ≤ μ2 H1: μ1 > μ2 Example a) b) c) d) e) 2 Is this a one-tailed or a two-tailed test? State the decision rule. Compute the value of the test statistic What is your decision regarding H0? What is the p-value? a) It is a one-tailed test. b) For α = 0.08 and a one tailed test, then we reject H0 if z>1.41 (CV=1.41). c) z X1 X 2 2 1 n1 2 2 n2 2.67 2.59 2 0.75 0.66 65 50 2 0.607 Example 2 d) 0.607 < 1.41, We fail to reject H0 e) p-value of the sample is P( z 0.607) 0.5 0.2291 0.2709 Two-Sample Test of Hypothesis (Proportions) Standardize the distribution of the differences. The test statistic for the difference between two proportions is: z p1 p2 pc (1 pc ) pc (1 pc ) n1 n2 Two-Sample Test of Hypothesis (Proportions) p1 is the first sample proportion (p1=x1/n1) p2 is the second sample proportion (p2=x2/n2) pc is the pooled proportion The pooled estimate of the population proportion is computed using the formula: x1 x2 pc n1 n2 Example 12, page 378 Single People: A sample of 400 people is selected. 120 had at least one accident in the past three years. Married People: A sample of 600 people is selected. 150 had at least one accident in the past three years. Use a 0.05 significance level. Is there a significant difference in the proportion of single and married people having accidents? Example 12 H 0: π m = π s H 1: π m ≠ π s 0.05 significance level, z of Critical Values: z = 1.96 and z=-1.96 (two tailed test). Accept Region is between 1.96 and -1.96. pc x1 x2 120 150 270 0.27 n1 n2 400 600 1000 Example z Ho 12 p1 p2 pc (1 pc ) pc (1 pc ) n1 n2 (120 / 400) (150 / 600) 1.74 0.27(1 0.27) 0.27(1 0.27) 400 600 is not rejected. There is no difference in the proportion of married and single drivers who have accidents. Two-Sample Test of Hypothesis: Independent Samples with unknown population SD σ and at least one of the samples is less than 30. o Use the following t distribution if: o Independent Samples o Both samples have unknown but equal population SD o At least one of the samples is less than 30 We use the t statistic. We compute the t value using the formula: t X1 X 2 1 1 s n1 n2 2 p Sp squared is pooled estimate of population variance. We use n1+n2-2 degrees of freedom. So to find the value of t, 3 steps are performed: 1. compute s1 and s2 2. compute sp 3. determine t Pooled variance is computed using the formula: (n1 1)s (n2 1)s s n1 n2 2 2 p 2 1 2 2 Where s1 squared is the variance of the 1st sample; s2 squared is the variance of the 2nd sample. Example 15, page 384 Men Examination Scores: 72 69 98 66 85 76 79 Women Examination Scores: 81 67 90 78 81 80 76 80 77 Is it reasonable to conclude that women score higher than men? Use the 0.01 significance level. Example 15, page 384 Ho: f m H1: f > m Use Appendix B2 to obtain Critical Values. For significance level 0.01, one tailed test and df=n1+n2-2=14 we obtain a t=2.624 for the Critical Value. Accept H0 if the sample t is less than 2.624. Sf=6.88 Sm=9.49 X f 79 X m 78 s 2p (n f 1) s 2f (nm 1) sm2 n f nm 2 (7 1)(6.88) (9 1)(9.49) 71.749 972 2 t X f Xm 1 1 s n f nm 2 p t < 2.624 so we accept H0 2 79 78 1 1 71.749 7 9 0.234 Two-Sample Test of Hypothesis: Independent Samples with unknown population SD σ and at least one of the samples is less than 30. o Use the following t distribution if: o Independent Samples o Both samples have unknown but can not assume equal population SD o At least one of the samples is less than 30 We use the t statistic. We compute the t value using the formula: t X1 X 2 s12 s22 n1 n2 Use the following for degree of freedom (round down if not an integer): 2 ( s / n1 ) ( s / n2 ) df 2 ( s1 / n1 ) 2 ( s22 / n2 ) 2 n1 1 n2 1 2 1 2 2 Example 22, page 388 Klein Models: 5.0, 4.5, 3.4, 3.4, 6.0. 3.3, 4.5, 4.6, 3.5, 5.2, 4.8, 4.4, 4.6, 3.6, 5.0 Clairborne Models: 3.1, 3.7, 3.6, 4.0, 3.8, 3.8, 5.9, 4.9, 3.6, 3.6, 2.3, 4.0 Is it reasonable to conclude that Clairborne Models earn more? Use the 0.05 significance level and assume the population standard deviations are not the same. Example 15, page 384 Ho: k c H1: k > c Use Appendix B2 to obtain Critical Values. For significance level 0.05, one tailed test and df=22. 0.795 0.881 ) ( 15 12 df (0.795 15) (0.881 12) 2 2 14 2 2 2 2 11 2 22.5 we obtain a t=1.717 for the Critical Value. Accept H0 if the sample t is less than 2.624. t 4.387 3.858 2 2 1.619 0.795 0.881 15 12 t < 1.717 so we fail to reject the null hypothesis