Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hypothesis Testing ESM 206 6 Feb. 2002 Example: Gas Mileage Do “Small” cars have a different average gas mileage than “Compact” cars? Data on mileage of 13 small and 15 compact cars. Mileage 35 30 25 20 Small Small Type SMALL COMPACT Eagle Summit Audi 80 Ford Escort Buick Skylark Ford Festiva Chevrolet LeBaron Honda Civic Ford Tempo Mazda Protégé Honda Accord Mercury Tracer Mazda 626 Nissan Sentra Mitsubishi Galant Pontiac LeMans Mitsubishi Sigma Subaru Loyale Nissan Stanza Subary Justy Oldsmobile Calais Toyota Corolla Peugeot 405 Toyota Tercel Subaru Legacy Volkswagen Jetta Toyota Camry Example: gas consumption G 0 1P 2 I 3 N 4U Which coefficients are different from zero? Data from 36 years in US. Hypothesis testing Define null hypothesis (H0) Does direction matter? Choose test statistic, T Distribution of T under H0 Calculate test statistic, S Probability of obtaining value at least as extreme as S under H0 (P) P small: reject H0 The null hypothesis Statement about underlying parameters of the population We will either reject or fail to reject H0 Usually a statement of no pattern or of not exceeding some criterion Examples The alternate hypothesis Written HA Is the logical complement of H0 Examples One- and two-sided tests One-sided test: direction matters Pick a direction based on regulatory criteria or knowledge of processes Direction must be chosen a priori Two-sided: all that matters is a difference One-sided has greater power Must make decision before analyzing data Comparing means: the t-test Compare sample mean to fixed value (eqs. 1-4) Compare regression coefficient to fixed value (eq. 5) Compare the difference between two sample means to a fixed value (usually 0) (eqs. 6-7) Assumptions of the t-test The data in each sample are normally distributed The populations have the same variance Can correct for violations of this with the Welch modification of df Test for difference among variances with Ftest The P-value P is the probability of observing your data if the null hypothesis is true P is the probability that you will be in error if you reject the null hypothesis P is not the probability that the null hypothesis is true Critical values of P Reject H0 if P is less than threshold P < 0.05 commonly used Arbitrary choice Other values: 0.1, 0.01, 0.001 Always report P, so others can draw own conclusions Example: Gas Mileage Do “Small” cars have a different average gas mileage than “Compact” cars? Data on mileage of 13 small and 15 compact cars. Mileage 35 30 25 20 Small Small Type SMALL COMPACT Eagle Summit Audi 80 Ford Escort Buick Skylark Ford Festiva Chevrolet LeBaron Honda Civic Ford Tempo Mazda Protégé Honda Accord Mercury Tracer Mazda 626 Nissan Sentra Mitsubishi Galant Pontiac LeMans Mitsubishi Sigma Subaru Loyale Nissan Stanza Subary Justy Oldsmobile Calais Toyota Corolla Peugeot 405 Toyota Tercel Subaru Legacy Volkswagen Jetta Toyota Camry Gas mileage: variances are unequal Min: 1st Qu.: Mean: Median: 3rd Qu.: Max: Total N: NA's : Variance: Std Dev.: Small 25.000000 28.000000 31.000000 32.000000 33.000000 37.000000 13.000000 0.000000 14.500000 3.807887 Compact 21.000000 23.000000 24.133333 24.000000 25.500000 27.000000 15.000000 0.000000 3.552381 1.884776 Gas mileage Test Name: Welch Modified Two-Sample t-Test Estimated Parameter(s): mean of x = 31 mean of y = 24.13333 Data: DS2 x: Small in DS2 , and y: Compact in Test Statistic: t = 5.905054 Test Statistic Parameter: df = 16.98065 P-value: 0.00001738092 95 % Confidence Interval: LCL = 4.413064 UCL = 9.32027 Example: gas consumption G 0 1P 2 I 3 N 4U Which coefficients are different from zero? Data from 36 years in US. Gas consumption Value Std. Error t value Pr(>|t|) (Intercept) -0.0898 0.0508 -1.7687 0.0868 GasPrice -0.0424 0.0098 -4.3058 0.0002 Income 0.0002 0.0000 23.4189 0.0000 New.Car.Price -0.1014 0.0617 -1.6429 0.1105 Used.Car.Price -0.0432 0.0241 -1.7913 0.0830 Interpreting model coefficients Is there statistical evidence that the independent variable has an effect? Is the parameter estimate significantly different from zero? Is the coefficient large enough that the effect is important? Must take into account the variation in the independent variable Use linear measure of variation – SD, IQ range, etc. Types of error Type I: reject null hypothesis when it’s really true Desired level: a Type II: fail to reject null hypothesis when it’s really false Desired level: Is associated with a given effect size E.g., want a probability 0.1 of failing to reject when true difference between means is 0.35. Types of error In reality, H0 is Your test says that H0 should be: True False Accepted Correct conclusion Type II error Rejected Type I error Correct conclusion Controlling error levels a is controlled by setting critical P-value is controlled by a, sample size, sample variance, effect size Tradeoff between a and Need to balance costs associated with type I and type II errors Power is 1-