Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hypothesis Testing GTECH 201 Lecture 16 Overview of Today’s Topic Formulation Evaluation Refining and Restating Statistical Tests What is a Hypothesis? Unproven or unsubstantiated statement You need to know the literature before you can formulate a hypothesis statement Data collection should support hypothesis testing and evaluation If hypothesis is tested and found to be correct, then results can be refined (different scenarios can be tested) If partially correct, then hypothesis statement needs to be refined (reworded) Hypothesis Testing Multi-step procedure that leads the researcher from the hypothesis statement to the decision regarding the hypothesis 6- step process 1. 2. 3. 4. State null and alternate hypotheses Select appropriate statistical test Select level of significance Delineate regions of rejection and nonrejection of hypotheses 5. Calculate test statistic 6. Make regarding null hypothesis Step 1 State null and alternate hypotheses Null hypothesis A hypothesis to be tested Usually represented as H 0 Alternative hypothesis A hypothesis considered as an alternate to the null hypothesis Usually represented as H A Guidelines for Setting up H0, HA Hypothesis tests concerning one parameter Population mean, m A null hypothesis for a hypothesis test concerning a population mean should always specify a single value for that parameter m0 (= ) sign must appear in the null hypothesis Therefore: H 0 : m m 0 H0 : m mH H0 : m mH 0 Guidelines, part 2 Alternative hypothesis The choice of the alternative hypothesis depends on and should reveal the purpose of the hypothesis test Null hypothesis and alternative hypothesis are mutually exclusive Three choices are possible H A : m m 0 (nondirectional ) H A : m m 0 (directiona l ) H A : m m 0 (directiona l ) Guidelines, part 3 H A : m m0 An alternate hypothesis with a sign is called a two-tailed test The population mean, m is different from a specified value, m 0 When a < sign appears in the alternate hypothesis, the test is called a left-tailed test When a > sign appears in the alternate hypothesis, the test is called a right-tailed test Setting up Hypotheses A snack food company produces 454 gms bags of pretzels. Although the actual weights deviate slightly from the 454 gms, and vary from one bag to another, the quality control team insists that the mean net weight of bags be maintained at 454 gms. If the mean net weight of the bags is lower or higher, it is likely to cause problems. If you work for the quality control team and you want to decide whether the packaging machine is working properly, how would you set up a hypothesis test? Stating Hypotheses H A 454g The packaging machine IS working properly H A 454g The packaging machine IS NOT working properly Select Appropriate Test One sample difference of means t test Objective Requirements and assumptions Compare a random sample mean to a population mean for difference Random sample Normally distributed population Variable is measured at interval or ratio scale Hypotheses Test Statistic Test Statistic X m X X m X sample mean population mean standard error of the mean population standard deviation Level of Significance = 0.10 (90%); 0.05 (95%); 0.01 (99.7%) Errors Type I error: Rejecting the null hypothesis when it is in fact true Type II error: Not rejecting the null hypothesis when it is in fact false Null Hypothesis is TRUE FALSE do not correct reject null decision hypothesis reject null hypothesis type I error type II error correct decision Identify Regions of Rejection Of null hypothesis Two-tailed Left tailed (directional) Right tailed (directional) Calculate test statistic Make decision regarding null or alternate hypothesis To Work in Class We want to investigate demographic change in an area 3500 households (HH) You take a sample of 250 HH Sample mean = 2.68; sample variance =4.3 = 0.10 (90%) Now, we want to find out if the mean HH size in this one area is typical or representative of the national mean household size (2.61) Use the six step process to compare how closely the samples that you have taken compare with the national average HH size of 2.61 Limits of Hypothesis Testing Pre-selecting level of significance Lacks a theoretical basis Used for convenience Binary nature of null and alternative hypothesis P-value or Probability value Accepted approach The exact significance level associated with the calculated test statistic is determined More About P-Value We can define P-value as: The exact probability of getting a test statistic value of a given magnitude, IF the null hypothesis is true What is the probability of making a Type I error Type I error occurs when the null hypothesis is rejected using the hypothesis testing procedure, even though in reality the null hypothesis is true Comparing Classical and P Value Approaches Classical State hypotheses Decide on significance level Select test Delineate regions of rejection/nonrejection Calculate the test statistic State your conclusion in words P- Value State hypotheses Decide on significance level Compute the value of the test statistic Determine P-value P reject null hypothesis; otherwise do not reject State your conclusion in words Guidelines for Using P-Value P 0.10 0.05 P 0.10 0.01 P 0.05 P 0.01 Evidence against H0 Weak or none Moderate Strong Very strong Example A random sample of 18 people with income below the poverty level reveals their daily intake of calcium mean 747.4 mg standard deviation 188 mg Use the P-value approach to determine whether the data provides sufficient evidence at the 5% significance level to conclude that the mean calcium intake of all Americans with income below the poverty level is less than the required daily allowance of 800 mg Parametric and Nonparametric Tests Parametric tests Require knowledge about population parameters Assumptions made about population distribution E.g., population is normally distributed Sample data measured on Interval/Ratio scale Non-parametric tests Requires no knowledge about population parameters Distribution-free Some non-parametric tests are designed to be 2 applied for nominal, ordinal data ( ) – we will talk about these in the next lecture Choices/Options Run only a parametric test Run only a non-parametric test Run both tests Goal State the problem Decide what inferential technique will be useful Identify formulae associated with the technique Interpret the results