Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 1 Topic (11) – CONCEPTS OF HYPOTHESIS TESTING Recall the definition of Scientific Method: 1. knowledge is obtained in a systematic and objective manner in order to extend our understanding. 2. Based on this knowledge we form a HYPOTHESIS – a tentative or postulated explanation of the phenomenon. Hence it is a statement about a population characteristic (eg, a mean µ or proportion π or the difference between population means µ1 − µ 2 ) 3. To evaluate the hypothesis we DESIGN and execute an objectively planned experiment. 4. The resulting data are TESTED to determine if they support or do not support the hypothesis. Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 2 A) Construct Hypotheses Almost all statistical testing procedures are based on testing two competing claims: the null and alternative hypotheses Defn: Ho is the NULL HYPOTHESIS. This is the status quo, i.e. it is the truth until disproven by testing. HA is the ALTERNATIVE HYPOTHESIS. This is the competing claim made by the researcher*. * There are exceptions, usually for testing equivalency or goodness of fit to a probability distribution. The alternative hypothesis, HA, lists the outcomes claimed to be true by the scientist. The null hypothesis, H0, then lists the remaining or unclaimed cases. The testing procedure results in either 1) rejecting the null hypothesis in favor of the alternative hypothesis because the data support the alternative, or 2) failing to reject the null hypothesis because there is insufficient evidence to show it is wrong Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 3 Step 1) State Your Claim In Words EXAMPLES: A. In a study of the effect of the new drug for reducing serum cholesterol levels in men at risk of heart disease, the company’s claim is that the drug reduces serum cholesterol levels in the target population. So we might write: Ho: the drug does not reduce serum cholesterol levels HA: the drug reduces serum cholesterol levels B. An entomologist believes that there is sexual dimorphism in body size of the periodical cicada. One measure of size of the length of the hind tibia. So her hypothesis might be stated: Ho: there is no sexual dimorphism in the periodical cicada HA: there is sexual dimorphism in the periodical cicada Note that these are informal statements that need to be clarified and made more rigorous Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 4 Step 2) State The Hypotheses In Terms Of The Relevant Population Characteristics (parameters) EXAMPLES A. In the study of the effect of the new drug for reducing serum cholesterol levels we had Ho: the drug does not reduce serum cholesterol levels HA: the drug reduces serum cholesterol levels What variable(s) are being measured? X = blood serum cholesterol level What population characteristic is being modified by the drug? If the drug leads to a reduction in blood levels, the population mean should go down. (We assume variability of values doesn’t change.) What is the value of the characteristic without the drug? Without the drug, the target population has a mean blood serum level of 250. So we can write the hypotheses more specifically as Ho: µ = 250 HA: µ < 250 Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 5 B. sexual dimorphism in the morphology of the periodical cicada Since they are studying the hind tibia length we should first clarify the hypotheses in words: HA: there is a difference in hind tibia lengths between males and females Ho: there is no difference in hind tibia lengths between the sexes What population characteristic(s) is(are) different between the 2 sexes? The means for hind tibia length in the 2 genders. Hence we can write the hypotheses as: Ho: µfemales = µmales HA: µfemales ≠ µmales Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 6 C. The scientist studying the proportion of fiddler crabs with dominant left pincers hypothesized that isolation on the island led to a proportion of leftpincered crabs larger than the typical 10%. Her hypotheses in words are: Ho: 10% of the population of fiddler crabs on the island are left-pincered HA: more than 10% of the population of fiddler crabs on the island are left-pincered and in symbols are Ho: π = 0.10 HA: π > 0.10 Important Point: Hypotheses must be carefully structured since, as was implied earlier, statistical tests can only disprove the null hypothesis; they CANNOT prove that Ho is true. Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 7 Forms Of Hypotheses For Single Populations The null hypothesis is always stated as H0: population parameter = hypothesized value where the hypothesized value is given by the problem (usually it’s the value being challenged). The alternative has one of the following forms: 1) 2-sided test alternative HA: parameter ≠ hypothesized value 2) 1-sided upper tail alternative HA: parameter > hypothesized value 3) 1-sided lower tail alternative HA: parameter < hypothesized value Note that the alternative never includes the hypothesized value. Having defined your hypotheses, the next step is: Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 8 B) Design The Experiment: a) identify the appropriate statistical test to be used b) identify type of data to be collected (variables) c) construct the sampling or experimental design so that it actually provides the data needed for the test Comment: These three cannot be separated as distinct activities Point: most of the tests we’ll learn require random sampling, independent sampling among different populations, and sample sizes sufficiently large so we can argue that our statistics are approximately normally distributed. Point: if the data are categorical then the test is different from that for continuous data Example of Experimental Design: in a study of the effect of temperature on seedling growth, the scientist put 100 plants at one temperature in a greenhouse with its windows painted over and another 100 plants at a different temperature in a greenhouse with no shading. He saw a statistical difference in growth and thus claimed it was due to temperature. Is this good experimental design? Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 9 C) Run The Experiment And Collect The Data D) Perform The Statistical Test And Draw Your Conclusions Important Fact: When you take a sample from the population of interest (or run an experiment on a subset of the population) you are basing your conclusions about that population on incomplete information. As a result your conclusion could be WRONG (just like CIs)! A A A A C C A A C A A A C C A C A A A A The population has N=20 elements with a proportion of successes (success = C) of 30%. H0: π =.25 HA: π <.25 Take a sample of 8 elements (with replacement) and get {A,A,A,A,A,A,A,C}. Would your sample lead you to reject H0? If so, you made the wrong conclusion. Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 10 There are two possible types of errors that can occur whenever you use a sample to test a hypothesis about a population: • TYPE I error – reject the null hypothesis when it is, in fact, true for the population • TYPE II error – fail to reject the null hypothesis when, in fact, the alternative hypothesis is true for the population EXAMPLE A jury trial is very similar to a hypothesis test. A person is assumed to be not guilty until it is proven otherwise. So, Jury Decision Not Guilty (do not reject H0) Guilty (reject H0) True Situation Guilty Not Guilty (H0 is true) (HA is true) correct Type II error Type I error correct True Situation ≡ True Value of Population Parameter Jury Decision ≡ The Outcome of a Hypothesis Test Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 11 In a statistical test, one of the types of error (but never both) can be controlled in a somewhat limited fashion. E.g. Marg claims she has ESP. To test the claim we get a set of 3 cards with symbols on them (circle, square and a triangle). Pick a card at random and look at it (don’t show Marg!). Ask Marg to say which card you are looking at. Repeat 25 times. Record the number of times she was correct. This is a Binomial experiment with 25 trials. Now, if Marg does not have ESP then the probability of a success on any one trial is 1/3 but if she does have ESP, her success rate would be higher. Ho: Marg does not have ESP HA: Marg does have ESP (π = 0.33) (π > 0.33) If the null hypothesis is true, then the distribution of p, the sample proportion, is approximately Normal with mean 0.33 and s.d. π (1 − π ) / n = 0.094 . We could decide that we will reject the null hypothesis only if the experimental data overwhelming support the alternative. For example, we may state that our rejection rule is that Marg must Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 12 demonstrate a success proportion greater than 0.33+2(0.094) = 0.52. Statistical testing procedures have been designed mostly to control this type of risk, thee risk of a type I error. That is, the scientist specifies how much of a risk of a type I error he/she is willing to take. Defn: The PROBABILITY OF COMMITTING A TYPE I ERROR is denoted with the Greek letter α (alpha) and is called the SIGNIFICANCE LEVEL of the test. Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 13 The PROBABILITY OF COMMITTING A TYPE II ERROR is denoted with the Greek letter β (beta). The value (1-β ) is called the POWER OF THE TEST. INTERPRETATION: The significance level can be thought of as follows: if we could repeat the experiment ad nauseam and for each experiment we performed the test, α % of the tests would lead us to falsely reject the null hypothesis. The only way to make sure it is virtually impossible to commit a type I error is to either 1) census the entire population. Then you’ll know for sure what the true value of the parameter is. or 2) do your test controlling for as small a Type I error as you can (e.g. use α =0.01 or even α =0.001). Topic (11) – CONCEPTS OF HYPOTHESIS TESTING 11 - 14 By doing (2), you will only reject the null hypothesis when the sample data overwhelmingly supports the alternative hypothesis. Does this increase your chances of a Type II error though? Yes, since now you might miss data that are supportive of HA. Your choice of a really small significance level states that you will only accept overwhelming evidence (not reasonable evidence) to reject the null hypothesis. Important Point: Choose the largest α tolerable for the problem and use that in testing.