Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hypothesis testing (The One-Sample Tests) Today: Paired-Samples t-Test Independent Samples t-Test The H0 Hypothesis The logic of Statistical Hypothesis testing is based on indirect proof – rejecting or retaining (keeping) the H0 hypothesis based on a test statistics! H1: µestimated < µ0 H0: µestimated = µ0 H2: µestimated > µ0 We assume NO difference (H0) UNTIL we COLLECT ENOUGH EVIDENCE to prove that there is significant DIFFERENCE Level of significance and possible errors of the decision When rejecting H0: - Possible error: rejecting H0 when it’s actually true! - Name of error: Type I. error - Probability = the level of significance (α) (getting statistically significant results when you shouldn’t – erroneous conclusion) When retaining (keeping) H0: - Possible error: a false H0 is retained! - Name of error: Type II. error - Probability = usually unknown (β) (failure to discover existing differences – erroneous conclusion) By reducing the Type I. Error – you raise the probability of the Type II. Error! CHOOSE YOUR LEVEL OF SIGNIFICANCE PRIOR TO ANALYSES! If I choose… p<0.05, for the level of significance for my statistical decision, probability of the Type I. error is: The confidence for my decision is: in this case is 100%5%=95%. If I choose p<0.01, for the level of significance for my statistical decision, probability of the Type I. error is (1%) or smaller! The p<0.1 level is called TENDENCY. P values above 0.05 are NOT SIGNIFICANT! Most statistical programs (e.g. SPSS provides the exact level of significance for a given test, e.g. p=0.003 (this is significant). Or p=0.23 (not significant); p=0.062 (tendency); p=0.9 (not significant), etc. Base Your conclusions on relevant statistics, and report: • number and most important characteristics of your observational units • the statistical test you’ve used and it’s parameters (level of sig, test value, degree of freedom, hypothesized mean, etc. …) • your precise conclusion (answer the research question by justifying / rejecting the research statement or report no significant difference) Comparing the Means of DEPENDENT Samples: The Paired-Samples t Test Chapter 15. (pg. 275-276, 278-285) • What are “dependent samples”? • The “direct difference” method • df (degrees of freedom – revisited) • Type I. & type II. errors What are “dependent samples”? The observations from one sample are related in some way to those from another! • Repeated – measures design: samples come from the same individuals • Matched – subjects design: selecting pairs based on a certain criteria • In any other case when the observations in the samples are NOT independent (e.g. husband – wife, mother – son, etc.) E.g. Samples: IQ scores of twins: Dependent Independent X Females and males of a school X Students who sit together in the classroom X Pulse rate before and after chemotherapy X Case-variable matrix for dependent samples: Subjects: • • • • Condition 1 Condition 2 1. 5 10 2. 6 9 3. 5 6 4. 9 12 Difference score 5 3 1 3 Calculating difference scores for each unit of observation Calculating average difference score for the sample… Calculating SD of difference scores… T tests work with these values, this is the: „direct difference” method! What to use: z Test, or one of the t Tests? Testing Statistical Hypotheses About µ : IF known not known z Test X − µ0 uz = σ/ n σ t Test t= Comparing the Means of DEPENDENT Samples t Test x − µ0 s/ n t= D sD / n Eg. Do patients have a higher pulse rate before chemotherapy than after? Do mothers in Malaysia have their first pregnancy earlier than the world average, which is 22 (±3) years? Is memory better in the morning, or in the evening? The “direct difference” method Comparing means of DEPENDENT samples, and testing hypothesis for the populations of these samples t Test Is there a difference between the D characteristics of = t • husbands and wives, sD / n • the IQ of twins, • memory performance in the morning and in the afternoon, etc? D=Y–X the “difference” variable, mean: D, St Dev: sD Criteria: quantitative, normally distributed variables, the SD of the samples are similar.. Calculating difference scores for each pair: D1, D2, etc. • calculating mean and SD of D1, D2, etc. • calculating the t Test • determining df (NUMBER of PAIRS minus 1!!!) Conclusions are made for the couple, the twins, and the time of day, etc. • checking from the table the corresponding t0.05 value Generalization of Hypothesis testing with t-tests t sample t-test result t ≤ t 0.05 H 1 : µ < µ0 |t| < t 0.05 H 0 t-0.05 ? 0 Region of rejection Region of retention t ≥ t 0.05 H 2 : µ > µ0 In this case we cannot say anything certain about the estimated population mean! t+0.05 ? Region of rejection For each problem FIND the t0.05 critical value for the appropriate df in the statistical tables of the Student’s t Distribution (df = n-1) Do you think it is more difficult to recall words in English that start with vowels or consonants??? • Design an experiment with a repeated measures design to be performed in class! • Draft the case-variable matrix to be used! • Formulate the research hypothesis! • What will be the null hypothesis? Is it more difficult to recall words in English starting with vowels or consonants? Statement: It is easier to recall words in English that start with…! Experimental design: units of observation, variables… Research hypothesis: estimated population mean of recalling English words with consonants will be significantly higher/lower than the estimated population mean of recalling English words with vowels. Statistical hypothesis: e.g: H1: µconsonants > µvowels Experiment: Half of the group: please recall as many words as you can in English, starting with consonants – you have 1 minute… Half of the group: please recall as many words as you can in English, starting with vowels – you have 1 minute… Switch task… Calculate the difference score for Your raw data! The case-variable datamatrix and demonstrating what SPSS will calculate… StatisticsLecturesTopic06demoVowelConsonantExample.xls Subjects: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 mean StDev vowels consonants difference 8 11 6 9 7 8 10 12 7 8 9 6 6 9 13 10 17 18 7 12 8 6 5 8 9 10 7 12 10 16 19 17 8.7 2.1 11.3 4.7 t= 2.5 9 7 1 3 1 -2 -5 -4 2 2 -2 6 4 7 6 7 t Test D = t sD / n D=Y–X the “difference” variable, mean: D, St Dev: sD 2.625 mean of Difference scores 4.3 StDev of difference scores Decision is based on generalized hypothesis testing: Conclude the statistical test results below: t = 2.5 df =15 t0.05 = 2.131 13 14 15 16 6 9 13 10 10 16 19 17 mean StDev 8.7 2.1 11.3 4.7 4 7 6 7 2.625 mean of Difference scores 4.3 StDev of difference scores t= 2.5 StatisticalTablesWithCriticalValuesInExcel.xls Average recall number of English words in one minute starting with vowels = 8.7 … with consonants = 11.3 Foreign language students recall English words starting with consonants more easily based on the performance of 16 students: they could recall significantly more words with consonants (11.3±4.7) as compared to their performance on words with vowels (8.7±2.1) based on the dependent samples t-test: t(15) = 2.5 (p < 0.05). Testing the H0 Hypothesis Random sample Test statistics – a mathematical rule for decision: • Calculate statistical value based on the formula of the appropriate statistical test • Determine critical values • Test if the calculated statistical value falls in the area of retention (keeping the H0) or in the area of rejection (keeping H1 or H2) Level of Confidence (e.g. 95%) Critical value (− −) Critical value (+) Region of Region of Region of rejection retention rejection H1 H0 H1: µNew Jersey < µUSA H0: µNew Jersey = µUSA H2: µNew Jersey > H2 µUSA Level of significance and possible errors of the decision When rejecting H0: - Possible error: rejecting H0 when it’s actually true! - Name of error: Type I. error - Probability = the level of significance (α) (getting statistically significant results when you shouldn’t – erroneous conclusion) When retaining (keeping) H0: - Possible error: a false H0 is retained! - Name of error: Type II. error - Probability = usually unknown (β) (failure to discover existing differences – erroneous conclusion) By reducing the Type I. Error – you raise the probability of the Type II. Error! E.g We did not report significant differences between language proficiency of class A and class B. What type of error might have we made? Type II. error! The one-sample z-test (rarely used) Testing a statistical hypothesis about µ when the Standard Deviation of the population (σ) is known: sample − µ0 σ/ n X z= z ≤ -1.96 H 1 : µ < µ0 |z| < 1.96 H 0 -1.96 +1.96 Region of rejection Region of retention Region of rejection z ≥ 1.96 H 2: µ > µ0 In this case we cannot say anything certain about the estimated population mean! Criteria: quantitative, normally distributed variables, σ is known, µ0 is hypothesized, the SD of the sample is similar to σ. What if we DO NOT KNOW the Standard deviation of the population?! Testing Statistical Hypotheses About µ0 : IF known z Test X − µ0 uz = σ/ n Eg. σ not known t Test t= x − µ0 s/ n In both tests the mean of the sample is being compared to a hypothesized: µ0 (e.g. national standard) We are testing if the estimated population mean (based on data from the sample) is significantly higher/lower than the µ0 Student's t-distribution William Sealy Gosset (published under the pseudonym „Student”) For estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown. The t-distribution for each sample size is different, and the larger the sample, the more the distribution resembles a normal distribution. The normal distribution describes the full population, on the other hand tdistributions describe samples drawn from a full population. https://onlinecourses.science.psu.edu/stat414/node/175 The one-sample t-test for testing a hypothesis about the population mean t sample X t= t≤ H 1 : µ < µ0 t 0.05 − µ0 s/ n |t| < t 0.05 H 0 t-0.05 t+0.05 0 Region of rejection Region of retention Region of rejection t ≥ t 0.05 H 2: µ > µ0 In this case we cannot say anything certain about the estimated population mean! Criteria: quantitative, normally distributed variables, σ is known, µ0 is hypothesized, the SD of the sample is similar to σ. The degrees of freedom t sample X t= t ≤ t 0.05 H 1 : µ < µ0 − µ0 s/ n |t| < t 0.05 H 0 t-0.05 ? 0 Region of rejection Region of retention t+0.05 ? Region of rejection t ≥ t 0.05 H 2 : µ > µ0 In this case we cannot say anything certain about the estimated population mean! For each problem there is a different critical value based on df = n – 1 in the statistical tables of the Student’s t Distribution Eg. Practice task: Teachers think that the students in their High School have an outstanding IQ, based on their sample: XIQ = (126, 139, 89, 106). What do you think? • Students show „well above” average IQ based on a sample of four: 115 points, the standard error of the mean is 11 points. IS THE AVERAGE IQ SIGNIFICANTLY HIGHER THAN 100 IN THIS SCHOOL? Eg. SPSS analysis: t sample X t= t ≤ t 0.05 H 1 : µ < µ0 − µ0 s/ n |t| < t 0.05 H 0 t-0.05 ? 0 Region of rejection Region of retention t+0.05 ? Region of rejection t ≥ t 0.05 H 2 : µ > µ0 In this case we cannot say anything certain about the estimated population mean! For each problem there is a different critical value based on df = n – 1 in the statistical tables of the Student’s t Distribution Eg. SPSS analysis: Eg. Practice task: Teachers think that the students in their High School have an outstanding IQ, based on their sample: XIQ = (126, 139, 89, 106). What do you think? • Students seem „well above” average IQ based on a sample of four: 115 points, however, based on the one sample t-test they are NOT significantly different from the 100 point average t(3)=1.363 (p > 0.1).