Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, KruskalWallis test, Spearman’ rank correlation. Krisztina Boda PhD Department of Medical Informatics, University of Szeged Parametric tests Parameter: a parameter is a number characterizing an aspect of a population (such as the mean of some variable for the population), or that characterizes a theoretical distribution shape. Usually, population parameters cannot be known exactly; in many cases we make assumptions about them. Krisztina Boda INTERREG 2 Parameters of the normal distribution: , Parameter of the binomial distribution: n, p Parameter of the Poisson distribution: Krisztina Boda INTERREG 3 Normal distributions N(, ) N(0,1) N(1,1) Probability Dens ity Function y=norm al(x;0;1) Probability Dis tribution Function p=inorm al(x;0;1) 0.6 Probability Dens ity Function y=norm al(x;1;1) 1.0 Probability Dis tribut p=inorm al(x 0.6 1.0 0.5 0.8 0.5 0.8 0.4 0.6 0.4 0.6 0.3 0.3 0.4 0.2 0.4 0.2 0.2 0.1 0.2 0.1 0.0 0.0 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 0.0 -2 Probability Dis tribution -3 Function p=inorm al(x;0;2) Probability Dens ity Function y=norm al(x;0;2) 0.6 0.0 -1 0 1 2 3 -3 -2 -1 1.0 0.5 0.8 0.4 0.6 0.3 0.4 0.2 0.2 0.1 0.0 0.0 -3 -2 -1 0 1 2 3 -3 , : parameters (a parameter is a number that describes the distribution) -2 -1 0 1 2 3 N(0,2) Krisztina Boda INTERREG 4 0 Binomial distributions Krisztina Boda 1. Each trial results in one of two possible, mutually exclusive outcome. (success, failure) 2. The probability of a success, p, remains constant from trial to trial 3. The trials are independent. We are interested in being able to compute the probability of k successes in n trials. The binomial distribution is useful for describing distributions of binomial events, such as the number of males and females in a random sample of companies, or the number of defective components in samples of 20 units taken from a production process. The binomial distribution is defined as: n n! n , n! 1 2...n Pk P( X k ) p k q n k , k 0,1,..., n k k !( n k )! k p is the probability that the respective event will occur q is equal to 1-p n is the maximum number of independent trials. INTERREG 5 Example Suppose that it is known that 30% of a certain population are immune to some disease. If a random sample of size n=10 is selected from this population, what is the probability that it will contain exactly k=4 immune persons? 10 4 6 10! 4 6 P( X 4) 0.3 0.7 0.3 0.7 4!6! 4 210 0.0081 0.117649 0.200121 Krisztina Boda INTERREG 6 Number of Success 0 1 2 3 4 5 6 7 8 9 10 Összesen Probabilty distribution 0.028247525 0.121060821 0.233474441 0.266827932 0.200120949 0.102919345 0.036756909 0.009001692 0.001446701 0.000137781 5.9049E-06 Distribution function 0.028247525 0.149308346 0.382782786 0.649610718 0.849731667 0.952651013 0.989407922 0.998409614 0.999856314 0.999994095 1 Probabilty of "success" 0.3 1 Probabilty distribution Distribution function 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 2 3 4 5 6 7 8 9 10 11 0 0 1 2 3 4 5 6 7 8 9 10 11 Binomial distribution n=10, p can be changfed, k=0,1,…,10 Krisztina Boda INTERREG 7 Poisson distribution The Poisson distribution is also sometimes referred to as the distribution of rare events. Examples of Poisson distributed variables are number of accidents per person, number of sweepstakes won per person, or the number of catastrophic defects found in a production process. If n tends to infinity, but at the same time np= is kept constant the binomial distribution approaches a fixed distribution n k n k k lim Pk lim p q f (k ) e n k! n k Krisztina Boda INTERREG 8 Example. In a certain disease the number of new occurrences in a month is 3 in average. Assuming that the number of new occurrences follows a Poisson distribution, what is the probability that Nobody becomes ill (0.0498) There are exactly 2 new occurrences (0.224) Number of events 0 1 2 3 4 5 6 7 8 9 10 Total Probability Distribution function 0.049787068 0.049787068 0.149361205 0.199148273 0.224041808 0.423190081 0.224041808 0.647231889 0.168031356 0.815263245 0.100818813 0.916082058 0.050409407 0.966491465 0.021604031 0.988095496 0.008101512 0.996197008 0.002700504 0.998897512 0.000810151 0.999707663 0.999707663 Average number of events 3 Probability 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 Krisztina Boda Distribution function 1 2 3 4 5 6 7 8 9 10 11 INTERREG 0 1 2 3 4 5 6 7 8 9 10 11 9 Parametric tests The null hypothesis contains a parameter of a distribution. The assumptions of the tests are that the samples are drawn from a normally distributed population. One sample t-test: H0: =c, Two sample t-test: H0: 1=2,assumptions: 1=2 Krisztina Boda INTERREG 10 Nonparametric tests We do not need to make specific assumptions about the distribution of data. They can be used when The distribution is not normal The shape of the distribution is not evident Data are measured on an ordinal scale (lownormal-high, passed – acceptable – good – very good) Krisztina Boda INTERREG 11 Ranking data Krisztina Boda Nonparametric tests can't use the estimations of population parameters. They use ranks instead. Instead of the original sample data we have to use its rank. To show the ranking procedure suppose we have the following sample of measurements: 199, 126, 81, 68, 112, 112. Sort the data in ascending order: 68, 81,112,112,126,199 Give ranks from 1 to n: 1, 2, 3, 4, 5, 6 Cases 5 and 6 are equal, they are assigned a rank of 3.5, the average rank of 3 and 4. We say that case 5 and 6 are tied. Ranks corrected for ties: 1, 2, 3.5, 3.5, 5, 6 INTERREG 12 Result of ranking data Case Data Rank Ranks corrected for ties 4 68 1 1 3 81 2 2 5 112 3 3.5 6 112 4 3.5 2 126 5 5 1 199 6 6 n The sum of all ranks must be ri i 1 Krisztina Boda n(n 1) 2 Using this formula we can check our computations. Now the sum of ranks is 21, and 6(7)/2=21. INTERREG 13 Nonparametric tests for paired data (nonparametric alternatives of paired t-test) Sign test Wilcoxon’s matched pairs test Null hypothesis: the paired samples are drawn from the same population Krisztina Boda INTERREG 14 The sign test Example: 13 students were measured in reading speed and comprehension at a course ending and after 1 month. Suppose we have reason to believe that the two distributions of reading scores are not normal. Number of positive signs: 6 Number of negative signs: 5 Cases with no change are omitted Krisztina Boda Student Score at course ending 1 50 2 48 3 46 4 50 5 62 6 80 7 23 8 30 9 45 10 53 11 49 12 51 13 46 INTERREG Score after 1 month 52 51 46 49 50 70 21 33 46 53 48 48 48 Difference -2 -3 0 1 2 10 2 -3 -1 0 1 3 -2 Sign + + + + + + - 15 Table of the sign test Krisztina Boda The table contains the acceptance region for given sample size and INTERREG 16 Decision based on table Krisztina Boda If the distributions of the two variables are the same (If the null hypothesis is true), the numbers of positive and negative differences should be similar. The null hypothesis is accepted if both numbers lie in the interval given it table for the sign test Number of positive signs: 6 Number of negative signs: 5 For n=11 and =0.05, this interval is 1-10. As both 5 and 6 lies in the interval 1-10, we accept the null hypothesis at 5% level. INTERREG 17 The Wilcoxon signed rank test Example: 13 students were measured in reading speed and comprehension at a course ending and after 1 month. Suppose we have reason to believe that the two distributions of reading scores are not normal. Sum of ranks belonging to positive signs: R+=40.5 Sum of ranks belonging to negative signs: R-=25.5 Cases with no change are omitted Krisztina Boda Student Score at course ending 1 50 2 48 3 46 4 50 5 62 6 80 7 23 8 30 9 45 10 53 11 49 12 51 13 46 INTERREG Score after 1 month 52 51 46 49 50 70 21 33 46 53 48 48 48 Difference Rank ignoring signs -2 5.5 -3 9 0 1 2 2 5.5 10 11 2 -3 9 -1 2 0 1 2 3 9 -2 5.5 18 Table of the Wilcoxon signed rank test Krisztina Boda The table contains the acceptance region for given sample size and INTERREG 19 Decision based on table Krisztina Boda If the distributions of the two variables are the same (If the null hypothesis is true), the sum of positive and negative ranks should be similar. The null hypothesis is accepted if both numbers lie in the interval given it table for the test Sum of ranks belonging to positive signs: R+=40.5 Sum of ranks belonging to negative signs: R-=25.5 For n=11 and =0.05, this interval is 10-56. As both rank sums are in this interval, we do not reject the null hypothesis and claim that the difference is not significant at 5% level. INTERREG 20 The case of large samples When the sample size is large, we can count the mean and standard deviation of the ranks and use the normal distribution to get the p-value. Computer packages use this normal approximation also in case of small sample size n R Ri z i 1 n R i 1 Krisztina Boda 2 R n(n 1) / 4 ~ N (0,1) (n(n 1)( 2n 1) / 24) i INTERREG 21 Nonparametric test for data in independent groups (nonparametric alternatives of two sample t-test) Mann-Whitney U test Null hypothesis: the samples are drawn from the same population Krisztina Boda INTERREG 22 Hypothetical example Krisztina Boda The change of body weight are compared in two groups: patients having a special diet and control patients. Null hypothesis: the diet is not effective, data are drawn from the same population. The original data are ranked and the sum of ranks in each group is computed. If the null hypothesis is true, the sum of ranks in the two groups are similar. INTERREG 23 Krisztina Boda Patient Change in body weight (kg) Group Rank Rank corrected for ties 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Sum of ranks, R1 11. 12. 13. 14. 15. 16. 17. 18. 19 20. 21. Sum of ranks R2 -1 5 3 10 6 4 0 1 6 6 Diet Diet Diet Diet Diet Diet Diet Diet Diet Diet 3 16 12 21 18 15 4 8 19 20 2 0 1 0 3 1 5 0 -2 -2 3 Control Control Control Control Control Control Control Control Control Control Control 11 5 9 6 13 10 17 7 1 2 14 3 16.5 13 21 19 15 5.5 9 19 19 140 11 5.5 9 5.5 13 9 16.5 5.5 1.5 1.5 13 91 INTERREG 24 Table of the MannWhitney U test Krisztina Boda INTERREG 25 Decision based on table Krisztina Boda If the distributions of the two variables are the same (If the null hypothesis is true), the sum of ranks in the two groups should be similar. The test statistic T is the sum of the ranks in the smaller group. The null hypothesis is accepted T lies in the interval given it table for the test Sum of ranks in the first group (n=10): R1=140 Sum of ranks in the second group (n=11): R2=91 The test statistic T is the sum of the ranks in the smaller group. T=140. For n1=10 and n2=11 and =0.05, this interval is 81-139. As T lies outside of this interval, we reject the null hypothesis and claim that the difference is significant at 5% level. INTERREG 26 An alternative test statistic Krisztina Boda The statistic U (due to Mann Whitney) is the number of all possible pairs of observations comprising one from each sample, say xi and yi , for which xi<yi. This if the sample sizes are n1 and n2, the U/n1n2 is the proportion of all such pairs, and so is also the estimated probability that a new observation from the first population will be less than a new observation sampled from the second population. 1 U n1 n2 n1 (n1 1) T 2 INTERREG 27 The case of large samples When the sample size is large, T test statistic T has an approximately Normal distribution And we can calculate the test statistic z according to the following formula: (ns and nL are the sample sizes in the smaller and larger group respectively). z Krisztina Boda Rs ns (ns n L 1) / 2 ns n L (ns n L 1) 12 ~ N (0,1) Computer packages use this normal approximation also in case of small sample size INTERREG 28 Comparing several independent groups: the Kruskal-Wallis test Krisztina Boda It is also called nonparametric one-way ANOVA It tests whether k independent samples that are defined by a grouping variable are from the same population. This test assumes that there is no a priori ordering of the k populations from which the samples are drawn. As a result, it gives one p-value. If the null hypothesis is rejected, further tests are required to make pairwise comparisons. These pairwise comparisons are generally not available in standard statistical packages. Pairwise comparisons can be performed by Mann Whitney U tests and p-values can be corrected by Bonferroni correction. INTERREG 29 Comparison of several related samples: the Friedman test Krisztina Boda The Friedman test is the nonparametric equivalent of a one-sample repeated measures design or a two-way analysis of variance with one observation per cell. Friedman tests the null hypothesis that k related variables come from the same population. For each case, the k variables are ranked from 1 to k. The test statistic is based on these ranks. As a result, it gives one p-value. If the null hypothesis is rejected, further tests are required to make pairwise comparisons. These pairwise comparisons are generally not available in standard statistical packages. Pairwise comparisons can be performed by Wilxocon signed rank tests and p-values can be corrected by Bonferroni correction. INTERREG 30 Review questions and exercises Problems to be solved by handcalculations ..\Handouts\Problems hand VII.doc Solutions ..\Handouts\Problems hand VII solutions.doc Krisztina Boda Problems to be solved using computer none INTERREG 31 Useful WEB pages Krisztina Boda http://www-stat.stanford.edu/~naras/jsm http://www.ruf.rice.edu/~lane/rvls.html http://my.execpc.com/~helberg/statistics.html http://www.math.csusb.edu/faculty/stanton/m26 2/index.html INTERREG 32