Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
13 Nonparametric Methods Introduction So far the underlying probability distribution functions (pdf) are assumed to be known, such as SND, t-distribution, chisquared distribution Parametric technique (mean, s.d.) Non-parametric techniques Few assumptions about the nature of the underlying pdf Not require the pdf is a SND 13.1 The sign test Investigate the amount of energy expended by patients with the congenital disease cystic fibrosis (CF), and for healthy individuals matched (such as age, sex, height, and weight) to the patients Table 13.1 Rest energy expenditure for patients with CF and healthy persons . Rest energy expenditure (kcal/day) pair CF healthy difference sign 1 1153 996 157 + 2 1132 1080 52 + 3 1165 1182 -17 - 4 1460 1452 8 + 5 1634 1162 472 + 6 1493 1619 -126 - 7 1358 1140 218 + 8 1453 1123 330 + 9 1185 1113 72 + 10 1824 1463 361 + 11 1793 1632 161 + 12 1930 1614 316 + 13 2075 1836 239 + 2 – signs 11 + signs 13.1 The sign test compare the resting energy expenditure (REE) for persons with CF and for healthy individuals (not comfortable in assuming REE or the differences between the measurements are SND) H0 : the median difference is 0 H1 : the median difference is not 0 It is a two-sided test (REE)CF – (REE)healthy > 0, < 0, = 0 +, - sign, no information (excluded from the analysis) Under the null hypothesis, we would expect to have approximately equal numbers of + and – signs That is the probability that a particular difference is + and - are ½ Bernoulli random variable with the probability of success p =0.5 Let D = the total number of + signs 13.1 The sign test The mean number of + signs in a sample of size n is np = n/2, and the s.d is (np(1-p))0.5 = (n/4)0.5 If D is either much larger or much smaller than n/2 we would want to reject H0 Evaluate the null hypothesis by considering the test statistic, z D (n / 2) n/4 If the sample size is large, z+ follows an approximate ND with mean 0 and s.d. 1. This test is called the sign test. 11 6.5 z 2.50 13 / 4 Area to the right and left of 2.50 is p = 2*(0.006) = 0.012 < 0.05 reject the null hypothesis the median difference among pairs is not equal to 0 REE is higher among persons with CF 13.1 The sign test If the sample size is small, less than about 20, the test statistic cannot be assumed to have a SND. Therefore, we use the binomial distribution to calculate the probability of observing D positive differences. P( D 11) P( D 11) P( D 12) P( D 13) 13 13 13 11 2 13 (0.5) (0.5) (0.5) (0.5)13 11 12 13 0.0095 0.0016 0.0001 0.0112 Since 0.0112 < 0.05, we would reject the null hypothesis at the 5% level 13.2 The Wilcoxon Signed-Rank Test take into account of the magnitude of the pair differences Chapter13 p307 Chapter13 p307 Chapter13 p311