Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NONPARAMETRIC STATISTICS In general, a statistical technique is categorized as NPS if it has at least one of the following characteristics: 1. 2. 3. The method is used on nominal data The method is used in ordinal data The method is used in interval scale or ratio scale data but there is no assumption regarding the probability distribution of the population where the sample is selected. Sign Test Mann-Whitney Test Spearman’s Rank Correlation Test Nonparametric versus Parametric Statistics Parametric statistical procedures are inferential procedures that rely on testing claims regarding parameters such as , , p. In some circumstances, the use of parametric procedures required some certain assumptions that need to be satisfied such as check for the normality distribution. Nonparametric statistical procedures are inferential procedures that are not based on parameters and required fewer assumptions that need to satisfy to perform tests. They do not required the population to follow a specific type of distribution or in other words known as distribution free procedures. Inferences : Parametric makes inference regarding the mean. Nonparametric makes inference regarding the median. Advantages of Nonparametric Most of the tests required very few assumptions. The computation is fairly easy. The procedures can be used for count data or rank data. Disadvantages of NonParametric Because of there are fewer requirements that must be satisfied to conduct these tests, researches sometimes use these procedures when parametric procedures can be used. Less efficient than parametric procedures. Required a larger sample size for nonparametric procedure. Sign Test The sign test is used to test the null hypothesis and whether or not two groups are equally sized. In other word, to test of the population proportion for testing p 0.5 in a small sample (usually n 20 ) It based on the direction of the + and – sign of the observation and not their numerical magnitude. It also called the binomial sign test with the null proportion is 0.5 (Uses the binomial distribution as the decision rule). A binomial experiment consist of n identical trial with probability of success, p in each trial.The probability of x success in n trials is given by P ( X x ) nC x p x q n x ; x 0,1, 2....n n n! where Cx x n x ! x ! X ~ B(n, p) where p 0.5 n There are two types of sign test : 1. One sample sign test 2. Paired sample sign test Hypothesis Testing: Two Tailed Left-Tailed Right-Tailed H0 : M M 0 H0 : M M 0 H0 : M M 0 H1 : M M 0 H1 : M M 0 H1 : M M 0 One Sample Sign Test Procedure: 1. Put a + sign for a value greater than the median value Put a - sign for a value less than the median value Put a 0 as the value equal to the median value 2. Calculate: i. The number of + sign, denoted by x ii. The number of sample, denoted by n (discard/ignore the data with value 0) 3. Run the test i. State the null and alternative hypothesis ii. Determine level of significance, iii. Reject H 0 if p value iv. Determining the p – value for the test for n, x and p = 0.5, from binomial probability table base on the type of test being conducted Sign of H 0 Sign of H1 Two tail test p - value n : 2 2P X x if x = n : 2 2P X x if x Right tail test @ > @ < Left tail test v. Make a decision P X x 1 P X x 1 P X x Example 5.1: The following data constitute a random sample of 15 measurement of the octane rating of a certain kind gasoline: 99.0 102.3 99.8 100.5 99.7 96.2 99.1 102.5 103.3 97.4 100.4 98.9 98.3 98.0 101.6 Test the null hypothesis m0 98.0 against the alternative hypothesis m0 98.0 at the 0.01 level of significance. Solution: m0 98.0 99.0 102.3 99.8 100.5 99.7 96.2 99.1 102.5 103.3 97.4 100.4 + + + + + + + + + 98.9 98.3 98.0 101.6 + + 0 + Number of + sign, x = 12 Number of sample, n = 14 (15 -1) p = 0.5 1. H 0 : m 0 98.0 H1 : m 0 98.0 2. 0.01, Reject H o if p value < 0.01 3. From binomial probability table for x = 12, n = 14 and p = 0.5 X ~ b 14,0.5 , p value P X 12 1 P X 11 1 0.9935 0.0065 4. Since p value 0.0065 0.01 , thus we reject H 0 and conclude that the median octane rating of the given kind of gasoline exceeds 98.0 Example 5.2 A recent article in the school newspaper reported that the typical credit card debt of a student is $500. A researcher claims that the median credit card debts of students is different from $500. To test this claim, he obtain a random sample of 20 students enrolled at the college and asks them to disclose their credit-card debt. The results are as below. Test at 5% significance level. 6000 1060 250 200 0 0 580 400 200 1200 1000 800 0 200 0 700 400 250 0 1000 Solution: 6000 1060 250 200 0 0 + + - - - - + - - + 1000 800 0 200 0 700 400 250 0 1000 + + - - - + - - - + 1. 2. 3. 580 400 200 1200 H 0 : m0 500 H1 : m0 500 0.05, Reject H o if p value < 0.01 From binomial probability table for x = 8, n = 20 and p = 0.5, n 20 10, As x 8; 10 2 2 X ~ b 20,0.5 , p value 2 P X 8 2( 0.2517 ) 0.5034 4. p-value = 0.5034 >0.05, Do not reject 5. There is not sufficient evidence to support the researcher claim that the median credit-card debt of students is different from $500. Exercise 5.2 The PQR Company claims that the lifetime of a type of battery that is manufactures is different from 250 hours. A consumer advocate wishing to determine whether the claim is justified measures the lifetime of 24 of the company batteries, the results are listed below. Determine whether the company claim is justified at 5% significance level. 271 262 252 230 288 198 275 282 225 284 219 236 291 253 224 264 253 295 216 211 Ans: Do not reject Paired Sample Sign Test Procedure: 1. Calculate the difference, di xi yi and record the sign of d i 2. i. Calculate the number of + sign and denoted as x ii. The number of sample, denoted by n (discard/ignore data with value 0) * probability is 0.5 (p = 0.5) 3. Run the test i. State the null hypothesis and alternative hypothesis ii. Determine the level of significance iii. Reject H 0 if p value iv. Determining the p value for the test for n, x and p = 0.5 from binomial probability table base on type of test being conducted. Sign of H 0 Right tail test = Left tail test < Two tail test v. Sign of H1 Make decision > p - value 2P X x P X x 1 P X x 1 P X x Sign of H 0 Sign of H1 Two tail test p - value n : 2 2P X x if x = n : 2 2P X x if x Right tail test @ > @ < Left tail test v. Make a decision P X x 1 P X x 1 P X x Example 5.3: 10 engineering students went on a diet program in an attempt to loose weight with the following results: Name Weight before Weight after Abu 69 58 Ah Lek 82 73 Sami 76 70 Kassim 89 71 Chong 93 82 Raja 79 66 Busu 72 75 Wong 68 71 Ali 83 67 Tan 103 73 Is the diet program an effective means of losing weight? Do the test at significance level 0.10 Solution: Let the sign + indicates weight before – weight after > 0 and – indicates weight before – weight after < 0 Thus Name Weight before Weight after Sign Abu 69 58 + Ah Lek 82 73 + Sami 76 70 + Kassim 89 71 + Chong 93 82 + Raja 79 66 + Busu 72 75 - Wong 68 71 - Ali 83 67 + Tan 103 73 + 1 . The + sign indicates the diet program is effective in reducing weight H 0 : m1 m2 H1 : m1 m2 2. 0.10 . So we reject H 0 if p value 0.10 3. Number of + sign, x 8 Number of sample, n 10 p 0 .5 P X 8 1 P X 7 1 0.9453 0.0547 4. Since p value 0.0547 0.10 . So we can reject H 0 and we can conclude that there is sufficient evidence that the diet program is an effective programme to reduce weight. Exercise 5.2: 8 panels of wood are painted with one side of each panel with paint containing the new additive and the other side with paint containing the regular additive.The drying time, in hours, were recorded as follows: Panel 1 2 3 4 5 6 7 8 Drying Times New Additive Regular Additive 6.4 5.8 7.4 5.5 6.3 7.8 8.6 8.2 6.6 5.8 7.8 5.7 6.0 8.4 8.8 8.4 Use the sign test at the 0.05 level to test the hypothesis that the new additive have the same drying time as the regular additive. Ans: Do not reject Mann-Whitney Test Mann-Whitney Test is a nonparametric procedure that is used to test the equality of two population medians from the independent samples. To determine whether a difference exist between two populations Sometimes called as Wilcoxon rank sum test 1. Null and alternative hypothesis H0 H1 Rejection area Two tail test Left tail test Right tail test m1 m2 m1 m2 m1 m2 m1 m2 W TL m1 m2 m1 m2 W TL ,TU W TU Hypotheses to be tested are: H 0 : The distributions for populations 1 and 2 are identical H1 : The distributions for populations 1 and 2 are different (two tailed test) H1 : The distributions for populations 1 and 2 lies to the left of that population 2 (left tailed test) H1 : The distributions for populations 1 and 2 lies to the right of that population 2 (right tailed test) 1.Rank all n1 n2 observations from small to large. 2. Test Statistic: Find T1 (Rank for sample 1). 3. Rejection region : Reject H 0 ,if test statistic less than critical value. Critical value: TL =get from table Wilcoxon Sum Rank Test table. TU n1 ( n1 n2 1) TL 4. Conclusion. Example 5.4: Data below show the marks obtained by electrical engineering students in an examination: Gender Marks Male Male Male Male Female Female Female Female Female 60 62 78 83 40 65 70 88 92 Can we conclude the achievements of male and female students identical at significance level 0.1 Solution: 1. Gender Marks Rank Male Male Male Male Female Female Female Female Female 60 62 78 83 40 65 70 88 92 2 3 6 7 1 4 5 8 9 H 0 : Male and Female achievement are the same H1 : Male and Female achievement are not the same 2. Test Statistic: We have n1 4, n2 5, W T1 R1 2 3 6 7 18 3. From the table of Mann Whitney test for 0.1 two tail test , n1 4,n2 5, so critical value 13,32 TL n 4, m 5, 0.05 13 (Refer Table) TU 4(4 5 1) 13 27 13, 27 4. Reject H 0 if W a,b or W 13,27 5. Since 18 13,27 , thus we fail to reject H 0 and conclude that the achievements of male and female are not significantly different. Exercise 5.3: Using high school records, Johnson High school administrators selected a random sample of four high school students who attended Garfield Junior High and another random sample of five students who attended Mulbery Junior High. The ordinal class standings for the nine students are listed in the table below. Test using Mann-Whitney test at 0.05 level of significance whether the Garfield Junior High is the same with Mulbery Junior High. Garfield J. High Mulbery J. High Student Class standing Student Class standing Fields 8 Hart 70 Clark 52 Phipps 202 Jones 112 Kirwood 144 TIbbs 21 Abbott 175 Guest 146 Rank Spearman A nonparametric procedure regarding the relationship of variables (X and Y). The procedure in Spearman Rank is have to rank the data before calculate the Spearman Rank. Example 5.5: The data below show the effect of the mole ratio of sebacic acid on the intrinsic viscosity of copolyesters. Mole ratio 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 Viscosity 0.45 0.20 0.34 0.58 0.70 0.57 0.55 0.44 Find the Spearman rank correlation coefficient to measure the relationship of mole ratio of sebacic acid and the viscosity of copolyesters. Solutions: X: mole ratio of sebacic Y: viscosity of copolyesters Mole ratio Viscosity R xi R yi 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.45 0.20 0.34 0.58 0.70 0.57 0.55 0.44 8 7 6 5 4 3 2 1 4 1 2 7 8 6 5 3 di R xi R yi 4 6 4 -2 -4 -3 -3 -2 di 2 16 36 16 4 16 9 9 4 T = 110 Thus 6 110 6T rs 1 1 0.3095 2 8 64 1 n n 1 which shows a weak negative correlation between the mole ratio of sebacic acid and the viscosity of copolyesters Exercise 5.4: The following data were collected and rank during an experiment to determine the change in thrust efficiency, y as the divergence angle of a rocket nozzle, x changes: Rank X 1 2 3 4 5 6 7 8 9 10 Rank Y 2 3 1 5 7 9 4 6 10 8 Find the Spearman rank correlation coefficient to measure the relationship between the divergence angle of a rocket nozzle and the change in thrust efficiency. Exercise 5.5 Suppose eight elementary school science teachers have been ranked by a judge according to their teaching ability and all have taken a “national teachers’ examination.” Calculate the Spearman rank correlation coefficient to measure the relationship between Judge Rank (xi ) and Test Rank (yi ). Judge Rank ( xi ) Test Rank ( yi ) 7 1 4 5 2 3 6 4 1 8 3 7 8 2 5 6