Download Mann-Whitney Test

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
NONPARAMETRIC STATISTICS
In general, a statistical technique is categorized as NPS if it has at
least one of the following characteristics:
1.
2.
3.



The method is used on nominal data
The method is used in ordinal data
The method is used in interval scale or ratio scale data but there is no
assumption regarding the probability distribution of the population
where the sample is selected.
Sign Test
Mann-Whitney Test
Spearman’s Rank Correlation Test
Nonparametric versus Parametric Statistics

Parametric statistical procedures are inferential procedures
that rely on testing claims regarding parameters such as  ,  , p.
In some circumstances, the use of parametric procedures
required
some certain assumptions that need to be
satisfied such as check for the normality distribution.

Nonparametric statistical procedures are inferential
procedures that are not based on parameters and required
fewer assumptions that need to satisfy to perform tests.
They do not required the population to follow a specific
type of distribution or in other words known as
distribution free procedures.

Inferences : Parametric makes inference regarding the
mean.
Nonparametric makes inference regarding the median.
Advantages of Nonparametric



Most of the tests required very few assumptions.
The computation is fairly easy.
The procedures can be used for count data or rank data.
Disadvantages of NonParametric


Because of there are fewer requirements that must be
satisfied to conduct these tests, researches sometimes use
these procedures when parametric procedures can be
used.
Less efficient than parametric procedures. Required a
larger sample size for nonparametric procedure.
Sign Test




The sign test is used to test the null hypothesis and whether or not two
groups are equally sized.
In other word, to test of the population proportion for testing p  0.5
in a small sample (usually n  20 )
It based on the direction of the + and – sign of the observation and not
their numerical magnitude.
It also called the binomial sign test with the null proportion is 0.5 (Uses
the binomial distribution as the decision rule).
A binomial experiment consist of n identical trial with probability of
success, p in each trial.The probability of x success in n trials is given by
P ( X  x )  nC x p x q n  x ;
x  0,1, 2....n
n
n!
where Cx    
 x   n  x ! x !
X ~ B(n, p) where p  0.5
n
 There
are two types of sign test :
1. One sample sign test
2. Paired sample sign test
Hypothesis Testing:
Two Tailed
Left-Tailed
Right-Tailed
H0 : M  M 0
H0 : M  M 0
H0 : M  M 0
H1 : M  M 0
H1 : M  M 0
H1 : M  M 0
One Sample Sign Test
Procedure:
1.
Put a + sign for a value greater than the median value
Put a - sign for a value less than the median value
Put a 0 as the value equal to the median value
2.
Calculate:
i.
The number of + sign, denoted by x
ii. The number of sample, denoted by n (discard/ignore the data with
value 0)
3. Run the test
i. State the null and alternative hypothesis
ii. Determine level of significance, 
iii. Reject H 0 if p  value  
iv. Determining the p – value for the test for n, x and p = 0.5, from
binomial probability table base on the type of test being conducted
Sign of H 0
Sign of H1
Two tail test
p - value
n
:
2
2P  X  x 
if x 
=

n
:
2
2P  X  x 
if x 
Right tail test
@
>
@
<
Left tail test
v. Make a decision
P  X  x   1  P  X  x 1
P  X  x
Example 5.1:
The following data constitute a random sample of 15 measurement of the
octane rating of a certain kind gasoline:
99.0 102.3 99.8 100.5 99.7 96.2 99.1 102.5 103.3 97.4 100.4
98.9 98.3 98.0 101.6
Test the null hypothesis m0  98.0 against the alternative hypothesis m0  98.0
at the 0.01 level of significance.
Solution:
m0  98.0
99.0 102.3 99.8 100.5 99.7 96.2 99.1 102.5 103.3 97.4 100.4
+
+
+
+
+
+
+
+
+
98.9 98.3 98.0 101.6
+
+
0
+
Number of + sign, x = 12
Number of sample, n = 14 (15 -1)
p = 0.5
1. H 0 : m 0  98.0
H1 : m 0  98.0
2.   0.01, Reject H o if p  value < 0.01
3. From binomial probability table for x = 12, n = 14 and p = 0.5
X ~ b 14,0.5  , p  value  P  X  12   1  P  X  11  1  0.9935  0.0065
4. Since p  value  0.0065  0.01   , thus we reject H 0 and conclude that the
median octane rating of the given kind of gasoline exceeds 98.0
Example 5.2
A recent article in the school newspaper reported that the
typical credit card debt of a student is $500. A researcher
claims that the median credit card debts of students is
different from $500. To test this claim, he obtain a random
sample of 20 students enrolled at the college and asks
them to disclose their credit-card debt. The results are as
below. Test at 5% significance level.
6000
1060
250
200
0
0
580
400
200 1200
1000
800
0
200
0
700 400
250
0
1000
Solution:
6000
1060
250
200
0
0
+
+
-
-
-
-
+
-
-
+
1000
800
0
200
0
700
400
250
0
1000
+
+
-
-
-
+
-
-
-
+
1.
2.
3.
580 400 200
1200
H 0 : m0  500
H1 : m0  500
  0.05, Reject H o if p  value < 0.01
From binomial probability table for x
= 8, n = 20 and p = 0.5,
n 20

 10, As x  8;  10
2 2
X ~ b  20,0.5  , p  value  2 P  X  8   2( 0.2517 )  0.5034
4. p-value = 0.5034 >0.05, Do not reject
5. There is not sufficient evidence to
support the researcher claim that the
median credit-card debt of students is
different from $500.
Exercise 5.2
The PQR Company claims that the lifetime of a type of
battery that is manufactures is different from 250 hours. A
consumer advocate wishing to determine whether the
claim is justified measures the lifetime of 24 of the
company batteries, the results are listed below. Determine
whether the company claim is justified at 5% significance
level.
271
262
252
230
288
198 275 282 225 284 219
236 291 253
224 264
253
295
216
211
Ans: Do not reject
Paired Sample Sign Test
Procedure:
1.
Calculate the difference, di  xi  yi and record the sign of d i
2.
i. Calculate the number of + sign and denoted as x
ii. The number of sample, denoted by n (discard/ignore data with value
0) * probability is 0.5 (p = 0.5)
3.
Run the test
i. State the null hypothesis and alternative hypothesis
ii. Determine the level of significance 
iii. Reject H 0 if p value  
iv. Determining the p value for the test for n, x and p = 0.5 from
binomial probability table base on type of test being conducted.
Sign of H 0

Right tail test
=

Left tail test

<
Two tail test
v.
Sign of H1
Make decision
>
p - value
2P  X  x 
P  X  x   1  P  X  x 1
P  X  x
Sign of H 0
Sign of H1
Two tail test
p - value
n
:
2
2P  X  x 
if x 
=

n
:
2
2P  X  x 
if x 
Right tail test
@
>
@
<
Left tail test
v. Make a decision
P  X  x   1  P  X  x 1
P  X  x
Example 5.3:
10 engineering students went on a diet program in an attempt to loose weight
with the following results:
Name
Weight before
Weight after
Abu
69
58
Ah Lek
82
73
Sami
76
70
Kassim
89
71
Chong
93
82
Raja
79
66
Busu
72
75
Wong
68
71
Ali
83
67
Tan
103
73
Is the diet program an effective means of losing weight? Do the test at
significance level   0.10
Solution:
Let the sign + indicates weight before – weight after > 0 and – indicates
weight before – weight after < 0
Thus
Name
Weight before
Weight after
Sign
Abu
69
58
+
Ah Lek
82
73
+
Sami
76
70
+
Kassim
89
71
+
Chong
93
82
+
Raja
79
66
+
Busu
72
75
-
Wong
68
71
-
Ali
83
67
+
Tan
103
73
+
1
. The + sign indicates the diet program is effective in reducing weight
H 0 : m1  m2
H1 : m1  m2
2.   0.10 . So we reject H 0 if p value  0.10
3. Number of + sign, x  8
Number of sample, n  10
p  0 .5
P  X  8  1  P  X  7   1  0.9453  0.0547
4. Since p  value  0.0547  0.10   . So we can reject H 0 and we can
conclude that there is sufficient evidence that the diet program
is an effective programme to reduce weight.
Exercise 5.2:
8 panels of wood are painted with one side of each panel with paint
containing the new additive and the other side with paint containing the
regular additive.The drying time, in hours, were recorded as follows:
Panel
1
2
3
4
5
6
7
8
Drying Times
New Additive
Regular Additive
6.4
5.8
7.4
5.5
6.3
7.8
8.6
8.2
6.6
5.8
7.8
5.7
6.0
8.4
8.8
8.4
Use the sign test at the 0.05 level to test the hypothesis that the new
additive have the same drying time as the regular additive.
Ans: Do not reject
Mann-Whitney Test
Mann-Whitney Test is a nonparametric procedure
that is used to test the equality of two population
medians from the independent samples.
 To determine whether a difference exist between
two populations
 Sometimes called as Wilcoxon rank sum test

1. Null and alternative hypothesis
H0
H1
Rejection area
Two tail test
Left tail test
Right tail test
m1  m2
m1  m2
m1  m2
m1  m2
W  TL
m1  m2
m1  m2
W  TL ,TU 
W  TU
Hypotheses to be tested are:
H 0 : The distributions for populations 1 and 2 are identical
H1 : The distributions for populations 1 and 2 are different (two tailed test)
H1 : The distributions for populations 1 and 2 lies to the left of that
population 2 (left tailed test)
H1 : The distributions for populations 1 and 2 lies to the right of that
population 2 (right tailed test)
1.Rank all n1  n2 observations from small to large.
2. Test Statistic:
Find T1 (Rank for sample 1).
3. Rejection region : Reject H 0 ,if test statistic less than critical value.
Critical value: TL =get from table Wilcoxon Sum Rank Test table.
TU  n1 ( n1  n2  1)  TL
4. Conclusion.
Example 5.4:
Data below show the marks obtained by electrical engineering students in an
examination:
Gender
Marks
Male
Male
Male
Male
Female
Female
Female
Female
Female
60
62
78
83
40
65
70
88
92
Can we conclude the achievements of male and female students identical at
significance level   0.1
Solution:
1.
Gender
Marks
Rank
Male
Male
Male
Male
Female
Female
Female
Female
Female
60
62
78
83
40
65
70
88
92
2
3
6
7
1
4
5
8
9
H 0 : Male and Female achievement are the same
H1 : Male and Female achievement are not the same
2. Test Statistic:
We have n1  4, n2  5, W  T1   R1  2  3  6  7  18
3.
From the table of Mann Whitney test for
  0.1  two tail test  , n1  4,n2  5, so critical value  13,32
TL  n  4, m  5,   0.05  13 (Refer Table)
TU  4(4  5  1)  13  27
 13, 27 
4.
Reject H 0 if W   a,b  or W  13,27
5.
Since 18  13,27  , thus we fail to reject H 0 and conclude that the
achievements of male and female are not significantly different.
Exercise 5.3:
Using high school records, Johnson High school administrators selected a
random sample of four high school students who attended Garfield Junior
High and another random sample of five students who attended Mulbery
Junior High. The ordinal class standings for the nine students are listed in the
table below. Test using Mann-Whitney test at 0.05 level of significance
whether the Garfield Junior High is the same with Mulbery Junior High.
Garfield J. High
Mulbery J. High
Student
Class standing
Student
Class standing
Fields
8
Hart
70
Clark
52
Phipps
202
Jones
112
Kirwood
144
TIbbs
21
Abbott
175
Guest
146
Rank Spearman

A nonparametric procedure regarding the
relationship of variables (X and Y).

The procedure in Spearman Rank is have
to rank the data before calculate the
Spearman Rank.
Example 5.5:
The data below show the effect of the mole ratio of sebacic acid on the
intrinsic viscosity of copolyesters.
Mole ratio
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
Viscosity
0.45
0.20
0.34
0.58
0.70
0.57
0.55
0.44
Find the Spearman rank correlation coefficient to measure the relationship of
mole ratio of sebacic acid and the viscosity of copolyesters.
Solutions:
X: mole ratio of sebacic
Y: viscosity of copolyesters
Mole ratio
Viscosity
R  xi 
R  yi 
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.45
0.20
0.34
0.58
0.70
0.57
0.55
0.44
8
7
6
5
4
3
2
1
4
1
2
7
8
6
5
3
di  R  xi   R  yi 
4
6
4
-2
-4
-3
-3
-2
di 2
16
36
16
4
16
9
9
4
T = 110
Thus
6 110 
6T
rs  1 
 1
 0.3095
2
8  64  1
n  n  1
which shows a weak negative correlation between the mole ratio of sebacic
acid and the viscosity of copolyesters
Exercise 5.4:
The following data were collected and rank during an experiment to
determine the change in thrust efficiency, y as the divergence angle of a
rocket nozzle, x changes:
Rank X
1
2
3
4
5
6
7
8
9
10
Rank Y
2
3
1
5
7
9
4
6
10
8
Find the Spearman rank correlation coefficient to measure the relationship
between the divergence angle of a rocket nozzle and the change in thrust
efficiency.
Exercise 5.5
Suppose eight elementary school science teachers have been ranked by
a judge according to their teaching ability and all have taken a “national
teachers’ examination.” Calculate the Spearman rank correlation
coefficient to measure the relationship between Judge Rank (xi ) and Test
Rank (yi ).
Judge Rank
( xi )
Test Rank
( yi )
7
1
4
5
2
3
6
4
1
8
3
7
8
2
5
6