Download Tests for Normality and Other Goodness-of-Fit Tests

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Statistics
27 3
TESTS OF NORMALITY AND OTHER GOODNESS-OF-FIT TESTS
Ralph B. D' Agostino, Albert J. Belanger, and Ralph B. D' Agostino Jr.
Boston University Mathematics Department, Statistics and Consulting Unit
Probability plots and goodness-of-fit tests are
useful tools in detennining the underlying disnibution of
a population (D'Agostino and Stephens, 1986. chapter 2).
Probability plotting is an informal procedure for describing
data and for identifying deviations from the hypothesized
disnibution. Goodness-of-fit tests are formal procedures
which can be used to test for specific hypothesized
disnibutions. We will present macros using SAS for
creating probability plots for six disnibutions: the uniform.
normal, lognonnai, logistic, Weibull. and exponential. In
addition these macros will compute the skewness
<.pi;> .
kurtosis (b2), and D' Agostino-Pearson Kz statistics for
testing if the underlying disttibution is normal (or
·lognormal) and the Anderson-Darling (EDF) A2 statistic
for testing for the normal, log-normal, and exponential
disnibutions. The latter can be modified for general
distributions.
standardized deviates, :z;. on the horizontal axis. In table
I, we list the formulas for the disnibutions we plot in our
macro. If the underlying distribution is F(x), the resulting
plot will be approximately a straight line.
TABLE 1
Plotting Fonnulas for the six distributions plotted in our macro.
(PF(i-0.5)/n)
Distribution
Unifonn
cdf F(x)
Vertical
Axis
x-11 for 1'<%<1' +o
G
PROBABILITY PLOTS
~
P...~
~;j
··I( i-3/8)
11+1(4
In(~~
Say we desire to investigate if the underlying
cumulative disnibution of a population is F(x) where this
disnibution depends upon a location parameter )l and scale
plll31lleter a. not necessarily the mean and standard
deviation. Further. let
Weibull
1-exp( -(.!.t)
Logistic
[I <ap{-11'(%-jl)/oi3>J·1
ln(lw)
6
Horizontal
Axis 7,
II
·-1( 1-3/8)
11+1{4
In( -In( 1-p,))
%-1'
F(z)=G(-)=G(z)
Exponential
0
where z=(x-p)/a. Further, say we have a random sample
of observations of size n with ordered observations
~n~...~Dl" A probability plot is a plot of
X(o
011
~=G- 1 (F.f~u))=G- 1 (p~
o·•o
where
is the inverse transformation of the
standardized disttibution of the population (hypothesized
distribution) under consideration and F.O is the empirical
cumulative defined here as:
i~.S
F~'=p,=•
(I)'
n
(1)
(see D'Agostino and Stephens. 1986, p 34). In our plots
we place the data (~il) on the vertical axis and the
l-ap( -%/6))
-ln(l-p,)
The macro PROBPLOT takes as input the data
set and produces probability plots for the six distributions
mentioned above. We use the rank procedure to order the
data to produce the ordered .standardized rants. When
observations are equal we use the means of the rants
(ties=mean option)as in D' Agostino and Stephens, chapter
2. Using these rants, i, we compute p,=(i-0.5)/n and then
the inverse transformation distributions for the six
dislributions. Probability plots are then produced.
Since the normal probability plot is the most
widely used we describe it in detail now. This plot
consists of the ordered observations on the vertical axis
and the standard normal deviates on the horizontal axis.
We use Blom's approximation when defining the nonnal
cumulative in order to enhance the linearity of the
NESUG '91 Proceedings
27 4
Statistics
plot. The plot is thus
XIII on Z=e-1( i-3/8)
n+l/4
where ~ 0 is the ith ordered observation from the ordered
sample
may reflect the presence of outliers, mixlmes in data, or·
truncation (censoring) in the data. The reader is referred
to D' Agostino and Stephens (1986) chapter 2 and
D' Agostino, Belanger and D' Agostino (1990) for further
details.
Probability plots are only informal techniques for
evaluating the underlying distribution of data. Next we
provide several statistical tests which provide a more
formal approach.
GOODNESS-OF-FIT TESTS
and Z is such that
j-3/8
z 1
--=!
n +1/4 -.;2ft
--'-<!
z'
2dJ&
for i:l,...,n
The figure in Appendix A contains a nonnal
probability plot of sample data with the expected sll'aight
line going through the +'son the graph. In programming
the macro to create this plot we took advantage of two
options in the Proc Rank procedure. The fust. was the
"ties=mean" option which chooses the mean rank when
there· are observations with the same value (see
D' Agostino and Stephens. chap 2 for further discussion)
and second the "normal=blom" option which will find the
standardized cumulative nonnal Blom rank automatically.
The pagesize and linesize options allowed the axis to be
wider than the traditional Proc Univariate nonnal
probability plot.
For the lognormal distribution we provide two
plots, the first after taking logs of the raw data and the
second after taking the logs of (observed data- estimated
lambda ). Lambda corresponds to the third parameter of a
three parameter lognormal distribution whose density is
Unless lambda is close to zero. the probability plot will
not be a straight line for a lognormal distribution when
one takes the logs of the data. The macro will
automatically produce both plots and gives as output the
estimated value of lambda (D' Agostino and Stephens,
1986, p. 53) so the user can decide which plot is more
appropriate. If the raw data contains values less than or
equal to zero. the macro will automatically add the
absolute value of the minimum plus .01 to each value 01
the data set for calculating the logs of the data.
Probability plots will fonn approximately a
straight line if the underlying distribution is the
hypothesized distribution. Deviations from linearity help
to determine properties of the underlying distribution such
as if it is skewed and/or thick tailed. Other deviations
NESUG '91 Proceedings
A population, or its random variable X. is said to
--
be normally distributed if its density function is given by
Here p and o are the mean imd standard deviation,
respectively, of it. Of interest here are the third and
fourth standardized moments given by
and
B:= E(X-~&) =E{X-~&f
[E(X-11ff
a4
4
. where E is the expected value operator. These moments
measure skewness and kurtosis, respectively, and for the
normal distribution they equal 0 and 3, respectively. A
positive third moment correspond to a skewness to the
right (ie a longer right tail) and a negative skewness
corresponds to skewness to the left. Kurtosis, (the word
means curvature) is a measure of tail thickness. A
kurtosis larger than 3 on a unimodal distribution indicates
thicker or heavier 1llils than the normal distribution, while
kurtosis less than three on a unimodal distribution
indicates lighter tails than the normal.
The sample estimates of these moments have
been shown to be useful statistics to test whether data is
normally distributed (D' Agostino et al 1990). For a
sample of size n, ~.... .X,. the sample estimates
of
and
.pr;
and
B:
are respectively.
Statistics
These are related to
following:
where
If:'
v~~·
and
x
is the sample me311
t·2)
ll(n-1)
(n-2)(n-3)
and
bnosis I . n(n .. l)L (X-Xyt _ 3(n-lf
and
bz
via the
ll1
and
~"' (n+l)(n-1) 8%+
i=EX,fn.
Values of 0 (for the third moment) and 3 (for the fourth
moment) would indicate that the underlying population of
a data set was normally distributed. Their expected values
under normality are 0 and 3(n-l)/(n+l) respectively.
These statistics can be used to test. formally if the
underlying distribution is normal (D' Agostino and
Stephens. 1986, chapter 9). If they lead to rejecting the
normal distribution they automatically indicate the type of
nonnormality present in the data. For instance, if the third
moment is negative this indicates that the data is
negatively skewed or if the fourth moment is greater than
3(n-l)/(n+l) this indicates heavy tails in the population
distribution. Thus, the signs and magnitude of these
statistics are both useful here.
We present tests for normalitv using these
statistics in our macro as well as an omnibus te~t using the
K1 statistic. (Omnibus here means that the test will detect
deviations from normality due to either skewness or
kurtosis). Much of the programming for these tests
involved finding the third and fourth moments using the
output from SAS's Proc Univariate procedure. The
skewness and kurtosis statistics calculated in the procedure
· are the Flsher g statistics defined as:
.fDt
27 5
3(n-l)
(n+l) •
Thus, once we transformed the statistics we can perform
the normality tests.
The second type of formal tests we programmed
into the macro are EDF (Empirical Distribution Function)
tests. For a random sample of size n, with data X 1••••
and the order statistics defmed as ~ 1 ,SX(2,s··::;~.,; let the
distribution of X be F(x). EDF statistics measure the
difference between F.(x) and F(x) where Fu(x) is defined
here by:
.x•.
more precisely
Note, F.(x) here is defined differently than for the
probability plots (formula ( 1)).
.
In our macro we used the Anderson-Darling
(1954) A1 statistic which uses a quadratic measure of
discrepancy between F.(x) and F(x) when it is calculated.
This test falls in the class given by the Cramer-von Mises
family
z (n·l)(n-2)(n-3)r (n-2)(n-3)
-'Nhere
where
is the sample variance.
-.oo
is [{F(x) }{ 1-F(x))]'1•
See D' Agostino
and Stephens chaprer 4 for an in depth di.w·nssion of EDF
statistics.
In order to compute the A2 statistic, we used the
computing formulas suggested in D' Agostino and Stephens
NESUG '91
Proceedings
276
Statistics
(chap. 4). Using the Probability Integral Transformation
(PIT), z,;F(x), we know that if F(x) is the true distribution
of X 1hen Z will be uniformly distributed on [0,1). We
calculate values of Z,=F(X;) i=l •.•.,n from our sample X1,
... .X.,. Fe'(z), the EDF of the values of Z, is then found.
Using these values we compute A2 as follows:
We compute this statistic for testing the normal.
For the
lognormal. and exponential distributions.
statistic
the
exponential distribution we calculated
for
estimates
with
assuming the origin was zero and then
1986,
Stephens.
the origin. We follow D'Agostino and
procedure on page 141. where we have the exponential
distribution
For the first case we assume n is 0 and use the sample
mean for ~. and in the second we estimate n and ~ using
the formulas
uniform, logistic, exponential and Weibull all are far from .
linear indicating that these data probably do not follow
any of these distributions.
When we look at what the goodness-of-fit tests
produced. we see that the skewness test for normality is
rejected (p=.Ol), while the kurtosis test is not (p=.59).
The K 2 test, which combines the two, rejects normality at
2
p=.03. The Anderson-Darling A tests confirm these
results, with the statistic to test for normality having a p
value <.005. The test for the lognormal distribution is not
rejected with .15<p<.25. Finally, the tests· for the
exponential distribution are rejected at p<.Ol. for the cases
when the origin is assumed known and when it was
estimated.
From this output, the investigator might decide
that the data were lognormally distributed and proceed
with an analysis after raking logarithms of the data. The
output from the macro can be found in the figures and in
table 3 in Appendix A. The macro itself is listed in
Appendix B.
Thus, we have provided a useful macro written
for use with SAS which will provide both a graphical
display of the data as well as several formal tests for
determining the underlying distribution of a population.
Table 2
Systolic Blood Pressure Data from the Framingham
Heart SIUdy
For the lognormal distribution we calculate the EDF
statistic only without the l we computed in the
probability plotting. We also present the A2 statistics both
in their unmodified and modified forms where the
modifications are made using suggestions by Stephens in
D' Agostino and Stephens (chap. 4). The user could test
for other distributions as well if they modify the macro so
they can input the Z; values for the distribution of interest
An example of the output from this macro is
presented below. Table 2 contains a sample of 67 systolic
blood pressures from a sample of 67 subjects from the
Framingham Heart Srudy. The data are presented in a
stem-and-leaf plot with descriptive statistics, also.
From examining the seven probability plots
computed one can see that the data appears most likely to
follow the lognormal distribution. After examining the
normal probability plot one can see that the data seems to
form two straight lines, one for z-values below zero, and
one for z-values above zero. This could be an indication
of a mixture of two normals. The lognormal probability
plot forms nearly a straight line, however when we
examine the lognormal plot with estimate of lambda the
data form a sttaight line except at the lower tail. This
should prompt the investigator to check the lowest point
as a possible outlier. Finally. the other four plots,the
NESUG '91 Proceedings
Slem-11111-leaf plot
21
20
19
18
17
16
15
14
13
12
11
10
9
Nwnber
0
08
0006
003
0008
0046
00124688
000002244689
000024467888
0000446889
046888
0
2
4
3
4
4
8
12
12
10
6
I
The clescriplive 118tiJtics an:118111ple size, n=67, mem:l37.15; slanciard
devialion=25.63; stewnes....768; ll:wlosia=3.08.
This work was funded by a grant from the
National Heart, Lung and Blood Instirute to R. B.
D' Agostino (RO 1 HL 40423..()3)
REFERENCES
D'Agoslino, R.B., Belanger, AJ., and D' Agostino Jr., R.B., (1990)
"A Suggestion for Using Powerful and Iofonnative Tests of
Normality." The Ameticao St.atislician,44.316-321.
D'Agostino. R.B., and Stephens, M.A. (1986), GoodMss-of·fit
T«hniquu, New York: Man:e1 Dekker.
277
Statistics
APPENDIX A: OUTP\IT FROM THE MACRO
0
I
....
..
·-.Ii
....i
.,
I
I
.....I
I
, I
I
5
t.
0
G
I
0
!
0
t.& •
a
.. .....
. ....II
!
ll
I
~
~
E
$
..
£
g
tl
s •.a •
E
ll
'I
E
g
.....I
...I
..
,. ··' I.
I
....
....I
'I
I
I
....
I
... .........
....
t.
-·
I
Q'
E
....I
....I
.... ...... .....
I
I
I
......
......
....
....
....
•••
1.~
1.~
J.J
J,3
I
•&o.l
•lol
•• J
loJ
lol
"''
.....
lol
loJ
J,J
.... I
"'
.....I
I
"
....I
:
I
"'
....I
.. I
c c
. ....I
....I
:
... .I
....
t.
0
--
4ol'"'
G
'1
5
I
K
G
5
!
1.1 •
t.
ll
A
ll
£
I
lol •
A
'7
t.
C'
... ... ... ... •.. ... ... ... ... ... ...
I
•••• •••• •"-1 ..,.. ....
...
••• ....
~..
•••
G~~%1'!~1111
••• J.t
UIG•IfOIIIQl. Z•'llU.ll1: (tAIIIIDAooUI
.. I
I
I
-i
0
I
s
E
II
'I
E
D
]
0
I
s
E
•
....I
1'
IC
a
~a u•:.I
1'
.r.
t.
Q'
IC
.... :I
-·I
.. i.
-I....
--II·
=.::
I
I
..
:
......
. I.
•1.1
1-VlU.lll:
... ... ... ... ... ... ...
~.J.L
•I.J
•1,1
'L•I
•\•I
.... 1
1.1
1.1
1.1
:,..,c-:r.:e :-,u:..""E
S...J
1.1
1.1
1.1
1 oDa v•r•
z-v;um:
cu~ o~
ranqe
278
Statistics
APPENDIX B: THE MACRO
~MA~O PROBPLOT(VAR.DATAI:
DATA: SET &DATA: T=1; KEEP T &:VAR;
PROC SORT: BY &:VAR:
PROC RANK TIES-MEAN OurooAA: VAR &:VAB:.
RANKS~
PROCUNIVAR.!ATE DATAaAA NORMAL PLOT:
VAR &:VAR; OU'I'PIJT OuralOC.STAT MIN•MIN NaN
MEAN-XBAR S'ID-S P:5_.., P9~5 MEDIAN-MEDIAN
SKEWNESS=01 ICURTOSIS-G2;
·•·• •&.a
•lei
·~·•
•Lol
...l
t.a
•·•
;..J
~.J
1.1
a.J
a• .a
W!%Str...:. :-U'.u,t,-;;
l obs vere
ou~ ~~
:anqe.
TABLE3
HORMAIJTY TI!ST I'OR VAIUAIILJ! SPP H-67
uh0.711392 SQRTBI .0.76821 Z.l.S6017 1"".0105
Z. O..S~OIJ p.o.599:S
02a41Mtj 82.-3.0&216
!t•o:!oOIISQ (: DP) • 6.!30"..!1 PoO.Ql:9
·-··----·----····---··-··--------··
DATA lOC.STAT: SET JOCSTAT; T•1;
UMBDAooi(P95•P5)-{MEDIAN•MEDIAN))((P9$+1'5-(2"MEDIAN));
ALPHA•Mnii-1/N; BETA.-N"'CBAR-MIN')J(N-1);
DROP P9S ~ MEDIAN:
DATA M: MERGE AA JOCSTAT: BY T:
LOGVARE=LOO&:VAR-LAMBDAI:
IF MlN>O niE."' LOOV AR=LOO(&:VARJ;
ELSE DO: LOGY AR=LOO(&:V AR+ABS(MIN)+.Ol):
FILE PRINT:
PUr 'WARNING SOME OFn!E DATA HAS VALUES <zOI All.' I
'DATA HAS BEEN TR.A.'ISFOR.'w!ED IN THE LOG-NORMAL A.'fO'I
'WEIBlJIL PLOTS BY MAKING AJ.J.. VALUES POSmVE.'; END:
CAJ.J.. SYMPurC'LAl>'IB' .LEFT(Pt..'TCLAMBDA.BESTI.)Il:
PROC UNIVARIATE DATADAA NOPRJNT; VARLOOVARE LOGVAR;
OU'I'PIJT Ot.rr•BB MEAN•LXBARE L.XBAR S1'DaLSE LS:
EDP 'IUrS USIHu AHlll!llSOH·DAIWNO A.SQUAJU!Il STAns!IC
CRmCAL VALliES I'OR NORMAl. DIStlUII1lriOH W!'IH MEAN AHll VAIUI.l<CI!
~
VA1UA.1112 •SPP
SIGNli'IC.U<a! LJ!VU AI..PIIA
.D5 .D25 .01 JlOS
.23 .15 .10
UPPER TAIL
.301 .0111 ,j61 All .7:52 .ITl l.a!S US.
LOWB TAIL
.119
.)It .249 .z:e .111 .Ill) .139
.so
BDP n'A'Itl'r!CI'OR1JIII NORMAL DIS11UBUUQHOIODD'II!Dl 1.1714
EDP n'A'Itl'r!C POl!. 1JIII !IOilMAL DIS11UBU'DQH (l1l6IOD1IIIID) 1~511
l!Dl' STA.mt!C !'OR 'Ill!! toG-NORMAL DISTRI3IITION (MODII'IED) D.n24
l!Dl' S'tA.'t!SnC 1'0111JIII t:.OG-HORMAL DIS'!liiBITDON (IJNMOD!FIEII) IUOii5
atmC\1. V ALliES POR 'Ill!! 1!XP0N!1mAL DISnllllltiOH. OIIIUIN
mowN AND SCAU1 ~
.:Z:S .:111
SICINII'ICA.HCI UV2L AIJ'HA
~ .10 .D5 .IIZS .D1 JlOS .D11Z5
UPPER TAIL
.1M .116 .916 1.DG 1.321 I ~I I.J59 l.lU ~Jol
!.OWD TAIL
.302 .311 .:111.2141 .lDI J1l ~:10
l!Dl' STA.'Itl'r!C P0R 1JIIII!XPONI!Il1IAI. DIS11UBITDOII lii.!IS3
1!llP STA.'Itl'r!C POR 'Ill!! I!XPOH1!HltAL DIS'!liiBITDOH ()IODII'II!III :11.1411
aaiCAI. V ALliES POR 'Ill!! !XP0M!1mA1. DISniiiU'IION. OIIJGIH
AHDSCAUitJICHOWior
M4f
.23
H
5
10
IS
::D
25
:10
100
.$OS
..ftS
.1125
.410
.110
- . .735
SICIIIIPICANCZ UV2L 1oLP11A
~5
.10 Jl5 .D1S .111
Ul'l'!ll TAIL
.sss .w .7%5 .... 319
.7.rl .920 1.11051 1.3S2
.720 .116 I.ID 1.191 1_.95
.7J7 .161 L.OQ 1.:47 l..sa:J
.D& .890 1.1197 I.JI7 1.635
.131 .965 u97 ~- 1.rn
.175 I.Dal I.:SO 1..510 I..ISS
.916 I.DQ I.J%1 1..591 I.?S.
I!S'tiMA1BS PORioLPIIA • l.J216 AND B!TA ••7.1636
EDP STA1lS'!lC !'OR '!HI! :e:xPOHI!NTL\!. DIS'!llllllrnOH 6..0107
EDP STA1lS'!lC I'OR 'Ill!! :e:xPOHI!NTL\!. DIS'!liiBliTIOH IMODIF!Eill ~I
NESUG '91 Proceedings
DATA BB: SET BB: T•1;
DATA M; MERGE M BB: BY T;
Ploo(l4S}/N:
UNIPORMZooPi;
WEIB"UI.t.Z-LOO<·LOO<l·Pil>:
LOGIS'I7.oo(SQRT(3)13.141S926Sl"LOG(Pi/(1·PI)l:
EXPONENZ--1..00(1-Pil:
NORMALZI-PROBNORM((&:VAR·XBAR).IS);
LOONCIIIZI-PROBNORMfiLOGVAR-UBARJILS);
EXPOHI!ZI>o1-EXP(-(&:VARIXBARll:
EXI'OHMZI-1-EXP(-((&VAR-AIJ'HA)IBEI'Al);
.NORMAU.S-(lJN)"((l"_."'_·I)"LOO<NORMALZI)+
<Z.N+1-2•_N_.)"LOO( 1-NORMAU!));
LOGNORAS-1:1/Nl"((l•_.'f_-ll"LOGILOGNORZI>+
Cl"N+1·2•_.'f..)"LOG(I·LOGNORZn);
EXPONEAS-< 1/N)"((l"_.'f_-1l 0 LOOIEXPONEZn+
(2-N+ 1-2•_."'..)"LOG(l-EXPONEZill:
EXI'ONMASoo(1/Nl"((2" _."'_·ll"LClG(EICPONMZI>+
.
(2"N+1-2"_!'IJ•LOG(1-EXPONMZill:
PROC MEANS NOPI!.OO; VAR NORMALAS LOGNORAS EXPONEAS
EXPONMAS:
OU'I'PUT OUT•ANDERSON
SUM•NASQUARE LASQUARE
EASQUARE EMA.SQUAR N-N:
DATA ANDERSON; MERGE ANDERSON XXSTAT; BY N:
NASQUARE-N-NASQUARE:
MASQUARE-NASQUARE•(i+(.75~•N)));
LASQUARE-N-LASQUARE:
MLASQUAR-LASQUARE"f I +(.751N)+(l.:!.51(N•N)));
EASQUARE-N-EASQUARE:
MEASQUAR-EASQUARE"fl+(.6/Nl);
EMASQIJAR-N-EMASQUAR;
MMEASQUA-£MASQUAR•(!+(.6/Nl);
DROPT J'REQ.._nPE..;
DATA; SEI'XXSTAT:
DO _z_a-1.0.1; _X..•XBAR+_z_"S:OUTPtJI':
END: KEEP ..X....):_;
DATA: MEROE AA J,..o.S'I'..;
Statistics
PROC RANK T!ES=MEAN NORMALooBLOM OUT=>AA:
VAR .tVAR LOGVAR LOGVARE;
RANKS BLOMRANK LOOBLOM LOGEBLOM;
OPllONS LS-10 PS>D60:
PROC PLOT NOL£GEND:
LABEL BLOMRANK-'NORMALIZED RANK'
LOGBLOM •'LOG-NORMAL RANK'
LOGEBLOM•"LOG-NORMAL RANK (LAMBDA..t.LAMII)"
.tVAR a'OBSERVED VALUE'
LOGY AR •'LOG OF OBSERVED VALUE'
LOGYARE •'LOG OF OBSERVED VALUE (USING LAMBDA)'
UNIFORM:z.a'UNIFORM RANK'
WEIBUJ..I.Z,.'WEIBULL RANK'
LOGISTZ a'LOGISnC RANK'
EXPONENZ='EXPONENTIAL RANK':
PLOT <I:VAR*BLOMRANK-'•• .J<..•.J._•'+'/
OVERLAY HAXIS-3 TO 3 BY .5:
PLOT LOGVAR*LOGBLOM•'*' I HAXIS-3 TO 3 BY .5;
PLOT LOGVARE*LOGEBLOMa'*' /HAXIS • -3 TO 3 BY .5;
PLOT AVAR•l.JNIFORMZ-'*' I HAXIS=O TO I BY .1:
PLOT .tVAR*LOGISTZ='*' I HAXIS-3 TO 3 BY .5;
PLOT AVAR•EXPONENZ-'*' I HAXIS..O TO 4 BY .S;
PLOT LOGVAR*WEIBUll.Z='*' I HAXIS~3 TO 3 BY .5;
DATA ANDERSON; SET ANDERSON;
SQRTBI=(N-2)1SQRT(N*(N-l))*GI;
Y.SQRTB I*SQRT((N+ ll*(N+3 )!(6*(N-2)));
BETA2-3*(N*N+27*N-70)*(N+I)*(N+3)/
((N·2)"(N+Sl'"!N+7)*(N+9));
W.SQRT(·l+SQRT(2*CBETA2·1)));
OELTAai/SQRT(LOG(W));
ALPHA.SQRTC2/(W*W-1));
Z..B1•DELTA •LOGCY/ALPHA+SQRT!CY/ALPHA)**Z+l));
82-3*(N-l)!(N+l}+(N·2)*(N-3)K(N+l)*(N-1))*G2:
MEANB2a3*(N·llKN+I);
VARB2a24*N*(N-2)*(N-3)!({N+I)*(N+I)*(N+3)'"(N+S));
X-<B2·MI!ANB2)1SQRT(V ARBl);
MOMENTa6*(N*N-5*N+2)1((N+7)*(N+9))*
SQRT(6*(N+3)"(N+5)1(N*(N-2)*(N-3))):
Aa6+8/MOMENT*C21MOMENT+SQRT(l+41(MOMENT**Z))):
ZJI2oo( 1·2K!I*A).(( 1·2/A)I
( 1+X*SQRT(2/(A-4))))**(113))/SQRT(2/(9*A));
PRZBI-.2*(1-PROBNORM(ABS(Z..Bl)));
PRZB2a2*(1·PROBNORM(ABS(Z..B2)));
an'l'Uf•Z..BI*Z..B1-+Z..B2"Z..B2:
PRCHI-1-PROBCHI(OU'l'EST,l);
l'll.E PRINT:
PUr @2 "NORMAUTY TEST FOR VARlABLE I<VAR "
N•/
@20 G1..S..S @33 SQRTBI..S..S @.50 OZ.'Z..B1 8.5 • P.' PRZBI 6.41
@20 m..&..S @33 BW..S @.50 'Za' Z...B2 8.5 ' P.' PRZB2 6.41
@33 'K**2aCHISQ (2 OF) •' OU'l'EST 8.5 ' P.' PRCHl 6.41
@10'...· - · - - · - - - · - · - - · - · - -....- -......... ,,
@ 10
'EDF TESTS USING ANDERSON-DARLING A..SQUARED
STATISTIC'/
@5 'CRITICAL VALUES FOR NORMAL DISTRIBUTION wrm
MEAN AND VARIANCE'/
@20 "UNKNOWN
VARIABLE • I<VAR "/
@W'
~
@10 'I
SIGNIFICANCE I..EVEl.. ALPHA
I'/
@10 '1.50 .:!5 .15 .10 .05 .02!1 .01 .005 1'/
@10 'I
UPP£R. TAIL
1'/
@IO '1.341 .470 .561 .631 .752 .rn 1.035 1.159 r1
@10 'I
LOWER TAIL
1'/
@10 'I .341 .249 .226 .188 .160 .139 .119
1'/
@10'
'II
@10 'EDF STATISTIC FOR 1liE NORMAL DISTRIBU'TlON
(MODIFIED) '
279
MASQUARE 6.4 I
@110 'EDF STATISTIC FOR 1liE NORMAL DISTRIBUTION
<UNMODIFIED) '
NASQUARE 6.4 II
@110 'EDF STATISTlC FOR 1liE LOG-NORMAL DISTRIBUTION
(MODIFIED) '
MLASQUAR 6.4/
@110 'EDF STATISTIC FOR 1liE LOG-NORMAL DISTRIBUTION
<UNMODIFIED) '
LASQUARE 6.4 II
@10 ................- .........................- -........,
@110 'ClUTICAL VALUES FOR 1liE EXPONENTIAL
DISTRIBUTION, ORIGIN '/
@120 'KNOWN AND SCALE UNKNOWN'/
@110 ·--~~=-----=---------'!
@110 'I
SIGNIFICANCE LEVEL ALPHA
I'/
@10 'I .:!5 .20 .15 .10 .0$ .02!1 .01 .005 .002.51'/
@110 'I
UPPER TAIL
I'/
@10 'I .736 .816 .916 1.062 1.3211.591 1.959 2.244 2..5341'/
@110 'I
LOWER TAIL
1'/
@110 '1.342 .312 .2&0 .249 .20& .178 .150
I'/
@IW'
W
@110 'EDF STATISTIC FOR 1liE EXPONENTIAL DISTRIBUTION'
EASQUARE 6.4/
@110 'EOF STATISTIC FOR 1liE EXPONENTIAL DISTRIBU'TlON
(MODIFIED) •
MEASQUAR 6.4 I
@l(Y ... _ ................................... ...__........ ,
@110 'CRII'ICAL VALUES FOR 1liE EXPONENTIAL DISTRIBUTION,
ORIGIN'/
@120 'AND SCALE UNKNOWN'/
@W
~
SIGNIFICA.'IICE LEVEL A1J'HA
1'/
N .:!5 .15 .10 .OS .02!1 .01 1'/
UPPER TAIL
1'/
'
.460 .j55 .621 .725 .848 .9&9 1'/
10 .j4S .660 .747 .920 1.068 1.352 I'/
IS
.575 .720 .816 1.009 1.198 1.495 1'/
20 .60& .757 .861 1.062 1.267 1.580 1'/
25
.625 .784 .890 1.097 1.317 1.635 1'/
so .6&0 .838 .965 1.197 1.440 1.775 1'/
100 .710 .815 l.OOS 1.250 1.510 1.85.S 1'/
'@W iDfiaity .736 .916 1..062 1.321 1..591 1.959 I'/
@10
'//
@110 'ESTIMA11lS FOR ALPHA • ' ALPHA 7.4 'AND BETA a '
BETA 7.41
@10 'EDF STATISTIC FOR 1liE EXPONBNnAL DIS11UBUTION'
EMASQUAR 6.4 .I
@10 'EDF STATISTIC FOR 1liE EXPONI!NTIAL DIS11UBUTION
@110
@110
@10
@10
@10
@10
@10
@10
@10
@10
'N='
(MODIFIED) •
MMEASQUA 6.4;
RUN;
...MEND PROBPLOT;
~
~
,. Euaaple of. - t o Cllec:ute lbo ......... ollcwe: .,
~
~
'M'ROBPLOT(SPF.OATAl)
*/
~
NESUG '91 Proceedings