Download Using the SAS® System for Analysis of Means

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Using SAS® Software for the Analysis of Means
Donna O. Fulenwider, SAS Institute Inc., Cary, NC
step-by-step explanation of the SAS code presented by this tutorial, should provide a guide for the implementation of the ANOM
technique using SAS software. The SAS code for the examples
used in this tutorial is given in Appendices 3 and 4.
ABSTRACT
This tutorial is designed as a sequel to the presentation entitled
'Application of the Analysis of Means' given by Dr. Peter R. Nel~
son in the Econometrics, Operations Research, and Quality Control Section. It includes a very brief review of the concept of
analysis of means but focuses primarily on how SAS® software
can be used to perform analysis of means.
THE BASIC STEPS
The following steps present, in their most general form. the
ANOM technique. Listed for each basic step is the SAS software
tool that can be used to perform the task.
INTRODUCTION
• Step 1:
Analysis of means (ANOM) is a technique for comparing a group
of ktreatment means from their grand mean, while controlling the
Type I risk, Q. AN OM can also be viewed as a multiple comparison
procedure that constructs simultaneous confidence intervals for
the contrasts of the individual population means versus their
overall mean (Nelson 1988).
- PROC SHEWHART or
PROC MEANS and the DATA step
• Step 2:
Compute the grand mean.
- PROC SHEWHART or
PROC MEANS and the DATA step
Graphically, ANOM can be thought of as an extension of a
ShewharHype control chart and is viewed in this context
throughout -this tutorial. The dependent or process variable is
plotted versus the classification or subgroup variable. Decision
limits are plotted to statistically and visually test the hypothesis
of differences in means. Consequently, ANOM graphically provides a measure of statistical significance as well as a graphical
measure of quantitative differences.
• Step 3:
Compute an estimate of variance.
- PROC SHEWHART or
PROC MEANS and the DATA step
• Step 4:
ANOM is appropriate for factors involving fixed effects only, as
discussed by Ramig (1983). ANOM can be applied to equal and
unequal sample size data. The choice of the appropriate critical
value to be used for computing the decision lines is the major difference in handling the equal and unequal sample size cases.
When the k means are based on equal sample sizes, their deviations from their grand mean are equicorrelated and exact critical
values, hu, can be computed. For purposes of this tutorial, the
hu values will be computed using an approximation developed by
L.S. Nelson (1983). Nelson reports that the approximation is
accurate to three Significant digits. The sample code for the
approximation is given in Appendix 1.
c
Obtain the appropriate critical value, hu.
the DATA step
• Step 5: Compute the upper and lower deciSion lines
(UDL and LDL).
- the DATA step
• Step 6: Plot the group means, the central grand mean
line, and the decision lines. If any mean falls outside of
the decision lines, declare that there is a statistically
Significant difference among the means.
- PROC SHEWHART
For the unequal sample size, case, the deviations of the group
means from their grand mean are not necessarily equicorrelated,
and therefore exact critical values cannot be computed. Instead,
hu', an upper bound on hu suggested by L.S. Nelson (1983), can
be used. The upper bound is calculated as the fl/2 percentage
point of the Student's t distribution, where,
VARIABLES DATA
Equal Sample Sizes
For simplicity, the ANOM technique is applied to a one-way classification design with equal samples of size n. The data used in
this example are taken from Walpole and Myers (1972, p. 366).
Five different concrete aggregates are used to investigate the
effect of aggregate on the mean absorption of moisture in concrete. Six samples of each aggregate were exposed to moisture
for forty-eight hours, for a total of thirty observations. The data
are read into a SAS data set, EXAMPLE1. with variable names
AGGREGT and MOISTURE. The data are presented in Table 1.
(1 )
k = number of means being compared.
Compute the group means.
(2)
The sample code for hu' can be found in Appendix 2.
Analysis of means is applicable to both variables and attribute
data. Variables data refers to those quality characteristics of a
sample that are expressed as a continuous numerical measure,
such as weight or volume. Conversely, attribute data are quality
characteristics that cannot be expressed as a continuous measure. These types of characteristics are usually noted by observing the presence (or absence) of some attribute of the sample,
such as the number or proportion defective. This tutorial presents
an example of variables data from Ramig (1983). Comparison of
the step-by-step explanation of ANOM given by Ramig, with the
1212
Table 1
The OUTLIMITS data set contains the information necessary to
produce control limits for an X chart of this data. It also provides
the information necessary to calculate the decision limits, or lines,
required by the ANOM technique. The contents of the
OUTHISTORY data set, HIST1, are shown in Table 2; the contents of the OUTLIMITS data set, LlM1, are shown in Table 3.
Moisture Absorption (Wgt %) for Concrete
Aggregate
Aggregt
2
551
457
450
731
499
632
595
580
508
583
633
517
3
639
615
511
573
648
677
4
417
449
517
438
415
555
5
563
631
522
613
656
679
Table 2
Contents of HISTORY Data Set for AGGREGT
AGGREGT
'"'
Step 1 in the ANOM process is to compute the group means, in
this case, the group means of MOISTURE for each value, or level
of AGGREGT. As noted in the previous section, this step can be
accomplished by using either PRQe SHEWHART or PROe
MEANS and the DATA step. PROe SHEWHART is chosen for the
following reasons:
Table 3
• Less programming code is required using PROe
SHEWHART. Several DATA steps and several PROe
MEANS statements are required to provide the same
information given by a single PROe SHEWHART
statement .
OBS
• PROe SHEWHART provides options for creating two
output data sets, the OUTHISTORY and OUTLIMITS data
sets. The OUTHISTORY data set is properly formatted for
reuse in a later PROe SHEWHART statement. The
OUTLIMITS data set contains necessary variable
information that must be entered separately if the PROe
MEANS statement is used.
KOISURES
110.154
47.986
59.946
57.607
58.783
KOI5UREN
Contents of LIMITS Data Set for AGGREGT
_VAL
...5UBGRL
MOISTURE
OB5
KOISURi:X
553.333
569.333
610.500
465.161
610.667
AGGREGT
_TYPE...
..LIKITIL
....ALPHA....
...5IGMAS-
0.0026998
ESTIMATE
....LCLL
JlEAtL
J)CLL
..LCLL
...5_
475.691
561.8
6~7.90J
2.Q3115
66.8952
_UCLL
...5TDDEL
131.759
70.3026
A note about the variable names saved into the OUTHISTORY
data set: if the process variable name is eight characters or more,
PROC SHEWHART creates output variable names by concatenating the first four letters and the last three letters of the variable name. The procedure then appends a suffix letter to the
variable name that indicates the statistic that the variable represents. In this example, the process variable name MOISTURE in
the input data set leads to creation of variables in the
OUTHISTORY data set such as MOISUREN. MOISUREN is a
summary variable that contains n, the number of values of
MOISTURE for each level of AGGREGT.
The following PROC SHEWHART statements produce output
data sets that contain most of the information needed for the
ANOM technique:
PROC SHEWHART DATA:EXAMPLE 1 ;
XCHART MOISTURE*AqGREGT J
NOCHART
STDDEVS
SKETHOD'"RMSDF
OUTHISTORY:HIST 1
DUTLIMITS:LIHl ;
In Step 2 the grand mean is calculated as
(3)
This value corresponds to the grand mean of MOISTURE for all
levels of AGGREGT. PROC SHEWHART saves the value of the
grand mean in the variable _MEAN_ in the OUTLIMITS data set
LIM 1.
The XCHART statement is chosen because it produces Ol,.ltput
data sets that contain information about the subgroup means of
the data. In the syntax of PROC SHEWHART, MOISTURE is the
process variable and AGGREGT is the subgroup variable.
Step 3 is to compute an estimate of· the common but unknown
variance, (i. For this tutorial, S2, the mean square for error, is
used as an estimate of the true population variance. The estimate
of variance is computed as
Several options are used in the XC HART statement. The
NOCHART option is specified in order to suppress the creation
of a chart. At this point in the analysis, the information needed
to create the ANOM chart is not complete.
5
The STDDEVS option requests the use of standard deviations for
creating the control limits for the X chart. By default, PROC
SHEWHART uses the subgroup ranges for estimating the control
limits. The SMETHOD= RMSOF reques"ts the use of the weighted
root mean square method for estimating the subgroup standard
deviations. The use of standard deviations for this analysis is
consistent with the example in Ramig (1983) and is necessary
when using the exact critical values, ha ·
2
~
(n,-1)S12+ ... (nk-1)s~
n, + ... nk - k
(4)
To compute S2, the value provided by the _STDDEV_ variable
contained in LlM1 is used. Recall that the weighted root mean
square method was used to calculate the subgroup standard
deviations in Step 1. This weighted root mean square estimate
is computed as
((n, - 1) 5,2
The OUTHISTORY and OUTLIMITS options create the working
data sets from which the ANOM technique is performed. The
OUTHISTORY data set contains the MOISTURE means for each
level of AGGREGT, or more simply the subgroup means. The
subgroup standard deviation estimates requested by the
STDDEVS option are automatically saved into the OUTHISTORY
data set, HISTl.
c4(n)(n,
+ ... + (nk - 1) 51)'/2
+ ... + n k - k)'/2
(5)
This method provides an unbiased estimate of the subgroup
standard deviations. To compute an unbiased estimate of the
population variance, multiply the unbiased estimate contained in
_STDDEV_ by the unbiasing factor, c4(n), as defined in the Methods for Estimating th·e Standard Deviation a in Chapter 5 of
1213
SAS/QC'-"" User's Guide, Version 5 Edition. SAS/QC software pro-
--.LCLJL=
j!EAN~ ~
vides a DATA step function C4 for calculating the control chart
constant, C4. The square of this quantity provides the estimate
of variance, or MSE, equal to the estimate of variance obtained
from a one-way analysis of variance table. The following DATA
step statements are used to compute the MSE and its corresponding degrees of freedom:
~UCLJL~
jlEAlL
HDELTA;
HDEL'I'A;
~STDDEV~=SQRT( &MSE);
J.LPHL=.05;
t
The creation of the AN OM chart, Step 6, is the final step in the
the ANOM technique. The AN OM graph in Figure 1 is produced
using the following PRGC SHEWHART statements:
DATA HIST1Aj
SET HISTI END"'EOF;
RETAIN N;
IF JL'" 1 THEN DO;
N=MOISUREN;
CALL SYKPUT( 'N' ,LEFT( PU'I'(MOISUREN, 11.0) ) J;
END;
IF EOF THEN DO;
SET LIM 1 (KEEP"'---'sTDDEV~J;
KSE"('-sTDDEL * CII(N*JL -(JL-l»)J**2;
KSEDF=JL * (N-1);
CALL SYMPUT( 'MSE' ,LEFT(PUT(MSE, 8. 3 J J );
CALL SYMPUT ( 'KSEDF' ,LEFT (PUT (MSEDF ,8.0) ) ) ;
CALL SYMPUT( 'NTRT' ,LEFT{PUT(JL,8. 0 J );
END;
PROC SHEWHART HISTORY=HIST1A LIMTIS=LIM1A GRAPHICS;
XCHART MOISTURE*AGGREGT /
S'I'DDEVS
TABLEOUT
READLIMITS
READALPHA
CT=WHITE
CLIMTS"-WHITE
CA=WHITE
CFRAME=TAN
FONT=XSWISS
NOCONNECT
CNEEDLES=GREEN
COUT=ORANGE
NOLEGEND
UCLLABEL=' UDL'
LCLLABEL=' LDL'
HAXIS=(' , '1' '2' '3' '4' '5' , ');
LABEL MOISUREX='MOISTURE ABSORPTION'
AGGREGT;:' CONCRETE AGGREGATE' ;
The DATA step above is used to store the value of MOISUREN
into a macro variable for later use in the analysis. Since the example has equal sample sizes for each level of AGGREGT, only one
common variable is needed. This DATA step is also used to store
the MSE, its associated degrees of freedom, and the number of
levels of AGGREGT. This information is needed in the calculation
of the decision limits. The CALL SYMPUT function is used to
avoid tedious data manipulation and merging. For this example,
N~6, MSE~4960.81, MSEDF~25, and NTRT~5.
--
r-----------------------------------------,~-
Step 4 in the ANOM technique involves the calculation of the critical value, ha . A small macro is provided to compute the critical
values for significance levels of .10, .05, .01, and .001. As discussed earlier in this tutorial, the approximation provided by Nelson (1983) was implemented for computing the critical values.
L1---___
. - - - - - - - -___---'-----1
!
The macro has the following syntax:
%ANOMH(alpha,df,k)
where
the Type I risk, a for computing the decision
lines.
alpha
the degrees of freedom associated with
df
S2.
Figure 1
the number of meanS being compared or, in
this context, the number of subgroups.
k
Several options are required to produce the ANOM graph. The
STDDEVS option is needed since the HISTORY data set contains
values based on the subgroup standard deviations. The
REAOLIMITS option indicates to PROC SHEWHART that the limits for the chart should be read from the LlMITS= data set,
LlM1A. Otherwise, the procedure recalculates the control limits
based on the information provided by the HISTORY data set
HIST1.
For a=.05, k=5, and df=25, the ha critical value equals 2.739.
This critical value, which the macro stores into a macro variable,
&HALPHA, is Used in a subsequent step for computing the
desired decision lines.
Step 5 involves the computation of the appropriate decision lines
for the ANOM technique. For equal sample sizes, they are computed as
UDL~X
+ hu S V (k -
l)/kn
(6)
V (k -
l)/kn
(7)
LDL~X - he S
Various other options are specified to provide helpful information
or to designate colors and demarcation for the chart. The
TABLEQUT option creates a table, shown in Table 4, that indentifies the points on the graph that exceed the decision Hnes. The
CNEEDLES option produces orange line segments that connect
the subgroup means with their grand mean. The READALPHA
option is used to produce the note on the graph that shows the
value of a used in the analysis. The HAXIS option that scales the
horizontal axis is new to PROC SHEWHART. It is an enhancement to be found in the next maintenance release of SAS/QC
To produce the appropriate decision lines, the OUTLIMITS data
set, LlM1, is altered to contain decision limits instead of control
limits. The following DATA step is used to create the appropriate
LIMITS data set for producing an AN OM chart with
PROC SHEWHART:
DATA FM1A;
SET LIM1;
HALPHA=~HALPHA;
HDELTA=&HALPHA * SQRT( &MSE) • SQRT(
(&NTRT~1)
ANOM Chart for Example 1
/ (tNTRT*&N»;
1214
CALL SYMPUT{ 'MSEDF' ,LEFT(PUT(MSEOF, S. ) ) ;
CALL SYMPUT{ 'TOTN' ,LEFT(PUT(Sl.JMNI,S.)});
END;
software. For details, see SAS Technical Report P-175, Changes
and Enhancements to the SAS System, Release 5.18, under OS
and eMS. The listing of the SAS code for this example can be
found in Appendix 3.
Table 4
The data set HIST2A is used as the HISTORY data set for input
to PROe SHEWHART. The creation of the _PHASE~ variable is
key in the establishment of the appropriate decision lines for the
ANOM technique. Its purpose is discussed in Step 5.
Resulting Table from TABLEOUT Option for
Example 1
Step 4 in the AN OM technique involves the choice of the appropriate critical value for calculating the decision lines. As discussed previously, an upper bound o·n h(u hu', is used. As in the
equal sample size case, a small macro, ANOMH2, is provided to
compute the critical values.
ANALYSIS OF MEMS
EXAKPLE 1
Subgroup
Sample
Siu
ASGREST
3.a Sigma
LOlfer Limit
For Mean
With n=6
491.35700
Q91.35700
Subgroup
Mean for
MOISTURE
~91.35700
553.33333
569.33333
610.50000
Q91.35700
491.35700
610.66667
~65.16667
3.0 Sigma
Upper Limit
For Mean
With n=6
Mean Limit
Exceeded
The macro has the following syntax:
632.2~300
632.24300
632.24300
632.24300
632.24300
Lower
%ANOMH2(alpha,df,k)
where
alpha
Figure 1 shows that the effect of moisture absorption for Aggregate 4 is significantly different from at least one other aggregate
at an alpha level of .05. Table 4 provides the same information
in tabular form. Other multiple comparison tests, such as
Duncan's Multiple Range Test and Fisher's LSD, lead to the same
conclusions as ANOM for this example. Ramig (1983) notes that
Walpole and Myers (1972) reached the same conclusions using
orthogonal contrasts with the analysis of variance.
the number of means being compared.
52.
Step 5 presents the major difference in the execution of the
ANOM technique for un,equal sample sizes. The AN OM decision
lines are dependent on the individual sample sizes, n j , of the
grouping variable AGGREGT. Therefore, the LIMITS data set for
input to PROe SH EWHART must contain an observation for each
value of AGGREGT, five observations in this example. The decision lines for unequal sample sizes are calculated as
Aggregate Data from Example 1 with Missing
Samples
Aggregt
2
595
580
508
583
633
the degrees of freedom associated with
k
5.
To illustrate the use of the ANOM technique for unequal sample
sizes, suppose that several samples of aggregate from Example
1 were not measureable. Table 5 contains the new unbalanced
data set, EXAMPLE2. The data set contains a total (N) of 22
observations.
1
551
457
450
731
499
632
df
For this example, with a=.05, k=5, and df=17, the critical value,
h(;, is 2.889. As before, the macro stores the critical value into
a macro variable, &HALPHA. This critical value is required in Step
Unequal Sample Sizes
Table 5
the Type I risk, a for computing the decision
lines.
3
639
615
511
573
4
417
449
517
438
5
563
631
522
UDL ~
X+ h; s V (N -
n;)/Nn;
(9)
LDL ~
X-
V (N -
n;)lNn;
(10)
h; s
The following DATA step is executed to produce the LIMITS data
set, LlM2A:
DATA LIM1A;
RETAIN _VAL --'sUBGRP- _SIGNAL -ALPHA..- ..JiEAN....;
LENGTH _INDElL. $ II.;
IF JL: 1 THEN SET LIM2;
SET HIST2 (KEEP",MOISUREN);
_INDEX-=--.N_;
HALPHA=5HALPHA;
HDELTA=5HALPHA. SQRT(~MSEl • SQRT((UOTN - MOISUREN) /
(5TOTN·MOISURENj) ;
_UCLlL.",..JiEML + HDELTA;
--.LCLXL=--.MEAN.... - HDELTA;
--.STDDEV-:SQRT (f;MSE 1 ;
--.LIMITIL=KOISUREN;
Execution of Steps 1 and 2 in the ANOM technique for unequal
sample sizes are no different in concept from the equal sample
size case. OUTHISTORY and OUTUMITS data sets, HIST2 and
L1M2 respectively, are created. A notable difference between the
one-way classification with equal sample sizes versus unequal
sample sizes is in the coding required for the calculation of the
MSE.
Due to the presence of the unequal sample sizes, the ANOM
technique requires varying decision lines. Two new variables,
_PHASE.-.- and _INDEL, are needed to produce these varying
decision lines. The _PHASE.-.- variable resides in the HISTORY
data set, and the _INDEL variable is contained in the LIMITS
data set. Tables 6 and 7 contain the contents of the HISTORY
and LIMITS data sets.
Step 3 in the ANOM process requires the calculation of S2. As
stated previously, the pooled mean square for error, or MSE, is
chosen as this estimate of variance. The following OATA step is
used to produce this estimate of MSE:
DATA HIST2A;
RETAIN SUMNI;
SET HIST2 END"EOF;
_PHASE-"AGGREGT;
SUKNI+KOISUREN;
IF EOF THEN DO;
SET LIM2 (KEEP",--.STDDEL);
MSE=(--.STDDEL.C4(SUMNI - iJL - 1))**2;
MSEDF"SUHNI-JL;
CALL SYKPUT('MSE' ,LEFT(PtrrjMSE,S.3»);
1215
Table 6
OBS
AGGREGT
Table 7
aBS
Contents of HISTORY Data Set for Example 2
IfOlSUREX
KOISURES
553.333
579.BOll
584.500
455.250
572.000
110.154
45.351
56.11BO
43.254
55.055
}tOISUR!N
KOtSVAR
KOISNN
aggregates in terms of their effect on moisture absorption in concrete. Table 8, created by the TABLEOUT option, verifies the conclusion. A listing of the SAS code used to apply the ANOM
technique to the one-way classification Example 2 is given in
Appendix 4.
JHASIL
12133.9
2056.7
3145.0
1810.9
3031.0
...----------r-----------------..t
t----------..r ------·
Contents of LIMITS Data Set for Example 2
_VAR....
--.SUBGRL --.SIGMAS- _Al.PIIL --'!lEAN....
O.002E998549.727
ll.U026998549.127
0.002E99B 549.127
O.002E998549.727
O.002E998549.727
MOISTURE
MOISTun
MOISTURE
MOtSTU!lB
AGGREGT
AGGREGT
AGGREGT
AGGREGT
MOISTUU AGGREGT
476.541 622.914
~E7.088 632.367
454.655644.799
454.655644.799
436.939 662.515
,, v, ,,
,, v ,,
,v
v
v
72.76ll
12.1631
72.1631
12.1631
72.1"631
_INDEx....
_TYPE..... ......LIMITN.....
LI------II....---I-----_..__---.'------1,-
ESTlKAT£
ESTlMATE
ESTlMATE
ESTIMATE
ESTIMATE
2.889
2.889
2.889
2.889
2.889
I
---------,'--......_ ..1___..__.._ --,
.
13.1B7
82.639
95.672
95.'072
112.788
........ _ .. _ .... _ .. __ w.
Within PROC SHEWHART, the variables _PHASE_ and
_INOEL are most often used in the generation of historical control charts. In the context of ANOM, however, these options are
used to signal PROC SHEWHART that varying decision lines
exist.
Figure 2
Table 8
Step 6 of the ANOM technique produces the ANOM chart. The
existence of the _PHASE_ and _INDEX- variables in the
HISTORY and LIMITS data sets facilitates the use of the
READPHASES and READINDEXES options in the XCHART
statement. With the exception of these two options, Step 6 for
the one-way classification with unequal sample sizes is identical
to the equal sample size case.
ANOM Chart for Example 2
Resulting Table from TABLEOUT Option for
Example 2
Phase
AGGREG'I'
Subqroup
Sample
Size
3.0 Slgma
tower Limit
For Mean
476.54067
467.08797
454.65509
454.65509
436.93915
PROC SHEWHART HISTORY"HIST2A LIMITS=LIK2A GRAPHICS;
XCHART HOISTURE*AGGREGT I
STOOEVS
TABLEOUT
REAOPHASES"('I' '2' '3' 'ii' '5')
REAOINOEXES,,{'l' '2' '3' '4' '5')
READLIKITS
NOCONNECT
NOLEGENO
CT"WHITE
CLIMITS"WHITE
CA"WHITE
CNEEDLES"GREEN
FONT=XSWISS
UCLLABEL=' UOL'
LCLLABEL=' LOt'
CFRAME=LIO
COUT",ORANGE
HAXIS=(' , 'I' '2' '3' '4' '5' , 'I;
LABEL MOISUREX,,'MOISTURE ABSORPTION'
AGGREGT'" CONCRETE AGGREGATE';
Phase
AGGREGT
Subgroup
Mean for
MOISTURE
553.33333
S79.S0nOO
584.50000
455.25000
572.00000
3.n Sigma
Upper Limit
For Mean
622.91387
632.36658
644.79945
644.79945
662.51540
Mean Limit
E:xceeded
SUMMARY
SAS software can easily be used to perform the ANOM technique. The SHEWHART procedure facilitates the use of the analysis of means with its variety of chart statements and options. This
tutorial provides the SAS software tools necessary to perform
analysis of means.
Appendix 1
1****************************************************************1
S A S S AMP L E L I BRA R Y
*1
The READPHASES and READINDEXES options direct PROC
SHEWHART to plot the information given in the HISTORY and
LIMITS data sets corresponding to the character values given in
the READPHASES and READINDEXES list. In this example, the
_PHASE_ and _INDEX- variables contained character representations of the numeric values 1 through 5. The READPHASES
and READINDEXES options require that the information provided
by the HISTORY and LIMITS data sets be plotted for values of
_PHASE- and _INDEX- that correspond to the character values, 1 through 5. In this example, the character list given by the
XCHART options is an exhaustive one.
,.1*
.,
1*
1*
1*
1*
1*
1*
NAME: ANOMH
*1
TITLE: MACRO FOR PROVIDING CRITICAL VALUES FOR ANALYSIS
*1
OF MEANS TECHNIQUE
*1
REF: L,S. NELSON (19B3), 'EXACT CRITICAL VALUES FOR
*1
USE WITH THE ANALYSIS Of MEANS'. JOURNAL OF QUALITY */
TECHNOLOGY 15, pp, 110-11'11.
*1
.,
"1***********************************'****************************1
,.,.1*
"
,."
Figure 2 presents the ANOM chart for AGGREGT. It is apparent
from the graph that there is no significant difference among
1216
THIS MACRO IS DESIGNED TO PROVIDE THE CRITICAL VALUES
NEEDED FOR USE WITH THE ANALYSIS OF MEANS. TilE VALUES
ARE VALID FOR THE ANALYSIS OF MEANS OF EQUAL SAMPLE
SIZES FOR SIGNIFICANCE LEVELS OF .10, .05, .01, AND
.001 • THE VALUES GENERATED ARE APPROXIMATE VALUES
WITH THE ABSOLUTE MAXIMUM DEVIATION FROM THE TRUE
*1
"
"
"
"
"
TABLE VALUES TO BE LESS THAN ONE IN THE THIRD
SIGNIFICANT DIGIT.
f.
f.
Appendix 2
'1
'1
f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .•••••• f
f •••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••• f
f.
I'
I'
I'
I'
I'
I'
I'
SHACRO ANO.KH(ALPHA,DF,K);
J,GLOBAL HALPHA;
DATA ---.NULL.....;
f.
CHECK FOR ERRORS IN ARGUKENTS OF THE FUNCTION
'1
"
"
IF 'OF LT J THEN DO;
PUT
ERROR: THE DEGREES OF FREEDOM ARE LESS THAN 3';
ABORT;
IF I>K GT 'DF THEN DO;
PUT' ERROR: THE NUMBER OF MEANS IS GREATER THAN THE NUMBER OF'
DEGREES OF FREEDOM. DEGREES OF FREEDOM FOR ERROR SHOULD BE'
K( N-I ) • CHECK YOUR INPUT.';
ABORT;
END;
f.
I'
I'
I'
I'
I'
"
"
I'
BUILD ARRAYS TO CONTAIN THE CONSTANTS TO BE USED FOR
APPROXIMATING THE HALPHA VALUES.
NAME: ANOMH2
TITLE: MACRO FOR PROVIDING UPPER BOUNDS ON THE TRUE
VALUE OF THE CRITICAL VALUE NECESSARY FOR THE
ANALYSIS OF MEANS TECHNIQUE
REF: L.S. NELSON (1983), 'EXACT CRITICAL VALUES FOR
USE WITH THE ANALYSIS OF MEANS'. JOURNAL OF QUALITY
TECHNOLOGY IS, PP. 40-~4.
'f
'1
'1
'1
'1
'1
'1
'1
"'1
f •••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••• f
END;
f.
f.
S ASS AMP L E L I BRA R Y
THIS MACRO IS DESIGNED TO PROVIDE UPPER BOUNDS ON THE
CRITICAL VALUES NEEDED FOR USE WITH THE ANALYSIS OF MEANS
TECHNIQUE FOR DATA WITH UNEQUAL SAMPLE SHES.
THE VALUES GENERATED ARE CALCULATED AS THE UPPER 'ALPHA" f2
PERCENTAGE POINTS OF A STUDENT'S T DISTRIBUTION, 'WHERE
'ALPHA*' 12=1-( I-ALPHA) •• ! 11K}
ALPHA = THE DESIRED SIGNIFICANCE LEVEL
K = THE NUHBER OF MEANS
IT SHOULD ALSO BE USED TO PROVIDE HALPHA VALUES WHERE THE
NUHBER OF DEGREES OF FREEDOM FOR ERROR IS LESS THAN K.
THE ARGUMENTS OF THE FUNCTION ARE THE SAME FOR THE BALANCED
DATA CASE. (SEE THE SAMPLE KEMBER, ANOKH)
I'
I'
f.
f ••• .................................
'f
'f
IF 'ALPHA=.IO THEN DO;
ARRAY BIOO(81 Bl-B8;
BIOOI1)= 1.2092;
BlOOP)" 0.7992;
BIOOP)= 0.6238;
BIOO(III-" 0.11797;
BIOO(5)= 1.6819;
BIOO{6),,-0.2155;
BIOO(1)'" 0.4529;
BIOO(8)=-0.6095;
'f
'1
'1
'1
'1
'1
'1
'1
'1
'1
'1
.f
*•••• *........... ** ••••••••• *f
j1MACRO ANOKH2(ALPHA,DF ,K);
DATA ---.NULL.....;
ASTAR = 1 -
(l-'ALPHA)"(lf~KJ;
f. SINCE ASTAR CORRESPONDS TO A TWO-SIDED SIGNIFICANCE LEVEL,
f. THE ONE-SIDED LEVEL h'OULO BE ASTARf2
.f
.f
END;
IF tALPHA". 05 THEN DO;
ARRAY B050(8) Bl-B8;
B050(1)'" 1.7011;
B050(2)'"' 0.6047;
B050P]'" 0.7102;
B050 (4] = 1.11605;
B050(5]= 1.9102;
B050(6]= 0.2250;
8050(1]= 0.6300;
B050 (8) =-0. 2202;
END;
IF 'ALPHA=.OI THEN DO;
ARRAY BOIO/8) 81-B8;
BOIOP)= 2.3539;
BOIOPI= 0.5176;
B010(3}= 0.711J7;
B010(1I]= 4.3161;
B010(51= 2.3629;
B010(6)= 4.6400;
8010P): 1.86110;
8010(8)= 0.3204;
END;
IF ,ALPHA=.OOI THEN DO;
ARRAY BOOI18] BI-88;
8001(1]= 3.1981;
BOOIP}= 0.3619;
BOOIP)= 0.7886;
BOOI1~)= 8.31189;
BOOI{5)= 3.1003;
B001(6]=21.1005;
BOO1{7}= 5.1211;
8001(8)= 0.7271;
END;
AS TAR " ASTARf2;
f. COMPUTE THE PROBABILITY VALUE NECESSARY FOR THE TINV FOHCTION *f
PROB = 1 - ASTAR;
HALPHA = TINV(PROB,&DF);
CALL SYMPUTj' HALPHA' ,LEFT(PUT{HALPHA,8. 3»));
j1MEND ANOMH2;
Appendix 3
UNCLUDE ANOHH;
RUN;
DATA EXAMPLE I ;
INPUT AGGREGT $ 1 iii;
DO 1=1 TO 6;
INPUT MOISTURE iii;
OUTPUT;
END;
DROP I;
CARDS;
551 451
595 580
639 615
411 449
563 631
Q50
508
511
511
522
731
583
573
438
613
499
633
648
415
656
632
517
611
555
619
PROC SHEWHART DATA:EXAMPLE 1 ;
XCHART HOISTURE.AGGREGTI
HOCKART
SToDEVS
SKETHOD"'RKSDF
OUTHISTORY"'HISTI
OUTLIMITS=-LIKl ;
KI
LOG!");
K2
LOG{iK-21;
VI
If(&DF-11;
HALPHA = Bl -+ B2.{K'**B3) -+ (B4 + B5.Kl).Vl +
(B6 + B1.K2 + B8.K2.,2).Vl**2;
CALL SYMPUT{ 'HALPHA' ,LEFT(PUT(HALPKA,8.3)));
DATA HIST1A;
SET HIST1 END=EOF;
RETAIN N;
IF ---.N_"I THEN DO;
N"MOISUREN;
CALL SYMPUT( 'N' ,LEFT(PUT(HOISUREN, 4.)));
END;
IF EOF THEN DO;
SET LIMI (KEEP"'_STDDEL);
iHEND ANOKH;
1217
KSE= (_STDDEV- * ClI(N*...1L - 1_1L-1) J)**2;
KSEDF=....lL*(N-l) ;
CALL SYKPUT1' NTRT' ,LEFT(PUT(...1L. ij.1 J J;
CALL SYKPUTI 'KSE' ,LEFT1PUT(MSE,8 .3}) J;
CALL SYMPUT( 'MSEDF' ,LEFT{PUT{HSEDF ,8.) J J;
END;
STDDEVS
SMETHOD=RHSDF
OUTHISTORY=HIST2
OUTLIMITS=LIH2 ;
RUN;
RUN;
/"
NOTE;
ALPHA LEVEL IS .05, HALPHA IS CALCULATED
DATA HIST2A;
RETAIN SUHNI;
SET HIST2 END=EOF;
---.PHASIL=AGGREGT;
SUHNI+HOISUREN;
IF EOF THEN DO;
SET LIH2(KEEP,,---.STDDEV-I;
KSE=(---'sTDDEV-*C4{SUHNI - (..JL -111)**2;
HSEDF=SUMNI-....lL;
CALL SYMPUT( 'KSE' ,LEFT{PUT(HSE,B. 31));
CALL SYMPUT{ 'KSEDF' ,LEFT(PUT{MSEDF ,B.)));
CALL SYMPUT{ 'NTRT' ,LEFT(PUT(....lL, 8. )));
CALL SYMPUT( 'TOTN' ,LEFT(PUT(SUMNI,S.));
0/
:i.ANOMH{. 05, iMSEDF. iNTRT);
DATA LIM1A;
SET LIM1;
HALPHA"iHALPHA;
HDELTA"tHALPHA*SQRT( iKSEJ.SQRT{ (iNTRT-l)/( Ufi'RT*iN));
....I.CLx....=...1!EAlL-HDELTA;
_UCLL=...1!EAN....+HDELTA;
_STDDEV-=SQRT( iKSE);
-ALPHA.....=.OS;
END;
RUN;
1*
NOTE:
ALPHA LEVEL IS .05, HALPHA IS CALCULATED
RUN;
lANOKH2{ .05, tHSEDF, tNTRT);
GOPTIONS NOTEXT82;
SYHBOL 1 V"NONE H=3 C=WHITE '1(=20 F=;
TITLEl FONT=XSWISS H=1.5 C=WHITE 'ANALYSIS OF HEANS";
TITtE2 FONT"XSWISS H=.9 C=WHITE 'EXAMPLE 1';
PROC SHEWHART HISTORY=HIST1A LIKITS=LIH1A GRAPHICS;
XCHART HOISTURE*AGGREGTI
STDDEVS
SHETHOO=RKSDF
CT=WHITE
CFRAKE=TAN
CLIHITS=WHITE
CA"'WHITE
COUT=KORO
FONT=XS14ISS
READLIHITS
READALPHA
NOCONNECT
CNEEDLES=GREEN
NOLEGEND
UCLLABEL=' UDL'
LCLLABEL=' LOL'
HAXIS=(' , '1' '2' '3' '4' '5' , 'I;
LABEL HOISUREX= 'MOISTURE ABSORPTION'
AGGREGT=' CONCRETE AGGREGATE' :
DATA LIM2A;
RETAIN _VAL ---'sUBGRP_ ---.SIGKAL -ALPHA.... ...1!EAlL;
LENGTH _INDEL $ -4.;
IF ....lL" 1 THEN SET LIM2;
SET HIST2(KEEP=MOISUREN);
_INDEL=....lL;
HALPHA=tHALPHA;
HDELTA=iHALPHA*SQRT( ~MSE) .SQRT{ (tTOTN-MOISUREN) /
(tTOTN*HOISUREN) );
---ULL=...1!EAlL-HDELTA;
_UCLL=JlEAtL+HDELTA;
---.STDDEL"SQRT1 tMSE);
....I.IKITtL=MOISUREN;
OUTPUT;
GOPTIONS NOTEXTB2;
SYMBOL 1 V=NONE H=3 14=20 F=;
TITLE 1 FONT=XSWISS H=1.5 C"WHI'I'E 'ANALYSIS OF MEANS';
TITLE2 FONT=XSWISS H=.9 C=WHITE 'EXAMPLE 2';
PROC SHEWHART IiISTORY=HIST2A LIMITS=LIM2A GRAPHICS;
XCHART KOISTURE*AGGREGTI
STDDEVS
TABLEOUT
READPHASES"( 'I' '2' '3' '4' '5')
READINDEXES=('1' '2' '3' 'ij' '5')
CT=WHITE
CLIMI'l'S=WHI'l'E
CA=WHITE
FONT=XSWISS
READLIKITS
NOCONNECT
CNEEDLES=GREEN
COUT"ORANGE
CFRAKE=LIO
NOLEGEND
UCLLABEL=' \JDL' LCLLABEL'" LDL'
HAXIS=(' , 'I' '2' '3' '4' '5' , 'J;
LABEL HOISUREX,,'MOISTURE ABSORPTION'
AGGREGT=' CONCRETE AGGREGATE';
RUN;
Appendix 4
UNCLUDE ANOMH2;
RUN;
DATA EXAKPLE2 ;
INPUT AGGREGT $ MOISTURE
CARDS;
551
4S7
<SO
731
'"
612
595
580
50'
583
63J
639
615
511
573
'"
m
517
43'
563
631
522
PROC SHEWHART DATA=EXAMPLE2;
XCHART HOISTURE*AGGREGT I
NOCHART
1218
REFERENCES
Nelson, L.S. (1974), "Factors for the Analysis of Means," Journal
of Quality Technology, 6,175-181.
Nelson, l. S. (1983). "Exact Critical Values for Use With the Analysis of Means," Journal of Quality Technology, 15,40-44.
Nelson, P. R. (1983), "A Comparison of Sample Sizes for the Analysis of Means and the Analysis of Variance," Journal of Quality
Technology, 15, 33-39.
Nelson, P. R. (1985), "Power Curves for the Analysis of Means,ft
Technometrics, 27, 65-73.
Ne/son, P.R. (1988). "Multiple Comparisons of Means Using
Simultaneous Confidence Intervals, Submitted for Publication.
Ott, E. R. (1967). "Analysis of Means - A Graphical Procedure," Industrial Quality Control, 24, 101-109.
Ramig, P. R. (1983), "Applications of the Analysis of Means,"
Journal of Quality Technology, 15. 19-25.
Walpole, RE. and Myers, R.H. (1972). Probability and Statistics
for Engineers and Scientists. New York: The MacMillian Co.
ft
SAS and SAS/QC are registered trademarks of SAS Institute Inc_.
Cary, NC, USA.
1219