Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
USING SAS SOFTWARE IN THE DESIGN AND ANALYSIS OF TWO-LEVEL FRACTIONAL FACTORIAL EXPERIMENTS Joanne R. Wendelberger, General Motors Research Laboratories current, and time. In the analysis phase of the experiment, regression analysis will be used to estimate the effects of varying each variable from its low to its high level. The -'s and +'s will become -l's and +l's for the experimental variable settings, recorded in columns, and these columns will then be used as independent variables for regression modeling of the dependent variable yield. A full factorial design for this experiment requires 24 or 16 runs in order to obtain every possible combination. The resulting design matrix is given in Table II. Note that the pattern 01 the plus and minu8 JJigns ron be easily generat«l in an orderlll manner. The first variable column has alternating plus and minus signs, the second variable column has alternating pairs of plus and minus signs, and so on. Ta.ble II DESIGN MATRIX AND EXPERIMENTAL SETTINGS FOR A 24 FACTORIAL EXPERIMENT OVERVIEW The design and analysis of fractional factorial experiments will be described. Use of SAS software for the design and analysis of fractional factorial experiments will be discussed and illustrated with examples from the author's experience. INTRODUCTION Scientific studies often investigate the effect of changing the settings of one or more variables on some measured quantity of interest. Experimental design deals with the selection of settings of the input variables at which the output or response variable is measured. A class of designs called the 2 k - p fractional factorials is very cost effective, especially at the initial variable screening process. These designs have been discussed in several statistical publications including Box and Hunter (1961), Box, Hunter, and Hunter (1978), Daniel (1976), and Davies (1978). Although these designs are not new, they are greatly underutilized. A Use of these fractional factorial designs allows efficient use of resources for obtaining information. Instead 01 requiring mea8urements on everll possible #!ombination of experimental settings, the fraetionalladoriai alloWJJ the experimenter to conduct a r«lucedsd of experiments, while JJtill retaining the ability to obtain information about the effectJJ of changing settingJJ of the experimental variables. The structure of these designs results in uncorrelated estimates of effects, a desirable situation which is hard to attain when measurements are collected without controlling input variables. Off 30 35 120 120 100 120 100 120 100 120 100 120 100 120 CURRENT TIME 20 on on on on off off off off on 20 30 30 20 20 30 30 on on off off off off 30 30 30 30 30 30 30 30 35 35 35 35 35 35 35 35 In addition to main effects, interaction effeets of combinations of variable8 mall be of intere8t. An interaction occurs when the effect of a variable depends on levels of one or more other variables. Interaction effects of two or more variables can be estimated using products of the columns associated with the experimental variables. For example, the interaction AB between the variables A and B is associated with the column obtained by multiplying the A and B columns together. The interaction columns will be used in the regression analysis as additional independent variables for modeling the dependent variable yield. With this design, it is possible to obtain uncorrelated estimates of the main effects A, B, C, D, the two-factor interactions AB, AC, AD, BC, BD, CD, the three-factor interactions ABC, ABD, ACD, BC D, and the four-factor interaction ABC D, plus the mean response. Note that the number of parameters which can be estimated is limited by the total number of runs, in this case 16. The complete design matrix containing all the main effects and interactions is given in Table III. Table I On 20 20 30 30 20 20 30 30 In this example, k = 4, so the matrix has k = 4 columns to represent the 4 variables and 2k = 24 = 16 rows which contain the 16 possible combinations of the low and high settings for the four experimental variables. Once the data collection phase of the experiment is completed, the column for a given variable, replaced with plus and minus 1'5, can be used to obtain an estimate of that variable's main effect. The main effect of an input variable measures the difference between the values, of the response- variable when measured at the input variable's high and low settings. EXPERIMENTAL SETTINGS FOR A 4 VARIABLE EXPERIM:ENT 30 PRESSURE 100 100 + + + + In a. two-level factorial design each variable has 2 settings, a low value and a high value. A full two-level factorial with k variables requires 2k runs. To illustrate, consider the example of a 2i factorial experiment: suppose an experimenter wants to study the effect of the variables temperature, pressure, current and time on the yield, in grams, of a chemical process. Each of the four input variables has a high and a low level as shown in Table I. 20 TEMPERATURE 120 100 + + + + + + + + + + + + + + + + In a factorial design experiment, factors or variables are selected which are thought to have some effect on a measured quantity of interest known as the response or dependent variable. Specific fixed levels are chosen for each of the variables, and the response is measured for different combinations of the variable settings. In a full faetorial, the response is measured at every possible combination of the variable settings. Often, the full factorial requires a large number of experiments or runs. For example, if X has 2 levels, Y has 3 levels, and Z has 5 levels, the full factorial would require 2 . 3·5 = 30 runs. When there are many variables, or if variables have many levels, the number of runs required for a full factorial may he too large to be practical. HIGH D + + + + + + + + FACTORIAL DESIGNS LOW 100 C + + + + In this paper, the design and analysis of 2 k - p fractional factorial experiments will be reviewed. To illUstrate the structure of fractional factorial designs and the subsequent analysis of data obtained from such designs, an example will be discussed in detail, including use of SAS. Additional applications from the author's experience will also be provided. VARIABLE Temperature (0 C) Pressure (lbs/in 2 ) Current Time (seconds) B 120 FRACTIONAL FACTORIAL DESIGNS A class of two-lcvellactorial designs which allow8 reduction af the number of runs while still allawing estimation of important effeds i8 the familll of two-level Iradiona! factorial designs. The construction of two-level fractional factorial designs is based on the structure of two-level full factorial designs. Letting the minus sign (-) represent the low level and the plus sign (+) the high level for each variable, the design may be represented by a matrix of - 's and + 's indicating the levels of the four input variables for each experimental run. The letters A, B, C, and D, will he used to represent the experimental variables temperature, pressure, In practice, interactions involving three or more factors are often negligible. We can take advantage of the fact that some of the effects estimable in the full factorial are negligible to reduce the number 772 Table III COMPLETE DESIGN MATRIX FOR A 2i FACTORIAL EXPERIMENT A 8 + - - + + + + C + - - + + - - - + + + + + + + + - + + + + + + + + + + - A8 AC AD 8C 0 - - + + - + + + + + + + - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + BD CD + + + + + + + + ABC ABO ACD BCD ABeD + + + + + + + + + + + + Table V ALIASING PATTERN FOR A 25- 1 FRACTIONAL FACTORIAL DESIGN + + + + + + + + + + + + + + + + + 1::::0 ABCDE + + + + + A:o:: BCDE B = ACDE C=ABDE D=ABCE E=ABcD AB=CDE AC= BDE AD =:: BCE AE = BCD BC=ADE BD= ACE BE= ACD CD = ABE CE=ABD DE = ABC + + + + + + + + + + of runs required for the experiment. In the example given above, assume that the 3-way and 4-way interactions are negligible. This information can then be used either to study additional variables without increasing the number of runs or to reduce the number of runs for the present study. Suppose a fifth variable E is added to the 4 variable study described above. For example, suppose two different catalysts are being studied. Catalyst 1 can be designated as the low level and Catalyst 2 as the high level. A full factorial with 5 variables requires 2 5 = 32 runs. Instead of adding a fifth column, which would require twice as many runs, the new fifth variable is assigned to the column for the ABC D interaction to obtain its settings. Interaction effects involving the new variable E are obtained as before by multiplying columns of the desired effects. Instead of representing just one effect, each column now may represent multiple effects. These effects are said to be aliased or confounded. Table IV gives the new design matrix, experimental settings, and sample fictitious data for the response variable yield. relations or generators: A table in Box, Hunter, and Hunter (1978) gives generators for several 2 k - p designs. A U. S. Department of Commerce Publication (1957) gives defining relationships for several designs, from which generators can be obtained. Aliasing patterns are also obtained from the generators. The algebra used to multiply words together is as follows. If the total number of times a letter appears in the words is odd, then it appears in the resulting product. If the total is even, the letter does not appear in the product. For example, if ABCD and CnE are multiplied using this rule, the result is ABE since A, B, and E each appear once, and C and D each appear twice. Any letter times itself is equal to the identity I. For example AB . AB = A 2 B2 = I . I = I. This algebra can be used to generate the entire aliasing pattern and is described in detail in Box, Hunter, and Hunter (1978). Table IV DESIGN MATRIX AND EXPERIMENTAL SETTINGS FOR A 2,-1 FRACTIONAL FACTORIAL EXPERIMENT A B C + + + + + D + + + + + + + + + + + + + + + + + + + E TEMP + 100 120 100 120 100 120 100 120 100 120 100 12{) 100 120 100 120 + + + + + + + + + + + + + + + PRESS 20 20 30 30 20 20 30 30 20 20 30 30 20 20 30 30 Given the generators of a specific design, the design matrix can be easily generated with SAS software using DATA IIteps. In the SAS program steps + 1 's and -1 's are used instead of plus and minus signs. First, the design matrix of a full factorial is created for the first k - P variables either by direct input, or by programming statements as shown in the following DATA step for the 2 5 - 1 example defined by 1= ABC DE. Note the pattern forming the argument for the !NT function which holds for any number of variables. CUR TIME CAT YLD on on on on off off off off 30 30 30 30 30 30 30 30 35 on on off off off off 35 35 35 35 35 35 35 2 1 1 2 1 2 2 1 1 2 2 1 2 1 1 2 52.4 15.4 68.9 61.1 58.8 69.3 62.9 11.6 59.6 69.1 63.0 72.5 53.1 15.8 69.9 66.9 DATA FULLFACT; DO 1=1 TO 16; A=(-l)"'*I; B~(-1)"(INT«I+I)/2)); C~(-I)"(INT«3+I)/4)); D~(-l) "(INT«7 + 1)/8)); OUTPUT; END; The generator relationships are then used to obtain one or more columns for additional variables by multiplying together columns of the full factorial in assignment statements. This design is called a half-fraction of the 32 run full factorial design. Since it has 5 variables, with one of them defined in terms of the other 4, it is a 25 - 1 fractional factorial design. Note that the number of runs is equal to 25- 1 = 24 = 16, half the number required for a full 5-variable factorial. DATA FRACTION; SET FULLFACT; E=A*B*C*D; Table V shows the complete aliasing patternfor the 25 - 1 design resulting from setting E = ABC D. I is used to represent the identity element and consists of a column of 16 plus signs. The I column will be used to estimate the response average in the regression analysis of the data. Columns corresponding to the interactions are also generated by multiplication in assignment statements. Note that only one column is generated for each pair of aliased effects. For example, since AC = BDE, only the AC column is generated. Note that each main effect and 2-factor interaction is aliased with a 3 or 4-way interaction which was assumed to be negligible. Thus, ignoring effects expected to be unimportant, we ean still obtain estimates 0/ all main effects and 2-way interactions. DATA DESIGN; SET FRACTION; AB=A*B; AC=A*C; AD=A*D; AE=A*E; BC=B*C; BD=B*D; BE=B*E; CD=C*D; CE=C*E; DE=D*E; GENERATING DESIGN MATRICES The portion of the aliasing pattern involving I, the effect associated with the mean, is called the defining relation. Depending on the degree of fractionation, the defining relation contains one or more effects or words. The words in the defining relation, or some subset of these words, can be used to generate the entire aliasing pattern. The words in such a set are known as generators. Publications listing experimental designs may include defining 773 response which are collected from repeated runs of the experiment at the same experimental settings. If replicate measurements are not available and some alias sets can be assumed negligible, then the residual mean square error of the regression model can be used to compute an alternate error estimate. Further details on obtaining error estimates may be found in Box, Hunter, and Hunter (1978). The total number of variables for the complete design should be at most Zk-p - 1. RANDOMIZATION The order in which experimental runs are done should be random to distribute the effects of unknown, hidden variables. Randomization can he done in SAS using the PLAN procedure, or for more complex situations using DATA steps, generating random numbers, and then using RANK and SORT procedures. For the Z5-1 chemical yield experiment exaIllvle. a completely randomized run order can be obtained using PROe PLAN. Output from the procedure consists of a list of the numbers from 1 to 16 in random order. Once an estimate of the standard error for a coefficient has been obtained, one can assess whether individual coefficients are important. To determine the significance of a particular coefficient, the ratio of the estimated coefficient to its standard error is compared to the percentage points of Student's t-distribution. Traditionally, this computation has been done using tables. With SAS software, the probability value for a given value of t and a specified number of degrees of freedom may be obtained with the PROBT function. Use of the PROBT function for the two-sided t-test appropriate for testing whether an effect is significantly different from zero is documented in SAS User's Guide; Basics under SAS functions. PROC PLAN; TITLE 'COMPLETELY RANDOMIZED RUN ORDER'; FACTORS RUN:::16; Suppose that for the chemical yield data, each data value Was actually the average of two measurements, Yl and Y2, obtained in separate experimental runs and that the resulting standard error of each individual coefficient has been determined to be .91. Then to determine the significance of the A effect, consider the test statistic Sample Run Order: 13,4,9,3,14,7,15,5,16,6,12,8,11,2,1,10 ESTIMATION OF EFFECTS Estima.tion of the effects of variables a.nd their interactions may be done by simple averaging techniques, the Yates algorithm, or regression. Averaging methods for small studies using the simple geometrical structure of factorials shed light on the interpretation of what is meant by a main effect or an interaction. The Yates algorithm provides a shortcut useful for hand calculations. However, for routine calculations the regression technique is most convenient, particularly for'large studies. The response variable is modeled as linear function of the variables and their interactions. The regression coefficients, which can be obtained from the SAS regression procedure PRGe REG, will correspond to the effects on the response of varying the settings of input variables from low to high values. Howey-er, with the -1 and +1 scaling, regression estimates will be equal to one half times the value of the effect of changing from the low setting to the high setting. Thus, to obtain values which can be interpreted directly as effects, the coefficients of the input variables must be multiplied by two. The intercept term corresponds to the average response and does not require doubling. A The value t = 10.9 is compared to a t-distribution with 16 degrees of freedom. Using the SAS PROBT function, the probability of obtaining a value as large in magnitUde as 10.9, if A has no effect, is found to be less than .001. Since a value which o<:curs less than one time in a thousand seems highly unlikely, one concludes that the variable A does have a significant effe<:t. Similarly, the effects of B, AB, and E are found tb be significant. The remaining effects ha.d t-ratios which were not large enough to be considered unlikely and hence were not considered significant. Normal probability plots may be useful in displaying the importance of different effects. In a normal prohablity plot, data values are plotted against their normal scores, a transformed version of the data. If the data behaves like a random sample from a normal distribution, the p<Jints should lie along a reference line which is generated using the m~ and standard deviation of the data. IT a set of effects is plotted, and some of the effects are important, then the data will not follow the reference line. Instead, the points for the nonsignificant effects usually lie along some different line, and the points of the significant effects will be offset. Removal of significant effects and replotting of the remaining effects should result in tbe expected linear pattern. For the above example, regression analysis can be used to obtain estimates of effects, regression diagndstics, residuals and predicted values for further analysis. Given a data set containing the complete design matrix for the above example and the measured values for the response variable, the estimation of regression coefficients can be accomplished with the statements below. Estimates Of the average and variable effects obtained from the intercept and doubled regression coefficients are given in Table VI. Normal probability plots may be generated using the RANK, MEANS and GPLOT (or PLOT) procedures. In a SAS Institute Technical Report (1983), Chilko discusses probability plotting with SAS software. Half-normal probability plots, discussed by Daniel (1959) may also be usefuL PROGREG; MODEL Y=A BCD E AB AC AD AE BC BD BE CD CE DE; Figure 1 contains a normal probability plot of all of the effects of the model estimated above for the chemical yield study. Most of the effects lie along a line different from the reference line. The A, B, AB, and E effect·.. are offset from this line indicating that they are significant. Figure 2 contains a second plot of the effects after the significant effects are removed. The points lie along the reference line as expected. SAS program statements to generate a normal probability plot of the effects and a reference line from a dataset EFFECTS containing the effect values in the variable EFFECT are as follows. Table VI ESTIM:ATES OF EFFECTS FOR 25 - 1 FRACTIONAL FACTORIAL EFFECT VALUE Average 66.06 A 9.96 B 3.a6 C -.04 D .48 E --6.02 AB -6.60 AC -.24 AD -.16 AE BC BD BE CD CE DE 9.96 t = .91 = -:9T = 10.9. PROC RANK DATA=EFFECTS NORMAL=BLOM OUT=RANKS; VAREFFECT; RANKS EFFRANK; PROCMEANS; VAREFFECT; OUTPUT OUT=SUMMARY MEAN=MEAN STD=STD; DATA LINE; SET SUMMARY; LOOP: SET RANKS; LINE=MEAN+EFFRANK*STD; OUTPUT; GOTO LOOP; PROC GPLOT; .44 -.04 -.06 .26 .24 .06 -.26 EXAMINATION OF EFFECTS Individual coefficients for each 0/ the efJ~ct3 must fie examined relative to the natural random variation inhl!rl!nt in the data. An error estimate may he obtained from replicate measurements of the 774 Figure 1 Figure 3 RESIDUALS VS PREDICTED YIELD VALUES NORMAL PROBABILITY PLOT OF EFFECTS ALL .FrECTS INCLUDED :a.ao 8.!, <.'" ec CE ; S -! .33 ~ NORM~L SCORES Figure 2 Figure 4 NORMAL PROBABILITY PLOT OF RESIDUALS NORMAL PROBABILITY PLOT OF EFFECTS ,, o " 0 .. NORMAL SCORES of residuals VB. predictE'd values is displayed in Figure 3. No obvious inadequacies are evident. The following Htatements can be used to plot residuals against predicted values and against each of the experimental variables. PROC GPLOT; PLOT RESIDUAL*(PREDICT ABC DE); The normal probability plots discussed earlier for analysis of effects may also be used as graphical tools for studying residuals. A normal probability plot of the residuals provides a check of whether the residuals are normally distributed, as required for some statistical tests. Points which deviate strongly from the line of a normal probability plot may indicate lack of fit in the model or presence of outliers. Figure 4 shows a normal plot of the residuals of the chemical yield data which does not suggest any inadequacies. PLOT EFFECT*EFFRANK LINE*EFFRANK /OVERLAY; SYMBOLI V=STAR !=NONE C=BLACK; SYMBOL2 V=NONE I=JOIN C=DLACK; Based on the results of statistical tests and graphical displays of the effects, effects which are significant are selected. These effects may then be used to predict the response value within the range of experimental conditions studied. One method of obtaining predicted values is to evaluate the response from a regression model containing only the significant effects. Residuals, which are obtained by subtracting the predicted response values from the actual measured response values may also be obtained. The SAS REG procedure is used again with the chemical data but this time an OUTPUT statement is included to save the predicted values and residuals in an output dataset called REGOUT. INTERPRETATION OF EFFECTS PROC REG; MODEL Y =A B AB E; OUTPUT OUT=REGOUT P=PREDICT R=RESIDUAL; Interpretation of main effects depends on the presence or absence of significant interaction effects. When a variable is not involved in any significant interactions, its effect may be interpreted as the difference in the response which occurs when the input is varied from its low value to its high value. However, if the variable does signifkantly interact with one or more other variables, the value of the main effect by itself is not meaningful. Rather, the main effect takes on a different value at each level of the other variables with which it interacts. Returning to the chemical yield experiment, the significant effects can be examined in light of the original experimental variables. The E effect corresponds to the Catalyst variable. Since the value of the E effect is -6.02, using Catalyst 2 instead of Catalyst 1 results in a drop in yield of 6 grams. A and B correspond to the variables Temperature and Pressure. The interaction term AB was also significant. Therefore the main effects of A and B are not meaningful by themselves. Instead, we say that the effect of changing the temperature from 100° to 120° is different at the two different RESIDUAL ANALYSIS An important part of the modeling process is the model checking phase. Careful examination of the residuals or deviations of the model from the data can reveal flaws or inconsistencies. Plots of residuals versus predicted response values and of residuals versus the experimental values provide useful diagnostic tools. For a model to he considered adequate, these plots should appear to he random, without evidence of obvious patterns such as trends or increasing spread in the values. Draper and Smith (1981) discuss the use of residual analysis for model checking. Plots of residuals versus predicted values and experimental settings may be obtained using the output from PROC REG if residuals and predicted values are saved in an output dataset with the OUTPUT statement. For the chemical yield example, a plot 775 behavior, and results were diffie_ult to interpret. Investigation finally led the experimenter back to the original recording sheets from the experiment, and a transcription error was discovered. When the incorrect value was corrected the probability plot of residuals appeared reasonably normal, and effects of variables made sense. The graphical tool used in the analysis helped uncover an outlier which could have led to mistaken conclusions. pressure levels used in the experiment. Since the variables C and D do not appear in any of the significant effects, we may conclude that the effects of current and time on yield are negligible within the experimental ranges studied. ADDITIONAL EXAMPLES ExtUllple 1: Experiment Screening Design for Industrial Process CONCLUSION The 2k - p fractional factorial designs are useful for studying the effects of several variables simultaneously. SAS software provides several useful tools for designing and analyzing data from fractional factorials. The use of 2k - p fractional factorials and other statistical experimental designs should be encouraged. The programs described here could be modified to accommodate more general problems, rather than just specific cases. Use of other SAS facilities, such as the macro language and SAS/AF™, which are documented by SAS Institute (1985), might be useful in implementing a system for assisting experimenters with the design and analysis of experiments. A set of process variables were identified as potentially important factors in determining the quality of the finished product. A screening design was desired to try to select a subset of important variables for further study. Generators for a 2 13 - 8 design were determined based on the aliasing pattern of a design in a Department of Commerce (1975) publication, "Fractional Factorial Experiment Designs for Factors at Two Levels.' Use of blocking variables allowed the runs to be grouped into 4 blocks, corresponding to the 4 days of the experiment. Blocks are used to try to account for variation between the different days. Blocking variables are assigned to interaction columns in the same way as additional experimental variables. Within each block, run order was randomized. Appendix A gives the program used to prepare the randomized design. The different stages of the design are produced as output from the program. ™ SAS and SAS/GRAPH are registered trademarks of SAS Institute Inc., Cary, NC, USA. SAS/ AF is a trademark of SAS Institute Inc. Exalllpie 2: Central COlllposite Design for Optinrizing Equiplllent Settings REFERENCES 1. Box, G. E. P., and Hunter, J. S. (1961)' "The 2k- p Fractional Factorial Designs, Part I," Telhnometrils, 3,311-351. 2. Box, G. E. P., and Hunter, J. S. (1961), "The 2k- p Fractional Factorial Designs, Part II," Telhnometrics,3,449-458. In some situations, an experimenter may need to estimate quadratic effects in addition to the main effects and two-factor interactions which can be estimated with a fractional factorial design. An extension of factorial and fractional factorial designs which allows estimation of quadratic effects is the central composite design. The central composite design is a useful tool for exploring surfaces which arise when a response variable is modeled as a quadratic function of experimental variables. In a composite design, a factorial or fractional factorial design is augmented by two additional types of design points. Center points occur at the center of the range of settings for each variable. Star points fix one variable at a non-central value and all the other variables at the center. The factorial points, star points, and center points are combined to form a composite design. 3. Box, Hunter, and Hunter (1978), Statistics for Experimenters, New York: John Wiley. 4. Chilko, Daniel (1983), Probability Plots, SAS Technical Report A-106, SAS Institute Inc., Cary, NC. 5. Daniel, C. (1959), "Use of Half-Normal Plots in Interpreting Factorial Two-Level Experiments," Telhnometrics,I,149. 6. Daniel, C. (1976), Applilations of Statistics to Experimentation, New York: John Wiley. Industrial 7. Davies, O. L. (1978), The Design and Analysis of Industrial Experiments, London: Longman Group. 8. Draper, N. R. (1985), "Small Composite Designs," Tahnometrils,27,173-180. An experimenter was interested in studying a mechanical system to determine optimum operating conditions. Eight settings could be varied on the equipment. A 2 8 - 3 fractional factorial design would provide information on main effects and some two-factor interactions. Adding star points and center points to the fractional factorial produced a central composite design which provided additional information to estimate quadratic effects. The program given in Appendix B illustrates generation of the design matrix and run randomization. 9. Draper, N. R. and Smith, H. (1981), Applied Regression Analysis, New York: John Wiley. 10. Myers, R. H. (1976), Response Surface Methodology, Blacksburg, Virginia: Virginia Polytechnic Institute and State University. 11. SAS Institute Inc. (1985), SAS Uscr's Guide: Basics, Version 5 Edition, Cary, NC. 12. SAS Institute Inc. (1985), SAS User's Guidl: StatistilS, Version 5 Edition, Cary, NC. 13. SAS Institute Inc. (1985), BAS/GRAPH User's Guide, Version 5 Edition, Cary, NC. 14. SAS Institute Inc. (1985), SAS/AF User's Guide, Version 5 Edition, Cary, NC. 15. U. S. Department of Commerce (1957), Fractional Faetoriai Experiment Designs for Factors at Two Levels, National Technical Information Service. Further information on composite designs and other response surface techniques is available in Myers (1976). Recent work by Draper (1985) improves on traditional central composite designs by further reducing the number of required runs in certain cases. Exaillple 3: Outlier Detection in a Biolllechanies Study A 25 - 2 fractional factorial design was run to study the effects of 5 variables on a biomechanical system. At the end of the 8 runs, a fear that failure to account for some other variables led to augmentation by another set of 8 rUllS resulting in a 21 - 3 design. The combined data was analyzed as described in the chemical yield example. A probability plot of the residuals indicated non-normal 776 APPENDIX A - SCREENING DESIGN APPEHDIX B - CENTRAL COMPOSITE DESIGR ,; PROGRAM TO GENERATE EXPERMEIITAL DESIGN FOR 2**(13-8) FRACTIONAL * * FACTORIAL IN 4 SLOCKS OF 10 UNITS EACH. RUNS ARE RANDOMIZED • WITHIN EACH BLOCK. EACH BLOCK INCLUDES 2 REPLICATE POIKTS. • PROGRAM WHICH GENERATES A Z**(8-3) FRACTIONAL FACTORIAL DESIGH, • * ADDS STAR POINTS AND CENTER POINTS TO FORM A CERTRAL COMPOSITE • * DESIGN, AND RANDOtIlZES RUN ORDER ~ .... " ... *.".*" .... *" .... ~ ... * .. *"." .. ** .. ~**.*."*.* ....... ** .. ***" •• * ... ,,**"* * GENERATE 2**5 FACTORIAL DESIGII * ** u* ~ ****u u **""* ~.*" *,.*,.* ...... * • GElIERATE A PULL Z**5 FACTORIAL .. data full, do i-I to 3Z; a (-l)*"i; 1>"'(-1) **(int(l+i)/Z»; c=( -1)""(int(3+i)/4» 1 eo:{-I)""(int{7+i)/S» ; f={ -1)**(int{15+i)/16»; output; data full. do iKl to 32; a={-l)"*i; IF-(-l)·''(int( (1+i)/2»; c={-1)*"(int( (3+i)/4»; d=( -l)d(int( (7+i)/8»; e-{-l)"*(int( (15+1)/16»; output; , E """ """ proc print; title 'Z*"5 FACTORIAL DESIGN'; ti.tle1 'Z*·5 FACTORIAL DESIGN'; ~~..~~!~;!* ..." ............. "**"." .. *** .. ** ........ ,."''' .... ** ~ GENERATE Z**{13-S) FRACTIONAL FACTORIAL IN , BLOCKS OF S " ***",*****",,,,*,,,**,,,,**** .. *,,***~,, ... **,, •• ****,... ,, ••• ,,,*.,,,,,,,.*.,,,."'''** " ADD VARIABLES TO OBTAIN FRACTIOI!"AL FACrORIAL • data fraction; set full; d"'a*b*c; g=b·e"f; h-b"c"e; j=a*b*c*e*t; k=c"e*f; l-b*c'f; 1Do.a·b·e; n_*b*f; blkl-f"g; blk2=g*h; exp= n ; titl;; 7 Z**(13-8) FRACTIONAL FACTORIAL DESIGN IN , BLOCKS OF S', proc print; var abc d e t 9' h j k 1 ID n blkl blkZ; * ****** "* ***"******* **" * .. *"" * GENERATE REPLICATE POINTS" data fraction, set full; f~a"b·c; g=a·b*d; h~b"c*d"e; tit1e1 'Z**{8-3) FRACTIONAL FACIORIAL DESIGN'; ~~~.. ~:!'.!;!**** ...... " .. *,.* .. ,. ... * ...... "" ** ... *,. .. " ~ .. GENERATE STAR POINTS AND CENTER POINTS • data star; input a b c d e f 9 h; cards; 0 0 0 ZOO 0 -2 0 0 0 0 0 o ZOO -Z o , o o data reps; input e.xp abc d e f g h j kIm n blU blk2; cards; 33 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -1 34 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -1 35 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 L 36 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 1 31 0 0 0 0 0 0 0 0 0 0 0 0 0 1-1 36 00 0 0 0 0 0 0 0 0 0 0 0 1-1 39000000000000011 40000000000000011 0 -2 0 2 0 0 , OOO-ZOOO o 0 0 0 0 o -, o 0 0 0 0 0 o -, 0 0 0 0 o -, 0 0 o 0 0 0 0 0 0 0 0 0 , •, , , .. ADD REPLICATE POINTS TO BLOCKED FRACTIONAL FACTORIAL .. 0 0 0 0 0 0 0 , -, 0 0 • * .. ,." ~ ..... " .. ~ .. " ..... ~**" .. ~" • .. FORM COMPOSITE DESIGN * data blocks; set traction reps; proc sort; by blk1 blkZ; title 'Z**{13-8) FRACTIONAL FACTORIAL DESIGN', titleZ 'IN 4 BLOCKS OF 10'; tit1e3 'INCLUDING REPLICATE POIlITS'; proc print; var exp abc d e t 9 h j k 1 D'I n blk1 blk2; *** .. * .. *" .... *** .... *** .... *".***** ......... "'.. *" .. * ................. "'*.** .. RANDOMIZE THE ORDER OF BLOCKS AND RUNS WITHIN Br.OCKS " data couq:>Os; set fraction star; title1 '8 VARIABLE COMPOSITE DESIGN'; ~~.~~;!~:!!:!" ......... " ......" ............... ** .... " .. ***.. * ........ . • GENERATE A RANDOM ORDER FOR THE 50 DESIGN roIIITS * , data rand; do ellp"'1 to 50; x=ranuni(O) ; proc sort datasblocks; by blkl b1kZ, data randblk; do Jt1=1 to 4; run_; output; ...., run; proc rank out=random; var run; .... "'*"' .... ** .. "'".. "'*** ...... ***"' ..... **... **********... " ............ ,,* .. ASSIGII RAMOOH ORDER TO POIIITS IN COMPOSITE DESIGN * block-ranuni(O), output; 000, proc rank out"'ranks; var block; data ranks;set ranks; do xZ-1 to 10; run=ranuui(O); output; data design; merge couq:>Os random; proc sort; """ by run; data design; JDerge blocks ranks, proc sort;by block; proc rank out~final; var run; by block; proc sort,by block run; titleZ 'RANDOMIZED WITHIN 4 BLOCKS OF 10'; proc print; var eKp block run abc d e t 9' h j k 1 ID n; run; tit1el 'CE!ITRAL COMPOSITE DESIGN IN RAKDOM ORDER' 1 proc print; var ruu eKp abc d e f 9 h; run; 777