Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Reporting Macro to Create Descriptive Statistical Summary Tables Ellen Forman, GlaxoSmithKline, Collegeville, PA The framework of the program flow is described as follows: ABSTRACT It is often necessary to produce summary tables showing statistical values such as mean, median, standard deviation (sd), minimum, and maximum, for numerical data such as lab parameters. These values may be ordered by other variables such as treatment group or visit. This paper describes an easy to use, flexible macro that will produce a variety of these tables. The SAS product used in this paper is SAS BASE, with no limitation of operating systems. STEP 1 This step prepares the input dataset for the macro and accepts the user's input macro argument information. There are ten macro arguments, namely, dsin, sortby1, sortby1f, noprintv, grpvar, grpf, sortby2, sortby2f, value, and chgbase. STEP 2 This step checks to see if the grpvar variable is present. If it is, then N is calculated for each group using proc sql. INTRODUCTION The SAS macro dstat.sas conforms to the author's company standards that require a linesize of 90 and a pagesize of 32, which corresponds to a 6 inch by 9 inch display area for Courier 12 point font. The arugments of this macro are: ARGUMENT dsin sortby1 sortby1f noprintv grpvar grpf sortby2 sortby2f value chgbase DESCRIPTION the prepared dataset from which the summary will be produced the first sortby variable which will become column one the format for the sortby1 variable a variable that provides the correct sort order but does not print on the summary report the grouping variable the format for the grpvar variable the second sortby variable the format for the sortby2 variable the variable used for the statistical summary code 'Y' for "change from baseline" title to appear across statistical columns Required variables are dsin, value, and at least one of the sortby1, sortby2, or grpvar variables. All other variables are optional. SAS MACRO DESIGN SPECIFICATION STEP 3 This step sorts the file by sortby1, grpvar, and sortby2. At least one of these three arguments must be valued or errors will result from the sort. The statistics (n, mean, median, standard deviation, minimum, and maximum) are then calculated using proc summary, and, if present, N is properly formatted. STEP 4 This step is the proc report statement which produces the table. The column, define, and break statements are customized depending on the presence of the sortby1, sortby2, and grpvar variables. The linesize limitation requires that the width of each column be determined by the number of columns requested. STEP 5 All temporary files are deleted by the proc datasets procedure at the end of the execution of the SAS macro dstat.sas. SAS CODE FOR DESCRIPTIVE STATISTICAL SUMMARY TABLES The entire macro code is as follows: %macro dstat(dsin=, sortby1=, sortby1f=, noprintv=, grpvar=, grpf=, sortby2=, sortby2f=, value=, chgbase=); ******************************************; * Get total (N) for each grpvar group ******************************************; %if &grpvar ^= %then %do; proc sql; create table totgrp as select &grpvar, count (distinct pid) as totn from &dsin; group by &grpvar; quit; %end; ******************************************; * Sort the file. ******************************************; proc sort data=&dsin; by &sortby1 &grpvar &sortby2; run; ******************************************; * Calculate statistics. ******************************************; proc summary data = &dsin; by &sortby1 &grpvar &sortby2; var &value; %if &noprintv ^= %then %do; id &noprintv; %end; output out=sumtot n=n mean=mean std=sd median=median min=min max=max; run; ******************************************; * Customize variables for proc report. If * grpvar is not valued use proc datasets * to copy sumtot to rpttot. ******************************************; %if &grpvar ^= %then %do; proc sql; create table rpttot as select s.*, t.totn, put (totn, 4.) as totnc from sumtot s, totgrp t where s.&grpvar = t.&grpvar; quit; %end; %else %do; proc datasets library=work nolist; change sumtot=rpttot; run; %end; ******************************************; * Sort file for report ******************************************; proc sort data=rpttot; by &sortby1 &grpvar &noprintv &sortby2; run; ******************************************; * Produce the table *; ******************************************; proc report data = rpttot headline headskip split='^' ; column (&sortby1 &grpvar %if &grpvar ^= %then totnc; &noprintv &sortby2 %if %upcase(&chgbase)=Y %then ("Change from Baseline" "--" n mean sd median min max));; %else n mean sd median min max);; %if &noprintv ^= %then define &noprintv / order noprint;; %if &sortby1 ^= %then %do; %if &sortby2 ^= %then %do; define &sortby1 / order width=11 left flow %if &sortby1f ^= then f=&sortby1f;; %if &grpvar ^= %then define &grpvar / order width=9 left flow %if &grpf ^= then f=&grpf;;; define &sortby2 / order width=17 center flow %if &sortby2f ^= then f=&sortby2f;; %end; %else %do; define &sortby1 / order width=20 left flow %if &sortby1f ^= %then f=&sortby1f;; %if &grpvar ^= %then define &grpvar / order width=9 left flow %if grpf ^= %then f=&grpf;;; %end; %end; %else %do; %if &sortby2 ^= %then %do; %if &grpvar ^= %then define &grpvar / order width=9 left flow %if grpf ^= %then f=&grpf;;; define &sortby2 / order width=20 center flow %if &sortby2f ^= %then f=&sortby2f;;; %end; %else %if &grpvar ^= %then define &grpvar / order width=9 left flow %if &grpf ^= %then f=&grpf;;; %end; %if &grpvar ^= %then define totnc / order width=4 center 'N'; define n / display width=4 spacing=1 'n' f=8.; define mean / display width=6 spacing=1 'MEAN' f=8.2; define sd / display width=6 spacing=1 'SD' f=8.2; define median / display width=6 spacing=2 'MEDIAN' f=8.1; define min / display width=6 spacing=1 'MIN' f=8.1; define max / display width=6 spacing=1 'MAX' f=8.1; %if &grpvar ^= %then break after &grpvar / skip; %else %if &sortby1 ^= %then break after &sortby1 / skip; %else %if &sortby2 ^= %then break after &sortby2 / skip; run; ******************************************; * Delete temporary datasets ******************************************; proc datasets nolist; delete &dsin tot tot1 totgrp rpttot; run; %mend dstat; SAS MACRO INVOCATION AND APPLICATION Before the macro invocation, it is important that you use the DATA STEP to prepare the SAS dataset to be used in the summary report. A few examples of the macro will follow: Example 1: Example 1 demonstrates the simplest use of the macro, using only one sortby variable. This example can be useful in searching for outliers. Figure 1 argument. Example 3 shows the summary statistics for each ECG parameter, grouped by visit. Viso was created as a noprintv variable to properly order the Visit Id column since a format was used for the sortby2 argument. Example using only sortby1 macro Example 2: Example 2 displays the summary statistics for each Electrocardiogram (ECG) parameter, grouped by treatment group. Figure 3 Example using sortby1 and sortby2 macro arguments. Example 4: Example 4 displays summary statistics for each ECG parameter, grouped by visit within treatment group. Trxvis was created as a noprintv variable to properly order the Visit Id column since a format was used for the sortby2 argument. The output example is shown in Figure 4. Example 5: Example 5 demonstrates the flexibility of the macro by substituting centre for treatment group as the groupby variable. Ctrvis was created as a noprintv variable to properly order the Visit Id column since a format was used for the sortby2 argument. The output example is shown in Figure 5. Figure 2 Example using sortby1 and grpvar macro arguments. Example 3: Figure 4 Example using sortby1, sortby2, and grpvar macro arguments. Figure 6 Example using chgbase macro argument to add a title across the statistical columns. CONCLUSION This paper describes a SAS macro for generating statistical summary reports that is both flexible and easy to use. Various examples of how to invoke the macro are provided along with corresponding output. ACKNOWLEDGMENTS The author would like to thank Shi-Tao Yeh, Biostatistics & Data Management, GlaxoSmithKline, for his help, suggestions, and review. CONTACT INFORMATION Your comments and questions are encouraged. Contact the author at: Ellen Forman GlaxoSmithKline Work Phone: (610) 917-6180 Email: [email protected] Figure 5 Example using sortby1, sortby2, and grpvar macro arguments. Example 6: Example 6 displays the summary statistics for change from baseline in hemoglobin by visit. The example uses the chgbase argument to add the "Change from Baseline" header over the column headings for the summary statistics. valued and