Download A Reporting Macro to Create Descriptive Statistical Summary Tables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Categorical variable wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
A Reporting Macro to Create Descriptive Statistical Summary Tables
Ellen Forman, GlaxoSmithKline, Collegeville, PA
The framework of the program flow is described as
follows:
ABSTRACT
It is often necessary to produce summary tables showing
statistical values such as mean, median, standard
deviation (sd), minimum, and maximum, for numerical
data such as lab parameters. These values may be
ordered by other variables such as treatment group or
visit.
This paper describes an easy to use, flexible macro that
will produce a variety of these tables.
The SAS product used in this paper is SAS BASE, with no
limitation of operating systems.
STEP 1
This step prepares the input dataset for the macro and
accepts the user's input macro argument information.
There are ten macro arguments, namely, dsin, sortby1,
sortby1f, noprintv, grpvar, grpf, sortby2, sortby2f,
value, and chgbase.
STEP 2
This step checks to see if the grpvar variable is present. If
it is, then N is calculated for each group using proc sql.
INTRODUCTION
The SAS macro dstat.sas conforms to the author's
company standards that require a linesize of 90 and a
pagesize of 32, which corresponds to a 6 inch by 9 inch
display area for Courier 12 point font.
The arugments of this macro are:
ARGUMENT
dsin
sortby1
sortby1f
noprintv
grpvar
grpf
sortby2
sortby2f
value
chgbase
DESCRIPTION
the prepared dataset from
which the summary will be
produced
the first sortby variable
which will become column
one
the format for the sortby1
variable
a variable that provides the
correct sort order but does
not print on the summary
report
the grouping variable
the format for the grpvar
variable
the second sortby variable
the format for the sortby2
variable
the variable used for the
statistical summary
code 'Y' for "change from
baseline" title to appear
across statistical columns
Required variables are dsin, value, and at least one of the
sortby1, sortby2, or grpvar variables. All other variables
are optional.
SAS MACRO DESIGN SPECIFICATION
STEP 3
This step sorts the file by sortby1, grpvar, and sortby2. At
least one of these three arguments must be valued or
errors will result from the sort. The statistics (n, mean,
median, standard deviation, minimum, and maximum) are
then calculated using proc summary, and, if present, N is
properly formatted.
STEP 4
This step is the proc report statement which produces the
table. The column, define, and break statements are
customized depending on the presence of the sortby1,
sortby2, and grpvar variables. The linesize limitation
requires that the width of each column be determined by
the number of columns requested.
STEP 5
All temporary files are deleted by the proc datasets
procedure at the end of the execution of the SAS macro
dstat.sas.
SAS CODE FOR DESCRIPTIVE STATISTICAL
SUMMARY TABLES
The entire macro code is as follows:
%macro dstat(dsin=, sortby1=, sortby1f=,
noprintv=, grpvar=, grpf=, sortby2=,
sortby2f=, value=, chgbase=);
******************************************;
* Get total (N) for each grpvar group
******************************************;
%if &grpvar ^= %then
%do;
proc sql;
create table totgrp as
select &grpvar, count (distinct pid)
as totn
from &dsin;
group by &grpvar;
quit;
%end;
******************************************;
* Sort the file.
******************************************;
proc sort data=&dsin;
by &sortby1 &grpvar &sortby2; run;
******************************************;
* Calculate statistics.
******************************************;
proc summary data = &dsin;
by &sortby1 &grpvar &sortby2;
var &value;
%if &noprintv ^= %then
%do;
id &noprintv;
%end;
output out=sumtot n=n mean=mean std=sd
median=median min=min max=max;
run;
******************************************;
* Customize variables for proc report. If
* grpvar is not valued use proc datasets
* to copy sumtot to rpttot.
******************************************;
%if &grpvar ^= %then
%do;
proc sql;
create table rpttot as
select s.*,
t.totn,
put (totn, 4.) as totnc
from sumtot s, totgrp t
where s.&grpvar = t.&grpvar;
quit;
%end;
%else
%do;
proc datasets library=work nolist;
change sumtot=rpttot;
run;
%end;
******************************************;
* Sort file for report
******************************************;
proc sort data=rpttot;
by &sortby1 &grpvar &noprintv &sortby2;
run;
******************************************;
* Produce the table
*;
******************************************;
proc report data = rpttot headline headskip
split='^' ;
column (&sortby1 &grpvar
%if &grpvar ^= %then totnc;
&noprintv &sortby2
%if %upcase(&chgbase)=Y %then
("Change from Baseline" "--" n
mean sd median min max));;
%else n mean sd median min max);;
%if &noprintv ^= %then
define &noprintv / order noprint;;
%if &sortby1 ^= %then
%do;
%if &sortby2 ^= %then
%do;
define &sortby1 / order width=11
left flow %if &sortby1f ^=
then f=&sortby1f;;
%if &grpvar ^= %then
define &grpvar / order width=9
left flow %if &grpf ^= then
f=&grpf;;;
define &sortby2 / order width=17
center flow %if &sortby2f ^=
then f=&sortby2f;;
%end;
%else
%do;
define &sortby1 / order width=20
left flow %if &sortby1f ^= %then
f=&sortby1f;;
%if &grpvar ^= %then
define &grpvar / order width=9
left flow %if grpf ^= %then
f=&grpf;;;
%end;
%end;
%else
%do;
%if &sortby2 ^= %then
%do;
%if &grpvar ^= %then
define &grpvar / order width=9
left flow %if grpf ^= %then
f=&grpf;;;
define &sortby2 / order width=20
center flow
%if &sortby2f ^= %then
f=&sortby2f;;;
%end;
%else %if &grpvar ^= %then
define &grpvar / order width=9
left flow %if &grpf ^= %then
f=&grpf;;;
%end;
%if &grpvar ^= %then define totnc /
order width=4 center 'N';
define n
/ display width=4 spacing=1
'n' f=8.;
define mean / display width=6 spacing=1
'MEAN' f=8.2;
define sd
/ display width=6 spacing=1
'SD' f=8.2;
define median / display width=6
spacing=2 'MEDIAN' f=8.1;
define min / display width=6 spacing=1
'MIN' f=8.1;
define max / display width=6 spacing=1
'MAX' f=8.1;
%if &grpvar ^= %then
break after &grpvar / skip;
%else %if &sortby1 ^= %then
break after &sortby1 / skip;
%else %if &sortby2 ^= %then
break after &sortby2 / skip;
run;
******************************************;
* Delete temporary datasets
******************************************;
proc datasets nolist;
delete &dsin tot tot1 totgrp rpttot;
run;
%mend dstat;
SAS MACRO INVOCATION AND APPLICATION
Before the macro invocation, it is important that you use
the DATA STEP to prepare the SAS dataset to be used in
the summary report. A few examples of the macro will
follow:
Example 1:
Example 1 demonstrates the simplest use of the macro,
using only one sortby variable. This example can be
useful in searching for outliers.
Figure 1
argument.
Example 3 shows the summary statistics for each ECG
parameter, grouped by visit. Viso was created as a
noprintv variable to properly order the Visit Id column
since a format was used for the sortby2 argument.
Example using only sortby1 macro
Example 2:
Example 2 displays the summary statistics for each
Electrocardiogram (ECG) parameter, grouped by
treatment group.
Figure 3 Example using sortby1 and sortby2 macro
arguments.
Example 4:
Example 4 displays summary statistics for each ECG
parameter, grouped by visit within treatment group. Trxvis
was created as a noprintv variable to properly order the
Visit Id column since a format was used for the sortby2
argument. The output example is shown in Figure 4.
Example 5:
Example 5 demonstrates the flexibility of the macro by
substituting centre for treatment group as the groupby
variable. Ctrvis was created as a noprintv variable to
properly order the Visit Id column since a format was used
for the sortby2 argument. The output example is shown in
Figure 5.
Figure 2 Example using sortby1 and grpvar macro
arguments.
Example 3:
Figure 4 Example using sortby1, sortby2, and grpvar
macro arguments.
Figure 6 Example using chgbase macro argument to
add a title across the statistical columns.
CONCLUSION
This paper describes a SAS macro for generating
statistical summary reports that is both flexible and easy
to use. Various examples of how to invoke the macro are
provided along with corresponding output.
ACKNOWLEDGMENTS
The author would like to thank Shi-Tao Yeh, Biostatistics
& Data Management, GlaxoSmithKline, for his help,
suggestions, and review.
CONTACT INFORMATION
Your comments and questions are
encouraged. Contact the author at:
Ellen Forman
GlaxoSmithKline
Work Phone: (610) 917-6180
Email: [email protected]
Figure 5 Example using sortby1, sortby2, and grpvar
macro arguments.
Example 6:
Example 6 displays the summary statistics for change
from baseline in hemoglobin by visit. The example uses
the chgbase argument to add the "Change from Baseline"
header over the column headings for the summary
statistics.
valued
and