Download Three Methods for Creating One Observation from Multiple Observations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pharmacogenomics wikipedia , lookup

Transcript
Coder's Corner
161
THREE METHODS FOR CREATING ONE OBSERVATION FROM MULTIPLE
OBSERVATIONS
Adeline Wilcox, VA Cooperative Studies Program, Perry Point, MD
SASe Language and Procedures: Usage 2 shows on page 173
how to create one observation from n mUltiple observations.
Suppose we wish to report, on a single line, all drugs
prescribed for each hypertension patient at each clinic visit.
Our example data set is shown in Table 1. Notice that n
varies, that is, the number of records per visit varies from
one to three.
We can produce the report using a modification of the code
in the Usage 2 manual. This is shown below. Note the
addition of the BY statement following the SET statement
and a statement to stop execution of the DO loop when the
last record for each visit has been reached. The report is
shown in Table 2.
DATA USAGE2;
ARRAY COMBINE{3} $ COLI-COL3;
DO 1=1 TO 3;
SET EXAMPLE; BY HOSPITAL PATIENT VISIT;
COMBINE{I}=DRUGi
IF LAST. VISIT THEN 1=3;
END;
RUN;
PROC PRINT NOOBS SPLIT=''''';
LABEL COLI=''''' COL2='Blood Pressure Medications'
COL3=''''';
VAR HOSPITAL PATIENT VISIT COLl-COL3;
RUN;
This report could have been produced with verbose code
similar to that on page 66 of SAS Programming Tips: A
Guide to Efficient SAS Processing and the PROC PRINT
shown above. That code is shown in the manual only for
comparison with PROC TRANSPOSE.
DATA VERBOSE(DROP=COUNT DRUG I);
ARRAY PILLS{3} $ COLl-COL3;
RETAIN COUNT 1 PILLS;
SET EXAMPLE; BY HOSPITAL PATIENT VISIT;
IF FIRST. VISIT THEN DO 1= I TO 3;
PILLS{I}=' ';
END;
PILLS{COUNT}=DRUG;
COUNT + Ii
IF LAST. VISIT THEN DO;
COUNT = I;
OUTPUT;
END;
RUN;
When we transpose a SAS data set the first row of the data
set becomes the first column and the first column becomes
the first row. Transposition of a data set with three rows
and three columns is shown here.
abc
d e f
g h
-
a d g
b e h
c f i
In our example data set the variable DRUG has rows with
only one entry. The number of drugs prescribed at each
visit varies from one to three. Here we are transposing BY
CENTER PATIENT VISIT. Note that the last variable in
the BY list determines the length of the columns to be
transposed. Transposing three rows in a single column
produces one row with three columns.
a
d
g
-
a d g
Transposing the data from the Bronx amounts to transposing
five data sets each consisting of a single entry. The first
patient from Cleveland was prescribed three drugs at his
only visit so transposing his data results in one row with
three drugs. The second patient from Cleveland was seen at
two visits and had a second drug added to his regimen at his
second visit. So transposing his data gives us two rows with
one drug in the first and two drugs in the second.
PROC TRANSPOSE is recommended for creating one
observation from several observations.
A PROC
CONTENTS and DATA _NULL_ step have been used to
print the variable names in the data set output by PROC
TRANSPOSE. When the transposed data set is output the
first drug at each visit is renamed COLI, the second drug
COL2 and the third drug COLJ.
PROC TRANSPOSE DATA=EXAMPLE OUT=TRANl;
VARDRUG;
BY HOSPITAL PATIENT VISIT;
RUN;
NESUG '92 Proceedings
162
Coder's Corner
PROC CONTENTS
NOPRINT SHORT;
RUN;
DATA = TRAN1
OUT=CONTI
DATA _NULL_;SET CONT1(KEEP=NAME);*MAKE
SURE COLN=COL3 OR ADD COL4 TO THE VAR
STATEMENT IN PROC PRINT;
PUTNAME@;
RUN;
PROC TRANSPOSE used 1.43 seconds of CPU time, the
USAGE 2 code required 2.47 seconds and the verbose
method 2.15 seconds when run under VMS on a VAX.
lln80.
Additional Reading:
Strang, Gilbert; Linear Algebra and Its Applications,
Harcourt Brace Jovanovich, San Diego, 1977. p.47.
If there are four drugs reported at a visit the fourth will not
appear on the output when PROC TRANSPOSE is used with
the PROC PRINT shown above. When the Usage 2 method
is employed the fourth drug is printed on the line below the
first three. When the verbose method is used a fourth drug
will cause an error because the array subscript will be out of
range.
OBS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
HOSPITAL
Bronx
Broox
Brorut
Broox
Brorut
Cleveland
Cleveland
Cleveland
Cleveland
Cleveland
Cleveland
Cleveland
Cleveland
Cleveland
Table 1
PATIENT VISIT
1342
4
1342
5
1342
6
2762
5
2762
6
615
6
615
6
615
6
1287
3
1287
4
1287
4
5075
1
5075
2
5075
2
DRUG
Captopril
Nifedipine
Nifedipine
Atenolol
Atenolol
Captopril
Hydrochlorothiazide
Nifedipine
Metoprolol
Metoprolol
Nifedipine
Lisinoprii
Lisinoprii
Furosemide
Table 2
HOSPITAL
Broox
Broox
Broox
Broox
Brorut
Cleveland
Cleveland
Cleveland
Cleveland
Cleveland
NESUG
PATIENT
1342
1342
1342
2762
2762
615
1287
1287
5075
5075
VISIT
4
5
6
5
6
6
3
4
I
2
'92 Proceedings
Blood Pressure Medications
Captopril
Nifedipine
Nifedipine
Atenolol
Atenolol
Captopril
Metoprolol
Metoprolol
Lisinopril
Lisinopril
Hydrochlorothiazide Nifedipine
Nifedipine
Furosemide