Download Getting the MAX Out of PROC MEANS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pandemic wikipedia , lookup

Hygiene hypothesis wikipedia , lookup

Eradication of infectious diseases wikipedia , lookup

Transmission (medicine) wikipedia , lookup

Infection wikipedia , lookup

Epidemiology wikipedia , lookup

Syndemic wikipedia , lookup

Compartmental models in epidemiology wikipedia , lookup

Disease wikipedia , lookup

Multiple sclerosis research wikipedia , lookup

Public health genomics wikipedia , lookup

List of medical mnemonics wikipedia , lookup

Transcript
Getting the MAX Out of PROC MEANS
R. Scott Leslie, MedImpact Healthcare Systems, Inc., San Diego, CA
ABSTRACT
The MEANS procedure is most commonly used to generate and view descriptives of a data set. The procedure can
also output these descriptive statistics to assist in later data manipulation steps. This paper shows how the OUTPUT
statement and MAX option can create a subject level data set from a data set with multiple observations per subject.
Additional steps include: introducing a macro to automate this code, using PROC FORMAT to avoid tedious hard
coding, and using PROC TABULATE to display the information in a precise table. In this example, the code uses a
pharmacy claims data set to calculate the percentage of members utilizing 64 different therapeutic classes of
medications.
INTRODUCTION
In addition to its most commonly used descriptive functions, the MEANS procedure has many underutilized functions.
One of which is the ability to output descriptive statistics to a data set. Further quick and easy steps using the
TABULATE and FORMAT procedures can make concise reporting of these statistics and other values from the data
set. This paper is offers code and handy tricks to convert a data set that has multiple observations per subject into a
data set with multiple dummy variables but a single observation per subject.
As one finds very quickly with SAS® or many other programming languages, there are many ways to get to where you
want to go. This paper offers a few routes to accomplish a task.
PROC MEANS TO THE MAX
This example uses a pharmacy claims data set that has multiple observations per patient. Each observation is a
prescription with a member identifier and a general therapeutic drug class code (GTC) indicating the therapeutic class
of the drug prescribed. The steps below create dummy variables indicating presence or no presence of a claim for
the 64 drug classes. A cut of the data set shows 8 claims for a patient using 3 different GTCs.
Obs
MEMBER_ID
DRUG
GTC
262
263
264
265
266
267
268
251988065
251988065
251988065
251988065
251988065
251988065
251988065
FOLIC ACID
ZETIA
ACTOS
METFORMIN HCL ER
TRICOR
METFORMIN HCL ER
TRICOR
0143
0109
0130
0130
0109
0130
0109
269
251988065
FOLIC ACID
0143
STEP 1
The first step is to make and populate indicator variables for each of the 64 GTCs. An array creates the 64 indicator
variables and sets the value of each to 0. IF-THEN/ELSE statements change the value of each indicator to 1 where
the condition applies to the observation.
data sample; set sample (keep= d_member_hq_id pub_gtc_cd);
array gtc_codes (64) _0100 - _0163;
do i = 1 to 64 ; gtc_codes(i)=0;end;
if pub_gtc_cd in ('0100') then _0100=1;
else if pub_gtc_cd in ('0101') then _0101=1;
else if pub_gtc_cd in ('0102') then _0102=1;
else if pub_gtc_cd in ('0103') then _0103=1;
else if pub_gtc_cd in ('0104') then _0104=1;
else if pub_gtc_cd in ('0105') then _0105=1;
else if pub_gtc_cd in ('0106') then _0106=1;
else if pub_gtc_cd in ('0107') then _0107=1;
else if pub_gtc_cd in ('0108') then _0108=1;
else if pub_gtc_cd in ('0109') then _0109=1;
else if pub_gtc_cd in ('0110') then _0110=1;
1
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
else
run;
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
if pub_gtc_cd
_na=1;
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
in
('0111')
('0112')
('0113')
('0114')
('0115')
('0116')
('0117')
('0118')
('0119')
('0120')
('0121')
('0122')
('0123')
('0124')
('0125')
('0126')
('0127')
('0128')
('0129')
('0130')
('0131')
('0132')
('0133')
('0134')
('0135')
('0136')
('0137')
('0138')
('0139')
('0140')
('0141')
('0142')
('0143')
('0144')
('0145')
('0146')
('0147')
('0148')
('0149')
('0150')
('0151')
('0152')
('0153')
('0154')
('0155')
('0156')
('0157')
('0158')
('0159')
('0160')
('0161')
('0162')
('0163')
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
then
_0111=1;
_0112=1;
_0113=1;
_0114=1;
_0115=1;
_0116=1;
_0117=1;
_0118=1;
_0119=1;
_0120=1;
_0121=1;
_0122=1;
_0123=1;
_0124=1;
_0125=1;
_0126=1;
_0127=1;
_0128=1;
_0129=1;
_0130=1;
_0131=1;
_0132=1;
_0133=1;
_0134=1;
_0135=1;
_0136=1;
_0137=1;
_0138=1;
_0139=1;
_0140=1;
_0141=1;
_0142=1;
_0143=1;
_0144=1;
_0145=1;
_0146=1;
_0147=1;
_0148=1;
_0149=1;
_0150=1;
_0151=1;
_0152=1;
_0153=1;
_0154=1;
_0155=1;
_0156=1;
_0157=1;
_0158=1;
_0159=1;
_0160=1;
_0161=1;
_0162=1;
_0163=1;
STEP 2
The next step is to use the MAX and OUTPUT options of the MEANS procedure to make a patient level data set from
the claim level data set. Sorting by subject allows the means procedure to take the max value of each of the 64
indicator variables for each patient. The NOPRINT option prevents printing unnecessary output. The _freq_ gives
the number of observations for each value in the BY statement. In this case, the number of observations is the
2
number of pharmacy claims for each patient, so the variable “NumberClaims” gives the number of claims for each
patient. The MAX option in the output statement lists the maximum value for each patient. The previous step set the
indicator vales are either 0 or 1. Therefore any patient with at least one claim for the specific GTC would have a
value of 1 for that GTC for the patient.
proc sort data=sample;by d_member_hq_id;run;
proc means data=sample noprint;
var _0100 - _0163;
by d_member_hq_id;
output out=outdata (rename=(_freq_=NumberClaims) drop=_type_)
run;
max= ;
The result of the above code creates a patient level data set with 64 indicator variables, one for each GTC.
Obs
MEMBER_ID
NumberClaims
12
262
8
Obs
12
0100 0101 0102 0103 0104 0105 0106 0107 0108 0109
0
0
0
0
0
0
0
0
0
1
**0110 - 0129** 0130 **0131 - 0142** 0143 **0144 – 0164**
0
1
0
1
0
STEP 3
Now that we have the patient level data set with the 64 variables, labeling the variables will gives a description for
each of them. Variable “_0100” is labeled as the indicator variable 'INFECTIOUS DISEASE - BACTERIAL'.
data outdata; set outdata;
label
_0100='INFECTIOUS DISEASE - BACTERIAL'
_0101='INFECTIOUS DISEASE - FUNGAL'
_0102='INFECTIOUS DISEASE - VIRAL'
_0103='INFECTIOUS DISEASE - PARASITIC'
_0104='INFECTIOUS DISEASE - MISCELLANEOUS'
_0105='NEOPLASTIC DISEASE'
_0106='CARDIOVASCULAR DISEASE - HYPERTENSION'
_0107='CARDIOVASCULAR DISEASE - ARRHYTHMIA'
_0108='CARDIOVASCULAR DISEASE - VASODILATION'
_0109='CARDIOVASCULAR DISEASE - LIPID IRREGULARITY'
_0110='CARDIOVASCULAR DISEASE - CARDIAC STIMULANT'
_0111='CARDIOVASCULAR DISEASE - MISCELLANEOUS AGENTS'
_0112='PAIN MANAGEMENT - ANALGESICS'
_0113='BEHAVIORAL HEALTH - ANTIDEPRESSANTS'
_0114='SEIZURE DISORDER'
_0115='PARKINSONS DISEASE'
_0116='ANTIEMESIS/ANTIVERTIGO'
_0117='DERMATOLOGY - ANTIINFLAMMATORY'
_0118='DERMATOLOGY - ANTIINFECTIVE'
_0119='DERMATOLOGY - ANTIPRURITIC DRUGS'
_0120='DERMATOLOGY - ACNE'
_0121='DERMATOLOGY - PSORIASIS/ECZEMA'
_0122='DERMATOLOGY - PIGMENTATION DISORDERS'
_0123='DERMATOLOGY - MISCELLANEOUS'
_0124='VAGINAL DISORDERS'
_0125='EAR - GENERAL DISORDERS'
_0126='EYE - GENERAL DISORDERS'
_0127='EYE - GLAUCOMA'
_0128='EYE - MISCELLANEOUS'
_0129='ORAL/PHARYNGEAL DISORDERS'
_0130='DIABETES'
_0131='INFLAMMATORY DISEASE'
_0132='ENDOCRINE DISORDER - THYROID'
3
_0133='ENDOCRINE DISORDER - FERTILITY'
_0134='ENDOCRINE DISORDER - OTHER'
_0135='UPPER GASTROINTESTINAL DISORDERS - ULCER DISEASE'
_0136='UPPER GASTROINTESTINAL DISORDERS - SPASTIC DISEASE'
_0137='UPPER GASTROINTESTINAL DISORDERS - DIGESTIVE'
_0138='LOWER GASTROINTESTINAL DISORDERS - BOWEL INFLAMMAT'
_0139='LOWER GASTROINTESTINAL DISORDERS - OTHER'
_0140='IMMUNIZATION'
_0141='GOUT AND RELATED DISEASES'
_0142='SKELETAL MUSCLE DISORDER'
_0143='VITAMIN AND/OR MINERAL DEFICIENCY'
_0144='ELECTROLYTE REGULATION'
_0145='HEMATOLOGICAL DISORDERS'
_0146='FLUID REPLACEMENT'
_0147='HORMONAL DEFICIENCY'
_0148='CONTRACEPTION/OXYTOCICS'
_0149='ASTHMA'
_0150='OTHER RESPIRATORY DISORDERS'
_0151='ALLERGY'
_0152='COUGH AND COLD'
_0153='URINARY TRACT - FUNCTIONAL DISORDERS'
_0154='LOCAL ANESTHESIA'
_0155='AUTONOMIC NERVOUS SYSTEM DISORDERS'
_0156='IMMUNOSUPPRESSION/MODULATION'
_0157='MEDICAL SUPPLIES'
_0158='MISCELLANEOUS AGENTS'
_0159='OTHER DRUGS'
_0160='NEUROLOGICAL DISEASE - MISCELLANEOUS'
_0161='SMOKING CESSATION'
_0162='WEIGHT REDUCTION'
_0163='BEHAVIORAL HEALTH - OTHER';
run;
DATA PRESENTATION: OUTPUT DELIVERY SYSTEM AND PROC TABULATE
Presenting the data in a concise table helps convey the data easily to readers. A PROC FREQ would be
cumbersome and difficult to read for those unfamiliar to SAS®. A simple TABULATE procedure step can give the
data in a table. In an additional step, the Output Delivery System (ODS) can output the table to a rich text format,
PDF or HTML file.
With this patient level data set, taking the mean of each indicator variable would give the proportion of patients using
each GTC. A TABULATE procedure step can do this easily, but the mean would be expressed as a proportion
between 0 and 1 (e.g. mean = 0.37 = 110/ 335 patients). Formatting the mean statistic can configure the mean as a
percent and add the “%” symbol. The FORMAT procedure step below uses the PICTURE option to create a template
for printing numbers. The format “pctpic” formats all values (low-high). The 9 makes a place holder for the values.
The MULT= statement multiples each value by 1000. The result is a clean percent of patients for each GTC.
proc format;picture pctpic low-high='009.0%' (mult=1000);run;
proc tabulate data=outdata;
class cohort;
var _0100 - _0163;
table
(_0100 _0101 _0102 _0103 _0104 _0105 _0106 _0107 _0108 _0109
_0110 _0111 _0112 _0113 _0114 _0115 _0116 _0117 _0118 _0119
_0120 _0121 _0122 _0123 _0124 _0125 _0126 _0127 _0128 _0129
_0130 _0131 _0132 _0133 _0134 _0135 _0136 _0137 _0138 _0139
_0140 _0141 _0142 _0143 _0144 _0145 _0146 _0147 _0148 _0149
_0150 _0151 _0152 _0153 _0154 _0155 _0156 _0157 _0158 _0159
_0160 _0161 _0162 _0163 )*(mean='')*f=pctpic.
, all='%'/box='PERCENT OF PATIENTS BY THERAPEUTIC CLASS';
run;
4
Wrap ODS statements around the code and the table is placed in the file of your choice. Notice the custom style
named ‘rtfshrink’. This was generated in a separate TEMPLATE procedure.
ods rtf file = 'c:\sas\wuss07\gtc_table.rtf' style=rtfshrink;
/*insert proc tabulate code from above*/
ods rtf close;
The result is a table with the percent of patients utilizing each of the 64 GTCs for the time period of your data set.
PERCENT OF PATIENTS BY THERAPEUTIC CLASS
INFECTIOUS DISEASE - BACTERIAL
%
37.9%
INFECTIOUS DISEASE - FUNGAL
3.5%
INFECTIOUS DISEASE - VIRAL
1.4%
INFECTIOUS DISEASE - PARASITIC
4.7%
INFECTIOUS DISEASE - MISCELLANEOUS
1.7%
NEOPLASTIC DISEASE
0.8%
CARDIOVASCULAR DISEASE - HYPERTENSION
CARDIOVASCULAR DISEASE - ARRHYTHMIA
CARDIOVASCULAR DISEASE - VASODILATION
79.1%
0.5%
5.0%
CARDIOVASCULAR DISEASE - LIPID IRREGULARITY
62.0%
CARDIOVASCULAR DISEASE - CARDIAC STIMULANT
3.5%
CARDIOVASCULAR DISEASE - MISCELLANEOUS AGENTS
0.0%
PAIN MANAGEMENT - ANALGESICS
46.2%
BEHAVIORAL HEALTH - ANTIDEPRESSANTS
21.7%
SEIZURE DISORDER
14.6%
PARKINSONS DISEASE
0.8%
ANTIEMESIS/ANTIVERTIGO
4.7%
DERMATOLOGY - ANTIINFLAMMATORY
9.2%
DERMATOLOGY - ANTIINFECTIVE
14.9%
DERMATOLOGY - ANTIPRURITIC DRUGS
0.0%
DERMATOLOGY - ACNE
0.2%
DERMATOLOGY - PSORIASIS/ECZEMA
0.5%
DERMATOLOGY - PIGMENTATION DISORDERS
0.0%
DERMATOLOGY - MISCELLANEOUS
0.8%
VAGINAL DISORDERS
5.3%
EAR - GENERAL DISORDERS
1.1%
EYE - GENERAL DISORDERS
3.8%
EYE - GLAUCOMA
6.2%
EYE - MISCELLANEOUS
0.0%
ORAL/PHARYNGEAL DISORDERS
1.1%
DIABETES
98.8%
INFLAMMATORY DISEASE
37.6%
ENDOCRINE DISORDER - THYROID
10.1%
ENDOCRINE DISORDER - FERTILITY
2.3%
ENDOCRINE DISORDER - OTHER
3.8%
UPPER GASTROINTESTINAL DISORDERS - ULCER DISEASE
UPPER GASTROINTESTINAL DISORDERS - SPASTIC DISEASE
5
25.9%
0.5%
PERCENT OF PATIENTS BY THERAPEUTIC CLASS
%
UPPER GASTROINTESTINAL DISORDERS - DIGESTIVE
0.0%
LOWER GASTROINTESTINAL DISORDERS - BOWEL INFLAMMAT
0.5%
LOWER GASTROINTESTINAL DISORDERS - OTHER
8.9%
IMMUNIZATION
0.0%
GOUT AND RELATED DISEASES
2.3%
SKELETAL MUSCLE DISORDER
7.1%
VITAMIN AND/OR MINERAL DEFICIENCY
9.8%
ELECTROLYTE REGULATION
8.0%
HEMATOLOGICAL DISORDERS
8.6%
FLUID REPLACEMENT
0.0%
HORMONAL DEFICIENCY
2.9%
CONTRACEPTION/OXYTOCICS
0.8%
ASTHMA
13.7%
OTHER RESPIRATORY DISORDERS
0.0%
ALLERGY
17.9%
COUGH AND COLD
17.3%
URINARY TRACT - FUNCTIONAL DISORDERS
7.4%
LOCAL ANESTHESIA
0.0%
AUTONOMIC NERVOUS SYSTEM DISORDERS
0.8%
IMMUNOSUPPRESSION/MODULATION
0.2%
MEDICAL SUPPLIES
10.7%
MISCELLANEOUS AGENTS
0.0%
OTHER DRUGS
3.2%
NEUROLOGICAL DISEASE - MISCELLANEOUS
0.0%
SMOKING CESSATION
0.2%
WEIGHT REDUCTION
0.2%
BEHAVIORAL HEALTH - OTHER
8.6%
FURTHER STEP: INCORPORATE MACRO
This is a lot of bulky code. Placing this code into a macro can automate the code and using an %INCLUDE statement
can hide the code to prevent hogging up the program editor. The %INCLUDE statement can identify code to pull into
your program from 3 sources: file name, internal lines, or keyboard entries. In the first line below, the %INCLUDE
statement pulls the code from the file, “GTC_Macro.sas”. This macro contains all the code mentioned above (macro
code is available by request). All that is needed now is a data set that contains a patient identifier and GTC. The 4
macro variables are,
1. preclms, the input data set which contains the patient identifier and gtc variables
2. d_member_hq_id, the patient identifier
3. pub_gtc_cd, the GTC code
4. gtc_class, the output data set.
%include 'c:\sas\wuss07\therapeutic_class_macro_gtc.sas';
%drug_gtc(preclms, d_member_hq_id, pub_gtc_cd, gtc_class);
6
ANOTHER WAY (QUICKER)
Another route can obtain similar results. An SQL procedure can calculate the number and percent of patients utilizing
each GTC from the same data set. This would give a GTC level data set or a data set with 64 observations, one for
each GTC. The 3 variables would be GTC, patient count, and percent of patients; hence the data set would list the
number of patients and percent of patients. Notice hard coding of the denominator.
proc sql;
create table gtc_ptid as
select gtc, count(distinct d_member_hq_id) as patient_count,
calculated patient_count / 533 /*insert total member count*/ as percent
from sample
group by gtc;
quit;
USE OF PROC FORMAT
The FORMAT procedure also has the ability to create a format from a data set. This would avoid the lengthy data
step above where you had to hard code all 64 variable labels. It also reduces errors due to typing.
data gtc;
set ref.gtc_cd_class;
rename pub_gtc_cd=start gtc_desc=label;
fmtname="$gtc";
run;
proc format library=work cntlin=gtc;run;
In the first data step a data set is created with three variables that contain all 64 GTC codes and the corresponding
descriptions. The variables “start “gives the format values and the variable “label” contains the format values. The
fmtname variable names the format.
In the next FORMAT procedure, the format is saved in the work library. The CNTLIN= option specifies that the data
set gtc is the source for the format $gtc. Now that the format is defined, all 64 indicator variables can be formatted in
later data and procedure steps. Use a similar tabulate procedure above to output the results. However, since the
data set is now GTC level, or has an observation for each GTC, the TABULATE procedure is taking the mean of one
value.
proc tabulate data=gtc_ptid ;
class pub_gtc_cd;
var percent;
table (pub_gtc_cd=''*percent='')*mean=''*f=pctpic.,
/box='PERCENT OF PATIENTS BY THERAPEUTIC CLASS';
format pub_gtc_cd $gtc.;
run;
all='%'
The result is a table similar to the first table.
PERCENT OF PATIENTS BY THERAPEUTIC CLASS
INFECTIOUS DISEASE - BACTERIAL
%
37.9%
INFECTIOUS DISEASE - FUNGAL
3.5%
INFECTIOUS DISEASE - VIRAL
1.4%
INFECTIOUS DISEASE - PARASITIC
4.7%
INFECTIOUS DISEASE - MISCELLANEOUS
1.7%
NEOPLASTIC DISEASE
0.8%
CARDIOVASCULAR DISEASE - HYPERTENSION
79.1%
CARDIOVASCULAR DISEASE - ARRHYTHMIA
0.5%
CARDIOVASCULAR DISEASE - VASODILATION
5.0%
CARDIOVASCULAR DISEASE - LIPID IRREGULARITY
7
62.0%
PERCENT OF PATIENTS BY THERAPEUTIC CLASS
CARDIOVASCULAR DISEASE - CARDIAC STIMULANT
%
3.5%
PAIN MANAGEMENT - ANALGESICS
46.2%
BEHAVIORAL HEALTH - ANTIDEPRESSANTS
21.7%
SEIZURE DISORDER
14.6%
PARKINSONS DISEASE
0.8%
ANTIEMESIS/ANTIVERTIGO
4.7%
DERMATOLOGY - ANTIINFLAMMATORY
9.2%
DERMATOLOGY - ANTIINFECTIVE
14.9%
DERMATOLOGY - ACNE
0.2%
DERMATOLOGY - PSORIASIS/ECZEMA
0.5%
DERMATOLOGY - MISCELLANEOUS
0.8%
VAGINAL DISORDERS
5.3%
EAR - GENERAL DISORDERS
1.1%
EYE - GENERAL DISORDERS
3.8%
EYE - GLAUCOMA
6.2%
ORAL/PHARYNGEAL DISORDERS
1.1%
DIABETES
98.8%
INFLAMMATORY DISEASE
37.6%
ENDOCRINE DISORDER - THYROID
10.1%
ENDOCRINE DISORDER - FERTILITY
2.3%
ENDOCRINE DISORDER - OTHER
3.8%
UPPER GASTROINTESTINAL DISORDERS - ULCER DISEASE
25.9%
UPPER GASTROINTESTINAL DISORDERS - SPASTIC DISEASE
0.5%
LOWER GASTROINTESTINAL DISORDERS - BOWEL INFLAMMAT
0.5%
LOWER GASTROINTESTINAL DISORDERS - OTHER
8.9%
GOUT AND RELATED DISEASES
2.3%
SKELETAL MUSCLE DISORDER
7.1%
VITAMIN AND/OR MINERAL DEFICIENCY
9.8%
ELECTROLYTE REGULATION
8.0%
HEMATOLOGICAL DISORDERS
8.6%
HORMONAL DEFICIENCY
2.9%
CONTRACEPTION/OXYTOCICS
0.8%
ASTHMA
13.7%
ALLERGY
17.9%
COUGH AND COLD
17.3%
URINARY TRACT - FUNCTIONAL DISORDERS
7.4%
AUTONOMIC NERVOUS SYSTEM DISORDERS
0.8%
IMMUNOSUPPRESSION/MODULATION
0.2%
MEDICAL SUPPLIES
10.7%
OTHER DRUGS
3.2%
SMOKING CESSATION
0.2%
WEIGHT REDUCTION
0.2%
BEHAVIORAL HEALTH - OTHER
8.6%
8
CONCLUSION
This paper shows a few of the many useful options of the MEANS procedure. The FORMAT and TABULATE
procedures are utilized to create programming efficiencies and report values in a concise, easy to read table. This
code can be used for any scenario when one has multiple observations per subject as has the need to make a subject
level data set with multiple indicator variables.
REFERENCES
SAS Institute Inc. 2004. “SAS Procedures: The MEANS Procedure”. SAS OnlineDoc® 9.1.3. Cary, NC: SAS Institute
Inc. http://support.sas.com/documentation/onlinedoc/91pdf/sasdoc_913/base_proc_8977_new.pdf
SAS Institute Inc. 2004. “SAS Procedures: The FORMAT Procedure”. SAS OnlineDoc® 9.1.3. Cary, NC: SAS
Institute Inc. http://support.sas.com/documentation/onlinedoc/91pdf/sasdoc_913/base_proc_8977_new.pdf
Kupronis, Ben. 2007. “PROC FORMAT: An Analyst’s Buddy.” Proceedings of the Thirty-first Annual SAS Users
Group International Conference. San Francisco, CA. Paper 084-31.
®
Carpenter, Art. 2007. Getting the Most From SAS Formats. Vista, CA: California Occidental Consultants.
SAS Institute Inc. 2004. “SAS Procedures: The TABULATE Procedure”. SAS OnlineDoc® 9.1.3. Cary, NC: SAS
Institute Inc. http://support.sas.com/documentation/onlinedoc/91pdf/sasdoc_913/base_proc_8977_new.pdf
ACKNOWLEDGMENTS
The author would like to thank Eunice Chang for her programming advice.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
R. Scott Leslie
MedImpact Healthcare Systems, Inc.
10680 Treena Street
San Diego, CA 92131
Work Phone: 858-790-6685
Fax: 858-689-1799
Email: [email protected]
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
9