Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Appendix C: SAS Software
Uses of SAS
CRM
datamining
data warehousing
linear programming
forecasting
econometrics
nonlinear parameter estimation
simulation
marketing models
statistical analysis
Data Types SAS Can Deal with
panel data
relational databases
scanner data
Web log data
questionnaires
Ideal When You Are …
transforming
manipulating
Mathematical
Marketing
massaging
sorting
merging
lookups
reporting
Slide C.1
SAS
Two Types of SAS Routines
DATA Steps
•
•
•
•
•
Read and Write Data
Create a SAS dataset
Manipulate and Transform Data
Open-Ended - Procedural Language
Presence of INPUT statement creates a Loop
PROC Steps
• Analyze Data
• Canned or Preprogrammed Input and Output
Mathematical
Marketing
Slide C.2
SAS
A Simple Example
data my_study ;
input id gender $ green recycle ;
cards ;
001
m
4
2
002
m
3
1
003
f
3
2
•••
•••
•••
•••
;
proc reg data=my_study ;
class gender ;
model recycle = green gender ;
Mathematical
Marketing
Slide C.3
SAS
The Sequence Depends on the Need
data step to read in scanner data;
data step to read in panel data ;
data step to merge scanner and panel records ;
data step to change the level of analysis to the household ;
proc step to create covariance matrix ;
data step to write covariance matrix in LISREL compatable format ;
Mathematical
Marketing
Slide C.4
SAS
The INPUT Statement - Character Data
List input
$ after a variable - character var
input last_name $ first_name $ initial $ ;
Formatted input
$w. after a variable
input last_name $22. first_name $22. initial $1.
Column input
$ start-column - end-column
input last_name $ 1 - 22 first_name $ 23 - 44 initial $ 45 ;
Mathematical
Marketing
Slide C.5
SAS
The INPUT Statement - Numeric Data
List input
input score_1 score_2 score_3 ;
Formatted input
w.d (field width and number of digits after an implied decimal point) after a variable
input score_1 $10. score_2 $10. score_3 10.
Column input
$ start-column - end-column
input score_1 1 - 10 score_2 11 - 20 score_3 21 - 30 ;
Mathematical
Marketing
Slide C.6
SAS
Grouped INPUT Statements
input (var1-var3) (10. 10. 10.) ;
input (var1-var3) (3*10.) ;
input (var1-var3) (10.) ;
input (name var1-var3) ($10. 3*5.1) ;
Mathematical
Marketing
Slide C.7
SAS
The Column Pointer in the INPUT Statement
input @3 var1 10. ;
input more @ ;
if more then input @15 x1 x2 ;
input @12 x1 5. +3 x2 ;
Mathematical
Marketing
Slide C.8
SAS
Documenting INPUT Statements
input
Mathematical
Marketing
@4
@9
@20
@20
green1
green2
aware1
aware2
4.
4.
5.
5. ;
/*
/*
/*
/*
greeness scale first item
greeness scale 2nd item
awareness scale first item
awareness scale 2nd item
*/
*/
*/
*/
Slide C.9
SAS
The Line Pointer
input x1 x2 x3 / x4 x4 x6 ;
input x1 x2 x3 #2 x4 x5 x6 ;
input
#2
Mathematical
Marketing
x1 x2 x3
x4 x5 x6 ;
Slide C.10
SAS
The PUT Statement
put x1 x2 x3 @
input x4 ;
put x4 ;
put _all_ ;
put a= b= ;
;
put x1 #2 x2 ;
put _infile_ ;
put x1 / x2 ;
put _page_ ;
col1 = 22 ; col2 = 14 ;
put @col1 var245 @col2 var246 ;
Mathematical
Marketing
Slide C.11
SAS
Copying Raw Data
infile in ′c:\old.data′ ;
file
out ′c:\new.data′ ;
data _null_ ;
infile in ;
outfile out ;
input ;
put _infile_ ;
Mathematical
Marketing
Slide C.12
SAS
SAS Constants
'21Dec1981'D
'Charles F. Hofacker'
492992.1223
Mathematical
Marketing
Slide C.13
SAS
Assignment Statement
x = a + b ;
y = x / 2. ;
prob = 1 - exp(-z**2/2) ;
Mathematical
Marketing
Slide C.14
SAS
The SAS Array Statement
array y {20} y1-y20 ;
do i = 1 to 20 ;
y{i} = 11 - y{i} ;
end ;
Mathematical
Marketing
Slide C.15
SAS
The Sum Statement
variable+expression ;
retain variable ;
variable = variable + expression ;
n+1 ;
cumulated + x ;
Mathematical
Marketing
Slide C.16
SAS
IF Statement
if a >= 45 then a = 45 ;
if 0 < age < 1 then age = 1 ;
if a = 2 or b = 3 then c = 1 ;
if a = 2 and b = 3 then c = 1 ;
if major = "FIN" ;
if major = "FIN" then do ;
a = 1 ;
b = 2 ;
end ;
Mathematical
Marketing
Slide C.17
SAS
More IF Statement Expressions
name ne 'smith'
name ~= 'smith'
x eq 1 or x eq 2
x=1 | x=2
then etc ;
if
a <= b | a >= c
a le b or a ge c
a1 and a2 or a3
(a1 and a2) or a3
Mathematical
Marketing
Slide C.18
SAS
Concatenating Datasets Sequentially
first:
second:
id
1
2
3
id
4
5
6
x
2
1
3
y
3
2
1
x
3
2
1
y
2
1
1
data both ;
set first second ;
both:
id
1
2
3
4
5
6
Mathematical
Marketing
x
2
1
3
3
2
1
y
3
2
1
2
1
1
Slide C.19
SAS
Interleaving Two Datasets
proc sort data=store1 ;
by date ;
proc sort data=store2 ;
by date ;
data both ;
set store1 store2 ;
by date ;
Mathematical
Marketing
Slide C.20
SAS
Concatenating Datasets Horizontally
left:
id
1
2
3
y1
2
1
3
right:
y2
3
2
1
id x1 x2
1
3 2
2
2 1
3
1 1
data both ;
merge left right ;
both:
id
1
2
3
Mathematical
Marketing
y1
2
1
3
y2
3
2
1
x1
3
2
1
x2
2
1
1
Slide C.21
SAS
Table LookUp
table:
database:
part desc
0011 hammer
0012 nail
0013 bow
id part
1
0011
2
0011
3
0013
proc sort data=database out=sorted
by part ;
data both ;
merge table sorted ;
by part ;
both:
id
1
2
3
Mathematical
Marketing
part desc
0011 hammer
0011 hammer
0013 bow
The last observations is repeated if one of the input data sets is smaller
Slide C.22
SAS
Update
master:
transaction:
part desc
0011 hammer
0012 nail
0013 bow
Part desc
0011 jackhammer
data new_master ;
update master transaction ;
by part ;
new_master:
Mathematical
Marketing
part desc
0011 jackhammer
0012 nail
0013 bow
Slide C.23
SAS
Changing the Level of Analysis 1
Subject
A
A
A
B
B
B
Time Score
1
A1
2
A2
3
A3
1
B1
2
B2
3
B3
Subject Score1 Score2 Score3
A
A1
A2
A3
B
B1
B2
B3
Mathematical
Marketing
Before
After
Slide C.24
SAS
Changing the Level of Analysis 1
data after ;
keep subject score1 score2 score3 ;
retain score1 score2 ;
set before ;
if time=1 then score1 = score ;
else if time=2 then score2 = score ;
else if time=3 then do ;
score3 = score ;
output ;
end ;
Mathematical
Marketing
Slide C.25
SAS
Changing the Level of Analysis 2
Day
1
1
1
2
2
2
Day
1
2
Mathematical
Marketing
Score
12
11
13
14
10
9
Student
A
B
C
A
B
C
Highest
13
14
Student
C
A
Before
After
Slide C.26
SAS
Changing the Level of Analysis 2
FIRST. and LAST. Variable Modifiers
proc sort data=log ;
by day ;
data find_highest ;
retain hightest ;
drop score ;
set log ;
by day ;
if first.day then highest=. ;
if score > highest then highest = score ;
if lastday then output ;
Mathematical
Marketing
Slide C.27
SAS
The KEEP and DROP Statements
keep a b f h ;
drop x1-x99 ;
data a(keep = a1 a2) b(keep = b1 b2) ;
set x ;
if blah then output a ;
else output b ;
Mathematical
Marketing
Slide C.28
SAS
Changing the Level of Analysis 3
Spreading Out an Observation
Subject Score1 Score2 Score3
A
A1
A2
A3
B
B1
B2
B3
Subject
A
A
A
B
B
B
Mathematical
Marketing
Time Score
1
A1
2
A2
3
A3
1
B1
2
B2
3
B3
Before
After
Slide C.29
SAS
Changing the Level of Analysis 3 – SAS Code
data spread ;
drop score1 score2 score3 ;
set tight ;
time = 1 ; score = score1 ; output ;
time = 2 ; score = score2 ; output ;
time = 3 ; score = score3 ; output ;
Mathematical
Marketing
Slide C.30
SAS
Use of the IN= Dataset Indicator
data new ;
set old1 (in=from_old1)
old2 (in=from_old2) ;
if from_old1 then … ;
if from_old2 then … ;
Mathematical
Marketing
Slide C.31
SAS
Proc Summary for Aggregation
proc summary data=raw_purchases ;
by household ;
class brand ;
var x1 x2 x3 x4 x5 ;
output out=household mean=overall ;
Mathematical
Marketing
Slide C.32
SAS
Using SAS for Simulations
Simulation
Loop
data monte_carlo ;
keep y1 - y4 ;
array y{4} y1 - y4 ;
array loading{4} l1 - l4 ;
array unique{4} u1 - u4 ;
l1 = 1 ; l2 = .5 ; l3 = .5 ; l4 = .5 ;
u1 = .2 ; u2 = .2 ; u3 = .2 ; u4 = .2 ;
do subject = 1 to 100 ;
eta = rannor(1921) ;
do j = 1 to 4 ;
y{j} = eta*loading{j} + unique{j}*rannor(2917) ;
end ;
output ;
end ;
proc calis data=monte_carlo ;
etc. ;
Mathematical
Marketing
Slide C.33
SAS
External Data Sets and Windows/Vista
filename trans 'C:\Documents\june\transactions.data' ;
libname clv
'C:\Documents\customer_projects\' ;
...
data clv.june ;
infile trans ;
input id 3. purch 2. day 3. month $ ;
Mathematical
Marketing
Slide C.34
SAS