Download Summary - LearnEconometrics.com

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
SUMMARY OF SAS COMMANDS AND OPTIONS
Below is a summary of SAS statements used in this book.
contain SAS documentation source
Braces {.}
I. STATEMENTS USED IN THE DATA STEP { SAS Language Guide }
Data Set Creation
DATA (dataset name);
INFILE 'dataset name';
INPUT (variable names) ($) (start column-end column);
LIST;
CARDS;
Do-Loops
do (index variable)=(start value) to (stop value) by (increment
value);
Other SAS statements
end;
Arithmetic Operators
addition (+)
subtraction (-)
multiplication (*)
division (/)
exponentiation (**)
Comparison Operators
<
<=
>
>=
(or
(or
(or
(or
lt)
le)
gt)
ge)
less than
less than or equal to
greater than
greater than or equal to
Functions
ABS(x)
CEIL(x)
CINV(p,df)
degrees
absolute value
greatest integer
percentile (p) of Chi-square distribution with df
of freedom
EXP(x)
raises e to power
FINV(p,df1,df2) percentile (p) of F-distribution with df1 and df2
degrees
of freedom
LAGn(x)
n'th lag of variable
LOG(x)
natural logarithm
MAX(arg,arg..) maximum value
MIN(arg,arg..) minimum value
NORMAL(seed)
PROBCHI(x,df)
PROBIT(x)
PROBNORM(x)
PROBT(x,df)
RANNOR(seed)
RANUNI(seed)
SQRT(x)
TINV(p,df)
UNIFORM(seed)
standard normal random number
CDF of Chi-square random variable with df degrees of
freedom
inverse function of N(0,1) CDF
CDF of N(0,1) random variable
CDF of t-distribution with df degrees of freedom
standard normal random number
uniform random number in [0,1] interval
square root
percentile (p) of t-distribution with df degrees of
freedom
uniform random number in [0,1] interval
Other statements and useful tools
DELETE
deletes observation
SUM statement
variable + expression;
IF-THEN statement
if expression then statement;
IF-THEN; ELSE; statement
if expression then statement;
else expression;
KEEP statement
keep variable1 variable2;
MERGE statement
merge dataset1 dataset2;
_N_ variable
_N_ is observation number
SET statement
set dataset1 dataset2;
SUBSETTING IF statement
if expression;
TITLE statement
title 'descriptive title here';
. (PERIOD)
denotes missing value
II.
STATEMENTS USED IN THE PROC STEP
Print a data set { SAS Procedures Guide }
PROC PRINT (DATA=dataset name);
TITLE 'descriptive title';
VAR (variable names);
RUN;
Sort a data set { SAS Procedures Guide }
PROC SORT (DATA=dataset name);
BY (variable name);
Summary Statistics { SAS Procedures Guide }
PROC MEANS options;
options include:
n
number of observations
mean
mean
min
minimum value
max
maxiumum value
range
range
sum
sum
var
variance
std
sample standard deviation
stderr standard error of the estimate
uss
uncorrected sum of squares
css
corrected sum of squares
t
t-value for mean=0
prt
p-value for "t" test
VAR variables;
OUTPUT OUT='dataset name' MEAN=(variable names) VAR=(variable names)
...;
or
PROC UNIVARIATE options;
options include:
n
number of observations
mean
mean
min
minimum value
max
maxiumum value
range
range
sum
sum
var
variance
std
sample standard deviation
median median
mode
mode
VAR variables;
OUTPUT OUT='dataset name' MEAN=(variable names) VAR=(variable names)
...;
Correlation { SAS Procedures Guide }
PROC CORR;
VAR(variable names);
Frequency Diagrams { SAS Procedures Guide }
PROC CHART;
VBAR (variable name)/options;
HBAR (variable name)/options;
Rough plots { SAS Procedures Guide }
PROC PLOT;
PLOT (Y variable name)*(X variable name) ....;
Linear Regression { SAS/STAT Users' Guide }
PROC REG options;
MODEL dependent = independent/ options;
model options include:
covb
clm
cli
acov
noint
dw
print Cov(b)
print confidence interval for E[y]
print confidence interval for y
heteroskedasticity consistent cov(b)
no intercept in model
Durbin-Watson test
OUTPUT OUT=SASdataset PREDICTED=varname RESIDUAL=varname ...;
Other variables that can be output include:
L95=varname
U95=varname
STDI=varname
lower bound of 95% prediction interval
upper bound of 95% prediction interval
standard error of forecast
WEIGHT varname;
TEST SAS expressions;
RESTRICT SAS expressions;
Systems of Linear Equations { SAS/ETS User's Guide }
PROC SYSLIN options;
PROC SYSLIN options include:
sur
vardef=n
Seemingly Unrelated Regressions
compute variances with no degrees
2sls
freedom correction
two stage least squares
of
ENDOGENOUS variables;
INSTRUMENTS variables;
IDENTITY equation;
MODEL dependent = independent variables;
STEST SAS expressions;
SRESTRICT SAS expressions;
Autoregessive Models { SAS/ETS User's Guide }
PROC AUTOREG options OUTEST=SASdataset;
Model dependent variable = independent variables/options;
options include:
NLAG=n
order of autoregressive model
METHOD=options
ml=maximum likelihood uls=nonlinear least
squares
CONVERGE=n
convergence tolerance
DW=n
Durbin-Watson statistic for order n=1,2,3 or
4
DWPROB
p-value of Durbin-Watson test
LAGDEP=varname
produces Durbin's h-statistic for
autocorrelation
in the presence of a lagged dependent
variable
LAGDEP
produces Durbin's t-statistic for
autocorrelation
in the presence of a lagged dependent
variable
Output out=SASdataset options;
PREDICTED=varname prediction corrected for autocorrelation
PREDICTEDM=varname prediction of mean value
LCL=varname
lower bound of 95% interval for PREDICTED
value
UCL=varname
upper bound of 95% interval for PREDICTED
RESIDUALM=varname
RESIDUAL=varname
residual from prediction of mean value
residual from PREDICTED value
value
Pooling Time-Series and Cross-Sectional Data
}
{ SAS/ETS User's Guide
PROC TSCSREG TS=t CS=n FULLER;
MODEL dependent = independent variables;
or
PROC MIXED;
{ SAS/STAT User's Guide }
CLASS ind time;
identify cross-section
an time-series variables
MODEL dependent = independent/s;
the "s" option prints slopes
RANDOM ind time;
identify which effects are
random
Systems of Nonlinear Equations { SAS/ETS User's Guide }
PROC MODEL;
PARAMETERS parameter names;
program statements
ENDOGENOUS varnames;
INSTRUMENTS varnames;
FIT equations/options;
FIT statment options include:
itsur
it2sls
ols
sur
iterative sur
iterative 2sls
ols (default)
seemingly unrelated regressions
Time-Series Analysis { SAS/ETS User's Guide }
PROC ARIMA;
IDENTIFY VAR=varname(d) NLAG=n;
variable
produces diagnostics for
differenced d times based on n
lags
ESTIMATE P=p Q=q METHOD=options;
FORECAST LEAD=n;
estimate ARIMA(p,d,q)
generate forecasts up to period
T+n
%DFTEST(SASdataset,variable[,options]);
This is a SAS macro that
performs the Dickey-
Fuller
unit root test.
Required arguments are:
the SAS dataset name
variable name.
Options include:
AR=n
DIF=(n)
the number of additional AR terms to
include. Default=3
the degree of differencing to be
DLAG=n
to the series.
specifies the lag to be tested for
OUT=datasetname
OUTSTAT=datasetname
unit root. n=1,2,4 or 12. Default=1.
writes residuals to output dataset.
writes test statistic, estimates, etc
applied
the
to
output dataset.
The macro does not
print
TREND=n
results, so this is necessary to view
results
specifies the degree of deterministic
trend included. n=0 for no trend, n=1
for
intercept, n=2 for intercept and time
trend.
Default=1.
Polynomial Distributed Lags { SAS/ETS User's Guide }
PROC PDLREG options;
MODEL dependent = variable(n,pmax,pmin,constraint);
where
n=lag length
pmax=degree of polynomial
pmin=minimum degree of polynomial
constraint=FIRST, LAST, or BOTH
FIRST imposes head constraint b(-1)=0
LAST imposes tail constraint b(n+1)=0
BOTH imposes both head and tail constraint
Nonlinear Least Squares Regression { SAS/STAT User's Guide }
PROC NLIN METHOD=options;
MODEL dependent = expression;
PARMS parameter=value...;
other program statements
DER.parameter = expression;
Discrete Dependent Variables { SAS/STAT User's Guide }
PROC PROBIT options;
CLASS variables;
MODEL response = variables/D=(normal or logistic) options;
or
PROC LOGISTIC options;
MODEL response = variables/options;