Download SAS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
SAS -- Scatter Plots (X-Y Graphs) and Correlation Analysis
Proc Gplot and Proc Corr
Proc Gplot -- Scatter Plots in SAS
Scatter Plots are graphs that show the corresponding values for two or more variables. For example, a
graph that shows the (X,Y) ordered pairs from a data set with variables X and Y is a scatter plot.
Proc Gplot is used to create scatter plots in SAS.
IMPORTANT NOTE: Before using Proc Gplot in SAS, you should do two things. First, put the
following command at the top of your program, just below the "options" commands (this
command tells SAS to send all graphics output to your ECN377 folder, rather than to a folder on
the TealWare server computer that you don't have authorization to use):
ods listing gpath="V:\ECN377";
Second, use the "ODS graphics on;" and "ODS graphics off;" commands so that your charts will
look prettier. “ods graphics” is the graphics part of SAS’s “output delivery system.” The ods
commands are usually optional, but they make the output look much nicer.
For example, if we have two variables GDP and Time in dataset01, we can create a scatter plot with GDP
on the vertical (Y) axis and Time on the horizontal (X) axis with the following commands:
ods graphics on;
proc gplot data=dataset01;
plot GDP*Time;
run;
ods graphics off;
In the plot command above, the asterisk "*" between GDP and Time is not a multiplication sign. Rather,
it is simply a connector that tells SAS to plot GDP (on the Y axis) against Time (on the X axis).
You can include multiple “gplot” statements between the "proc gplot" and "run" commands if you want to
make more than one plot. For example, if you had variables GDP, InflationRate and Time in your
dataset, you could make several different plots:
ods graphics on;
proc gplot data=dataset01;
plot GDP*Time;
plot InflationRate*Time;
plot GDP*InflationRate;
run;
ods graphics off;
1
You can graph two or more Y variables on the same graph with the "Overlay" option. For example, if
you want to graph GDP and InflationRate both against Time on the same graph, you can use the
following commands:
ods graphics on;
proc gplot data=dataset01;
plot GDP*Time InflationRate*Time / Overlay;
run;
ods graphics off;
Proc Corr -- Correlation Analysis in SAS
To do correlation analysis in SAS, use PROC CORR.
(Again, make sure that the ods listing gpath="V:\ECN377"; command is at the top of your
program, and use the “ods graphics on;” and “ods graphics off;” commands to make the output graphics
look prettier.)
For example, suppose we have variables “PopCens” and “Age65per10000” in dataset01. To calculate the
value of r for these variables, use the following SAS commands:
ods graphics on;
proc corr data = dataset01 plots=matrix(histogram nvar=all nwith=all);
var PopCens Age65per10000;
run;
ods graphics off;
To have SAS run correlations between all pairs in a list of variables, simply list all the variables on the
"var" line of the proc corr command. Remember, the variables must be numerical measurement variables.
The key output of Proc Corr is a Correlation Matrix, a table that gives, for every pair of variables
selected for analysis:



the correlation coefficient (the "r" values) between the pair of variables,
the p-value (for the hypothesis test H0: ρ = 0, H1: ρ ≠ 0), and
the number of observations, n, used to calculate the correlation coefficient.
The upper-left to lower-right diagonal elements of the table will be exactly 1.00, because any variable is
perfectly correlated with itself. Also, the numbers above the diagonal will be the same as the
corresponding numbers below the diagonal--you only need to look at the numbers above the diagonal or
the numbers below the diagonal, not both. Finally, if there are missing values in the data, then the
number of observations used to calculate each correlation coefficient may vary from one correlation
coefficient to the next in the table.
The “plots=matrix(histogram nvar=all nwith=all)” option of Proc Corr causes SAS to create a
Scatterplot-Matrix—a matrix of graphs, where each graph shows a pair of variables from the “var” list
plotted against one another. In addition, the histograms for all variables are shown along the diagonal of
the scatterplot matrix. Very cool!
2
IMPORTANT NOTE—Accessing the Scatterplot Matrix: SAS puts the Scatterplot Matrix in the
Results Window of SAS (not the Output Window), in subfolder “Corr: The SAS System”, filename
“MatrixPlot.” In SAS, go to the Results Window by clicking on the Results tab at the bottom of the SAS
screen. Then, in the Results Window, double-click on the “Corr: The SAS System” folder, then doubleclick on “MatrixPlot.” SAS will put the Scatterplot Matix in a new window. You can enlarge or reduce
the MatrixPlot window, and then you can use the “Snip” tool in MS Windows to take a snip of the
Scatterplot Matrix and paste it into a MS Word document.
3