Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STAT 6250 Descriptive Statistics Dr. Fan Supplementary Materials Based on the data in HTWT.txt SAS Code: DATA HTWT; INPUT GENDER $ HEIGHT WEIGHT COLLEGE $; DATALINES; M 68.5 155 SCI F 61.2 99 BSNS F 63.0 115 BSNS M 70.0 205 SCI M 68.6 170 ARTS F 65.1 125 BSNS M 72.4 220 ARTS M 69.5 188 SCI ; RUN; For Quantitative Variables: PROC UNIVARIATE PROC UNIVARIATE gives you an extensive statistics for quantitative variables including normality tests (with the option NORMAL), stemplots and boxplots (with the option PLOT). You can also get histograms, normal Q-Q plots, and normal probability plots by HISTOGRAM, QQPLOT, and PROBPLOT, respectively. SAS Syntax: PROC UNIVARIATE DATA=dataname NORMAL PLOT; TITLE “the title you want for the output”; VAR var1 var2 … vari; HISTOGRAM var1 var2 … varj; QQPLOT var1 var2 … vark; PROBPLOT var1 var2 … varl; RUN; Example: We wish to get descriptive statistics including stemplots and boxplots for the variable height. Also verify whether the variable height follows a normal distribution. SAS Code: PROC UNIVARIATE DATA=HTWT NORMAL PLOT; TITLE "DESCRIPTIVE STATISTICS + PLOTS"; 1 STAT 6250 Descriptive Statistics Dr. Fan VAR HEIGHT WEIGHT; HISTOGRAM HEIGHT / MIDPOINTS=60 TO 75 BY 5 NORMAL; INSET MEAN='Mean' (5.2) STD='Standard deviation' (5.2); QQPLOT HEIGHT; PROBPLOT HEIGHT; RUN; Exercise: Obtain descriptive statistics for weight. Is it reasonable to assume weight follows a normal distribution? Does the mean weight of the population significantly differ from 160 lb? Example: We wish to get the descriptive statistics and histogram for height and for each gender separately. SAS Code: PROC SORT DATA=HTWT; BY GENDER; RUN; /** NUMERICAL AND VISUAL SUMMARIES FOR QUANTITATIVE VAR'S **/ PROC UNIVARIATE DATA=HTWT NORMAL PLOT; TITLE "MORE DESCRIPTIVE STATISTICS + HISTOGRAMS"; BY GENDER; VAR HEIGHT; RUN; For Categorical Variables: PROC FREQ and PROC GCHART PROC FREQ outputs frequency tables for categorical variables. PROC GCHART outputs the bar charts for categorical variables. SAS Syntax: PROC FREQ DATA=dataname; TABLES var1 var2 … vari; RUN; PROC GCHART DATA=dataname; VBAR var1 var2 … vari; RUN; Example: Summarize the variable college and gender. SAS Code: 2 STAT 6250 Descriptive Statistics Dr. Fan PROC FREQ DATA=HTWT; TITLE “FREQUENCY TABLE”; TABLES GENDER COLLEGE; RUN; /** VISUAL SUMMARY FOR CATEGORICAL VAR'S **/ PROC GCHART DATA=HTWT; TITLE "BAR CHART"; VBAR GENDER COLLEGE; RUN; Relationship among Variables: 1 categorical + 1 quantitative variables Example: Draw a plot to illustrate how the height of male students distributes differently to the height of female students. Describe what you see. /* Sorting the data to be used in the side by side boxplots */ proc sort data=htwt OUT=HTWT_SORTED; by gender; run; PROC BOXPLOT DATA=HTWT_SORTED; TITLE "SIDE-BY-SIDE BOXPLOT FOR GENDER AND HEIGHT"; PLOT HEIGHT*GENDER; /* QUANTITATIVE*CATEGORICAL */ RUN; 2 categorical variables Example: Draw a plot to illustrate how the distribution of the colleges of male students differs from that of female students. Describe what you see. PROC FREQ DATA=HTWT; TITLE "TWO-WAY TABLE"; TABLES GENDER*COLLEGE; RUN; PROC GCHART DATA=HTWT; TITLE "SIDE-BY-SIDE BAR CHART FOR GENDER AND COLLEGE"; VBAR COLLEGE / GROUP= GENDER; RUN; 2 quantitative variables: Example: Draw a plot to illustrate the relationship between height and weight. Describe what you see. PROC GPLOT DATA=HTWT; TITLE "SCATTERPLOT OF WEIGHT VS. HEIGHT"; PLOT WEIGHT*HEIGHT; /* Y*X */ RUN; 3 STAT 6250 Descriptive Statistics Dr. Fan 1 categorical + 2 quantitative variables Example: Draw a plot to illustrate the relationship between height and weight for different genders. Describe what you see. PROC GPLOT DATA=HTWT; TITLE "SCATTERPLOT FOR WEIGHT BY HEIGHT WITH GENDER GROUP"; PLOT WEIGHT*HEIGHT=GENDER; RUN; 2 categorical + 1 quantitative variables Example: Draw a plot to illustrate how the height of male students distributes differently to the height of female students within each college group. Describe what you see. /* Sorting the data to be used in the side by side boxplots */ Proc Sort Data=HTWT; By COLLEGE GENDER; run; /* Printing the side by side boxplots of height by college and gender*/ Proc Boxplot data = HTWT; Title "Boxplot of height by College and Gender"; Plot Height*GENDER (COLLEGE); run; 4