Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Simple Descriptive Statistics Using SAS Winter, 2005 /********************************************** This example shows how to import an Excel File, set up missing value codes and create some new variables. New variables are also created. Filename: descriptives.sas **********************************************/ PROC IMPORT OUT= WORK.OWEN DATAFILE= "C:\temp\labdata\OWEN.XLS" DBMS=EXCEL REPLACE; SHEET="OWEN$"; GETNAMES=YES; MIXED=NO; SCANTEXT=YES; USEDATE=YES; SCANTIME=YES; RUN; proc means data=owen; title "Descriptive Statistics for All Numeric Variables"; run; data owen2; set owen; if vit_a = 99 then vit_a = .; if head_cir = 99 then head_cir = .; if fatfold = 99 then fatfold = .; if b_weight = 999 then b_weight= .; if mot_age = 99 then mot_age = .; if b_order_ = 99 then b_order_ = .; if m_height = 999 then m_height=.; if f_height = 999 then f_height=.; rename b_order_ = b_order; bwt_g = b_weight*10; /*NOTE: THE CODE BELOW TAKES MISSING INTO ACCOUNT*/ if bwt_g not=. and bwt_g < 2500 then lowbwt=1; if bwt_g >=2500 then lowbwt=0; if if if if run; sex=1 sex=1 sex=2 sex=2 and and and and race=1 race=2 race=1 race=2 then then then then sexrace=1; sexrace=2; sexrace=3; sexrace=4; /*Univariate Statistics and Graphs*/ proc means data=owen2; title "Descriptive Statistics for All Numeric Variables"; title2 "After Fixing Missing Value Codes"; run; proc contents data=owen2; title "Contents of SAS Data Set"; 1 run; proc format; value sexfmt 1="1:Male" 2="2:Female"; value lbwtfmt 1="1:Low" 0="0:Not Low"; value racefmt 1="1:White" 2="2:Black"; value sexrfmt 1="1:White Male" 2="2:White Female" 3="3:Black Male" 4="4:Black Female"; run; proc freq data=owen2; tables childnum sex race sexrace w_rank b_order lowbwt; format sex sexfmt. lowbwt lbwtfmt. race racefmt. sexrace sexrfmt.; title "Frequencies for Categorical Variables"; run; proc chart data=owen2; hbar sex race w_rank b_order lowbwt / discrete; title "Bar Charts for Categorical Variables"; run; /*Histograms for continuous variables*/ proc univariate data=owen2 plot; var age b_weight weight ; histogram; title "Histogram for Continuous Variables"; run; /*Bivariate Relationships of Two Categorical Variables*/ proc freq data=owen2; tables sex*childnum; title "Crosstabulation of Sex and Child Number"; run; /*Bivariate Relationships of Two Continuous Variables*/ proc plot data=owen2; plot weight * height; title "Scatter Plot of Weight vs Height"; run; proc gplot data=owen2; plot weight * height; run; /*Bivariate Relationships of Continuous vs. Categorical Variable*/ proc sort data=owen2; by sex; run; proc univariate data=owen2 plot; by sex; var weight; title "Side-by-Side Boxplot Using Proc Univariate"; run; proc boxplot data=owen2; plot weight*sex / boxstyle=schematic; title "Side-by-Side Boxplot Using Proc Boxplot"; run; 2 Descriptive Statistics for All Numeric Variables The MEANS Procedure Variable Label N Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FAM_NUM_ FAM_NUM 1006 4525.11 1634.03 2000.00 7569.00 CHILDNUM CHILDNUM 1006 1.3359841 0.5716672 1.0000000 3.0000000 AGE AGE 1006 44.0248509 16.6610452 12.0000000 73.0000000 SEX SEX 1006 1.4890656 0.5001291 1.0000000 2.0000000 RACE RACE 1006 1.2823062 0.4503454 1.0000000 2.0000000 W_RANK W_RANK 1006 2.2127237 0.9024440 1.0000000 4.0000000 INCOME_C INCOME_C 1006 1581.31 974.2279710 80.0000000 6250.00 HEIGHT HEIGHT 1001 99.0429570 11.4300111 70.0000000 130.0000000 WEIGHT WEIGHT 1000 15.6290800 3.6523446 8.2400000 41.0800000 HEMO HEMO 1006 12.4606362 1.1578850 6.2000000 24.1000000 VIT_C VIT_C 1006 1.1302187 0.6599121 0.1000000 3.5000000 VIT_A VIT_A 1006 51.2465209 28.0530567 15.0000000 99.0000000 HEAD_CIR HEAD_CIR 1006 49.7216700 4.6155769 39.0000000 99.0000000 FATFOLD FATFOLD 1006 5.6780318 10.8109068 2.6000000 99.0000000 B_WEIGHT B_WEIGHT 1006 338.4502982 111.0447134 91.0000000 999.0000000 MOT_AGE MOT_AGE 1006 30.9990060 12.4970444 17.0000000 99.0000000 B_ORDER_ B_ORDER 1006 5.4304175 15.4013836 1.0000000 99.0000000 M_HEIGHT M_HEIGHT 1006 185.3499006 132.7438368 122.0000000 999.0000000 F_HEIGHT F_HEIGHT 1006 203.5119284 142.1009149 152.0000000 999.0000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Descriptive Statistics for All Numeric Variables After Fixing Missing Value Codes The MEANS Procedure Variable Label N Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FAM_NUM_ FAM_NUM 1006 4525.11 1634.03 2000.00 7569.00 CHILDNUM CHILDNUM 1006 1.3359841 0.5716672 1.0000000 3.0000000 AGE AGE 1006 44.0248509 16.6610452 12.0000000 73.0000000 SEX SEX 1006 1.4890656 0.5001291 1.0000000 2.0000000 RACE RACE 1006 1.2823062 0.4503454 1.0000000 2.0000000 W_RANK W_RANK 1006 2.2127237 0.9024440 1.0000000 4.0000000 INCOME_C INCOME_C 1006 1581.31 974.2279710 80.0000000 6250.00 HEIGHT HEIGHT 1001 99.0429570 11.4300111 70.0000000 130.0000000 WEIGHT WEIGHT 1000 15.6290800 3.6523446 8.2400000 41.0800000 HEMO HEMO 1006 12.4606362 1.1578850 6.2000000 24.1000000 VIT_C VIT_C 1006 1.1302187 0.6599121 0.1000000 3.5000000 VIT_A VIT_A 763 36.0380079 8.8951237 15.0000000 78.0000000 HEAD_CIR HEAD_CIR 999 49.3763764 2.0739057 39.0000000 56.0000000 FATFOLD FATFOLD 993 4.4562941 1.6683194 2.6000000 42.0000000 B_WEIGHT B_WEIGHT 986 325.0517241 59.5162936 91.0000000 544.0000000 MOT_AGE MOT_AGE 981 29.2660550 6.2603025 17.0000000 51.0000000 b_order B_ORDER 980 2.9479592 2.1939526 1.0000000 16.0000000 M_HEIGHT M_HEIGHT 980 163.7632653 6.3663343 122.0000000 199.0000000 F_HEIGHT F_HEIGHT 975 178.2194872 7.3821354 152.0000000 210.0000000 bwt_g 986 3250.52 595.1629357 910.0000000 5440.00 lowbwt 986 0.1075051 0.3099115 0 1.0000000 sexrace 1006 2.2604374 1.0953386 1.0000000 4.0000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 3 Contents of SAS Data Set The CONTENTS Procedure Data Set Name Member Type Engine Created Last Modified Protection Data Set Type Label Data Representation Encoding WORK.OWEN2 DATA V9 Monday, January 03, 2005 05:25:10 PM Monday, January 03, 2005 05:25:10 PM Observations Variables Indexes Observation Length Deleted Observations Compressed Sorted 1006 22 0 176 0 NO NO WINDOWS_32 wlatin1 Western (Windows) Engine/Host Dependent Information Data Set Page Size Number of Data Set Pages First Data Page Max Obs per Page Obs in First Data Page Number of Data Set Repairs File Name Release Created Host Created 16384 12 1 92 74 0 C:\DOCUME~1\Kathy\LOCALS~1\Temp\SAS Temporary Files\_TD688\owen2.sas7bdat 9.0101M2 XP_HOME Alphabetic List of Variables and Attributes # Variable Type 3 15 2 1 14 19 13 8 10 7 16 18 5 4 12 11 9 6 17 20 21 22 AGE B_WEIGHT CHILDNUM FAM_NUM_ FATFOLD F_HEIGHT HEAD_CIR HEIGHT HEMO INCOME_C MOT_AGE M_HEIGHT RACE SEX VIT_A VIT_C WEIGHT W_RANK b_order bwt_g lowbwt sexrace Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num 4 Len 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 Label AGE B_WEIGHT CHILDNUM FAM_NUM FATFOLD F_HEIGHT HEAD_CIR HEIGHT HEMO INCOME_C MOT_AGE M_HEIGHT RACE SEX VIT_A VIT_C WEIGHT W_RANK B_ORDER Frequencies for Categorical Variables The FREQ Procedure CHILDNUM Cumulative Cumulative CHILDNUM Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 720 71.57 720 71.57 2 234 23.26 954 94.83 3 52 5.17 1006 100.00 SEX Cumulative Cumulative SEX Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1:Male 514 51.09 514 51.09 2:Female 492 48.91 1006 100.00 RACE Cumulative Cumulative RACE Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1:White 722 71.77 722 71.77 2:Black 284 28.23 1006 100.00 Cumulative Cumulative sexrace Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1:White Male 368 36.58 368 36.58 2:White Female 146 14.51 514 51.09 3:Black Male 354 35.19 868 86.28 4:Black Female 138 13.72 1006 100.00 5 Frequencies for Categorical Variables The FREQ Procedure W_RANK Cumulative Cumulative W_RANK Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 237 23.56 237 23.56 2 406 40.36 643 63.92 3 275 27.34 918 91.25 4 88 8.75 1006 100.00 B_ORDER Cumulative Cumulative b_order Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 278 28.37 278 28.37 2 244 24.90 522 53.27 3 178 18.16 700 71.43 4 113 11.53 813 82.96 5 58 5.92 871 88.88 6 42 4.29 913 93.16 7 28 2.86 941 96.02 8 9 0.92 950 96.94 9 7 0.71 957 97.65 10 8 0.82 965 98.47 11 6 0.61 971 99.08 12 4 0.41 975 99.49 13 2 0.20 977 99.69 14 2 0.20 979 99.90 16 1 0.10 980 100.00 Frequency Missing = 26 Frequencies for Categorical Variables The FREQ Procedure Cumulative Cumulative lowbwt Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 0:Not Low 880 89.25 880 89.25 1:Low 106 10.75 986 100.00 Frequency Missing = 20 6 Bar Charts for Categorical Variables SEX 1 2 ‚ ‚********************************************************************* ‚ ‚****************************************************************** ‚ Šƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒ 30 60 90 120 150 180 210 240 270 300 330 360 390 420 450 480 510 Frequency Freq Cum. Freq Percent Cum. Percent 514 514 51.09 51.09 492 1006 48.91 100.00 RACE 1 2 ‚ ‚************************************************************************ ‚ ‚**************************** ‚ Šƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒ 50 100 150 200 250 300 350 400 450 500 550 600 650 700 Freq Cum. Freq Percent Cum. Percent 722 722 71.77 71.77 284 1006 28.23 100.00 Frequency Bar Charts for Categorical Variables W_RANK 1 2 3 4 ‚ ‚*********************************************** ‚ ‚********************************************************************************* ‚ ‚******************************************************* ‚ ‚****************** ‚ Šƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒ 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 Frequency 7 Freq Cum. Freq Percent Cum. Percent 237 237 23.56 23.56 406 643 40.36 63.92 275 918 27.34 91.25 88 1006 8.75 100.00 Bar Charts for Categorical Variables B_ORDER 1 2 3 4 5 6 7 8 9 10 11 12 13 14 16 ‚ ‚******************************************************** ‚ ‚************************************************* ‚ ‚************************************ ‚ ‚*********************** ‚ ‚************ ‚ ‚******** ‚ ‚****** ‚ ‚** ‚ ‚* ‚ ‚** ‚ ‚* ‚ ‚* ‚ ‚ ‚ ‚ ‚ ‚ ‚ Šƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆƒƒƒˆ 20 40 60 80 100 120 140 160 180 200 220 240 260 280 Freq Cum. Freq Percent Cum. Percent 278 278 28.37 28.37 244 522 24.90 53.27 178 700 18.16 71.43 113 813 11.53 82.96 58 871 5.92 88.88 42 913 4.29 93.16 28 941 2.86 96.02 9 950 0.92 96.94 7 957 0.71 97.65 8 965 0.82 98.47 6 971 0.61 99.08 4 975 0.41 99.49 2 977 0.20 99.69 2 979 0.20 99.90 1 980 0.10 100.00 Frequency Bar Charts for Categorical Variables lowbwt 0 1 ‚ ‚******************************************** ‚ ‚***** ‚ Šƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒ 100 200 300 400 500 600 700 800 Frequency 8 Freq Cum. Freq Percent Cum. Percent 880 880 89.25 89.25 106 986 10.75 100.00 Histogram for Continuous Variables The UNIVARIATE Procedure Variable: AGE (AGE) Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 1006 44.0248509 16.6610452 -0.1448644 2228795 37.8446374 Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean 1006 44289 277.590427 -1.0971483 278978.379 0.52529498 Basic Statistical Measures Location Mean 44.02485 Median 45.00000 Mode 45.00000 Test Student's t Sign Signed Rank Variability Std Deviation Variance Range Interquartile Range 16.66105 277.59043 61.00000 27.00000 Tests for Location: Mu0=0 -Statistic-----p Value-----t 83.80977 Pr > |t| <.0001 M 503 Pr >= |M| <.0001 S 253260.5 Pr >= |S| <.0001 Quantiles (Definition 5) Quantile Estimate 100% Max 99% 95% 90% 75% Q3 50% Median 25% Q1 10% 5% 1% 0% Min 73 71 69 66 58 45 31 20 16 13 12 Extreme Observations ----Lowest---Value Obs 12 12 12 12 13 ----Highest--Value Obs 968 528 375 111 979 72 72 72 72 73 9 96 131 664 828 78 Histogram for Continuous Variables The UNIVARIATE Procedure Variable: AGE (AGE) Histogram 72.5+************ .************************************ .******************************* .***************************** .****************************** .*********************************** 42.5+******************************** .*************************** .*************************** .************************** .******************** .************************* 12.5+********* ----+----+----+----+----+----+----+* may represent up to 3 counts # 36 106 91 86 88 105 96 79 81 78 60 75 25 Boxplot | | | +-----+ | | *-----* | + | | | +-----+ | | | | Normal Probability Plot 72.5+ ++******** | ******** | *****+ | ***++ | ****+ | ***+ 42.5+ **** | *** | +*** | +**** | +**** | ******** 12.5+******++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 10 Histogram for Continuous Variables The UNIVARIATE Procedure Variable: B_WEIGHT (B_WEIGHT) Moments 986 Sum Weights 325.051724 Sum Observations 59.5162936 Variance -0.4343559 Kurtosis 107668459 Corrected SS 18.3097917 Std Error Mean N Mean Std Deviation Skewness Uncorrected SS Coeff Variation Basic Statistical Measures Location Variability Mean 325.0517 Std Deviation Median 331.0000 Variance Mode 354.0000 Range Interquartile Range Test Student's t Sign Signed Rank 986 320501 3542.1892 1.25461651 3489056.36 1.89538491 59.51629 3542 453.00000 73.00000 Tests for Location: Mu0=0 -Statistic-----p Value-----t 171.4964 Pr > |t| <.0001 M 493 Pr >= |M| <.0001 S 243295.5 Pr >= |S| <.0001 Quantiles (Definition 5) Quantile Estimate 100% Max 544 99% 453 95% 413 90% 399 75% Q3 363 50% Median 331 25% Q1 290 10% 249 5% 227 1% 136 0% Min 91 Extreme Observations ----Lowest---Value Obs 91 409 104 12 108 317 108 316 109 972 Missing Value . ----Highest--Value Obs 480 547 499 441 503 440 517 669 544 684 Missing Values -----Percent Of----Missing Count All Obs Obs 20 11 1.99 100.00 Histogram for Continuous Variables The UNIVARIATE Procedure Variable: B_WEIGHT (B_WEIGHT) Histogram 550+* . .* .* .* .**** .****** .*********** .******************** .************************ .****************************************** .********************************* .**************************** .******************************** .******************* .************ .******* .****** .** .* .* .** .** 90+* ----+----+----+----+----+----+----+----+-* may represent up to 4 counts # 1 Boxplot 0 2 2 1 15 21 41 80 96 165 132 111 126 76 45 27 23 6 1 4 5 5 1 0 0 | | | | | +-----+ | | *--+--* | | +-----+ | | | | | 0 0 0 0 0 Histogram for Continuous Variables Variable: B_WEIGHT (B_WEIGHT) Normal Probability Plot 550+ * | | * | * | ++* | ***** | +**** | +**** | ***** | ***** | ***** | ***** | ****+ | ***** | ****+ | ***+ | *** | +**** | +++** |+ * | ** |*** |* 90+* +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 12 Histogram for Continuous Variables The UNIVARIATE Procedure Variable: WEIGHT (WEIGHT) Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 1000 15.62908 3.65234459 0.97193709 257594.423 23.3689033 Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean Basic Statistical Measures Location Variability Mean 15.62908 Std Deviation Median 15.20000 Variance Mode 15.00000 Range Interquartile Range Test Student's t Sign Signed Rank 1000 15629.08 13.339621 3.43576311 13326.2814 0.11549728 3.65234 13.33962 32.84000 4.76500 Tests for Location: Mu0=0 -Statistic-----p Value-----t 135.3199 Pr > |t| <.0001 M 500 Pr >= |M| <.0001 S 250250 Pr >= |S| <.0001 Quantiles (Definition 5) Quantile Estimate 100% Max 41.080 99% 26.250 95% 21.785 90% 20.070 75% Q3 17.870 50% Median 15.200 25% Q1 13.105 10% 11.105 5% 10.280 1% 9.035 0% Min 8.240 Extreme Observations ----Lowest-------Highest---Value Obs Value Obs 8.24 824 27.64 975 8.30 986 30.30 618 8.39 125 31.40 774 8.60 735 36.28 391 8.64 968 41.08 909 Missing Value . Missing Values -----Percent Of----Missing Count All Obs Obs 6 0.60 100.00 13 Histogram for Continuous Variables The UNIVARIATE Procedure Variable: WEIGHT (WEIGHT) Histogram 41+* . .* . . .* . .** 25+** .**** .*********** .************************ .***************************** .******************************************* .**************************** .********************* 9+****** ----+----+----+----+----+----+----+----+--* may represent up to 6 counts # 1 Boxplot * 1 * 2 0 7 10 24 61 140 174 256 167 122 35 The UNIVARIATE Procedure Variable: WEIGHT (WEIGHT) Normal Probability Plot 41+ * | | * | | | * | | *** 25+ ****++ | ****++ | +***** | ****** | ****** | ******** | ****** | ********* 9+********++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 14 0 0 | | | +-----+ *--+--* +-----+ | | Crosstabulation of Sex and Child Number The FREQ Procedure Table of SEX by CHILDNUM SEX(SEX) CHILDNUM(CHILDNUM) Frequency‚ Percent ‚ Row Pct ‚ Col Pct ‚ 1‚ 2‚ 3‚ Total ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ 1 ‚ 372 ‚ 113 ‚ 29 ‚ 514 ‚ 36.98 ‚ 11.23 ‚ 2.88 ‚ 51.09 ‚ 72.37 ‚ 21.98 ‚ 5.64 ‚ ‚ 51.67 ‚ 48.29 ‚ 55.77 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ 2 ‚ 348 ‚ 121 ‚ 23 ‚ 492 ‚ 34.59 ‚ 12.03 ‚ 2.29 ‚ 48.91 ‚ 70.73 ‚ 24.59 ‚ 4.67 ‚ ‚ 48.33 ‚ 51.71 ‚ 44.23 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 720 234 52 1006 71.57 23.26 5.17 100.00 Scatter Plot of Weight vs Height Plot of WEIGHT*HEIGHT. Legend: A = 1 obs, B = 2 obs, etc. W E I G H T 50 ˆ ‚ ‚ ‚ ‚ ‚ A 40 ˆ ‚ ‚ A ‚ ‚ ‚ A 30 ˆ A ‚ AA ‚ A BAA ‚ A A A AA A A ‚ AA BAA ABABA A ‚ A A A A CCCEFECBAAAA 20 ˆ A BCD BCEJFGKFFDEAB A ‚ ACBABDFHFNQSMHMGD AA ‚ B A A DCEMMNHNOOMJKDABAAAA ‚ ABAAAABFFGJIMHNOPZNTILFGHEA A ‚ A B C FFFFMNHKNJJGFCFBAAA A AA ‚ BBEDEAKCGKEIKHHJFHCHBB A 10 ˆ ABCFCJJFBBABBA CAAAA ‚ ADA ABA A A ‚ ‚ ‚ ‚ 0 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒ 70 80 90 100 110 120 130 HEIGHT NOTE: 8 obs had missing values. 2 obs hidden. 15 Side-by-Side Boxplot Using Proc Univariate ----------------------------------------------------------- SEX=1 ---------------------------------------------The UNIVARIATE Procedure Variable: WEIGHT (WEIGHT) Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 510 15.8952157 3.62177792 1.34306809 135532.213 22.7853336 Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean Basic Statistical Measures Location Variability Mean 15.89522 Std Deviation Median 15.42000 Variance Mode 15.00000 Range Interquartile Range Test Student's t Sign Signed Rank 510 8106.56 13.1172753 6.08443256 6676.69313 0.16037502 3.62178 13.11728 31.70000 4.70000 Tests for Location: Mu0=0 -Statistic-----p Value-----t 99.11279 Pr > |t| <.0001 M 255 Pr >= |M| <.0001 S 65152.5 Pr >= |S| <.0001 Variable: WEIGHT (WEIGHT) Quantiles (Definition 5) Quantile Estimate 100% Max 41.080 99% 24.300 95% 21.860 90% 20.185 75% Q3 18.100 50% Median 15.420 25% Q1 13.400 10% 11.560 5% 10.800 1% 9.980 0% Min 9.380 Extreme Observations ----Lowest---Value Obs 9.38 464 9.50 434 9.53 4 9.56 465 9.60 514 Missing Value . ----Highest---Value Obs 26.30 270 26.36 237 31.40 400 36.28 214 41.08 477 Missing Values -----Percent Of----Missing Count All Obs Obs 4 0.78 100.00 Histogram 41+* . .* 16 # 1 Boxplot * 1 * . . .* . .* 25+* .***** .************* .************************* .******************************** .******************************************** .****************************** .***************** 9+**** ----+----+----+----+----+----+----+----+---* may represent up to 3 counts 1 2 3 14 38 74 94 130 90 51 11 0 0 | | | +-----+ | | *--+--* +-----+ | | Side-by-Side Boxplot Using Proc Univariate ----------------------------------------------------------- SEX=1 ---------------------------------------------The UNIVARIATE Procedure Variable: WEIGHT (WEIGHT) Normal Probability Plot 41+ * | | * | | | * | | ** 25+ **++ | *****+ | ****** | ******* | ****** | ******* | ******** | *********+ 9+*****++++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 17 Side-by-Side Boxplot Using Proc Univariate ----------------------------------------------------------- SEX=2 ---------------------------------------------- N Mean Std Deviation Skewness Uncorrected SS Coeff Variation The UNIVARIATE Procedure Variable: WEIGHT (WEIGHT) Moments 490 Sum Weights 15.3520816 Sum Observations 3.6670949 Variance 0.62442916 Kurtosis 122062.21 Corrected SS 23.8866298 Std Error Mean Basic Statistical Measures Location Variability Mean 15.35208 Std Deviation Median 15.00000 Variance Mode 15.00000 Range Interquartile Range Test Student's t Sign Signed Rank 490 7522.52 13.447585 0.72837857 6575.86908 0.16566246 3.66709 13.44759 22.06000 4.99000 Tests for Location: Mu0=0 -Statistic-----p Value-----t 92.67085 Pr > |t| <.0001 M 245 Pr >= |M| <.0001 S 60147.5 Pr >= |S| <.0001 Variable: WEIGHT (WEIGHT) Quantiles (Definition 5) Quantile Estimate 100% Max 30.30 99% 26.45 95% 21.64 90% 19.89 75% Q3 17.69 50% Median 15.00 25% Q1 12.70 10% 10.89 5% 10.00 1% 8.64 0% Min 8.24 Extreme Observations ----Lowest-------Highest---Value Obs Value Obs 8.24 906 26.45 703 8.30 996 26.62 908 8.39 574 27.50 944 8.60 866 27.64 990 8.64 985 30.30 810 Missing Value . Missing Values -----Percent Of----Missing Count All Obs Obs 2 0.41 100.00 18 Side-by-Side Boxplot Using Proc Univariate ----------------------------------------------------------- SEX=2 ---------------------------------------------Variable: WEIGHT (WEIGHT) Histogram 30.5+* . . .* .** .** .** .* .**** .**** .******** 19.5+*************** .****************** .******************** .********************* .********************************* .****************************** .******************** .******************* .********************* .*************** .******** 8.5+**** ----+----+----+----+----+----+--* may represent up to 2 counts # 1 Boxplot 0 2 3 3 4 2 8 7 16 30 36 39 41 66 60 39 38 41 30 16 8 0 0 0 | | | | | | | +-----+ | | *--+--* | | | | +-----+ | | | | Normal Probability Plot 30.5+ * | | | * | ** | ** | *** ++ | * +++ | ***+ | **+ | *** 19.5+ **** | **** | *** | +*** | ***** | **** | *** | **** | **** | ***** | *****++ 8.5+**** +++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 Schematic Plots 45 40 35 30 25 20 15 10 5 SEX | + | | | * + | | | * + | | | 0 + 0 | | 0 | 0 0 + 0 | | | | | | | | | + | | | | | | +-----+ +-----+ | | + | | | + *-----* *--+--* | +-----+ | | | | +-----+ | | | + | | | | | | + ------------+-----------+----------1 2 19