Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Univariate data analysis • Measures of central tendency – Mean, median, mode • Measures of spread – Variance, standard deviation, standard error • Measures of skew Variance and standard deviation • The variance (σ2) of a distribution is the mean of the squares of the deviations of each observation from the mean σ2 = ( x1 − x ) 2 + ( x2 − x ) 2 + .... + ( xi − x ) 2 = n −1 ∑ (x − x) 2 i n • The standard deviation (σ) of a distribution is the square root of the variance σ= ∑ (x − x) 2 i n −1 Step 1 – look at the data • Examine the distribution – histogram – scatter plot • Identify outliers 1 Step 2 – Describe the data • What is the most effective way to present the information contained in the data? – Table, histogram, line graph, box plot, etc. • Median vs. mean – Income and age at marriage Skewed distributions Bimodal distributions 2 Box plot: a convenient way to describe the central tendency and spread of a distribution Descriptive tables Sandberg, J. 2005. "The Influence of Network Mortality Experience on Nonnumeric Response Concerning Expected Family Size: Evidence From a Nepalese Mountain Village." Demography 42:737-756. 3 0 .005 .01 Percent .015 5 .02 .025 Recoding 20 40 60 AGE OF RESPONDENT 80 0 5 Percent 10 15 1 20 Easier to interpret? 0 18-29 30-39 40-49 50-59 60-69 AGE OF RESPONDENT 70-79 80-89 Too much formatting 25 Percent 20 15 10 5 0 18-29 30-39 40-49 50-59 60-69 70-79 80-89 AGE OF RESPONDENT 4 Way too much formatting 25 Percent 20 15 10 5 0 18-29 30-39 40-49 50-59 60-69 70-79 80-89 AGE OF RESPONDENT 5