Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Stat 350 Lab Session GSI: Yizao Wang Section 016 Mon 2pm30-4pm MH 444-D Section 043 Wed 2pm30-4pm MH 444-B Outline • • • • • Introduction Syllabus A brief review Module1: Activity1,2 Module2: Activity2 Something about me • My name: Yizao Wang • My brief CV: Originally from Beijing Having been studying in Paris during the last three years Now a first year graduate student in Department of Statistics • I play Go when I have time… (do you know where is the Umich Go club?) Introduce yourself • • • • What is your name? Where are you from? What is your major? Which year are you in? Syllabus Any questions? What is statistics… Data | Analysis | Inference/conclusion Let’s start with data When we are collecting (sampling) data… How many types of variables are there? What are they? Let’s start with data When we are collecting (sampling) data… How many types of variables are there? 2 What are they? Categorical variables Quantitative/numerical variables Categorical Consisting of groups of names that do not Raw data necessarily have a logical order example Graphical summary Numerical summary Gender, eye color Quantitative Consisting of numerical values taken on each individual. Height, test score Categorical Consisting of groups of names that do not Raw data necessarily have a logical order Quantitative Consisting of numerical values taken on each individual. example Gender, eye color Height, test score Graphical summary Bar graph Pie chart Histogram Boxplot Numerical summary Categorical Consisting of groups of names that do not Raw data necessarily have a logical order Quantitative Consisting of numerical values taken on each individual. example Gender, eye color Height, test score Graphical summary Bar graph Pie chart Histogram Boxplot Frequency table 5 number summary (median, quartiles and extremes) Numerical summary Some big ideas Different types of data lead to different statistical methods, numerical summaries and plots. Histograms: the (shape of ) distribution of a quantitative response Boxplots: picture of 5 number summary most useful for comparing 2+ sets of data Module 1: Activity 1 visualizing and exploring a data set Start up SPSS and open the employee data set What type of variable is gender? What type of graphs would be good to make for this variable? What type is current salary? What type of graphs for it? Module 1: Activity 1 visualizing and exploring a data set Start up SPSS and open the employee data set What type of variable is gender? Categorical What type of graphs would be good to make for this variable? Bar graphs What type is current salary? Quantitative What type of graphs for it? Histogram Module 1: Activity 1 visualizing and exploring a data set Let’s make a histogram of current salary Don’t forget the title! What shape do we see for the distribution of salary? Change the color Module 1: Activity 1 visualizing and exploring a data set Let’s make a histogram of current salary Don’t forget the title! What shape do we see for the distribution of salary? Skew to the right Change the color Module 1: Activity 1 visualizing and exploring a data set Basic summary measures for current salary Get five number summary Save output Module1: Activity 2 The Mean and the Median Open the applet http://www.ruf.rice.edu/%7Elane/stat_si m/descriptive/index.html Produce a positive skew and a negative skew, and compare the relationship between the mean and the median Try different shapes of distribution, compare their standard deviations. Comment? Toy question: with N=10, give the distribution with largest/smallest standard deviation Module1: Activity 2 The Mean and the Median In a symmetric distribution, the mean and the median are equal. With positive skewed distributions, the mean is generally larger than the median. With negative skewed distributions, the mean is generally smaller than the median. In a skewed distribution, which is a good measure the center of a distribution? Module1: Activity 2 The Mean and the Median In a symmetric distribution, the mean and the median are equal. With positive skewed distributions, the mean is generally larger than the median. With negative skewed distributions, the mean is generally smaller than the median. In a skewed distribution, which is a good measure the center of a distribution? Median Module1: Activity 2 The Mean and the Median Standard deviation: On average, salaries are expected to fall approximately ___$ from the mean salary of ___$. On average, salaries vary by about ___$ from the mean salary of ___$. Module2: Activity 1 How do genders compare on SSHA scores Background: Survey of Study Habits and Attitudes of college freshmen. It is known that scores on the SSHA may explain success in college. Data of both females and males is collected. Use side-by-side boxplots to examine (compare) the distribution of the scores by gender. Module2: Activity 1 How do genders compare on SSHA scores Produce a side-by-side boxplot Add a title Which gender had the lowest score? Which had the highest score? Which gender had the lowest median score? How to compare the variability? Can you tell the shape from boxplot? Module2: Activity 1 How do genders compare on SSHA scores Produce a side-by-side boxplot Add a title Which gender had the lowest score? Male Which had the highest score? Female Which gender had the lowest median score? Male How to compare the variability? IQRs Can you tell the shape from boxplot? No! Module2: Activity 1 How do genders compare on SSHA scores Split file and make histograms (organize output by groups) (Get descriptive summaries using frequencies option) Review of lab 1 What does statistics do? Categorical variables and numerical variables Using plots to visualize data Histogram to see the distribution Standard deviation and shape of distribution Boxplot with 5 number summary Are you able to do HW1 with SPSS? Before we finish today… Comments on today’s lab? Qwizdom system Survey to complete