Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STATS 250 Lab 2 Julie Ghekas [email protected] September 15, 2014 Schedule • Recap from first lab and prelab • Warm Up • Lab 2 • Cool Down • iClicker Questions Lab Workbook for Fall 2014 • open.umich.edu/ – Search Stats 250 – Select Materials Tab for the course – Scroll down to have access to handouts, labs, lecture notes, etc. • open.umich.edu/education/lsa/statistics250/fall2015/mater ials#Labs • Or order from Amazon (~$10) • Recommend taking notes in updated personal workbook Prelab Results Right skewed: Mean > Median Left skewed: Median > Mean Remember, the mean is more sensitive to outliers than the median. Symmetric: Mean = Median Recap for Homework • Practice homework graded to the same standard of regular homework • Answer the question fully • Put your name on generated graphs • Show all work • Include units Boxplots • Graph of 5-number summary • Outliers denoted with ° and * • Can be side-by-side • Does not show shape of distribution Bar charts • Displays categorical variables • Y-axis represents counts, proportions, or percentages • Can rearrange bars in any order Time Series/Sequence Plots • Examining data over time • Checks assumption that observations are from an identically distributed population • Be careful; time series can be displayed in different formats Time Series Source: http://blogs.bgsu.edu/statgraphicsmep aler/2013/03/22/new-havenstemperature-in-4-different-timeseries-plots/ Source: http://www.statcan.gc.ca/edu/powerpouvoir/ch9/bargraphdiagrammeabarres/5214818-eng.htm Time Series: Trends • A trend is a consistent, long-term rise or fall. Time Series: Variation • Generally, variation is used to describe patterns in the data. Seasonal Variation Increasing Variation Time Series: Stability • If there are no patterns in the time plot, then it is considered stable. • Stability helps us confirm or reject the identically distributed part of iid/random sample. In order for data to be considered stable, both the mean and the variance of the observations needs to be constant over time. Q-Q Plots • Checks assumption that observations are from a normally distributed parent population • Q stand for quantiles (percentiles): graph compares Quantiles from the standard normal distribution with Quantiles from our sample • Want a straight line • Better than a histogram Q-Q Plot of Data from an Approximately Normal Distribution Q-Q Plots that do NOT allow us to assume a population with Normal Distribution R scripts • Canvas homepage -> R tutorials • Open timeseries.rdata or qqplot.rdata – Canvas homepage -> R tutorials ->Time Series/QQ plots • Start script with timeseries() or qqplot() – Without the underscore printed in the lab workbook Warm Up Lab • With a partner or two, work on the Lab • You will not get credit if you work alone • Work with employee data.sav • If you finish early, complete the Cool Down, R practice, Example Exam Question, or Practice HW problem 3 Statistics Current Salary N Valid 474 Missing Mean $34,419.57 Median $28,875.00 Std. Deviation $17,075.661 Range $119,250 Minimum $15,750 Maximum $135,000 Percentiles 25 $24,000.00 50 $28,875.00 75 $37,162.50 IQR=$13,162.50 Statisticsa Statisticsa Current Salary Current Salary Valid N 0 104 Missing Valid N 0 Mean $28,713.94 Mean Median $26,625.00 Median Std. Deviation $11,421.638 Range $83,650 Minimum $16,350 Maximum Percentiles $100,000 25 $23,587.50 50 $26,625.00 75 $30,712.50 a. Minority Classification = Yes IQR=$7,125.00 370 Missing $29,925.00 Std. Deviation $18,044.096 Range $119,250 Minimum $15,750 Maximum Percentiles 0 $36,023.31 $135,000 25 $24,150.00 50 $29,925.00 75 $40,350.00 a. Minority Classification = No IQR=$16,200.00 Cool Down • Everyone turns in own ticket • Work on Cool Down in groups iClicker Survey: Students were asked how many hours they study in a typical week. A five-number summary of the responses is: 2, 10, 14, 20, 60 Fill in the blank: About 75% of the students spent at least ___ hours studying in a typical week. A. 10 B. 14 C. 20 D. 45 iClicker Survey: Students were asked how many hours they study in a typical week. A five-number summary of the responses is: 2, 10, 14, 20, 60 Fill in the blank: About 75% of the students spent at least ___ hours studying in a typical week. A. 10 B. 14 C. 20 D. 45 iClicker Survey: Students were asked how many hours they study in a typical week. A five-number summary of the responses is: 2, 10, 14, 20, 60 What percent of students reported studying between 10 and 20 hours in a typical week? A. 68% B. 50% C. 25% D. 75% iClicker Survey: Students were asked how many hours they study in a typical week. A five-number summary of the responses is: 2, 10, 14, 20, 60 What percent of students reported studying between 10 and 20 hours in a typical week? A. 68% B. 50% C. 25% D. 75% iClicker Which of the following provides the most information about the shape of a data set? A. Boxplot B. Pie chart C. Five number summary D. Histogram iClicker Which of the following provides the most information about the shape of a data set? A. Boxplot B. Pie chart C. Five number summary D. Histogram iClicker Here is a graph showing revenue for a company. What kind of graph is this? Actual Revenue for Eastman Kodak 25 • B. Histogram 20 • C. Time plot • D. Box plot Revenues ($billions) • A. Bar chart 15 10 5 0 98 19 96 19 94 19 92 19 90 19 88 19 86 19 84 19 82 19 80 19 iClicker Here is a graph showing revenue for a company. What kind of graph is this? Actual Revenue for Eastman Kodak 25 • B. Histogram 20 • C. Time plot • D. Box plot Revenues ($billions) • A. Bar chart 15 10 5 0 98 19 96 19 94 19 92 19 90 19 88 19 86 19 84 19 82 19 80 19 iClicker Cost of a gallon of gas Can we describe this graph as… A. Right Skewed B. Left Skewed C. Neither iClicker Cost of a gallon of gas Can we describe this graph as… A. Right Skewed B. Left Skewed C. Neither iClicker In a boxplot, what does a dot represent? A. Quartile B. Mean C. Median D. Mode E. Outlier iClicker In a boxplot, what does a dot represent? A. Quartile B. Mean C. Median D. Mode E. Outlier iClicker Which time plot is the most stable? A. B. C. iClicker Which time plot is the most stable? A. B. C. iClicker How do you feel about the material covered today? A. Completely understood everything B. Understood main ideas, shaky on details C. Good for the first half, lost for the second D. Had trouble with some main ideas E. Difficulty following most materials Reminders • Good job setting up LectureBook • Practice Homework due Thursday 8 am • Pre-lab 3 due Monday 8 am • Office Hours • Food Allergies?