Download Julie`s Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
STATS 250 Lab 2
Julie Ghekas
[email protected]
September 15, 2014
Schedule
• Recap from first lab and prelab
• Warm Up
• Lab 2
• Cool Down
• iClicker Questions
Lab Workbook for Fall 2014
• open.umich.edu/
– Search Stats 250
– Select Materials Tab for the course
– Scroll down to have access to handouts, labs, lecture notes, etc.
• open.umich.edu/education/lsa/statistics250/fall2015/mater
ials#Labs
• Or order from Amazon (~$10)
• Recommend taking notes in updated personal workbook
Prelab Results
Right skewed: Mean > Median
Left skewed: Median > Mean
Remember, the mean is more
sensitive to outliers than the
median.
Symmetric: Mean = Median
Recap for Homework
• Practice homework graded to the same standard
of regular homework
• Answer the question fully
• Put your name on generated graphs
• Show all work
• Include units
Boxplots
• Graph of 5-number summary
• Outliers denoted with ° and *
• Can be side-by-side
• Does not show shape of distribution
Bar charts
• Displays categorical variables
• Y-axis represents counts, proportions, or
percentages
• Can rearrange bars in any order
Time Series/Sequence Plots
• Examining data over time
• Checks assumption that observations are
from an identically distributed population
• Be careful; time series can be displayed in
different formats
Time Series
Source:
http://blogs.bgsu.edu/statgraphicsmep
aler/2013/03/22/new-havenstemperature-in-4-different-timeseries-plots/
Source:
http://www.statcan.gc.ca/edu/powerpouvoir/ch9/bargraphdiagrammeabarres/5214818-eng.htm
Time Series: Trends
• A trend is a consistent, long-term rise or fall.
Time Series: Variation
• Generally, variation is used to describe
patterns in the data.
Seasonal Variation
Increasing Variation
Time Series: Stability
• If there are no patterns in the time plot, then it is
considered stable.
• Stability helps us confirm or reject the identically
distributed part of iid/random sample. In order
for data to be considered stable, both the mean
and the variance of the observations needs to be
constant over time.
Q-Q Plots
• Checks assumption that observations are from a
normally distributed parent population
• Q stand for quantiles (percentiles): graph compares
Quantiles from the standard normal distribution with
Quantiles from our sample
• Want a straight line
• Better than a histogram
Q-Q Plot of Data from an
Approximately Normal
Distribution
Q-Q Plots that do NOT allow us to
assume a population with Normal
Distribution
R scripts
• Canvas homepage -> R tutorials
• Open timeseries.rdata or qqplot.rdata
– Canvas homepage -> R tutorials ->Time Series/QQ
plots
• Start script with timeseries() or qqplot()
– Without the underscore printed in the lab workbook
Warm Up
Lab
• With a partner or two, work on the Lab
• You will not get credit if you work alone
• Work with employee data.sav
• If you finish early, complete the Cool Down, R
practice, Example Exam Question, or Practice
HW problem 3
Statistics
Current Salary
N
Valid
474
Missing
Mean
$34,419.57
Median
$28,875.00
Std. Deviation
$17,075.661
Range
$119,250
Minimum
$15,750
Maximum
$135,000
Percentiles
25
$24,000.00
50
$28,875.00
75
$37,162.50
IQR=$13,162.50
Statisticsa
Statisticsa
Current Salary
Current Salary
Valid
N
0
104
Missing
Valid
N
0
Mean
$28,713.94
Mean
Median
$26,625.00
Median
Std. Deviation
$11,421.638
Range
$83,650
Minimum
$16,350
Maximum
Percentiles
$100,000
25
$23,587.50
50
$26,625.00
75
$30,712.50
a. Minority Classification = Yes
IQR=$7,125.00
370
Missing
$29,925.00
Std. Deviation
$18,044.096
Range
$119,250
Minimum
$15,750
Maximum
Percentiles
0
$36,023.31
$135,000
25
$24,150.00
50
$29,925.00
75
$40,350.00
a. Minority Classification = No
IQR=$16,200.00
Cool Down
• Everyone turns in own ticket
• Work on Cool Down in groups
iClicker
Survey: Students were asked how many hours they study in a typical
week. A five-number summary of the responses is: 2, 10, 14, 20, 60
Fill in the blank: About 75% of the students spent at least ___ hours
studying in a typical week.
A. 10
B. 14
C. 20
D. 45
iClicker
Survey: Students were asked how many hours they study in a typical
week. A five-number summary of the responses is: 2, 10, 14, 20, 60
Fill in the blank: About 75% of the students spent at least ___ hours
studying in a typical week.
A. 10
B. 14
C. 20
D. 45
iClicker
Survey: Students were asked how many hours they study in a typical
week. A five-number summary of the responses is: 2, 10, 14, 20, 60
What percent of students reported studying between 10 and 20 hours
in a typical week?
A. 68%
B. 50%
C. 25%
D. 75%
iClicker
Survey: Students were asked how many hours they study in a typical
week. A five-number summary of the responses is: 2, 10, 14, 20, 60
What percent of students reported studying between 10 and 20 hours
in a typical week?
A. 68%
B. 50%
C. 25%
D. 75%
iClicker
Which of the following provides the most
information about the shape of a data set?
A. Boxplot
B. Pie chart
C. Five number summary
D. Histogram
iClicker
Which of the following provides the most
information about the shape of a data set?
A. Boxplot
B. Pie chart
C. Five number summary
D. Histogram
iClicker
Here is a graph showing revenue for a company.
What kind of graph is this?
Actual Revenue for Eastman Kodak
25
• B. Histogram
20
• C. Time plot
• D. Box plot
Revenues ($billions)
• A. Bar chart
15
10
5
0
98
19
96
19
94
19
92
19
90
19
88
19
86
19
84
19
82
19
80
19
iClicker
Here is a graph showing revenue for a company.
What kind of graph is this?
Actual Revenue for Eastman Kodak
25
• B. Histogram
20
• C. Time plot
• D. Box plot
Revenues ($billions)
• A. Bar chart
15
10
5
0
98
19
96
19
94
19
92
19
90
19
88
19
86
19
84
19
82
19
80
19
iClicker
Cost of a gallon of gas
Can we describe this graph as…
A. Right Skewed B. Left Skewed C. Neither
iClicker
Cost of a gallon of gas
Can we describe this graph as…
A. Right Skewed B. Left Skewed C. Neither
iClicker
In a boxplot, what does a dot represent?
A. Quartile
B. Mean
C. Median
D. Mode
E. Outlier
iClicker
In a boxplot, what does a dot represent?
A. Quartile
B. Mean
C. Median
D. Mode
E. Outlier
iClicker
Which time plot is the most stable?
A.
B.
C.
iClicker
Which time plot is the most stable?
A.
B.
C.
iClicker
How do you feel about the material covered today?
A. Completely understood everything
B. Understood main ideas, shaky on details
C. Good for the first half, lost for the second
D. Had trouble with some main ideas
E. Difficulty following most materials
Reminders
• Good job setting up LectureBook
• Practice Homework due Thursday 8 am
• Pre-lab 3 due Monday 8 am
• Office Hours
• Food Allergies?