Download Min = -17 IQR=14

Document related concepts
no text concepts found
Transcript
Welcome to
Week 04 Tues
MAT135 Statistics
http://media.dcnews.ro/image/201109/w670/statistics.jpg
Review
Descriptive Statistics
Descriptive statistics –
describe our sample – we’ll use
this to make inferences about
the population
Descriptive Statistics
graphs
n
max
min
each observation
frequencies
mean, median, mode
range, variance, standard
deviation, quartiles, IQR
Statistics vs Parameters
Statistic
n
x
s2
s
Parameter
N
μ
σ2
σ
Questions?
Exploring Data
We are using the descriptive
statistics to summarize our
sample (and, hopefully, our
population) in just a few
numbers
Exploring Data
The “five-number summary” is:
the min
Q1
the median
Q3
the max
Exploring Data
We know how to get all of
these using our calculators!
Boxplots
There is a graph statisticians
use to show this summary:
the box plot
Boxplots
The boxplot (a.k.a. box and
whisker diagram) is a
standardized way of displaying
the distribution of data based
on the five number summary:
minimum, first quartile, median,
third quartile, and maximum
Boxplots
BOXPLOTS
IN-CLASS PROBLEM
Daily high temperatures Feb 2008
for Fairbanks, Alaska:
14, 12, 17, 25, 10, -1, -8, -15,
-7, 0, 5, 14, 18, 14, 16, 8,
-15, -13, -17, -12, 0, 1, 9, 12,
14, 7, 6, 8
Create a Boxplot
BOXPLOTS
IN-CLASS PROBLEM 1
What do we need for a Boxplot?
BOXPLOTS
IN-CLASS PROBLEM 2
Daily high temperatures Feb 2008
for Fairbanks, Alaska:
14, 12, 17, 25, 10, -1, -8, -15,
-7, 0, 5, 14, 18, 14, 16, 8,
-15, -13, -17, -12, 0, 1, 9, 12,
14, 7, 6, 8
Find the 5-number summary
BOXPLOTS
IN-CLASS PROBLEM 2
Min =
Q1 =
Median =
Q3 =
Max =
BOXPLOTS
IN-CLASS PROBLEM 2
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
Notice they’re all in order at the
bottom of your list! YAY!
BOXPLOTS
IN-CLASS PROBLEM 3
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
Now for the box!
-4
0
4
8
12
16
20
24
BOXPLOTS
IN-CLASS PROBLEM 3
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
-4
0
4
8
Min!
12
16
20
24
BOXPLOTS
IN-CLASS PROBLEM 3
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
-4
0
4
8
Q1!
12
16
20
24
BOXPLOTS
IN-CLASS PROBLEM 3
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
-4
0
4
8
Median!
12
16
20
24
BOXPLOTS
IN-CLASS PROBLEM 3
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
-4
0
4
8
Q3!
12
16
20
24
BOXPLOTS
IN-CLASS PROBLEM 3
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
-4
0
4
8
Max!
12
16
20
24
BOXPLOTS
IN-CLASS PROBLEM 3
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
-4
0
4
8
Box!
12
16
20
24
BOXPLOTS
IN-CLASS PROBLEM 3
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
-4
0
4
8
Whiskers!
12
16
20
24
Questions?
Outliers
Because the min and max may
be outliers, a variation on the
boxplot includes “fences” to
show where most of the data
occurs
Outliers
Lower fence:
Q1 - 1.5 * IQR
Upper fence:
Q3 + 1.5 * IQR
OUTLIERS
IN-CLASS PROBLEM 4
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
What is the IQR?
-4
0
4
8
12
16
20
24
OUTLIERS
IN-CLASS PROBLEM 4
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
IQR=14-(-4)=18
What is the
lower fence?
-4
0
4
8
12
16
20
24
OUTLIERS
IN-CLASS PROBLEM 5
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
IQR=14-(-4)=18
Lower fence =
Q1-1.5*IQR
-4-1.5(18)
= -31
-4
0
4
8
12
16
20
24
OUTLIERS
IN-CLASS PROBLEM 5
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
IQR=14-(-4)=18
Lower fence=-31
What is the
upper fence?
-4
0
4
8
12
16
20
24
OUTLIERS
IN-CLASS PROBLEM 6
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
IQR=14-(-4)=18
Lower fence=-31
Upper fence=
Q3+1.5*IQR
14+1.5(18)=41
-4
0
4
8
12
16
20
24
OUTLIERS
IN-CLASS PROBLEM 6
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
IQR=14-(-4)=18
Lower fence=-31
Upper fence=41
So, do we have
any outliers?
-4
0
4
8
12
16
20
24
OUTLIERS
IN-CLASS PROBLEM 7
Min = -17
Q1 = -4
Median = 7.5
Q3 = 14
Max = 25
-24
-20
-16
-12
-8
IQR=14-(-4)=18
Lower fence=-31
Upper fence=41
Max and Min are
inside the fence!
-4
0
4
8
12
16
20
24
Outliers
How outliers are shown in a
boxplot
Types of Boxplots
Questions?
Boxplots
Boxplots are typically used to
compare different groups
Boxplots
Data Summary Table from a
Ball-bouncing Experiment
Super Wiffle Golf Splash Spongy
Ball
Ball
Ball
Ball
Ball
Minimum
66
38
70
7
44
Q1
71
45
75
14
58
Median
76
48
78
16.5
60
Q3
78
50
80
23
62
Maximum
91
58
90
28
67
Boxplots
Boxplots
BOXPLOTS
IN-CLASS PROBLEM 8
What differences?
Boxplots
Unfortunately it is almost
impossible to get a true boxplot
using Excel
Boxplots
Unfortunately it is almost
impossible to get a true boxplot
using Excel
(there are several YouTube
videos showing how to get one…
Boxplots
Unfortunately it is almost
impossible to get a true boxplot
using Excel
(there are several YouTube
videos showing how to get one…
but they are all wrong…)
Questions?
Exploring Data
There actually IS a useful
graph you can get out of Excel
that includes both an average
and a measure if dispersion
Exploring Data
I use the Hi/Low/Close graph
Exploring Data
BOXPLOTS
IN-CLASS PROBLEM 9
What does this graph show?
BOXPLOTS
IN-CLASS PROBLEM 10
What does this graph show?
Questions?
Normal Probability
The most popular continuous
graph in statistics is the
NORMAL DISTRIBUTION
Empirical Rule
Two descriptive statistics
completely define the shape of
a normal distribution:
Mean µ
Standard deviation σ
Empirical Rule
Suppose we have a normal
distribution, µ = 12 σ = 2
Empirical Rule
If µ = 12
12
Empirical Rule
If µ = 12 σ = 2
6
8 10 12 14 16 18
Empirical Rule
More sneaky stuff about the
normal distribution:
Empirical Rule
More sneaky stuff about the
normal distribution:
Empirical Rule
So now you can calculate even
more percentages!
EMPIRICAL RULE
IN-CLASS PROBLEM 11
What % of the data is between
the mean and +1 SD?
EMPIRICAL RULE
IN-CLASS PROBLEM 12
What % is between the mean
and -1 SD?
EMPIRICAL RULE
IN-CLASS PROBLEM 13
What % of the data is between
+1 SD and +2 SD?
EMPIRICAL RULE
IN-CLASS PROBLEM 14
What % is between -1 SD and
-2 SD?
EMPIRICAL RULE
IN-CLASS PROBLEM 15
What % of the data is between
+2 SD and +3 SD?
EMPIRICAL RULE
IN-CLASS PROBLEM 16
What % is between -2 SD and
-3 SD?
EMPIRICAL RULE
IN-CLASS PROBLEM 17
What % of the data is above
+3 SD?
EMPIRICAL RULE
IN-CLASS PROBLEM 18
What % of the data is below
-3 SD?
Questions?
z-scores
For the standard normal
distribution, µ = 0 σ = 1
-3 -2 -1 0
1
2
3
z-scores
The standard normal is also
called “z”
z-scores
z = (x - µ)/σ
EMPIRICAL RULE
IN-CLASS PROBLEM 19
A dataset has a normal
distribution with μ = 45 and
σ = 13
Find the z-score for a value
of 65:
z-scores
With a bit of algebra, we can
use z = (x - µ)/σ to solve for
x given a z-score
z-scores
z = (x - µ)/σ
x =
EMPIRICAL RULE
IN-CLASS PROBLEM 20
A data point from a normal
distribution with μ = 45 and
σ = 13 has a z-score = 2.3
What is the data value?
In-class Project
Turn in your homework!
Don’t forget
your homework
due next class!
See you Thursday!
Related documents