Download Week03 Class2 PowerPoint

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Central limit theorem wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
Welcome to
Week 03 Thurs
MAT135 Statistics
http://media.dcnews.ro/image/201109/w670/statistics.jpg
Review
Descriptive Statistics
graphs
n
max
min
each observation
frequencies
“averages”
Descriptive Statistics
And…
Measures of variability!
Descriptive Statistics
Averages tell where the data
tends to pile up
Descriptive Statistics
Another good way to describe
data is how spread out it is
Variability
Measures of variability tell how
close to the “average” the
sample data tend to be
Variability
Just like measures of central
tendency, there are several
measures of variability
Variability
Range = max – min
sum of (obs – x)2
Variance =
n − 1
s = variance
Statistics vs Parameters
Statistic
Parameter
n
x
s2
s
N
μ
σ2
σ
Questions?
Descriptive Statistics
Other numbers and calculations
can be used to summarize our
data
Frequencies
Frequency –
the number in a category
Number of
Users
9
18
15
8
1
0
1
Frequencies
Cumulative frequency – the
number of observations that
fall in that category or a
previous category
This can only be done if the
categories can be ordered
Cumulative Frequencies
How many observations occur in
a given category and any
previous ordered categories:
Minutes Internet Number of
Usage
Users
Cumulative
Number of
Users
1-20
9
9
21-40
18
27
41-60
15
42
61-80
8
50
81-100
1
51
101-120
0
51
121+
1
52
Cumulative Frequencies
The last value is always “n” the
sample size
Minutes Internet Number of
Usage
Users
Cumulative
Number of
Users
1-20
9
9
21-40
18
27
41-60
15
42
61-80
8
50
81-100
1
51
101-120
0
51
121+
1
52
Cumulative Frequencies
The histogram for a cumulative
frequency distribution is called
an
“ogive”
Cumulative Frequencies
Data table:
A B A B A C B B
Cum Freq
distribution:
A: 3
A or B: 3+4=7
A,B or C: 8
n = 8
Histogram:
Cumulative Frequencies
Note that the last category in
a cumulative frequency
ALWAYS has the value n
Cumulative Frequencies
Note also a cumulative
frequency cannot get smaller as
you move up the categories
Cumulative Frequencies
Note also a cumulative
frequency cannot get smaller as
you move up the categories
It can stay the same (if the
category count is 0)
Cumulative Frequencies
An ogive typically forms an “s”
shape
Questions?
Fractiles
Another way of describing
frequency data
A measure of position
Based on the ogive (cumulative
frequency) or ordered data
Fractiles
How to do it:
find n
order the data
divide the data into the #
of pieces you want, each with
an equal # of members
Fractiles
quartile - four pieces
percentile - 100 pieces
FRACTILES
IN-CLASS PROBLEM 6
17
88
33
30
11
41
46
62
5
78
31
54
Step 1: Find n!
FRACTILES
IN-CLASS PROBLEM 6
17
88
33
30
11
41
46
62
5
78
31
54
n = 12
What’s next?
FRACTILES
IN-CLASS PROBLEM 7
5
11
17
30
31
33
41
46
54
62
78
88
Order the data!
What if you split it into
equal halves?
How many observations would
be in each half?
FRACTILES
IN-CLASS PROBLEM 8
5
11
17
30
31
33
41
46
54
62
78
88
Poof!
6 observations in each half!
This is the 50th percentile
or the “median”
FRACTILES
IN-CLASS PROBLEM 9,12
5
11
17
30
31
33
41
46
54
62
78
88
The 50th percentile
or the “median”
33+41
=
= 37
2
FRACTILES
IN-CLASS PROBLEM 10
5
11
17
30
31
33
41
46
54
62
78
88
What if you wanted
quartiles?
How many observations would
be in each quartile?
Where would the splits be?
FRACTILES
IN-CLASS PROBLEM 11,13
5
11
17
30
31
33
41
46
54
62
78
88
Poof!
3 observations in each
quartile!
FRACTILES
IN-CLASS PROBLEM 11,13
5
11
17
30
31
33
41
46
54
62
78
88
1st
30+17
quartile =
= 23.5
2
3rd
62+54
quartile =
= 58
2
Fractiles
Quartiles and percentiles are
common, others not so much
The median is also common, but
it is called “the median” rather
than “the 50th percentile” or
“2nd quartile”
Questions?
Variability
Another measure of variability:
Variability
Interquartile range (IQR):
IQR = 3rd quartile – 1st quartile
Variability
The interquartile range is in the
same units as the original data
(like the range and standard
deviation “s”)
FRACTILES
IN-CLASS PROBLEM 14
5
11
17
30
31
33
41
46
54
62
78
88
What is the IQR for our
data?
FRACTILES
IN-CLASS PROBLEM 14
5
11
17
30
31
33
41
46
54
62
78
88
1st
30+17
quartile =
= 23.5
2
3rd
62+54
quartile =
= 58
2
So the IQR is…
FRACTILES
IN-CLASS PROBLEM 14
5
11
17
30
31
33
41
46
54
62
78
88
1st
30+17
quartile =
= 23.5
2
3rd
62+54
quartile =
= 58
2
IQR = 58 - 23.5 = 34.5
Questions?
Continuous Distributions
You can think of smooth
quantitative data graphs as a
series of skinnier and skinnier
bars
Continuous Distributions
When the width of the bars
reach “zero” the graph is
perfectly smooth
Continuous Distributions
SO, a smooth quantitative
(continuous) graph can be
thought of as a bar chart
where the bars have width zero
Normal Distribution
The most popular continuous
graph in statistics is the
NORMAL DISTRIBUTION
Normal Distribution
Two descriptive statistics
completely define the shape of
a normal distribution:
Mean µ
Standard deviation σ
Normal Distribution
Suppose we have a normal
distribution, µ = 12 σ = 2
Normal Distribution
If µ = 12
12
Normal Distribution
If µ = 12 σ = 2
6
8 10 12 14 16 18
Normal Distribution
PROJECT QUESTION
Suppose we have a normal
distribution, µ = 10
?
Normal Distribution
PROJECT QUESTION
Suppose we have a normal
distribution, µ = 10 σ = 5
?
?
? 10
?
?
?
Normal Distribution
PROJECT QUESTION
Suppose we have a normal
distribution, µ = 10 σ = 5
-5
0
5 10 15 20 25
Normal Distribution
We can change any normallydistributed variable into a
standard normal
One with:
mean = 0
standard deviation = 1
Normal Distribution
PROJECT QUESTION
For the standard normal
distribution, µ = 0 σ = 1
?
?
?
?
?
?
?
Normal Distribution
For the standard normal
distribution, µ = 0 σ = 1
-3 -2 -1 0
1
2
3
Normal Distribution
The standard normal is also
called “z”
Normal Distribution
To calculate a “z-score”:
Take your value x
Subtract the mean µ
Divide by the standard
deviation σ
Normal Distribution
z = (x - µ)/σ
Normal Distribution
IN-CLASS PROBLEMS
Suppose we have a normal
distribution, µ = 10 σ = 2
z = (x - µ)/σ = (x-10)/2
Calculate the z values for
x = 9, 10, 15
Normal Distribution
IN-CLASS PROBLEMS
z = (x - µ)/σ = (x-10)/2
x .
9 z = (9-10)/2 = -1/2
Normal Distribution
IN-CLASS PROBLEMS
z = (x - µ)/σ = (x-10)/2
x .
9 z = (9-10)/2 = -1/2
10 z = (10-10)/2 = 0
Normal Distribution
IN-CLASS PROBLEMS
z = (x - µ)/σ = (x-10)/2
x .
9 z = (9-10)/2 = -1/2
10 z = (10-10)/2 = 0
15 z = (15-10)/2 = 5/2
On the graph:
-3 -2 -1 0
|
-1/2
1
2
3
|
5/2
Empirical Rule
Questions?
You survived!
Turn in your classwork!
Don’t forget
your homework
due next week!
Have a great
rest of the week!