Download SECTION 2.3 – HOW CAN WE DESCRIBE THE CENTER OF

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Categorical variable wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
SECTION 2.3 – HOW CAN WE
DESCRIBE THE CENTER OF
QUANTITATIVE DATA?
As guests leave Hershey Park, they are asked
how many rides they have ridden in that day’s
visit. This is an example of what type of
variable?
1. 
2. 
3. 
4. 
5. 
Binary categorical
Discrete categorical
Binary quantitative
Discrete quantitative
Continuous quantitative
As guests leave Hershey Park, a random
sample of them are asked how many rides
they have ridden in that day’s visit. We will
use the data for some of today’s work:
2 2 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 6 7 7 8 8 10 12 15 Graphs vs. Numerical Summaries
—  Graphs
help give a sense of the shape of
the distribution when using quantitative
data.
—  Numerical summaries for quantitative data
typically take two forms.
◦  Center
◦  Spread
The shape of the data
The shape of the data
Mean (Average)
— 
The mean is the sum of all of the observations
divided by the number of observations.
x
x =∑
n
— 
What is the mean number of rides ridden, in the
sample of Hershey Park visitors?
Visual interpretation of the mean
The “balance point”
Median
—  The
median is the midpoint of the
observations. Half of the observations are
above it and half are below it.
◦  For an odd data set this is the middle number.
◦  For an even data set we use the average of the
two middle numbers.
Example 1
—  In
a small class a teacher notes that the
grades on a 10-point quiz for her 5 students
are: 10, 10, 7, 6, 4.
◦  What is the mean?
◦  What is the median?
Example 2
—  In
a small class a teacher notes that the
grades on a 10-point quiz for her 6 students
are: 10, 10, 9, 7, 6, 4
◦  What is the mean?
◦  What is the median?
In the previous example, if the last
score was a 1 instead of a 4, would the
mean change?
1.  Yes
2.  No
In the previous example, if the last
score was a 1 instead of a 4, would the
median change?
1.  Yes
2.  No
Mean vs. Median
The mean takes into account the value of every
observation. Thus the mean is sensitive to very
large and very small values.
—  The median only takes into account the middle of
the data. Thus values on either extreme do not
affect it.
—  We call extreme values outliers.
—  A numerical summary which is NOT sensitive to
outliers is said to be resistant.
— 
Mean vs. Median
—  When
the shape is symmetric the mean and
the median are usually close.
—  When the shape is skewed left, the mean is
to the left of the median.
—  When the shape is skewed right, the mean
is to the right of the median.
Mean vs Median
Mean vs. Median
The mean is a good numerical summary when
the data is symmetric and bell-shaped.
Height of 25 women in a class
Mean vs. Median
When the shape is irregular the mean is not as
meaningful of a summary.
—  What could account for the irregular shape of this
graph?
— 
Height of plants by color
5
red
Number of plants
4
pink
blue
3
2
1
0
Height in centimeters
Mode
—  For
discrete data that takes on a few
values, the median can become
meaningless.
—  In this case, the mode which is the most
frequently chosen value is often used as
the measure of center.
Example 3
25 faculty members at a university are asked
how many children they have. 13 faculty
members have no children, 6 faculty members
have 1 child, and 6 faculty members have two
children.
◦  The median number of children is 0.
◦  If the number of faculty members with one child
was 0 and the number of faculty members with two
children was 12, the median would still be 0.
SECTION 2.4 – HOW CAN WE
DESCRIBE THE SPREAD OF
QUANTITATIVE DATA?
Range
—  The
range is the difference between the
largest and the smallest observation.
—  If quiz scores are 5, 5, 7, 8, 9, 9, 10, then
the range of scores is 10 – 5 = 5.
—  It is not resistant.
—  It ignores most individual observations.
Standard Deviation
—  How
far does each observation fall from
the mean?
—  A deviation is the difference between the
observation and the mean.
◦  Positive deviation when the value is above the
mean.
◦  Negative deviation when the value is below the
mean.
Standard Deviation
—  We
square the deviations to make all of the
deviations positive.
—  We then average all of the deviations to
find the variance.
—  The standard deviation is the square root of
the variance.
—  Using 1-Var Stats on your TI-83/84 you can
get this and more for a data set.
Standard Deviation
Example 1, again
—  There
are 5 students in class and the scores
on a quiz are 10, 10, 7, 6, 4.
—  Determine the standard deviation of the
scores “by hand.”
—  Determine
the standard deviation of the
scores using a TI calculator.
Standard Deviation
—  Large
standard deviations represent data
which is spread out.
—  Small standard deviations represent data
which is bunched together.
—  Standard deviation is not resistant.
—  What would a data set with standard
deviation 0 look like?
Empirical Rule
If the distribution of the data is bell-shaped,
then approximately:
◦  68% of the observations fall within one standard
deviation of the mean
◦  95% of the observations fall within two standard
deviations of the mean
◦  99.7% of the observations fall within three
standard deviations of the mean
Example
We know that the heights of adult males follow
a generally bell shaped distribution with a
mean 68 inches and standard deviation of 2.5
inches.
—  Sketch the shape of this distribution.
—  If I have a random sample of adult males
whose mothers breastfed them and the
average height from this sample is 73 inches.
Does this seem like a significant result?