Download Frequency Distributions and Central Tendency

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Time series wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
There are many ways to collect data,
and all of them have some drawbacks.
Frequency Distributions
and Central Tendency
For example, advice columnist Ann
Landers once asked her readers if they
would still have children, given a
chance to do it over again.
She reported in her column a few week
later that 70% of parents said having
kids was not worth it.1
1Yates,
It is true that 70% of those
responding to this survey said they
would not have kids if they could
do it all to do over again. But to
you think you can use this data to
draw the conclusion that 70% of
parents feel this way? Why or why
not?
One sampling method that is fairly
reliable is a random sample.
Definition: A random sample of n
individuals from a population chosen in
such a way as each individual in that
population has an equal chance of being
selected.
D, et all. The Practice of Statistics. 2000.
This survey used what is called a
voluntary response sample. Only
those who felt strongly about the issue
chose to respond. So instead of
saying 70% of parents would not have
children again, the best we could say
is 70% of those who feel strongly
about the issue would not have
children again.
One way to analyze data
collected is to look at how the
data is grouped. To do this, we
will examine the frequency
distribution.
1
Example: A random sample of 25 college
students is surveyed and asked, on
average, how many minutes a day they
spend reading their math textbook. The
answers are listed below:
14, 7, 1, 11, 2, 3, 11, 6, 10, 13, 11, 11, 16,
12, 9, 11, 9, 10, 7, 12, 9, 6, 4, 5, 9
We will use our calculators to plot the
frequency distribution.
Now enter ZOOM and ZOOMSTAT.
Your frequency distribution graph will
appear.
14, 7, 1, 11, 2, 3, 11, 6, 10, 13, 11, 11, 16,
12, 9, 11, 9, 10, 7, 12, 9, 6, 4, 5, 9
On your calculator, clear all lists. Then
input this data into L1.
Hit 2ND-STAT PLOT (above Y= key), and
hit ENTER to choose plot 1. Select ON,
and the bar graph (3rd selection) for Type.
Xlist should by default be set to L1.
Your plot should look something like this:
Hit WINDOW. If Xscl is not equal to 1,
change it. You might also want a Yscl
of 1.
Now, the height of the first bar represent
the number of 1’s, the second 2’s, etc.
Using this chart, we can write out a
frequency table:
minutes
1
2
3
4
5
6
7
8
frequency
1
1
1
1
1
2
2
0
minutes
9
10
11
12
13
14
15
16
frequency
4
2
5
2
1
1
0
1
We can use our plot of frequency
distribution to draw a frequency
polygon. A frequency polygon connects
the midpoint of the top of each bar on
the frequency distribution plot, using a 0
value for the top of the hypothetical 0th
and (n + 1)th bar.
2
Your calculate will also regroup the data.
Under WINDOW, change Xscl to 4, and
hit graph (you might also need to change
your Ymax).
Now our rectangles represent the number
of people who spent 1 – 4, 5 – 8, 9 – 12,
and 13 – 16 minutes daily reading their
math textbook.
Another way to analyze data is to
calculate its central tendency.
There are three different measure of
central tendency for frequency
distributions; mean, median, and mode.
Definition: The central tendency of a
data set is a value given to the center, or
middle, of the set.
The mean of a frequency distribution is
the most familiar.
For probability distributions, the central
tendency was the expected value.
Definition: The mean of n numbers, x1,
x2, . . . , xn is
n
x
∑
x + x + + xn i=1 i
x= 1 2
=
n
n
Other useful information on your screen:
This process is tedious for long lists of
numbers. Luckily, our calculators can do
this. To find the mean of the numbers is
L1, just hit STAT, scroll to CALC and hit
ENTER to get 1-Var Stats. That will
appear on your home screen. Hit L1 and
ENTER, and a long list of statistics
appear. x is the mean of the numbers in
L1.
∑x
∑x
= sum of L1
2
= sum of squares of values in L1
Sx = standard deviation of mean (next
section)
σx = standard deviation of population (we
won’t use this.
n = how many numbers in L1
3
Xmin = smallest number in L1
Q1 = beginning of 1st interquartile range (a
measure of variance; we won’t use this)
Med = median (coming up soon)
Q3 = end of 2nd interquartile range (see Q1)
MaxX = largest number in L1
The shape of the frequency distribution
is most commonly used to decide the
best measurement of central tendency.
When the frequency distribution is
shaped like a bell curve, the mean is the
best measure of central tendency (more
on this in section 9.3) When the
frequency distribution is skewed, median
can be a better measure of central
tendency.
In math, x is the name given to the
mean of a sample. This number can be
used to get an idea of what the mean
value is for the entire population.
The symbol typically used for the mean
of the entire population is µ (mu).
Our goal is to draw appropriate
conclusions about the population by
examining a sample.
Definition: The median of a data set is
the middle entry of that set.
Note that for our data set of time spent
reading math, the data was slightly
skewed to the left (we say the data set’s
left, not our left; that is, skewed to the
larger end of our set of numbers). Thus,
median would be a good measure of
central tendency for this data set.
Using the 1-Var Stats, we already found that
the median of this data is 9.
Since there are 25 numbers in this list of
data, 12 are less than (or equal) 9, 12 are
more than (or equal) 9.
To check, we can sort our list. Under STAT
menu, choose 2. SortA(, sort ascending. Hit
ENTER, and SortA( will appear on you
homescreen. Input L1, ENTER.
“Done” will appear on the
homescreen. If you now go to
STAT and hit 1 for Edit, you will
see L1 in ascending order. Check to
see if we got the correct median.
4
If you have an even number of numbers
in your data set, the median in the mean
of the middle 2.
Example: Find the median of
100, 114, 125, 135, 150, 172.
These numbers are already in ascending
order. There are 2 number less than 125
and 2 greater than 135, therefore
125 + 135
= 130
2
is the median.
Examples:
1. Find the mode or modes:
The final measure of central tendency is the
mode. This measure is not often used, but
does tend to cancel out the effect of
unusually large or unusually small elements
in a data set.
Definition: The mode of a data set is the
most frequently occurring element in that
data set.
A data set can have more than one mode.
Classwork: Below is a sampling of ages
of motorcyclists at the time they were
fatally injured in traffic accidents:
2. Find the mode or modes:
17, 38, 27, 14, 18, 34, 16, 42, 28, 24, 40,
20, 23, 31, 37, 21, 30, 25, 17, 28, 33,
25, 23, 19, 51, 18, 29
1, 7, 2, 2, 5, 7, 2, 5, 7
1. Input the data in L1 in you calculator.
21, 32, 46, 32, 49, 32, 49
2. Graph a frequency distribution with
bar width (Xscl) of 1.
3. Graph a frequency distribution with
bar width of 10.
4. Find the mean and median of the
data.
5