Download Algebra 2 Statistics Notes #5: Describing Data Distributions Name

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Algebra 2
Statistics Notes #5: Describing Data Distributions
Name _____________________________________
MAFS.912.S-ID.1.4
 Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population
percentages.
 Recognize that there are data sets for which such a procedure is not appropriate.
 Use calculators and tables to estimate areas under the normal curve.
When describing a data distribution, ALWAYS include 3 things:
Shape
,
Center
,
Spread
.
SHAPE
I. These are the common shapes that distributions have. Name each.
a)
b)
Bell
c)
Uniform
d)
Skewed Left
Skewed Right
II. While creating box-and-whisker plots, we found out that sometimes the mean and median are not the same.
a) Which shape(s) would you expect to have the same mean and median? Bell and Uniform (both symmetric)
b) Which shape(s) would you expect to have a different mean and median? Skewed Left and Skewed Right
c) When the mean is greater than the median, what shape is the distribution? Why?
Skewed Right because the average will be pulled toward the greater extreme value.
d) When the mean is less than the median, what shape is the distribution? Why?
Skewed Left because the average will be pulled toward the lower extreme value.
CENTER
We have already looked at two different measures of center:
mean
and median
.
Which measure of center best describes the peak of a skewed distribution? Why?
Median, because the mean is affected by extreme values and may not best represent the majority of the data.
SPREAD
Spread is a measure of how far apart the data values are, compared to each other. The simplest measure of spread is the
range
.
Another measure of spread that is often used is
standard
deviation
easiest way to explain standard deviation is: how far apart the data values are when compared to the
. The
mean
.
Example 1:
Compare data sets A, B, and C:
A: {8, 9, 10, 11, 12 }
B: {7, 8, 10, 12, 13 }
a) What is the mean of each data set?
C: {10, 10, 10, 10, 10 }
A: 10 B: 10 C:10
b) Which data set has values closest to the mean?__C___. This one has the
smallest
c) Which data set has values farthest from the mean?__B___. This one has the
standard deviation.
largest
standard deviation.
IMPORTANT PROPERTIES of STANDARD DEVIATION:
 It is only equal to 0 if all of the data is the
same number.
 Larger st. dev. values indicate
bigger
amounts of variation.
 It will increase dramatically if extreme
values
(outliers) exist.

Units
of standard deviation are the same as the units of the original data
This is the formula for standard deviation:
𝑆𝑥 = √
∑(𝑥−𝑥̅ )2
𝑛−1
Example 2:
Lately there have been a lot of big caterpillars on the trees outside the school. The science department
decides to take a random sample of 5 caterpillars and measure their lengths in centimeters. The lengths
are shown below.
{𝒙: 8 cm, 6 cm, 5 cm, 7 cm, 4 cm}
a) Find the Mean:
̅=
𝒙
6 cm
.
b) Find the Standard Deviation. Fill out the chart to help you:
x
(𝑥 − 𝑥̅ )
(𝑥 − 𝑥̅ )2
∑(𝑥−𝑥̅ )2
8
2
4
√
𝑆𝑥 =
=
6
0
0
𝑛−1
5
-1
1
=Type equation here.
7
1
1
4
-2
4
√
10
4
=1.581 cm
sum: ∑(𝑥 − 𝑥̅ )2 =
10
sample size: n = 5
Example 3:
You decide to take another random sample of 5 caterpillars. Find the mean and standard deviation of this
new sample: {𝒙: 6 cm, 8 cm, 4 cm, 3 cm, 9 cm}
a) Find the Mean:
̅=
𝒙
6 cm .
b) Find the Standard Deviation. Fill out the chart to help you:
x
(𝑥 − 𝑥̅ )
(𝑥 − 𝑥̅ )2
6
0
0
𝑆𝑥 =
8
2
4
4
-2
4
cm
3
-3
9
9
3
9
sum: ∑(𝑥 − 𝑥̅ )2 =
26
√
∑(𝑥−𝑥̅ )2
𝑛−1
=√
26
4
= 2.550
.
c) Compare the means and standard deviations of both samples. Explain your findings.
The data sets have the same mean (6 cm) but the second data set has a larger standard deviation
because its values are farther from the mean.
Finding Standard Deviation on your CALCULATOR:
On the TI-84:
1. Type your data into a List: STAT, EDIT
2. Press: STAT, CALC, 1-Var Stats
3. Enter the list your data is in (L1, L2, L3, etc.), ENTER
(If there is a space for FreqList, leave this blank!)
Note: Scroll down to find the mean (𝑥̅ ) and the 5-number summary!
On the TI-nspire:
1. Type your data into a List: menu, New Document, Add Lists & Spreadsheet
2. Press: menu > Statistics > Stat Calculations > One-Variable Statistics
3. Num of Lists: 1, OK, OK
(Leave all other lines blank!)
Note: Scroll down to find the mean (𝑥̅ ) and the 5-number summary!
 Now Try Examples 6 and 7 on your calculator!!
 Did you get the same numbers for standard deviation, Sx? YES!
PRACTICE PROBLEMS
1. The stem-and-leaf plot below shows the number of minutes the students in Language Arts class said that they
spent writing a report for a class project.
a) What is the approximate shape of the distribution?
b) Find the median.
c) The mean is 146. How close are they? Does this make sense,
based on the shape of the distribution?
Key: 12│4= 124 min
2. The salaries of the 15 employees at a small company (in thousands) are shown below.
24, 24, 25, 25, 25, 25, 28, 28, 30, 30, 32, 38, 65, 85, 120
a) Make a stem-and-leaf plot for this data.
b) Describe the shape of the data
c) What is the center of the data (mean and median)?
d) Compare the mean and median. Which is larger? How does this
support the shape you chose for part (b)?
e) Based on the shape, which measure of center more accurately
represents the data? Explain, using the context of the question.
3. Which of the following histograms would have the smallest standard deviation?
(a)
(b) 8
(c)
(d)
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12
7
6
5
4
3
2
1
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12
9
8
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13
4. For the distribution shown to the right, which statement is true, based on its shape?
(a) The mean and median are approximately equal.
(b) The mean is greater than the median.
(c) The mean is less than the median.
(d) The mean could be either greater than or less than the median.
5. For the distribution shown to the right, which statement is true, based on its shape?
(a) The mean and median are approximately equal
(b) The mean is greater than the median.
(c) The mean is less than the median.
(d) The median is 80
6. Without performing any calculations, use the stem-and-leaf plots to determine which statement is accurate.
a)
b)
c)
d)
7.
Data sets (i) and (ii) have the same standard deviation.
Data set (ii) has the greatest standard deviation.
Data set (i) has the smallest standard deviation.
Data sets (i) and (iii) have the same range.
The accompanying box-and-whisker plots can be used to compare the annual incomes of three professions.
a) Which profession has the largest range of income?
b) What is the shape of the distribution for the income of
musicians? Based on this shape, would the mean
or the median income be greater?
c) What is the median income for a nuclear engineer?
d) Which profession’s income has the smallest standard
deviation?