Download CCAlgI.Unit #10.Lesson #4.Variation within a Data Set.Answer Key

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Business intelligence wikipedia, lookup

Data vault modeling wikipedia, lookup

Exercise #4: A marketing company is trying to determine how much diversity there is in the age of people who
drink different soft drinks. They take a sample of people and ask them which soda they prefer. For the two
sodas, the age of those people who preferred them is given below.
Soda A: 18, 16, 22, 16, 28, 18, 21, 38, 22, 29, 25, 44, 36, 27, 40
Soda B: 25, 22, 18, 30, 27, 19, 22, 28, 25, 19, 23, 29, 26, 18, 20
(a) Explain why standard deviation is a better measure of the diversity in age than the mean.
The mean is only going to give us a sense for what the typical age of a particular soda drinker is. It won’t tell
us how wide of a variety of ages we have. Standard deviation will tell us how far a typical data point is away
from the mean, and thus a higher value means a greater diversity in age than a lower value.
(b) Which soda appears to have a greater diversity in the age of people who prefer it? How did you decide on
Soda A appears to have the greater diversity. The ages range from a low of 16 to a high of 40 and lots of ages
spread between. On the other hand, Soda B has a low of 18 and a high of 30.
(c) Use your calculator to determine the sample standard deviation, normally given as sx , for both data sets.
Round your answers to the nearest tenth. Did this answer reinforce your pick from (b)? How?
s A  9.1 years and sB  4.1 years
This did reinforce the answer from part (b). The standard deviation is much higher for Soda A, with a typical
age being 9.1 year away from the mean, while for Soda B the typical age is only 4.1 years away from the
mean, showing much less diversity in age.
Population Versus Sample Standard Deviation
When we are working with every possible data point of interest, we call this a population and use the
population standard deviation,  . When we have only a sample of all possible values we use the sample
standard deviation, s. The formulas for these two differ very slightly, so their values tend to be slightly
Exercise #5: Which of the following data sets would have a standard deviation (population) closest to zero?
Do this without your calculator. Explain how you arrived at your answer.
(1) 5,  2,  1, 0, 1, 2, 5
(3) 11, 11, 12, 13, 13
(2) 5, 8, 10, 16, 20
(4) 3, 7, 11, 11, 11,18
Choice (3) has almost no variation within the
data set at all. Choice (4) also has little
deviation, but the inclusion of the 3 and 18 will
make it have a larger standard deviation than
Choice (3).