Download Standard Deviation (cont.)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Time series wikipedia , lookup

Regression toward the mean wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
MEASURES OF CENTRAL
TENDENCY
Unifix cubes activity
 Get into groups of 6-10 people.
 Each person in the group grab a handful of
unifix cubes. It can be a large handful or a
small handful. A variety is good.
 Without paper, pencil, or calculator: devise a
plan for redistributing the cubes in your group
so that everyone has the same number of
cubes.
 Devise a second, possibly more efficient plan.
Mean
 Sample Mean
 The sample mean is the arithmetic mean of a
set of sample data, given by
 𝒙=
 𝒙=
𝒙𝟏 +𝒙𝟐 +⋯.+𝒙𝒏
𝒏
𝒙𝒊
𝒏
or
 where xi is the ith data value and n is the
number of data values in the sample.
Mean as a balancing point
 Get into groups of 4-6.
 Find the balancing point of the following
numbers on a number line using post-it
notes:
 3, 4, 4, 4, 4, 5, 5, 6, 7, 8, 8, 10, 10
 Try it again with:
 2, 3, 3, 3, 4, 4, 4, 4, 5, 8 , 10, 10
 Find a list of 10 numbers that have a mean of
7 using mean as a balancing point.
Median
 Finding the Median of a Data Set
1. List the data in ascending (or descending) order,
making an ordered array.
2. If the data set contains an ODD number of values,
the median is the middle value in the ordered array.
3. If the data set contains an EVEN number of values,
the median is the arithmetic mean of the two
middle values in the ordered array. Note that this
implies that the median may not be a value in the
data set.
Calculating Measures of Center—
Mean, Median, and Mode
 Given the recent economy and change of
attitude in society, many people chose to
take on another job after retiring from one.
Below is a sample of ages at which people
truly retired; that is, they stopped working for
pay. Calculate the mean, median, and mode
for the data.
 84, 80, 82, 77, 78, 80, 79, 42
Calculating Measures of Center—
Mean, Median, and Mode (cont.)
 Mean: Remember, the mean is the sum of
all the data points divided by the number of
points.
xi

x
n
84  80  82  77  78  80  79  42

8
602

 75.25  75.3
8
Calculating Measures of Center—
Mean, Median, and Mode (cont.)
 Median: We have an even number of values,
so we will need the mean of the middle two
values in the ordered array.
42, 77, 78, 79, 80, 80, 82, 84
79  80
 79.5
2
Calculating Measures of Center—
Mean, Median, and Mode (cont.)
 Mode:
The number 80 occurs more than
any other
number, so it is the mode.
Choosing an Appropriate Measure
of Center
 Determining the Most Appropriate Measure of
Center
 1. For qualitative data, the mode should be used.
 2. For quantitative data, the mean should be used,
unless the data set contains outliers or is skewed.
 3. For quantitative data sets that are skewed or
contain outliers, the median should be used.
Determining Mean, Median,
and Mode from a Graph
 Determine which letter represents the mean,
the median, and the mode in the graph
below.
Graphs and Measures of Center
 Graphs and Measures of Center
1. The mode is the data value at which a
distribution has its highest peak.
2. The median is the number that divides the area
of the distribution in half.
3. The mean of a distribution will be pulled toward
any outliers.
MEASURES OF
DISPERSION
Range
Range
Range = Maximum Data Value −
Minimum Data Value
Standard Deviation
Standard Deviation
The standard deviation is a measure of how
much we might expect a typical member of the
data set to differ from the mean.
The population standard deviation is given by
σ=
𝑥𝑖 −𝜇 2
𝑁
Standard Deviation
Standard Deviation (cont.)
where xi is the ith value in the population,
μ is the population mean, and
N is the number of values in the population.
Standard Deviation
Standard Deviation (cont.)
The sample standard deviation is given by
where xi is the ith data value,
x̄ is the sample mean, and
n is the number of data values in the sample.
Calculate the sample standard deviation of the
following data collected regarding the numbers of
hours students studied for a physics exam.
5, 8, 7, 6, 9
Solution
Let’s calculate the sample standard deviation by
hand using the following formula.
Calculating Standard Deviation (cont.)
Next, find the sum of the squared
deviations by adding up the values in the
last column.
Calculating Standard Deviation
(cont.)
Finally, substitute the appropriate values into the
sample standard deviation formula as follows.
Interpreting Standard
Deviations
Mark is looking into investing a portion of his recent bonus into the
stock market. While researching different companies, he discovers the
following standard deviations of one year of daily stock closing prices.
Profacto Corporation: Standard deviation of stock prices = $1.02
Yardsmoth Company: Standard deviation of stock prices = $9.67
What do these two standard deviations tell you about the stock prices
of these companies?
Interpreting Standard
Deviations (cont.)
Solution
A smaller standard deviation indicates that the data
values are closer together, while a larger standard
deviation indicates that the data values are more
spread out. In this example, the standard deviation
of stock prices for the Profacto Corporation is
considerably smaller than that of the Yardsmoth
Company. Hence, there is less variability in the daily
closing prices of the Profacto stock than in the
Yardsmoth stock prices.
Interpreting Standard
Deviations (cont.)
If Mark wants a stable long-term investment,
then Profacto appears to be the better
choice. If, however, Mark is looking to make
a quick profit and is willing to take the risk,
then the Yardsmoth stock would seem to
better suit his purposes. Note that looking at
the standard deviations is just one
component of evaluating market prices.
Empirical Rule
Empirical Rule for Bell-Shaped
Distributions
•Approximately 68% of the data values
lie within one standard deviation of the
mean.
•Approximately 95% of the data values
lie within two standard deviations of the
mean.
•Approximately 99.7% of the data values
lie within three standard deviations of
the mean.
Applying the Empirical Rule for
Bell-Shaped Distributions
The distribution of weights of newborn babies is
bell-shaped with a mean of 3000 grams and
standard deviation of 500 grams.
a.
What percentage of newborn babies weigh
between 2000 and 4000 grams?
b.
What percentage of newborn babies weigh
less than 3500 grams?
c.
Calculate the range of birth weights that
would
contain the middle 68% of newborn
babies’ weights.
Applying the Empirical Rule for BellShaped Distributions (cont.)
Thus, these weights lie two standard deviations
above and below the mean. According to the
Empirical Rule, approximately 95% of values lie
within two standard deviations of the mean.
Therefore, we can say that approximately 95% of
newborn babies weigh between 2000 and 4000
grams.
Applying the Empirical Rule for BellShaped Distributions (cont.)
b.
To begin, let’s find out how many standard
deviations a weight of 3500 grams is away from the
mean by performing the same calculation as
before.
Applying the Empirical Rule for BellShaped Distributions (cont.)
Thus it is one standard deviation above the mean.
The Empirical Rule says that 68% of data values lie
within one standard deviation of the mean.
Because of the symmetry of the distribution, half of
this 68% is above the mean and half is below.
Putting the upper 34% together with the 50% of
data that is below the mean, we have that
approximately
of newborn babies weigh less than 3500 grams.
Applying the Empirical Rule for BellShaped Distributions (cont.)
Applying the Empirical Rule for BellShaped Distributions (cont.)
From the Empirical Rule, we know that
68% of the data values lie within one
standard deviation of the mean for bellshaped distributions. The standard
deviation of this distribution is 500; thus,
by adding 500 to and subtracting 500 from
the mean of the distribution, we will get
the range of birth weights that contain the
middle 68% of newborn babies’ weights.
c.
Applying the Empirical Rule for BellShaped Distributions (cont.)
Upper end: 3000 + 500 = 3500
Lower end: 3000 − 500 = 2500
Thus, 68% of newborn babies weigh
between 2500 and 3500 grams.