Download StewartPCalc61405

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Probability and Statistics
Copyright © Cengage Learning. All rights reserved.
Descriptive
Statistics
14.5
(Numerical)
Copyright © Cengage Learning. All rights reserved.
Objectives
► Introduction to Statistical Measures
► Measures of Central Tendency: Mean, Median,
Mode
► Measures of Spread: Variance and Standard
Deviation
3
Descriptive Statistics (Numerical)
Data usually consist of thousands or even
millions of numbers. The first goal of
statistics is to describe such huge sets of
data in simpler terms.
One way to make sense of data is to find a
“typical” number or the “center” of the data.
Any such number is called a measure of
central tendency.
4
There are 3 Measures
of Central Tendency:
Mean, Median, Mode
What do each of these tell us
about the Data?
5
Example 3 – Effect of Outliers on the Mean and Median
The following table gives the selling prices of houses sold
in 2007 in a small coastal California town.
(a) Find the mean house price.
(b) Find the median house price.
(c)Which value is a more
“typical” value? Why?
6
Measures of Central Tendency: Mean, Median, Mode
If a data set includes a number that is “far out” or far away
from the rest of the data, that data point is called an outlier.
In general, when a data set has outliers, the median is a
better indicator of central tendency than the mean.
7
Measures of Central Tendency: Mean, Median, Mode
The mode of a data set is a summary statistic that is
usually less informative than the mean or median, but has
the advantage of not being limited to numerical data.
8
Measures of Central Tendency: Mean, Median, Mode
The mode of the data set 1, 1, 2, 2, 2, 3, 5, 8 is the number
2. The data set 1, 2, 2, 3, 5, 5, 8 has two modes: 2 and 5.
Data sets with two modes are called bimodal.
The data set 1, 2, 4, 5, 7, 8 has no mode.
9
Organizing Data: Frequency Tables
Sometimes listing the data in a special way can help us get
useful information about the data. One such method the
frequency table.
A frequency table for a set of data is a table that includes
each different data point and the number of times that point
occurs in the data.
The mode is most easily determined from a frequency table.
10
Example 4 – Using a Frequency Table
The scores obtained by the students in an algebra class on
a five-question quiz are given in the following frequency
table. Find the mean, median, and mode of the scores.
Frequency Table
11
Example 4 – Solution
The mode is 5, because more students got this score than
any other score.
The total number of quizzes is 16 + 8 + 5 + 5 + 3 + 3 = 40.
To find the mean, we add all the scores and divide by 40.
Note that the score 5 occurs 16 times, the score 4 occurs 8
times, and so on.
So the mean score is
12
Example 4 – Solution
cont’d
There are 40 students in this class. If we rank the scores
from highest to lowest, the median score is the average of
the 20th and 21st scores. From frequency column in the
table, we see that these scores are each 4. So the median
score is 4.
13
Measures of Spread: Standard Deviation
Measures of central tendency identify the “center” or
“typical value” of the data.
Measures of spread (also called measures of
dispersion) describe the spread or variability of the data
around a central value.
For example, find the mean of each of the following sets of
numbers.
50, 58, 78, 81, 93
72, 71, 72, 72, 73
Although, the means are the same, we say that the first
data set shows more variability than the second.
14
Measures of Spread: Standard Deviation
The most important measure of
variability in statistics is the
standard deviation.
•Standard deviation measures the
average deviation (or difference)
from the mean.
15
Example: Finding the Sample Standard Deviation
The starting salaries are for the Chicago branches
of a corporation. The corporation has several other
branches, and you plan to use the starting salaries
of the Chicago branches to estimate the starting
salaries for the larger population. Find the sample
standard deviation of the starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42
Larson/Farber 4th ed.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
16
Slide 4 - 16

STANDARD DEVIATION

The standard deviation, s, is
measured in the same units as the
original data using the formula:
 y  y 
2
s
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
n 1
Slide 4 - 17
Solution: Finding the Sample Standard Deviation
Sample Standard Deviation
88.5
s s 
 3.1
9
•
2
The sample standard deviation is about 3.1, or $3100.
Larson/Farber 4th ed.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
18
Slide 4 - 18
Solution: Using Technology to Find the Standard Deviation
Sample Mean
Sample Standard
Deviation
Larson/Farber 4th ed.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
19
Slide 4 - 19
Example: Using Technology to Find the Standard Deviation
Sample office rental rates (in
dollars per square foot per
year) for Miami’s central
business district are shown
in the table. Use a calculator
to find the mean rental rate
and the sample standard
deviation. (Adapted from:
Cushman & Wakefield Inc.)
Larson/Farber 4th ed.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Office Rental Rates
35.00
33.50
37.00
23.75
26.50
31.25
36.50
40.00
32.00
39.25
37.50
34.75
37.75
37.25
36.75
27.00
35.75
26.00
37.00
29.00
40.50
24.50
33.00
38.00
20
Slide 4 - 20
Interpreting Standard Deviation


Standard deviation is a measure of the typical
amount an entry deviates from the mean.
The more the entries are spread out, the greater
the standard deviation.
Larson/Farber 4th ed.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
21
Slide 4 - 21
Thinking About Variation




Since Statistics is about variation, spread is an
important fundamental concept of Statistics.
Measures of spread help us talk about what we
don’t know.
When the data values are tightly clustered around
the center of the distribution, standard deviation
will be small.
When the data values are scattered far from the
center, standard deviation will be large.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Slide 4 - 22
Example 7 – Calculating Standard Deviation
Two machines are used in filling 16-ounce soda bottles. To
test how consistently each machine fills the bottles, a
sample of 20 bottles from the output of each machine is
selected. Find the standard deviation for each machine.
Which machine is more consistent in filling the bottles?
23
Example 7 – Solution
cont’d
The standard deviations x and y for Soda Machines I and
II, respectively, are
Soda Machine I is more consistent in filling the bottles
because the standard deviation of the data from Machine I
is much smaller than that of the data from Machine II.
24