Download 3-1 and 3-2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 3
Statistics for Describing,
Exploring, and Comparing Data
3-1 Overview
3-2 Measures of Center
3-3 Measures of Variation
3-4 Measures of Relative Standing
3-5 Exploratory Data Analysis (EDA)
Slide
1
Section 3-1
Overview
Slide
2
Overview
 Descriptive Statistics
summarize or describe the important
characteristics of a known set of
data
 Inferential Statistics
use sample data to make inferences
(or generalizations) about a
population
Slide
3
Section 3-2
Measures of Center
Slide
4
Key Concept
When describing, exploring, and comparing
data sets, these characteristics are usually
extremely important: center, variation,
distribution, outliers, and changes over time.
Slide
5
Definition
 Measure of Center
the value at the center or middle of a
data set
Slide
6
Definition
Arithmetic Mean
(Mean)
the measure of center obtained by adding
the values and dividing the total by the
number of values
Slide
7
Notation

denotes the sum of a set of values.
x
is the variable usually used to represent the
individual data values.
n
represents the number of values in a sample.
N
represents the number of values in a population.
Slide
8
Notation
x is pronounced ‘x-bar’ and denotes the mean of a set
of sample values
x =
x
n
µ is pronounced ‘mu’ and denotes the mean of all values
in a population
µ =
x
N
Slide
9
Definitions
 Median
the middle value when the original
data values are arranged in order of
increasing (or decreasing) magnitude
 often denoted by x~
(pronounced ‘x-tilde’)
 is not affected by an extreme value
Slide
10
Finding the Median
 If the number of values is odd, the
median is the number located in the
exact middle of the list.
 If the number of values is even, the
median is found by computing the
mean of the two middle numbers.
Slide
11
5.40
1.10
1.10
0.42
0.73
0.48
0.42
5.40
0.48
0.73
1.10
1.10
(in order - even number of values – no exact middle
shared by two numbers)
0.73 + 1.10
MEDIAN is 0.915
2
5.40
1.10
0.42
0.73
0.48
1.10
0.66
0.42
0.48
0.66
0.73
1.10
1.10
5.40
(in order - odd number of values)
exact middle
MEDIAN is 0.73
Slide
12
Definitions
 Mode
the value that occurs most frequently
 Mode is not always unique
 A data set may be:
Bimodal
Multimodal
No Mode
Mode is the only measure of central tendency
that can be used with nominal data
Slide
13
Mode - Examples
a. 5.40 1.10 0.42 0.73 0.48 1.10
Mode is 1.10
b. 27 27 27 55 55 55 88 88 99
Bimodal -
c. 1 2 3 6 7 8 9 10
No Mode
27 & 55
Slide
14
Definition
 Midrange
the value midway between the maximum and
minimum values in the original data set
Midrange =
maximum value + minimum value
2
Slide
15
Round-off Rule for
Measures of Center
Carry one more decimal place than is
present in the original set of values.
Slide
16
You do!
Here are the volumes (in ounces) or randomly selected
cans of Coke:
12.3 12.1 12.2 12.3 12.2
Find the mean, median, mode, and midrange.
Slide
17
Mean from a Frequency
Distribution
Assume that in each class, all sample
values are equal to the class
midpoint.
Slide
18
Mean from a Frequency
Distribution
use class midpoint of classes for variable x
Slide
19
Find the Mean from a Frequency
Distribution
Age of Actress
Frequency (f)
21-30
28
Class Midpoint f*x
(x)
25.5
714
31-40
30
35.5
1065
41-50
12
45.5
546
51-60
2
55.5
111
61-70
2
65.5
131
71-80
2
75.5
151
Totals
76
2718
Slide
20
Weighted Mean
In some cases, values vary in their degree of
importance, so they are weighted accordingly.
 (w • x)
x =
w
Slide
21
Example
• We have three test sores: 85, 90, 75
• The first test counts for 20%, the second for 30%,
and the third for 50% of the final grade.
• Find the weighted mean.
Slide
22
Best Measure of Center
Slide
23
Definitions
 Symmetric
distribution of data is symmetric if the
left half of its histogram is roughly a
mirror image of its right half
 Skewed
distribution of data is skewed if it is not
symmetric and if it extends more to
one side than the other
Slide
24
Skewness
Slide
25
Heart Rate Activity
Is there a difference between male and
female heart rates?
Male: 60 67 59 64 80 55 72 84 59 67 69
Female: 83 56 57 63 60 69 70 86 70 57 67 75
72 75 57 76 69 79 84 75 56
Compare: 1) The sample sizes
2) The Means and the Medians (centers)
3) The IQR, the 5 # Summaries and the
standard deviations (spreads)
4) Look at the shape: symmetric, skewed,
or irregular?
Slide
26
Related documents