Download Mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Descriptive Statistics
(자료의 정리)
Chapter 3
Three characteristics of data.
1. Representative score, such as average
2. Measure of scattering or variation
3. Nature of the distribution,
such as bell-shaped.
Slide
1
Do women really talk more than men?
Slide
2
Measures of Center
1. Arithmetic Mean (Mean)
2. Median
3. Mode
4. Midrange
5. Weighted Mean
6. Symmetricity
Slide
3
1. Arithmetic Mean (Mean)
x =
µ =
x
n
x
N
: the mean of a set of
sample values
: the mean of all values
in a population
Slide
4
2. Median
the middle value when the original data values
are arranged in order of increasing (or
decreasing) magnitude.
Slide
5
5.40
1.10
0.42
0.73
0.48
1.10
0.66
0.42
0.48
0.66
0.73
1.10
1.10
5.40
(in order - odd number of values)
exact middle
5.40
1.10
1.10
0.42
0.73
0.48
0.42
5.40
0.48
0.73
1.10
1.10
(in order - even number of values – no exact middle
shared by two numbers)
Slide
6
3. Mode
the value that occurs most frequently
A data set may be:
Bimodal
Multimodal
No Mode
Slide
7
Mode - Examples
a. 5.40 1.10 0.42 0.73 0.48 1.10
b. 27 27 27 55 55 55 88 88 99
c. 1 2 3 6 7 8 9 10
Slide
8
4. Midrange
the value midway between the maximum
and minimum values in the original data
set
Midrange =
maximum value + minimum value
2
Slide
9
Weighted Mean
In some cases, values vary in their degree of
importance, so they are weighted accordingly.
 (w • x)
x =
w
Slide
10
Best Measure of Center
Slide
11
Example
Find the (1)mean, (2)median, (3)mode, and (4)
midrange for the following values:
3 3 3 3 5 2 3 3 3 2 (3)
4 2 2 3 2 3 5 3 4 4
(1) mean:
(2) Median:
(3) Mode:
(4) Midrange:
Slide
12
 Symmetric
distribution of data is symmetric if the left
half of its histogram is roughly a mirror
image of its right half
 Skewed
distribution of data is skewed if it is not
symmetric and if it extends more to one
side than the other
Slide
13
Skewness
Slide
14
Measures of Variation
1. Range
2. Standard Deviation (SD)
3. Variance
4. Estimating SD Range Rule of Thumb
5. Empirical Rule
6. Coefficient of Variation (CV)
Slide
15
Group A Group B
65
42
66
54
67
58
68
62
71
67
73
77
74
77
77
85
77
93
77
100
Dispersion Statistics
Group A
Mean
Group B
= 71.5 Mean
= 71.5
Median = 72.0 Median = 72.0
Mode
= 77
Midrange
= 71.0
Mode
= 77
Midrange
= 71.0
Slide
16
Dispersion Statistics



We can see no difference between the two
groups.
But the group B are much more widely
scattered than those of group A.
This variability among data is one
characteristic to which averages are not
sensitive.
Slide
17
Dispersion Statistics

Three basic measures of dispersion
(a) Range,
(b) Standard Deviation,
(c) Variance
Slide
18
1. Range
the difference between the maximum value
and the minimum value.
Range
= (maximum value) – (minimum value)
Slide
19
Example

The range is simply the difference b/w the
highest value and the lowest value.


For group A, the range is 12 (77-65)
For group B, the range is 58 (100-42)

Don’t be confused b/w the midrange
(average) and the range (dispersion).
Slide
20
2. Standard Deviation and Variance
SD and Var measure the dispersion or
variation of values about the mean.
s=
 (x - x)
n-1
2
Slide
21
Population Standard
Deviation
 =
s=
 (x - µ)
2
N
 (x - x)
n-1
2
Slide
22
x
2
3
5
6
9
17
Total: 42
42
x
 7 .0
6
( x  x ) (x  x )
-5
-4
-2
-1
2
10
2
25
16
4
1
4
100
150
150
s
 30  5.5
6 1
Slide
23
Variance - Notation
standard deviation squared
}
Notation
s

2
2
Sample variance
Population variance
Slide
24
Comparison of Word Counts of
Men & Women
Slide
25
Interpreting and understanding SD
1. Range Rule
2. Empirical Rule
Slide
26
1. Range Rule of Thumb
(Estimation of Standard Deviation)
For estimating a value of the standard
deviation s,
Use
s
Range
4
range = (maximum value) – (minimum value)
Slide
27
Rough Estimates of the Min. & Max.
“Usual” sample values
Minimum “usual” value
= (mean) – 2 ∙ (standard deviation)
Maximum “usual” value
=(mean) + 2 ∙ (standard deviation)
Slide
28
2. The Empirical Rule
For data sets having a distribution that is
approximately bell shaped,
 About 68% of all values fall within 1 standard
deviation of the mean.
 About 95% of all values fall within 2 standard
deviations of the mean.
 About 99.7% of all values fall within 3 standard
deviations of the mean.
Slide
29
FIGURE 2-13
Slide
30
Example: P.106 IQ Scores
Empirical (68-95-99.7) Rule with Bell-shaped
Distribution
Mean = 100
S.D. = 15
What percentage of adults have IQ scores b/w 70
and 130?
Slide
31
Coefficient of Variation
The coefficient of variation (or CV) for a set of
sample or population data, expressed as a
percent, describes the standard deviation relative
to the mean.
Sample
CV =
s  100%
x
Population
CV =

 100%
m
Slide
32
Note: Coefficient of Variation (CV)


The coefficient of variation, expressed in
percent, is used to describe the standard
deviation relative to the mean.

*100
m
Find the CV for following sample scores:
2, 2, 2, 3, 5, 8, 12, 19, 22, 30
CV = 95%
s
*100
x
Slide
33
Example: p. 109 Heights and Weights
Height
Weight
Mean
SD
68.34 in 3.02 in
172.55 lb 26.33 lb
CV for Heights =
CV for Weights =
Slide
34
Measures of Relative Standing
1. Z Score
2. Quartiles and Percentiles
Slide
35
1. Z Score (or standardized value
the number of standard deviations that a given
value x is above or below the mean
Sample
x
x
z= s
Population
x-µ
z=

Slide
36
Interpreting Z Scores
Whenever a value is less than the mean, its
corresponding z score is negative
Ordinary values: z score between –2 and 2
Unusual Values: z score < -2 or z score > 2
Slide
37
Which measure of center is the only one that
can be used with data at the nominal level
of measurement?
A. Mean
B. Median
C. Mode
Slide
38
Which of the following measures of center is
not affected by outliers?
A. Mean
B. Median
C. Mode
Slide
39
Find the mode (s) for the given sample data.
79, 25, 79, 13, 25, 29, 56, 79
A. 79
B. 48.1
C. 42.5
D. 25
Slide
40
Which is not true about the variance?
A. It is the square of the standard deviation.
B. It is a measure of the spread of data.
C. The units of the variance are different
from the units of the original data set.
D. It is not affected by outliers.
Slide
41
Weekly sales for a company are $10,000 with
a standard deviation of $450. Sales for the
past week were $9050. This is
A. Unusually high.
B. Unusually low.
C. About right.
Slide
42
Related documents