• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```Understandable Statistics
Seventh Edition
By Brase and Brase
Prepared by: Lynn Smith
Gloucester County College
Chapter Three
Averages and Variation
1
Measures of Central Tendency
• Mode
• Median
• Mean
2
The Mode
the value or property that occurs
most frequently in the data
3
Find the mode:
6, 7, 2, 3, 4, 6, 2, 6
The mode is 6.
4
Find the mode:
6, 7, 2, 3, 4, 5, 9, 8
There is no mode for this data.
5
The Median
the central value of an ordered
distribution
6
To find the median of raw data:
• Order the data from smallest to largest.
• For an odd number of data values, the
median is the middle value.
• For an even number of data values, the
median is found by dividing the sum of
the two middle values by two.
7
Find the median:
Data:
5, 2, 7, 1, 4, 3, 2
Rearrange: 1, 2, 2, 3, 4, 5, 7
The median is 3.
8
Find the median:
Data:
31, 57, 12, 22, 43, 50
Rearrange: 12, 22, 31, 43, 50, 57
The median is the average of the middle two values =
31  43
 37
2
9
The Mean
The mean of a collection of data is found by:
• summing all the entries
• dividing by the number of entries
mean
sum of all entries

number of entries
10
Find the mean:
6, 7, 2, 3, 4, 5, 2, 8
6  7  2  3  4  5  2  8 37
mean 
  4.625  4.6
8
8
11
Sigma Notation
•The symbol  means “sum the following.”
•  is the Greek letter (capital) sigma.
12
Notations for mean
Sample mean
“x bar”
Population mean
x
Greek letter (mu)

13
Number of entries
in a set of data
• If the data represents a sample, the
number of entries = n.
• If the data represents an entire
population, the number of entries = N.
14
Sample mean
x
x
n
15
Population mean
x

N
16
Resistant Measure
a measure that is not influenced by
extremely high or low data values
17
Which is less resistant?
• Mean
• Median
The mean is less
resistant. It can be
large by increasing
the size of one value.
18
Trimmed Mean
a measure of center that is more
resistant than the mean but is still
sensitive to specific data values
19
To calculate a (5 or 10%)
trimmed mean
• Order the data from smallest to largest.
• Delete the bottom 5 or 10% of the data.
• Delete the same percent from the top of
the data.
• Compute the mean of the remaining 80 or
90% of the data.
20
Compute a 10% trimmed mean:
15, 17, 18, 20, 20, 25, 30, 32, 36, 60
• Delete the top and bottom 10%
• New data list:
17, 18, 20, 20, 25, 30, 32, 36
• 10% trimmed mean =
 x 198

 24 .8
n
8
21
Measures of Variation
• Range
• Standard Deviation
• Variance
22
The Range
the difference between the largest
and smallest values of a
distribution
23
Find the range:
10, 13, 17, 17, 18
The range = largest minus smallest
= 18 minus 10 = 8
24
The standard deviation
a measure of the average variation
of the data entries from the mean
25
Standard deviation of a sample
s
 (x  x)
n 1
2
mean of the
sample
n = sample size
26
To calculate standard
deviation of a sample
•
•
•
•
•
•
Calculate the mean of the sample.
Find the difference between each entry (x) and the
mean. These differences will add up to zero.
Square the deviations from the mean.
Sum the squares of the deviations from the
mean.
Divide the sum by (n  1) to get the variance.
Take the square root of the variance to get
the standard deviation.
27
The Variance
the square of the standard
deviation
28
Variance of a Sample
(
x

x
)

2
s 
n 1
2
29
Find the standard deviation and
variance
x
30
26
22
78
x x
4
0
4
mean=
26
(x  x )
Sum = 0
2
16
0
16
___
32
30
The variance
s 
2

( x  x)
n 1
2
= 32  2 =16
31
The standard deviation
s=
16  4
32
Find the mean, the
standard deviation and
variance
mean = 5
x
xx
(x - x)
4
1
1
5
0
0
5
0
0
7
2
4
4
1
1
25
6
33
2
The mean, the standard
deviation and variance
Mean
=5
S tan dard deviation  1.5  1.22
Variance
6

 1 .5
4
34
Computation formula for
sample standard
deviation:
s
SS x
n 1

 x

2
where
SS x   x
2
n
35
To find  x
2
Square the x values, then add.
36
To find (  x )
2
Sum the x values, then square.
37
Use the computing formulas to
find s and s2
x
x2
n=5
4
16
(Sx) 2 = 25 2 = 625
5
25
Sx2 = 131
5
25
SSx = 131 – 625/5 = 6
7
49
s2 = 6/(5 –1) = 1.5
4
25
16
131
s = 1.22
38
Population Mean and Standard
Deviation
x

population mean   
N
population
standard deviation   
2


x

x

N
where N  number of data values in the population
39
COEFFICIENT OF
VARIATION:
a measurement of the relative
variability (or consistency) of data
s

CV   100 or
 100
x

40
CV is used to
compare variability or
consistency
A sample of newborn infants had a mean weight of
6.2 pounds with a standard deviation of 1 pound.
A sample of three-month-old children had a mean
weight of 10.5 pounds with a standard deviation of
1.5 pounds.
Which (newborns or 3-month-olds) are more
variable in weight?
41
To compare variability,
compare Coefficient of Variation
For
newborns:
For 3month-olds:
CV = 16%
Higher CV:
more variable
CV = 14% Lower CV:
more consistent
42
Use Coefficient of Variation
To compare two groups of data,
Which is more consistent?
Which is more variable?
43
CHEBYSHEV'S THEOREM
For any set of data and for any number k,
greater than one, the proportion of the
data that lies within k standard
deviations of the mean is at least:
1
1 
k
2
44
CHEBYSHEV'S THEOREM for k = 2
According to Chebyshev’s Theorem, at
least what fraction of the data falls
within “k” (k = 2) standard deviations of
the mean?
1
3
At least 1  2 2  4  75 %
of the data falls within 2 standard deviations of
the mean.
45
CHEBYSHEV'S THEOREM for k = 3
According to Chebyshev’s Theorem, at
least what fraction of the data falls
within “k” (k = 3) standard deviations of
the mean?
1
8
At least 1  3 2  9  88 . 9 %
of the data falls within 3 standard deviations of
the mean.
46
CHEBYSHEV'S THEOREM for k =4
According to Chebyshev’s Theorem, at
least what fraction of the data falls
within “k” (k = 4) standard deviations of
the mean?
1
15
At least 1  4 2  16  93 . 8 %
of the data falls within 4 standard deviations of
the mean.
47
Using Chebyshev’s Theorem
A mathematics class completes an examination
and it is found that the class mean is 77 and the
standard deviation is 6.
According to Chebyshev's Theorem, between
what two values would at least 75% of the
48
Mean = 77
Standard deviation = 6
At least 75% of the grades would be in the
interval:
x  2 s to x  2 s
77 – 2(6) to 77 + 2(6)
77 – 12 to 77 + 12
65 to 89
49
Mean and Standard Deviation of
Grouped Data
• Make a frequency table
• Compute the midpoint (x) for each class.
• Count the number of entries in each class
(f).
• Sum the f values to find n, the total
number of entries in the distribution.
• Treat each entry of a class as if it falls at
the class midpoint.
50
Sample Mean for a Frequency
Distribution
xf

x 
n
x = class midpoint
51
Sample Standard Deviation for
a Frequency Distribution
s
 ( x  x) f
2
n 1
52
Computation Formula for
Standard Deviation for a
Frequency Distribution
SS x
s
n 1
where SSx   x

xf 

f
2
2
n
53
Calculation of the mean of
grouped data
Ages:
f
30 - 34
x
xf
32
128
37
185
42
84
47
xf = 820
423
4
35 - 39
5
40 - 44
2
f = 20
45 - 49
9
54
Mean of Grouped Data
xf  xf

x

n
f
820

 41 . 0
20
55
Calculation of the standard
deviation of grouped data
Ages:
f
x
x–
mean
(x –
mean)2
32
30 - 34
–9
81
37
4
80
–4
16
42
35 - 39
2
1
1
47
5
f =
20
40 - 44 Mean
324
6
2
(x – mean)2
f
324
36
 (x – mean)2 f
= 730
56
Calculation of the standard
deviation of grouped data
  x  x   730
f = n = 20
2
( x  x) f

s

2
n 1
730
20  1
 38 . 42  6 . 20
57
Computation Formula for
Standard Deviation for a
Frequency Distribution
SS x
s
n 1
where SSx   x

xf 

f
2
2
n
58
Computation Formula for
Standard Deviation
x
f
xf
x2f
32
4
128
4096
5
37
42
47
185
2
9
6845
3528
84
f = 20 xf = 820
423
19881
x2f =
34350
59
Computation Formula for
Standard Deviation for a
Frequency Distribution
where
SS
  x f 
2
x

xf
n

2

820 2
34350 
 730
20
SS x
730
s 

 6 . 20
n1
20  1
60
Weighted Average
Average calculated where some of
the numbers are assigned more
importance or weight
61
Weighted Average
xw

Weighted Average 
w
where w  the weight of the data value x.
62
Compute the Weighted Average:
•
•
•
•
•
•
Midterm weight = 25%
Term paper weight = 25%
Final exam weight = 50%
63
Compute the Weighted Average:
• Midterm
• Term Paper
• Final exam
x
92
80
88
w
.25
.25
.50
1.00
xw
23
20
44
87
 xw  87  87  Weighted Average
 w 1.00
64
Percentiles
For any whole number P (between 1 and
99), the Pth percentile of a distribution is
a value such that P% of the data fall at or
below it.
The percent falling above the Pth percentile
will be (100 – P)%.
65
Percentiles
60% of data
P 40
Highest value
Lowest value
40% of data
66
Quartiles
• Percentiles that divide the data into
fourths
• Q1 = 25th percentile
• Q2 = the median
• Q3 = 75th percentile
67
Q1
Median
= Q2
Q3
Highest value
Lowest value
Quartiles
Inter-quartile range = IQR = Q3 — Q1
68
Computing Quartiles
• Order the data from smallest to largest.
• Find the median, the second quartile.
• Find the median of the data falling below
Q2. This is the first quartile.
• Find the median of the data falling above
Q2. This is the third quartile.
69
Find the quartiles:
12
23
41
15
24
45
16
25
51
16
30
17
32
18
33
22
33
22
34
The data has been ordered.
The median is 24.
70
Find the quartiles:
12
23
41
15
24
45
16
25
51
16
30
17
32
18
33
22
33
22
34
The data has been ordered.
The median is 24.
71
Find the quartiles:
12
23
41
15
24
45
16
25
51
16
30
17
32
18
33
22
33
22
34
For the data below the median, the median is 17.
17 is the first quartile.
72
Find the quartiles:
12
23
41
15
24
45
16
25
51
16
30
17
32
18
33
22
33
22
34
For the data above the median, the median is 33.
33 is the third quartile.
73
Find the interquartile range:
12
23
41
15
24
45
16
25
51
16
30
17
32
18
33
22
33
22
34
IQR = Q3 – Q1 = 33 – 17 = 16
74
Five-Number Summary of Data
•
•
•
•
•
Lowest value
First quartile
Median
Third quartile
Highest value
75
Box-and-Whisker Plot
a graphical presentation of the fivenumber summary of data
76
Making a
Box-and-Whisker Plot
• Draw a vertical scale including the lowest
and highest values.
• To the right of the scale, draw a box from
Q1 to Q3.
• Draw a solid line through the box at the
median.
• Draw lines (whiskers) from Q1 to the
lowest and from Q3 to the highest values.
77
Construct a
Box-and-Whisker Plot:
12
23
41
15
24
45
16
25
51
16
30
17
32
18
33
22
33
Lowest = 12
Q1 = 17
median = 24
Q3 = 33
22
34
Highest = 51