Download mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Descriptive Statistics: Numerical Measures
Location and Variability
Chapter 3
BA 201
Slide 1
Sample Statistics, Population Parameters,
and Point Estimators
If the measures are computed
for data from a sample,
they are called sample statistics.
If the measures are computed
for data from a population,
they are called population parameters.
A sample statistic is referred to
as the point estimator of the
corresponding population parameter.
Slide 2
LOCATION
Slide 3
Measures of Location

Mean

Median
Mode



Percentiles
Quartiles
Slide 4
Mean

The mean provides a measure of central location.

The mean of a data set is the average of all the data
values.
The sample mean x is the point estimator of the
population mean m.

Slide 5
Mean
x
x
n
Sample
i
m
x
i
N
Population
Slide 6
Sample Mean
Apartment Rents
445
440
465
450
600
570
510
615
440
450
470
485
515
575
430
440
525
490
580
450
490
590
525
450
472
470
445
435
x

x
n
435
425
450
475
490
525
600
i

600
445
460
475
500
535
435
460
575
435
500
549
475
445
600
445
460
480
500
550
435
440
450
465
570
500
480
430
615
450
480
465
480
510
440
34, 356
 490.80
70
Slide 7
Trimmed Mean
 Another measure, sometimes used when extreme
values are present, is the trimmed mean.
 It is obtained by deleting a percentage of the
smallest and largest values from a data set and then
computing the mean of the remaining values.
 For example, the 5% trimmed mean is obtained by
removing the smallest 5% and the largest 5% of the
data values and then computing the mean of the
remaining values.
Slide 8
Median
 The median of a data set is the value in the middle
when the data items are arranged in ascending order.
 Whenever a data set has extreme values, the median
is the preferred measure of central location.
Slide 9
Median
 For an odd number of observations:
26 18 27 12 14 27 19
12
14
18
19
26
27 27
7 observations
in ascending order
the median is the middle value.
Median = 19
Slide 10
Median
 For an even number of observations:
26 18 27 12 14 27 30 19
12
14
18
19
26
27 27
30
8 observations
in ascending order
the median is the average of the middle two values.
Median = (19 + 26)/2 = 22.5
Slide 11
Median
 Example: Apartment Rents
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Averaging the 35th and 36th data values:
Median = (475 + 475)/2 = 475
Slide 12
Mode
 The mode of a data set is the value that occurs with
greatest frequency.
 The greatest frequency can occur at two or more
different values.
 If the data have exactly two modes, the data are
bimodal.
 If the data have more than two modes, the data are
multimodal.
Slide 13
Mode
Apartment Rents
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
450 occurred most frequently (7 times)
Mode = 450
Slide 14
PRACTICE
Slide 15
Practice #1
1479
983
1133
1286
1409
1259
1579
1113
1265
1354
1255
374
1124
1265
• Compute the …
 Mean
 Median
 Mode
Slide 16
Practice #1 - Median
1479
983
1133
1286
1409
1259
1579
1113
1265
1354
1255
374
1124
1265
374
983
1113
1124
1133
1255
1259
1265
1265
1286
1354
1409
1479
1579
Slide 18
Practice #1 - Mode
374
983
1113
1124
1133
1255
1259
1265
1265
1286
1354
1409
1479
1579
Slide 19
Percentiles
 A percentile provides information about how the
data are spread over the interval from the smallest
value to the largest value.

The pth percentile of a data set is a value such that at
least p percent of the items take on this value or less
and at least (100 - p) percent of the items take on this
value or more.
Slide 20
Percentiles
Arrange the data in ascending order.
Compute index i, the position of the pth percentile.
i = (p/100)n
If i is not an integer, round up. The pth percentile
is the value in the ith position.
If i is an integer, the pth percentile is the average
of the values in positions i and i+1.
Slide 21
80th Percentile
Apartment Rents
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
i = (p/100)n = (80/100)70 = 56
Averaging the 56th and 57th data values:
80th Percentile = (535 + 549)/2 = 542
Slide 22
Quartiles
 Quartiles are specific percentiles.
 First Quartile = 25th Percentile
 Second Quartile = 50th Percentile = Median
 Third Quartile = 75th Percentile
Slide 23
Third Quartile
Apartment Rents
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525
Slide 24
PRACTICE
Slide 25
Practice #2 - Percentiles
374
983
1113
1124
1133
1255
1259
1265
1265
1286
1354
1409
1479
1579
80th Percentile
Slide 26
VARIABILITY
Slide 27
Measures of Variability
 Range
 Interquartile Range
 Variance
 Standard Deviation
 Coefficient of Variation
Slide 28
Range
 The range of a data set is the difference between the
largest and smallest data values.
Slide 29
Range
Apartment Rents
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Range = largest value - smallest value
Range = 615 - 425 = 190
Slide 30
Interquartile Range
 The interquartile range of a data set is the difference
between the third quartile and the first quartile.
 It is the range for the middle 50% of the data.
 It overcomes the sensitivity to extreme data values.
Slide 31
Interquartile Range
Apartment Rents
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
3rd Quartile (Q3) = 525
1st Quartile (Q1) = 445
Interquartile Range = Q3 - Q1 = 525 - 445 = 80
Slide 32
PRACTICE
RANGE
Slide 33
Practice #3 - Range
Range
3
11
16
23
18
7
Slide 34
VARIANCE
Slide 35
Variance
The variance is a measure of variability that utilizes
all the data.
It is based on the difference between the value of
each observation (xi) and the mean ( x for a sample,
m for a population).
The variance is useful in comparing the variability
of two or more variables.
Slide 36
Variance
The variance is the average of the squared
differences between each data value and the mean.
The variance is computed as follows:
2  ( xi  x )
s 
n 1
for a
sample
2
 ( xi  m ) 2
 
N
2
for a
population
Slide 37
Sample Variance
Apartment Rents
• Variance
2
(
x

x
)
 i
s2 

n1
2, 996.16
Slide 38
Detailed Example - Variance
xi
xi  x
( xi  x )
1
3
2
1
3
2a
1-2 b= -1
3-2c = 1
2-2 = 0
1-2 = -1
3-2 = 1
-12d= 1
12 e= 1
02 = 0
-12 = 1
12 = 1
2
s2 
2
(
x

x
)
 i
n 1
s2 = 4/(5-1)
g =1
4f
Slide 39
Standard Deviation
The standard deviation of a data set is the positive
square root of the variance.
It is measured in the same units as the data, making
it more easily interpreted than the variance.
Slide 40
Standard Deviation
The standard deviation is computed as follows:
s  s2
  2
for a
sample
for a
population
Slide 41
Standard Deviation
Apartment Rents
s2 
2
(
x

x
)
 i
n1

2, 996.16
• Standard Deviation
s  s 2  2996.16 
54.74
Slide 42
Detailed Example – Standard Deviation
s 
2
2
(
x

x
)
 i
n 1
s2 = 4/(5-1) = 1
s s
2
a 1
s 1
Slide 43
PRACTICE
VARIANCE AND STANDARD
DEVIATION
Slide 44
Practice #4 – Variance
xi
3
7
xi  x ( xi  x )
2
s2 
2
(
x

x
)
 i
n 1
11
16
18
23
Slide 45
Practice #4 – Standard Deviation
s 
2
2
(
x

x
)
 i
n 1
s s
2
Slide 46
Coefficient of Variation
The coefficient of variation indicates how large the
standard deviation is in relation to the mean.
The coefficient of variation is computed as follows:
s

 100  %
x



 100  %
m

for a
sample
for a
population
Slide 47
Sample Variance, Standard Deviation,
And Coefficient of Variation
Apartment Rents
• Variance
s2 
2
(
x

x
)
 i
n1

2, 996.16
• Standard Deviation
s  s  2996.16  54.74
2
• Coefficient of Variation
the standard
deviation is
about 11%
of the mean
 s

 54.74


100
%


100



 %  11.15%
x

 490.80

Slide 48
PRACTICE
COEFFICIENT OF
VARIATION
Slide 49
Practice #5 – Coefficient of Variation
x
s=
s

*
100

%
x

Slide 50
Slide 51
Related documents