Download Measures of Location Outline Notes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Law of large numbers wikipedia , lookup

Dragon King Theory wikipedia , lookup

Negative binomial distribution wikipedia , lookup

Elementary mathematics wikipedia , lookup

Transcript
(3.1, 3.2, 3.4) Measures of Location
Measures of Location:
Mean: average value
Sample mean: denoted by x  
xi
n
x
Population mean: denoted    i
N
Median: value in the middle when the data are arranged in ascending order (smallest to
largest).
With an odd number of observations, the median is the middle value, however with an
even number of observations, having no middle value, the average of the two middle
values is the median.
Practice:
Find the median for the set of numbers below
5, 6, 8, 10, 13, 15, 15
Median = _____
Now find the median for this set of numbers
1, 3, 4, 5, 7, 8, 9, 12
Median = _______
Mode: the value that occurs with greatest frequency (may be more than one mode in a set
of data).
Percentile: the pth percentile is the value such that at least p percent of the observations
are less than or equal to this value and at least (100-p) percent of the observations
are greater than or equal to this value.
For example: If you scored in the 90th percentile on the verbal part of your SAT’s,
this would mean that you scored above 90% of all verbal scores taken for the SAT’s at
that time.
1.
2.
3.
4.
To calculate the pth percentile:
arrange the data in ascending order
compute an index i=(p/100)*n
if i is not an integer, round up, the next integer greater than I denotes the position of
the pth percentile
if i is an integer the pth percentile is the average of the values in positions i and i+1.
Practice Finding Percentile:
48
49
51
52
54
57
57
58
59
63
64
65
66
66
67
70
70
71
72
73
74
75
75
76
76
77
78
79
82
83
84
84
85
87
87
88
89
91
91
92
93
94
95
97
98
99
99
100
101
106
There are 50 elements in this data set. The median is:
To find the 85th percentile:
Quartiles: used to divide the data into 4 parts.
Q1: first quartile, _____ percentile
Q2: second quartile, _____ percentile, median
Q3: third quartile, _____ percentile
5 number summary:
1. Smallest value
2. first quartile (Q1)
3. Median (Q2)
4. third quartile (Q3)
5. Largest value
So the 5 number summary for this data set is:
Measures of Variability
Range: largest value – smallest value
In this example:
Inter-quartile range (IQR): Q3 – Q1, the range of the middle 50% of the data.
IQR:
Outliers (equation 1.5*IQR): unusually extreme values in the data, may be unusually large
or small.
Box Plots: a graphical summary of data that is based on a five-number summary.
40
50
60
70
80
90
100
110
IQ scores example
91, 101, 106, 107, 110, 112, 114, 115, 132, 147
5 Number Summary:
IQR:
Outliers:
90
100
110
120
130
140
150
Shape of Distribution
Skewness
Binomial Distribution
Binomial Distribution
0.25
0.2
P(X=x)
P(X=x)
Binomial Distribution
0.35
0.15
0.1
0.05
0
0.35
0.3
0.3
0.25
0.25
P(X=x)
0.3
0.2
0.15
0.1
0.1
0.05
0.05
0
0
1
2
3
4
5
6
7
8
9
10
Number of Successes in 10 trials
Symmetric the mean and the
median are the same
0.2
0.15
0
0
1
2
3
4
5
6
7
8
9
10
Number of Successes in 10 trials
Skewed to the right because
data is PULLED farther to the
right
0
1
2
3
4
5
6
7
8
9
10
Number of Successes in 10 trials
Skewed to the left because
data is PULLED farther to the
left
Which is greater, mean or the median in our IQ example?
Median is not affected by skewness or outliers whereas the mean IS. The mean is pulled in the
direction of outlier or “tail” (skew).
Binomial Distribution
Binomial Distribution
0.35
0.3
0.3
0.25
0.25
0.2
P(X=x)
P(X=x)
0.35
0.15
0.2
0.15
0.1
0.1
0.05
0.05
0
0
0
1
2
3
4
5
6
7
8
Number of Successes in 10 trials
Median Mean
9
10
0
1
2
3
4
5
6
7
8
9
10
Number of Successes in 10 trials
Mean Median