Download File - phs ap statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Mean field particle methods wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Warm – Up
Find the Mean, Median, Mode, IQR and Standard
Deviation of the age of a room containing 5 people.
Ages: 16, 18, 17, 16, 19.
Calculate all values a second time. Describe what
happens to these values if someone’s 99 year old
Grandma walks into the room.
Mean = 17.2
Mean = 30.833
Mode = 16
Mode = 16
Median = 17
Median = 17.5
Standard Dev. = 1.304
Standard Dev. = 33.415
IQR =
IQR =
2.5
3
CHAPTER 5 (continued)
The Mean and the Std. Dev. are considered
NONRESISTANT because they’re very
sensitive and influenced by extreme outliers.
The Median, IQR, and Mode are considered
RESISTANT or ROBUST, since outliers do
not ‘greatly’ (if at all) affect their value.
Unbiased – Statistics are unbiased when the
center of the distribution is approximately
equal to the true population average.
Biased -
x  True population mean.
Unbiased and small spread are the best.
You can generalized only if the data was
RANDOMLY collected from entire population.
The Mean in relation to the Median
 If the Mean is (roughly) equal to the Median then
the distribution is approximately symmetric.
 If the Mean is greater than the Median then the
distribution is skewed right.
 If the Mean is less than the Median then the
distribution is skewed left.
The Five Number Summary:
Minimum, Q1, Median, Q3, Maximum
The Five Number Summary can be displayed in a
BOX PLOT (A Box and Whisker Plot)
Min.
Q1
Med.
Q3 Max.
Calculate the Mean, Standard Deviation, and 5Number summary. Then constructing a Box Plot
with the following data:
Babe Ruth’s # of Home Runs with
the New York Yankees 1920-1934
54 59 35 41 46 25 47 60
54 46 49 46 41 34 22
Mean = 43.933
Standard Dev. = 11.247
Min = 22, Q1 = 35
Median = 46
Q3 = 54, Max = 60
22
35
46
54
60
Min.
Q1
Med.
Q3
Max.
# of Ruth’s Home Runs
What is an OUTLIER ?
Formula to Determine Outliers:
An Observation, xi , is an Outlier if:
xi > Q3 + 1.5 · (IQR) or
xi < Q1 – 1.5 · (IQR)
Determine if any Outliers exist for
#H.Runs: 54 59 35 41 46 25 47 60 54 46 49 46 41 34 22
Q1 = 35
Q3 = 54
IQR = 19
Xi > Q3 + 1.5∙(IQR) =
Xi > 54 + 1.5(54 – 35) = 82.5
Xi < Q1 - 1.5∙(IQR) =
Xi < 35 - 1.5(54 – 35) = 6.5
HW: Page 92:14, 16, 22
HW: Page 92:14-24 even
March/April
February
July
The Med. for June is higher and more consistent than Jan.
Summer months have more consistent temp.
CHAPTER 5 (continued)
The Mean in relation to the Median
 If the Mean is (roughly) equal to the Median then
the distribution is approximately symmetric.
 If the Mean is greater than the Median then the
distribution is skewed right.
 If the Mean is less than the Median then the
distribution is skewed left.
A = Mode
B = Median
C = Mean
A = Mode
A = Median
A = Mean
C = Mode
B = Median
A = Mean
The Modified Box Plot represents Outliers outside
of the Box Plot.
Example of Box Plot for Multiple data Sets
Heights of Plant (mm)
Example: 25 28 30 31 31 36
a.) Find the Mean, Median and Mode.
b.) Add the value ’90’ as the 7th value to the
data set. Is 90 consisted an outlier?
Now find the Mean, Median, and
Mode.