Download Lecture 8 - Andrew Zaeske

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
Lecture 8
6A, 6B and 6C
6B: Measures
of Variation
6C: The
Normal
Distribution
Andrew L. Zaeske
Marian University
2/25/15
Test Review
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
6B: Measures
of Variation
6C: The
Normal
Distribution
Distribution and Measures of Central Tendency
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
6B: Measures
of Variation
6C: The
Normal
Distribution
Distribution of data : the way it is spread out
Question: Where is the data centered?
Mean (average) : Sum of all values divided by number of
values
Median : Middle value of ordered data (average if there
are an even number)
Mode : Most frequent value
Shapes of Distributions
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
6B: Measures
of Variation
6C: The
Normal
Distribution
Outlier : Number(s) that is far away from others
Outliers can greatly change mean value, but not median
or standard deviation
Symmetry : mirror image around mean/median/mode
Skewness : clustered to one side (left skew: spread out
on left side, right skew: spread out on right side)
Variation : how wide apart are values
Simplest Measures: Range and 5 Number
Summary
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
Range = maximum value - minimum value
6B: Measures
of Variation
5 Number Summary : Minimum, Lower Quartile (Q1 ),
Median, Upper Quartile (Q3 ), Maximum
6C: The
Normal
Distribution
Lower Quartile is the median of the values between the
Minimum and the Median
Upper Quartile is the median of the values between the
Median and the Maximum
More complicated: Standard Deviation
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
6B: Measures
of Variation
6C: The
Normal
Distribution
Deviations from mean = Value - mean
Squared deviation = (Value - mean)2
Sum of squared deviations = square root of sum of
squared deviations from mean, divided by number of
values minus 1
p
sumof (value − mean)2
countofvalues − 1
(1)
Example #1
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
Randomly generated integers: 10 2 2 8 2 10 3 1 8 5
6B: Measures
of Variation
Mean: 51/10=5.1, Median: (3+5)/2=4, Mode: 2
6C: The
Normal
Distribution
5 Number: 1, 2, 4, 8, 10
Standard deviation: 3.573
Example #2
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
6B: Measures
of Variation
6C: The
Normal
Distribution
Salaries of Brewer Starters: 550,000 11,000,000
11,250,000 12,500,000 3,325,000
Mean: 7,725,000 Median: 11,000,000
5 Number: 550,000 3,325,000 11,000,000 11,250,000
12,500,000
Standard deviation: 5,403,529
Example #3
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
6B: Measures
of Variation
6C: The
Normal
Distribution
Randomly Generated Numbers: 0.5188813 1.5681064
8.3100768 2.8875178 6.1874861 0.9655630 8.3006705
6.2502796 5.3124419 5.2274770
Mean: 4.55285 Median:
(5.3124419+5.2274770)/2=5.269959
5 Number: 0.5188813 1.5681064 5.2699594 6.2502796
8.3100768
Standard deviation: 2.897736
Normal Distribution
Lecture 8
Andrew L.
Zaeske
Review
6A:
Characterizing
Data
6B: Measures
of Variation
6C: The
Normal
Distribution
Normal: Symmetric, bell-shaped distribution with a
single peak
The peak is the mean, median and mode
The standard deviation affects the height of the peak
Rule: 66% of values are within 1 standard deviation of
mean, 95% within 2, 99.7% within 3
Z-scores and Percentiles
Lecture 8
Andrew L.
Zaeske
value−mean
standarddeviation
Review
z-score = standard score =
6A:
Characterizing
Data
Percentile of a value: Percentage of all values that are
less than or equal to that value
6B: Measures
of Variation
6C: The
Normal
Distribution
Percentile and z-scores are the same: A z-score implies a
specific percentile and each percentile has one z-score
General procedure: Given a value, find a z-score and/or
find what percentage will get above a particular score
If given z-score, to solve for value rearrange:
value = mean + z-score x standard deviation
Related documents