Download 2.1 Visualizing Distributions: Shape, Center, Spread

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Section 2.1
Visualizing Distributions:
Shape, Center, and Spread
Exploration of univariate data
should follow this sequence:
1. plot → 2. shape → 3. center → 4. spread
1.
2.
3.
4.
Choose an appropriate plot

4 kinds of graphs will be shown:

Dot plot, Stem plots, Bar charts, Histogram (or Relative Frequency) plots
Describe the shape

4 kinds of shapes will be demonstrated:

Uniform, Normal, Skewed, Bi-modal
Find a measure of center appropriate to the shape

2 kinds of centers will be calculated (formulas to be taught):

mean and median (sometimes mode)
Find a measure of spread that agrees with the measure of center.

3 kinds of spread will be calculated (formulas to be taught):

deviations, standard deviation, and variance
Data Distributions - “shapes”
We will be considering 4 types of data distribution
shapes:




Uniform
Normal
Skewed
Bimodal
Uniform Distribution – Ideal Shapes
For a uniform distribution of data, all the
values occur equally often (or the same
amount)
AKA a rectangular distribution because of
its shape
Dot plots – Ideal Graph
Uniform Distribution – Ideal Shape
Answer: a set of data (group of numbers) where each datum
(single number) is graphed as a single dot on an x-y graph.
Example: If you roll a 10-sided die (labeled 0 to 9) 50 times, you
may get a graph of 50 dots on a uniform distribution similar to
below:
Histogram – Not Ideal Graph
Uniform Distribution – Not Ideal Shape
Describing : The graph shows a roughly uniform
distribution of births across the months.
Dot Plot – Not Ideal Graph
Uniform Distribution – Not Ideal Shape
Activity 2.1a, page 28

Last Digit of 30 Phone Numbers
Does this graph look like a uniform distribution?
Is this a uniform distribution graph? Why or why not?
[Hint: Would only 2 phone number digits look uniform? How about 1000 phone
number digits?]
Normal Distribution – Ideal Shape
Idealized normal shape is:

perfectly symmetric

single peak, or mode, at line of symmetry

curve drops off smoothly on both sides, never
touching x-axis, and stretches infinitely far in
both directions.
Normal Distribution – Ideal Shape
Center & Spread
Center : Mean = Mode
Spread : Standard deviation (SD) is the
horizontal distance from the mean to an
inflection point.
Normal Distribution – Ideal Shape
Center & Spread
Use the mean to describe the center and
standard deviation to describe spread of a
normal distribution.
Example: A typical random sample of five
workers has an average age of 47 years, give
or take about 4 years.
Statistically speaking: IF the population is a
normal distribution, the Mean is 47 years, the
standard deviation is 4 years. (That is a “big if”)
Skewed Distributions – Ideal Shape
Uniform and normal distributions are symmetric.
Many common distributions show bunching at
one end and a long tail stretching out in the
other direction.
These distributions are called skewed.
Skewed Distributions – Ideal Shape
Direction of tail tells whether distribution is
skewed left or skewed right.
Skewed Distributions – Ideal Shape
Often the bunching in skewed distribution occurs
because values “bump up against a wall.”
Examples: Either a minimum that values can not
go below, such as 0 for distances, or a
maximum that values can not go above, such
as 100 for percentages.
Note: To have a “wall”, values must be bunched
up against it
Skewed Distributions – Ideal Shape
Center & Spread
Use median to describe the center with the
lower and upper quartiles to describe the
spread.
Example: The middle 50% of the SAT math
scores were between 630 and 720, with half
above 680 and half below.
Skewed Distributions – Not Ideal Shape
Center & Spread
The middle 50% of the Polar
Bear weights are between
about 115 and 250, with half
above about 155 lbs and half
below.
Bimodal Distributions – Ideal Shape
Many distributions have only one peak-unimodal.
Some have two peaks (bimodal) or even more.
Bimodal Distributions – Not Ideal Shape
Locate the two peaks. However, it is more useful if
you can find another variable that divides the data
into two groups centered at the two peaks.
Other Possible Data Characteristics
Outlier: unusual value that stands apart from the
bulk of the data
Cluster: a group of data “clustering” close to the
same value, away from other groups
Gap: on a plot, the space that separates clusters
of data