Download Statistics Chapter 2 Exploring Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Section 2.1
Visualizing Distributions: Shape, Center and Spread

To learn the basic shapes of distributions of data:
 Uniform, normal and skewed

To describe characteristics of a shape of distribution:
 Symmetry, skewness, modes, outliers, gaps and clusters



To describe a uniform distribution using range and frequency
To estimate graphically the mean and standard deviation of
a normal distribution and use them to describe the
distribution.
To estimate graphically the median and quartiles and use to
describe a skewed distribution.

A graph that shows:
 Spread of the data
 How many times a value in the data occurs

How have we used a distribution?
 To see where data from a simulation lies.
 To explore probabilities of a random selection

Uniform (Rectangular)
 All values occur equally often
 Selecting the last digit of the numbers in a phone
book
 Selecting the last digit of social security #s or you
student id #s
 randInt(start,end,n) ie: randInt(0,9,100) L1
 Why?
▪ All digits 0-9 would be used and there would be no reason any
one of them would be used more than the others.

Normal Distributions (bell-shaped)
 Very common in our world and will be used
throughout the year.

Measure a ball.
 Measure the diameter to the nearest mm and record
your result.
As a class create a dot plot that shows the
distribution of our measurements.
 What do you notice?
 Why do you think that occurred.
 Normal Distribution Video #1
 Normal Distribution Video #2


Characteristics of a Normal Distribution:
 Symmetric: The mean (avg. value) of the data is the




center point. If it is truly normal, the mode and median of
the data is also at the center.
These are called measures of center.
Standard deviation (SD) is a measure of the spread of a
normal distribution. The SD happens to be the distance
from the center out to the inflection point on the curve.
One SD out from the center in both directions will give
boundaries for an area of 68% of the total under the curve.
This is a measure of the spread of the data.

Skewed distribution (a longer tail on one side)
 Skewed right: tail stretches to the right
 Not a line of symmetry
 Median is typically used to describe a measure of
center since there is not line of symmetry.
▪ Divide the plot into equal #s of data points on each side
of the median.
 Quartiles are a measure of spread for this.
▪ Lower quartile divides the lower half of the data
▪ Upper quartile divides the upper half of the data

Bimodal Distributions (two peaks)
 Cases often represents two groups when this
occurs: Male/Female, Majority/Minority…

Outliers: A data value that stands apart from
the bulk of the data.
 These deserve special attention
 Sometimes they are mistakes
 Sometimes there are unusual circumstances that
can be important to great discoveries.

Gaps in where the data values lie.

Could also call the areas where the bulks of
the data lie, clusters.

When describing a distribution, you must
include the following:
 Shape (as we have just described)
 Measure of center (or centers if bimodal)
▪ mean, mode, median
 Measure of spread
 Locations of Gaps or Clusters

Discussion D4

Practice P1-3, 5

Page 39 E1, 3, 5, 8, 11

AP only: also E4, and 6