Download Math 167 Chapter 5 Chapter 5 - Exploring Data : Distributions In this

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Math 167
Chapter 5
Chapter 5 - Exploring Data : Distributions
In this chapter, we will learn about different distributions to represent and analyze data.
Definitions
1. Individuals are objects described by a set of data. They may be people, animals, or things.
2. Variable is any characteristic of an individual. It can take different values for different individuals.
A variable can be categorized as a) Numeric or Quantitative : e.g grades, height, age, or b)
Categorical or Qualitative: e.g email, name, gender, color
3. The distribution of a variable tells us what values the variable taken and how often does it take
these values.
4. Histogram: is a graph of distribution of outcomes (often divided into classes) for a single numeric
value. Height of each bar is the number of observations. All classes (bars) should have the same
width and each observation must fall into exactly one class.
5. Outlier: A deviation from the rest of the data. An individual that falls outside the overall pattern.
6. Symmetry of a Distribution: a) Left-skewed : longer tail of the distribution is on the left.
b) Right-skewed : longer tail of the distribution is on the right.
c) Symmetric : GUESS!!
7. Mean of a Distribution : The average. It is computed as
8. Median: Arrange all observations in increasing order. If the number of observations is odd, the
median M is the center observation. If the number of observations is even, the median M is the
average of two center observation.
9. Mode: Most frequently occurring value in the set of observations.
10. Range: largest observation – smallest observation.
11. Standard Deviation : average amount that observed data values deviate from the mean.
12. Variance: Square of standard deviation.
13. Quartiles: Arrange the observations in increasing order.
a) The first quartile is the median of the lower half
b) The third quartile is the median of the upper half
c) The second quartile is the median.
d) The inter-quartile range is the difference between third and first quartile.
14. Normal Distribution is determined by the mean and standard deviation.
Fun Facts:
a) Mean = Median = Mode
b) The first quartile is located about 0.67 standard deviation below the mean; the third quartile is
located about 0.67 standard deviation above the mean.
c) The 68 – 95 – 99.7 rule states that
i) About 68% of the observations fall within 1 SD of the mean.
ii) About 95% of the observations fall within 2 SD of the mean.
iii) About 99.7% of the observations fall within 3 SD of the mean.
d) The curve is known as a bell curve and it never touches the x – axis.
1
Math 167
Chapter 5
Normal distribution is an approximation of Histogram
Comparison of Symmetry
How does a change in mean and/or standard deviation affect the normal curve?
2
Math 167
Chapter 5
Example 1. Given a list of number of dogs owned by families in a particular neighborhood. Draw a
histogram representing the same.
0, 1, 1, 2, 3, 5, 8
a) Draw a histogram representing the same.
b) Find the mean, mode, and median of the data.
c) What is the Range for this data?
d) What is the first, second, third quartiles? What is the inter-quartile range?
e) Find the standard deviation.
3
Math 167
Chapter 5
Example 2. Given a set of observations. Find the missing observation so that the median is 5.
8, 5, 10, 3, ?
Example 3. Below are the ages of 15 students in a college class
27, 50, 33, 25, 86, 25, 85, 31, 37, 44, 20, 36, 59, 34, 28
a) What is the mean age?
b) What is the median age?
c) What is the mode?
d) What is the inter-quartile range?
e) Can you identify any possible outliers?
f) What is the range for this data?
4
Math 167
Chapter 5
Example 4. The following data is the grades of students of a class. Divide the graded into reasonable
classes, and make a histogram with those classes.
86, 86, 85, 83, 83, 82, 82, 81, 81, 80, 79, 77, 77, 77, 76, 76, 75, 74, 74, 73, 72, 72, 72, 69, 69, 69, 67,
65, 61, 58, 51
Example 5. The scores on an honors exam from a class of 17 students are given below:
32, 71, 72, 77, 77, 83, 84, 85, 87, 89, 90, 92, 95, 96, 98, 99, 100
a) Find the mean, mode, and median of the scores.
b) What is the minimum score, the maximum score, and the range of the data?
c) What are the first and third quartiles? What is the inter-quartile range?
5
Math 167
Chapter 5
Example 6.The scores of students on a standardized test form a normal distribution with mean 300 and
standard deviation 40. If 2000 students took the test, find the number of students who score above 380.
Example 7. The length of tape on a roll of a certain type of tape is normally distributed with a mean of
25 meters and a standard deviation of 50 centimeters.
a) What is the range of lengths that included 99.7% of the rolls?
b) What percentage of rolls are longer than 26 meters?
c) What lengths bracket the middle 50% of the rolls of tape?
Example 8. Heights of women is distributed approximately normal with a mean of 64.5 and a standard
deviation of 2.5 inches.
a) What percentage of women lie between the heights 59.5 and 69.5?
b) What is the height range for the middle 50% of the women?
6
Math 167
Chapter 5
The Five-Number Summary and Boxplots
The five-number summary of distribution consists of five numbers (SURPRISE!!!)
Minimum, Q1, M, Q3, Maximum
A boxplot is a graph of the five-number summary.
Example 9.. Use the data from Problem #1 and sketch the corresponding boxplot.
27, 50, 33, 25, 86, 25, 85, 31, 37, 44, 20, 36, 59, 34, 28
Example 10. Use the data from Problem #5 and sketch the corresponding boxplot.
32, 71, 72, 77, 77, 83, 84, 85, 87, 89, 90, 92, 95, 96, 98, 99, 100
7
Math 167
Chapter 5
Example 11.. Display the following data in a Stemplot
32, 71, 72, 77, 77, 83, 84, 85, 87, 89, 90, 92, 95, 96, 98, 99, 100
8
Related documents