Download Modeling Distributions of Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Modeling
Distributions
of Data
Kylie Bugbee and Anya
Bershader
Percentile
●
The pth percentile of a distribution is the value with p percent of the
observations less than it
Cumulative Relative Frequency Graphs:
●
There are some interesting graphs that can be made with percentiles. One of the most
common graphs starts with a frequency table for a quantitative variable. For instance,
here is a frequency table that summarizes the ages of the first 44 U.S. presidents when
they were inaugurated:
Z-scores:
Z = statistic - mean / standard deviation
Transforming data:
Adding or subtracting (by a constant):
●
●
Add the constant to measures of center and location
Don’t change shape or spread
Multiplying or dividing (by a constant):
●
●
multiplies/divides measures of center, location, and spread by the constant
Does not change the shape of the distribution
Cumulative Relative Frequency Graphs
●
●
A way to display cumulative information graphically
Shows the percentage, of observations that are less than or equal to particular values
Density Curves
●
●
●
●
A curve that is always on or above the horizontal axis and has area exactly 1
underneath it
Describes the overall pattern of a distribution
The median is the point that divides the curve in half
The mean is the balance point of the curve
Normal Distribution
●
●
●
●
Described by a normal density curve
Described by mean and standard deviation
Mean is the center of the symmetric normal curve
Standard deviation is the distance from the center to the change of curvature
points
Empirical Rule
●
●
●
68% of observations fall within one standard deviation
95% of observations fall within two standard deviations
99.7% of observations fall within three standard deviations
Standard Normal Distribution
●
●
Mean 0 and standard deviation 1
Can be used with the standard normal table to find area under the curve to the
left of a z-score
Practice Problems
Suppose that the distance a golfer can hit the ball has an approximately normal
distribution with a mean of 200 yards and a standard deviation of 15 yards.
1.
The middle 99.7% of his hits will travel between what two distances?
A. 0 and 400 yards
B. 105 and 295 yards
C. 170 and 230 yards
D. 185 and 215 yards
E. 185 and 215 yards
C. 155 and 245 yards
Suppose that the distance a golfer can hit the ball has an approximately normal
distribution with a mean of 200 yards and a standard deviation of 15 yards.
2. What proportion of his hits will travel between 198 and 206?
A. .21
B. .44
C..68
D. .79
E. .95
A. .21
Suppose that the distance a golfer can hit the ball has an approximately normal
distribution with a mean of 200 yards and a standard deviation of 15 yards.
3. The highest 20% of his hits will travel at least what distance?
A. 181 yards
B. 187 yards
C. 212 yards
D. 219 yards
E. 225 yards
C. 212
4. The lifetime of a 2-volt non-rechargeable battery in constant use has a Normal
distribution with a mean of 516 hours and a standard deviation of 20 hours. Of all
batteries, 90% have a lifetime shorter than:
A. 541.6 hours
B. 517.28 hours
C. 490.4 hours
D. 502.4 hours
A. 541.6 hours
5. Suppose that a density curve is strongly skewed to the right. What is true about
the relationship between the mean and the median of the distribution?
A. the mean will be less than the median
B. the mean will be about the same as the median
C. the mean will be greater than the median
D. the mea could be higher or lower than the median
E. it is impossible to determine without actual data
C. the mean will be greater than the
median