Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ST 311- Introduction to Statistics Instructor: Judith Canner Learning Objectives D Spring 2010 THE NORMAL DISTRIBUTION Probability Definitions and Notation P ( X > 10) = the probability that a value is greater than 10 (based on a given distribution) Bell-shaped Histogram Patterns are common in nature; one of the most common is the bell-shaped curve. Most individuals are clumped near the center, with fewer individuals the greater the distance from the center. Column n Mean Std. Dev. textbooks 534 348.10675 143.71208 Empirical Rule For any bell-shaped curve, approximately • 68% of the values fall within 1 standard deviation of the mean in either direction • 95% of the values fall within 2 standard deviations of the mean in either direction • 99.7% of the values fall within 3 standard deviations of the man in either direction http://www.stat.tamu.edu/~west/applets/empiricalrule.html 1 ST 311- Introduction to Statistics Instructor: Judith Canner Learning Objectives D Spring 2010 Consider the text book data: What are the minimum and maximum? How many standard deviations from the mean are the min and the max? In general, how many standard deviations are the max and the min from the mean? Approximate of the standard deviation for relatively large samples (>200). Will this approximation work for skewed data? Why? Characteristics of a Normal Distribution 1. Continuous random variable P ( X = k ) = 0 2. Symmetric and Bell-shaped: The normal distribution is a model for bell-shaped curves. All normal distributions are bell-shaped, but no all bellshaped curves are normal. 3. Empirical Rule holds 4. Equal probability for a measurement to less than the mean or greater than the mean P ( X ≤ μ ) = P ( X ≥ μ ) = 0.5 5. The probability that a value is so many units (d) below the mean is equal to the probability that the value is the same number of units above the mean. P( X ≤ μ − d ) = P( X ≥ μ + d ) Question: Why are 4 and 5 true (what is the characteristic that determines this property)? 2 ST 311- Introduction to Statistics Instructor: Judith Canner Learning Objectives D Spring 2010 Example: The Empirical Rule and the Normal Distribution Consider a population with mean 75 and standard deviation of 10. If each tick mark represents a standard deviation, note the mean and the values of 1, 2, and 3 standard deviations from the mean. Answer the following questions (it may help to shade/draw each problem on the normal curve): 1. P ( X < 65) = 2. P (55 < X < 85) = 3. Find x: P ( X > x ) = 0.16 3 ST 311- Introduction to Statistics Instructor: Judith Canner Learning Objectives D Spring 2010 Standardized Score ________________________: the distance (number of standard deviations) between a specified value and the mean z= Observed Value - Mean x − μ = σ Standard Deviation How does it work? When we convert values for any normal random variable to z-scores, it is equivalent to converting the random variable of interest to a standard normal random variable, i.e., ⎛ X −μ x−μ ⎞ P ( X ≤ x) = P ⎜ ≤ = P( Z ≤ z ) σ ⎟⎠ ⎝ σ where Z ~ Normal ( μ = 0, σ = 1) Example: Consider the shoe size for the female population. The population mean is 8 and the standard deviation is 1.5. What is the probability of selecting a woman with a shoe size less than 6 from the population? ⎛ __ − __ __ − __ ⎞ P ( X ≤ __) = P ⎜ = ⎟ = P ( Z ≤ __) __ __ ⎝ ⎠ X Z -2 -1 0 1 2 Now that we know the z-score, how do we find the probability associated with that score? 4 ST 311- Introduction to Statistics Instructor: Judith Canner Learning Objectives D Spring 2010 How to use the Z-Table Notice that the table only gives probabilities associated with values less than the calculated zscore. Fortunately, there are several properties of the Normal distribution that allow to use these values for different and more complicated calculations of probability. Probability Relationships for Normal Random Variables 1. P ( X > a ) = 1 − P ( X ≤ a ) In words: 2. P ( a < X < b) = P ( X ≤ b) − P ( X ≤ a ) In words: 3. P ( X > μ + d ) = P ( X < μ − d ) In words: 5 ST 311- Introduction to Statistics Instructor: Judith Canner Learning Objectives D Spring 2010 Examples: Consider again the shoe size for the female population. The population mean is 8 and the standard deviation is 1.5. Answer the following: *EXTRA CHALLENGE* Try only using the first page of the attached z-table to answer the questions 1. What is the probability that a woman has a shoe size greater than size 11? 2. What is the probability that that woman’s shoe size is between size 7.5 and size 9? 3. What is the probability that a woman’s shoe size is less than 5 (can you figure this out without looking at the table? Hint: look at the 3rd relationship)? 4. What is the probability that a woman’s shoe size is less than 5 or greater than 11? Percentiles and Probabilities A percentile is another way of describing a value’s cumulative probability. • That is P ( X ≤ x) = 0.75 , where x is the value corresponding to the 75th percentile of a distribution. • Essentially, instead of finding the probability a particular value occurs, you are given the probability and must find the value. Step 1: Find the z value that corresponds to the specified percentile rank (or cumulative probability). Step 2: Compute x = μ + zσ . This is the percentile value. Example: Recall the previous example. What would my shoe size have to be for me to be in the 80th percentile of woman’s shoe sizes? 6 ST 311- Introduction to Statistics Instructor: Judith Canner Learning Objectives D Spring 2010 If you know a population is (approximately) normally distributed, you can describe that population completely using only the mean and standard deviation! In Class Exercise: Consider the heights for NFL football players. Assume the heights are normally distributed with a mean of 74 inches and a standard deviation of 2.5. On each of the following subjects, spend a minute or two and write/draw as much as you possibly can about NFL players’ heights with regards to: 1. Location/Center (mean, median) 2. Spread (~Range for a sample, Q1, Q3, IQR) 3. Shape (symmetry, mode) 4. Empirical Rule 5. Draw a Boxplot or a Histogram that represents a sample from the population 7 ST 311- Introduction to Statistics Instructor: Judith Canner Learning Objectives D Spring 2010 8 ST 311- Introduction to Statistics Instructor: Judith Canner Learning Objectives D Spring 2010 9