Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Inductive probability wikipedia , lookup
Taylor's law wikipedia , lookup
History of statistics wikipedia , lookup
Probability amplitude wikipedia , lookup
Central limit theorem wikipedia , lookup
Law of large numbers wikipedia , lookup
Math 123- Statistics Chapter 5 Notes Name_______________________________ 5.1 Introduction to Normal Distributions and the Standard Normal Distribution def- A continuous probability distribution is the probability distribution of a continuous random variable. Def- A normal distribution is a continuous probability distribution for a random variable x. The graph is called a normal curve and has the following properties: 1. The mean, median, and mode are equal. 2. The normal curve is bell-shaped and symmetric about the mean. 3. The total area under the normal curve is equal to one. 4. The normal curve approaches, but never touches the x-axis. 5. Between and the graph curves downward. The graph curves upward to the left of and to the right of . The points where it changes from upward to downward are called inflection points. Extra Information- The equation for this curve is y 1 2 ( x )2 e 2 2 . You do not need to know this equation. Ex- Graph the normal curve. a) 4 and 1 b) 5.3 and .7 Ex- Which curve has the greater mean and which has the greater std. dev.? Def- The standard normal distribution is a normal distribution with mean 0 and std. dev. 1. The x horizontal scale corresponds to z-scores. The z-score formula is z . Note: x represents a random variable in a non-standard normal distribution whereas z represents values in a standard normal distribution. Properties of the Standard Normal Distribution 1. The cumulative area is close to zero for z-scores close to z= -3.49. 2. The cumulative area increases as the z-scores increase. 3. The cumulative area for z=0 is .5. 4. The cumulative area is close to 1 for z-scores close to z= 3.49. Ex- Use the table in Appendix A p.A16 – A17 or look at Table 4 in the cardstock pages at the back of the book to find the area under the standard normal curve. a) Find the area to the left of z= -.84. b) Find the area to the right of z= 1.68. c) Find the area between z= -1.22 and z= -.43. d) Find P(2.15 z 1.55) . e) Find P( z 2.5 or z 2.5) . For all probability problems in Chapter 5… Step 1- Verify normality. Step 2- Write the probability notation. Step 3- Do the problem. Ex- The green turtle migrates across the Southern Atlantic in the winter. A study found that green turtles migrate an average of 2200 km with a standard deviation of 625 km. Assuming the distances are normally distributed, find the following probabilities. a) Find the probability that a green turtle migrates less than 1900 km. b) Find the probability that a green turtle migrates between 2000 km and 2500 km. c) Find the probability that a green turtle migrates farther than 2450 km. 5.2 Normal Distributions: Finding Probabilities Note: 5.2 is the same as 5.1, just more practice with word problems. Ex- The lengths of Atlantic croaker fish are normally distributed with a mean of 10 inches and a std. dev. of 2 inches. A fish is randomly selected. a) Find the probability that the length of the fish is at most 7 inches. b) Find the probability that the length of the fish is between 7 and 15 inches. c) Find the probability that the length of the fish is more than 15 inches. d) What percent of fish are longer than 11 inches? e) If a fish has a z-score of 1.39, what is its corresponding length? f) If 200 Atlantic croakers are randomly selected, how many of them would you expect to be shorter than 8 inches? A z-score tells you how far the data value is away from the mean. 1. If z= 2.31, is x above or below the mean? 2. If z= -3.81, is x considered to be an outlier? 3. Explain why P( z 2.5) and P ( z 2.5) are the same. 5.3 Normal Distributions: Finding Values Note: 5.3 is backwards from 5.2. You are given the probability and you have to find the z-score. Ex- Find the z-score corresponding to each cumulative area or percentile. a) Area= .9945 b) Area= .0192 c) Probability= .45 d) P7 e) P40 f) P99 Transforming a z-score to an x-value x Solve for x. z Ex- On dry surface, the braking distance of a Pontiac Grand Am SE can be approximated by a normal curve where the average braking distance is 45.1 meters and std. dev. is .5 meters. a) Find the braking distance of a Pontiac Grand Am SE that has a z-score of -2.4. b) What is the braking distance that represents the 95th percentile? c) What braking distance corresponds to the third quartile? d) What is the shortest braking distance that can be in the top 10% of braking distances? e) What is the longest braking distance that can be in the bottom 5% of braking distances? 5.4 Sampling Distributions and the Central Limit Theorem Intro Activity: Choose five words randomly from anywhere in the Ch 5 notes and write them down here. Def- The sampling distribution is the probability distribution of a sample statistic that is formed when the samples of size n are repeatedly taken from a population. If the sample statistic is the sample mean, then the distribution is the sampling distribution of sample means. Every sample statistic has a sampling distribution. Properties of Sampling Distributions of Sample Means x , is equal to the population mean. x 2. The std. dev. of the sample means, x , is equal to the population standard deviation divided by the square root of the sample size. x The std. dev. of the sampling distribution of sample n 1. The mean of the sample means, means is called the standard error of the mean. Ex- If I asked everyone in class to choose 3 numbers between 1 and 10, then find the mean, the mean might not be close to 5.5 (as it should be). What if I then asked everyone what their mean was, then found the average of all of the means. This average would be very close to 5.5. This is the idea of a sampling distribution of sample means. Your data consists of means (instead of x-values). The Central Limit Theorem 1. If samples of size n, where n 30 , are drawn from any population with mean and std. dev. , then the sampling distribution of sample means approximates a normal distribution. The greater the sample size, the better the population. 2. If the population is normally distributed, then the sampling distribution of sample means is normal for any sample size n. Formulas for either case above: x 2 n 2 x x n Ex- The graph shows the relative frequency for the amount of time that people spend in the shower per day. Determine which of the following graphs most likely resembles the sampling distribution of sample means in a sample of 100 people. Ex- People in the U.S. consume an average of 154.8 pounds of processed fruit per year with a std. dev. of 51.6 pounds. a) Random samples of size 54 are drawn from the population and the mean of each sample is calculated. Find the mean of the sampling distribution of sample means and the std. dev. of the sampling distribution of sample means. b) Assume that the population is normally distributed and samples of size 12 are selected. Find and x x . c) Assume that the population is not normally distributed and samples of size 12 are selected. Find x and x . Probability and the Central Limit Theorem To find the probability that a sample mean, distribution, you use the formula z x x x x , will fall in a given interval of the x sampling or, equivalently, z x . n Ex- The average annual salary for chauffeurs is $21,000 with std. dev. of $1500. a) What is the probability that in a sample of 45 chauffeurs, the average annual salary is less than $20,000? b) Find the probability that in a sample of 45 chauffeurs, the average annual salary is more than $21,700. c) Find the probability that a randomly selected chauffer has an annual salary that is at most $21,900. Assume that the data is normally distributed. Note: Probabilities less than 5% are unusual. Which of the above events are unusual? 5.5 Normal Approximations to Binomial Distributions Note: In 4.2 we used the binomial formula to calculate probabilities. In reality, if we have very large values of x such that P ( x 150) for n=200, we want a more efficient method than using the binomial formula 151 times (x=0, 1, 2, …, 150). The more efficient method is discussed here in section 5.5. Normal Approximation to a Binomial Distribution If np 5 and nq 5 , then the binomial random variable x is approximately normally distributed with mean np and std. dev. npq where n is the number of independent trials, p is the probability of success in a single trial, and q is the probability of failure in a single trial. Ex- Determine if you can use a normal distribution to approximate the binomial situation. If you can, find the mean and std. dev. If you can’t, explain why. a) A survey indicates that 46% of women pack too much when they go on vacation. Twelve women are randomly selected and asked if they pack too much on their vacation. b) Out of 117 people surveyed, 23 hate statistics. Seventeen people are randomly selected and asked if they hate statistics. Continuity Correction P(x=c) cannot be done because there is no width to the rectangle for the z-chart. Exact Binomial Probability Continuous Normal Approximation To allow for the curve to be continuous, you add and subtract .5 so that each of the bars meet. So P(x=c) = P(c – .5 < x < c + .5). This process is called the continuity correction. Ex- Write the appropriate inequality to represent the normal probability for the given binomial probability. a) Binomial P( x 25) Normal b) Binomial P( x 45) Normal c) Binomial P (5 x 8) Normal d) Binomial P ( x 2) Normal e) Binomial P ( x 8) Normal Using a Normal Distribution to Approximate Binomial Probabilities 1. Make sure the data is normal ( np 5 and nq 5 ). 2. Add and subtract .5 where appropriate to allow for a continuity correction. 3. Use the z-score to find the probability. Ex- 65% of children aged 12 to 17 keep at least part of their savings in a savings account. You randomly select 45 children between 12 and 17 and ask each child if he/she keeps at least part of their savings in a savings account. Determine if a normal approximation to a binomial distribution is possible, then complete the problem using the appropriate method. a) Find the probability that at most 20 children say yes. b) Find the probability that more than 30 children say yes. Ex- 33% of adults graded public schools as “excellent” for preparing students for college. You randomly select 12 adults and ask them if they think public schools are “excellent” for preparing students for college. Determine if a normal approximation to a binomial distribution is possible, then find the probability that more than 5 adults say yes using the appropriate method. Note: In a discrete probability distribution, there is a difference between P ( x c ) and P( x c) . P( x 3) P(3) P(4) P(5) ... and P( x 3) P(4) P(5) P(6) ... , In a continuous probability distribution, there is no difference between P ( x c ) and P( x c) , so P ( x 3) = P ( x 3) .