Download 1342Lecture6.pdf

Lecture 6 57 Instruction: Sampling Distributions A population can be thought of as a set of measurements, either existing or conceptual. Recall that a sample is a subset of measurements from the population. Recall also that a population parameter is a numerical descriptive measure of the population. For example, the population mean µ is a parameter. A statistic, on the other hand, is a numerical descriptive measure of a sample such as the sample mean X . Often we do not have access to all the measurements of an entire population, so we must use samples instead. In such cases, we will use a statistic to make inferences about corresponding population parameters. In order to evaluate the reliability of our inferences, we will need to know the probability distribution for the statistic we are using. This probability distribution is called a sampling distribution. Note that a sampling distribution is specific to a particular statistic. For example, the sampling distribution of the mean is the distribution of all the possible sample means of certain-sized samples. While the X -distribution is not the only sampling distribution, it is a very important one. Instruction: Sampling Distribution of the Mean Consider a set of data that includes 100 samples of ten measurements. Suppose the table below reflects the means of the 100 samples. Classes of X for the 100 Samples 6<X<7 7<X<8 8<X<9 9 < X < 10 10 < X < 11 11 < X < 12 12 < X < 13 13 < X < 14 14 < X < 15 15 < X < 16 f f/100 = Relative Frequency 3 0.03 5 0.05 9 0.09 12 0.12 14 0.14 17 0.17 16 0.16 14 0.14 6 0.06 4 0.04 Since the relative frequencies may be thought of as probabilities, the table effectively represents a probability distribution. Since X represents the mean measurement, then we can estimate the probability of X falling into each class by using the relative frequencies. Accordingly, the grouped relative frequency distribution given in Figure 1 represents a probability distribution of the X values. Lecture 6 58 0.2 0.15 0.1 0.05 0 6. 07. 0 7. 08. 0 8. 09. 9. 0 010 10 .0 .0 -1 1 11 .0 .0 -1 2 12 .0 .0 -1 3 13 .0 .0 -1 4 14 .0 .0 -1 5 15 .0 .0 -1 6. 0 Relative Frequency, f /100 Figure 1 Measurements Each bar in Figure 1 represents the estimated probabilities of X values based on the table; thus, the graph represents a probability sampling distribution for the sample mean based on random samples of ten measurements. We can see that the distribution is mound-shaped and almost bell-shaped. Irregularities occur due to the small number of samples used (only 100 sample means) and the rather small sample size (ten measurements per sample). These irregularities would become less obvious and even disappear if the number of samples increased, if the number of classes increased, and if the number of measurements per sample increased. In fact, the curve would eventually become a perfect bell-shaped curve. This property of the sampling distribution for the sample mean is the main conclusion of the two theorems discussed in the next portion of this lecture. Instruction: Central Limit Theorem The sample mean is said to be unbiased because the mean of all the possible sample means of a certain size equals the population mean. Indeed, we can rely on the Sample Mean Theorem stated below. Let X be a random variable with a normal distribution whose mean is µ and standard deviation is σ . Let X be the sample mean corresponding to random samples of size n taken from the X-distribution. Then the Sample Mean Theorem asserts that the following three statements are true: a) The X -distribution is a normal distribution. b) The mean of the X -distribution is µ . c) The standard deviation of the X -distribution , called the standard error of the mean, is σ n . Lecture 6 59 Using the Sample Mean Theorem, we conclude that the X -distribution will be normal provided that the X-distribution is normal regardless of the sample size. Furthermore, we can convert the X -distribution to the standard normal Z-distribution using the formula below. The Z-score for the sampling distribution of the mean is given by Z= X − µX σX = X −µ σ n = ( n X −µ σ ). The Sample Mean Theorem gives complete information about the X -distribution provided the original X-distribution is normal. It turns out, however, that the same conclusions can be had as long as the sample size is "large enough" regardless of whether or not the X-distribution is normal. This is the conclusion of the Central Limit Theorem stated below. The Central Limit Theorem states that if X possesses any distribution with mean µ and standard deviation σ , then the sample mean X based on a random sample of size n will have a distribution that approaches the distribution of a normal random variable with mean µ and standard deviation of σ n as n increases without limit. According to the Central Limit Theorem, the X -distribution will approximate the normal as the sample size n increases without limit. Most statisticians agree that once n reaches at least thirty the X -distribution will approximate a normal distribution closely enough to treat the X -distribution as essentially normal. Instruction: Normal Approximation to the Binomial Distribution Sometimes the random variable of an experiment is categorical and can take only two "values" (that is two categorical identities). For instance, consider a vaccine that protects 95% of adults from a vaccine. From a population of vaccinated adults, a sample of adults could be chosen and then observed to be protected or not protected. The random categorical variable, X, takes on the so-called values of protected and not protected. In cases like this, the probability that 480 adults from a sample of 500 are protected can be calculated using the binomial distribution, but doing so would require tedious calculations. Luckily, the normal distribution can be used to approximate the binomial distribution given the conditions stated below. Let p be the probability of success and let 1 – p be the probability of failure in a single binomial trial. Let n be the number of trials in the binomial experiment. If n, p, and 1 – p are such that both np > 5 and n(1 – p) > 5, then the normal probability distribution with µ = np and σ = np (1 − p ) will be a good approximation to the binomial distribution, and as n gets larger the approximation gets better. Lecture 6 60 When the conditions above hold, the binomial distribution approximates the normal distribution. In practice, one must keep in mind that the binomial distribution is discrete while the normal distribution is continuous. Assume that the normal distribution is being used to approximate the binomial for X successes. If X is a left endpoint of an interval, we subtract 0.5 to get the corresponding normal variable X. If X is a right endpoint of an interval, we add 0.5 to get the corresponding normal variable X. For instance, if we are interested in P ( X < 6 ) where X is a binomial variable, we would approximate it with a normal variable X by calculating P ( X < 6.5 ) . Similarly, if we are interested in P ( X > 9 ) where X is a binomial variable, we would approximate it with a normal variable X by calculating P ( X > 8.5 ) . Instruction: Sampling Distribution of the Proportion Assume that 60% of a population meets a particular characteristic (while 40% of the population does not). The parameter π indicates the proportion of the population that are identified with the characteristic while the statistic p indicates the proportion of members of a sample that are identified with the characteristic. If the sample proportion was calculated for each of every possible n-sized sample, the distribution of the calculated p-values would be the sampling distribution of the proportion, which is equal to the binomial distribution, which approximates the normal distribution under certain conditions as we have seen. The mean of the sampling distribution of the proportion equals π . The standard deviation of the sampling distribution of the proportion is called the standard error of the proportion is given by the equation below. The standard error of the proportion denoted σ p is given by σp = π (1 − π ) n Using π for the population mean, p for the sample mean, and σ p for sample standard deviation, we attain the Z-score for the sampling distribution of the proportion given below. The Z-score for the sampling distribution of the proportion is given by Z= p −π σp = p −π π (1 − π ) n Assignment 6 61 Problems #1 Suppose a team of marine biologists studying a population of trout lengths has determined that X has a normal distribution with µ = 10.2 inches and a standard deviation of σ = 1.4 inches. What is the probability that the mean length of five trout taken at random is between 8 and 12 inches? #2 A certain strain of bacteria occurs in all raw milk. Let X be the bacteria count per milliliter of milk. The health department has found that if the milk is not contaminated, then X has a distribution that is more or less mound-shaped but not symmetric. The mean of the Xdistribution is µ = 2,500 and the standard deviation is σ = 300 . In a large commercial dairy the health inspector takes 42 random samples of the milk produced each day. At the end of the day the bacteria in each of the 42 samples is averaged to obtain the sample mean bacteria count X. A) Assuming the milk is not contaminated, what is the mean and standard deviation of the distribution of X ? B) Assuming the milk is not contaminated, what is the probability that the average bacteria count X for one day is less than 2,650 bacteria per milliliter? C) At the end of each day, the health inspector should write a report to accept or reject the milk produced that day. What should the health inspector do if the X for the day is greater than 2,650? #3 The owner of a new apartment building must install twenty-five water heaters. From past experience in other apartment buildings the owner knows that Sun-Temp is a good brand; indeed, Consumer Reports has determined that the probability that a Sun-Temp water heater will last ten years or more is 0.25. What is the probability that eight or more of the owner's twenty-five new Sun-Temp water heaters will last at least ten years? #4 According to Horseman's Quarterly, two out of five qualifying racehorses win money. If fifty qualifying racehorses are selected at random, what is the probability that less than two-thirds of the fifty horses will win money?

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1342Lecture6.pdf