Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ch. 10 Sampling Distributions 1. Parameter and statistic Recall • Population: • Sample: • Parameter: a number that describes the population • Statistic: a number that can be computed from the sample data without making use of any unknown parameters Example 10.1 We were interested in the weight of 2001 Gala apples. We measured the weights of 2000 2001 Gala apples. µ: average weight of all 2001 Gala apples, x̄: average weight of the 2000 2001 Gala apples. Which one is the parameter? Statistic? 1 For Q, • Parameter: µ, the population mean • Statistic: x̄, the sample mean In statistical practice, parameter is unknown; we use a statistic to estimate an unknown parameter. 2 Example 10.2 State whether each number in the following is a parameter or a statistic (a) The Bureau of Labor Statistics last month interviewed 60,000 members of the labor force, of whom 7.2% were unemployed. (b) A telemarketing firm in Los Angeles uses a device that dials residential telephone numbers in the city at random. Of the first 100 numbers dialed, 48% are unlisted. This is not surprising because 52% of all Los Angeles residential phones are unlisted. (c) A researcher carries out a randomized comparative experiment with young rats to investigate the effects of a toxic compound in food. She feeds the control group a normal diet. The experimental group receives a diet with 2500 parts per million of the toxic material. After 8 weeks, the mean weight gain is 335 grams for the control group and 289 grams for the experimental group. 3 2. Sampling Distributions Recall: • population, sample • parameter, statistic For Q, • Parameter: µ, the population mean; and σ, the population s.d. • Statistic: x̄, the sample mean; and s, the sample s.d. 4 Example 10.3 Fig. A shows the distribution of return for all 1815 stocks listed on the New York Stock Exchange for the entire year 1987. This was a year of extreme swings in stock prices, including a record loss of over 20% in a single day. The mean return for all 1815 stocks was −3.5% and the distribution shows a very wide spread. Fig. B shows the distribution of returns for all possible portfolios that invested equal amounts in each of 5 stocks. A portfolio is just a sample of 5 stocks and its return is the average return for the 5 stocks chosen. The mean return for all portfolios is still −3.5%, but the variation among portfolios is much less than the variation among individual stocks. For example, 11% of all individual stocks had a loss of more than 40%, but only 1% of the portfolios had a loss that large. The two Figures illustrate a basic principle of investment: diversification reduces risk. That is, buying several securities rather than just one reduces the variability of the return on an investment. 5 • Statistic varies from sample to sample — sampling variability. • Nevertheless, there is a regular distribution of its values in a large number of repetitions. Sampling distribution of a statistic: the distribution of values taken by the statistic in all possible samples of the same size from the same population. 6 For SRS • Mean of x̄ is equal to µ • Averages are less variable than individual observations — as sample size n increases, the s.d. of x̄ decreases, √ in fact the s.d. of x̄ is σ/ n • If the population distribution is normal, x̄ is always normal • If the population distribution is not normal, averages are more normal than individual observations — when n is large, the sampling distribution of x̄ is close to normal (the central limit theorem). How large an n is needed depends on the shape of the population distribution. A statistic used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated. x̄ is an unbiased estimator of µ. 7 x̄ − µ √ ∼ N (0, 1) σ/ n if the population is normal; x̄ − µ . √ ∼ N (0, 1) σ/ n when n is large, no matter what shape the population distribution has. Thus, probability problems about x̄ are the normal distribution calculation problems discussed in Chapter 3 with z − score = 8 x̄ − µ √ σ/ n Example 10.4 The height x of young American women varies approximately according to N (64.5, 2.5). P (x > 67) = For an SRS of 10 P (x̄ > 67) = 9 Example 10.5 The GPAs of all students enrolled at a large university have an approximately normal distribution N (3.03, .28). Find the probability that the mean GPA of an SRS of 16 students selected from this university is (a) 3.1 or higher (b) 2.89 or lower (c) 2.89 to 3.1 Solution: 10 Example 10.6 A life insurance company sells a term insurance policy to a 21-year old male that pays $100,000 if the insured dies within the next 5 years. Suppose that the distribution of profit for each policy has mean $100 and standard deviation $9,000. (a) If the insurance company sells 25 such policies, what is the probability that the average profit per policy is less than -$1000? (b) If the insurance company sells 4,000,000 such policies, what is the probability that the average profit on the 4,000,000 policies is (i) less than $90 (ii) more than $110 (iii) $90 to $110 Solution: 11 Example 10.7 Children in kindergarten are sometimes given the Ravin Progressive Matrices Test (RPMT) to assess their readiness for learning. Experience at Southwark Elementary School suggests that the RPMT scores for its kindergarten pupils have mean 13.6 and s.d. 3.1. The distribution is close to normal. Mr. Lavin has 22 children in his kindergarten class this year. He suspects that their RPMT scores will be unusually low because the test was interrupted by a fire drill. To check this suspicion, he wants to find the level L such that there is probability only 0.05 that the mean score of 22 children falls below L when the usual Southwark distribution remains true. What is the value of L? Solution: 12