Download chapter1(3) - Portal UniMAP

Sampling Distributions  Introduction  Point Estimation  Sampling Distribution of x  Sampling Distribution for the p̂ Difference between Two Means  Sampling Distribution of p̂  Sampling Distribution for the p̂ Difference between Two Proportions Introduction  A sampling distribution is a distribution of all of the possible values of a sample statistic for a given size sample selected from a population.  For example, suppose you sample 50 students from your college regarding their mean GPA. If you obtained many different samples of 50, you will compute a different mean for each sample. We are interested in the distribution of all potential mean GPA we might calculate for any given sample of 50 students. The reason we select a sample is to collect data to answer a research question about a population. The sample results provide only estimates of the values of the population characteristics. The reason is simply that the sample contains only a portion of the population. With proper sampling methods, the sample results can provide “good” estimates of the population characteristics. Point Estimation Point estimation is a form of statistical inference. In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter. We refer to x as the point estimator of the population mean . s is the point estimator of the population standard deviation . p̂ is the point estimator of the population proportion p. Sampling Distribution of the Sample Mean  The probability distribution of X is called its sampling distribution. It list the various values that X can assume and the probability of each value of X . In general, the probability distribution of a sample statistic is called its sampling distribution.  If a population is normal with mean μ and standard deviation σ, the sampling distribution of is also normally distributed with X   X  and Z-value for the sampling distribution of X Z ( X  )  n  n Sample Mean Sampling Distribution: If the Population is not Normal  We can apply the Central Limit Theorem:  Even if the population is not normal, sample means from the population will be approximately normal as long as the sample size is large enough. n↑ As the sample size gets large enough… the sampling distribution becomes almost normal regardless of shape of population Properties and Shape of the Sampling Distribution of the Sample Mean, X .  If n≥30, X is normally distributed, where  2  X ~ N  ,  n    Note: If the  2 2 s unknown then it is estimated by .  If n<30 and variance is known. X is normally distributed  2  X ~ N  ,  n    If n<30 and variance is unknown. t distribution with n-1 degree of freedom is use T X  2 s n ~ tn 1 Example Suppose a population has mean μ = 8 and standard deviation σ = 3. Suppose a random sample of size n = 36 is selected. What is the probability that the sample mean is between 7.8 and 8.2? Solution:  Even if the population is not normally distributed, the central limit theorem can be used (n > 30)  … so the sampling distribution of x is approximately normal σ 3  with mean μ x = 8 and standard deviation σ x    0.5 n 36    7.8 - 8 X -μ 8.2 - 8  P(7.8  X  8.2)  P    3 σ 3   36 n 36    P(-0.4  Z  0.4)  0.3108 Example: The amount of time required to change the oil and filter of any vehicles is normally distributed with a mean of 45 minutes and a standard deviation of 10 minutes. A random sample of 16 cars is selected.  What is the standard error of the sample mean to be?  What is the probability of the sample mean between 45 and 52 minutes?  What is the probability of the sample mean between 39 and 48 minutes?  Find the two values between the middle 95% of all sample means. Solution:  X: the amount of time required to change the oil and filter of any vehicles  X ~ N 45,102  n  16  X : the mean amount of time required to change the oil and filter of any vehicles  102  X ~ N  45,  16   10 a) Standard error = standard deviation,    2.5 16 52  45   45  45 b) P  45  X  52   P  Z  2.5   2.5  P  0  Z  2.8   0.4974 48  45   39  45 c) P  39  X  48   P  Z  2.5 2.5    P  2.4  Z  1.2   0.4918  0.3849  0.8767 P  a  X  b   0.95 d) b  45   a  45 P Z    0.95 2.5   2.5 P  za  Z  zb   0.95 from table: za  1.96 zb  1.96 a  45  1.96  a  40.1 2.5 b  45  1.96  b  49.9 2.5 Sampling Distribution for the Difference between Two Means  Suppose we have two populations, X1 and X 2 which are normally distributed. X1 has mean 1 and variance  21 while X 2 has mean  2 and variance  2 2 . These two distributions can be written as: X 1 ~ N  1 ,  2 1  and X 2 ~ N  2 ,  2 2  Now we are interested in finding out what is the sampling distribution of the difference between two sample means, the distribution of X 1  X 2  12  22  X1  X 2 ~ N  1  2 ,   n n  1 2  Example: A taxi company purchased two brands of tires, brand A and brand B. It is known that the mean distance travelled before the tires wear out is 36300 km for brand A with standard deviation of 200 km while the mean distance travelled before the tires wear out is 36100 km for brand A with standard deviation of 300 km. A random sample of 36 tires of brand A and 49 tires of brand B are taken. What is the probability that the a) difference between the mean distance travelled before the tires of brand A and brand B wear out is at most 300 km? b) mean distance travelled by tires with brand A is larger than the mean distance travelled by tires with brand B before the tires wear out? Solution: X 1 : the mean distance travelled before the tires of brand A wear out X 2 : the mean distance travelled before the tires of brand B wear out  2002 3002  X 1  X 2 ~ N  36300  36100,   36 49   X 1  X 2 ~ N  200, 2947.846  a) P | X 1  X 2 | 300   P  300  X 1  X 2  300  300  200   300  200  P Z  2947.846   2947.846  P  9.21  Z  1.84   0.9671 b) P  X 1  X 2   P  X 1  X 2  0  0  200    PZ   2947.846    P  Z  3.68  0.9999 Sampling Distribution of the Sample Proportion  The population and sample proportion are denoted by p and p̂ , respectively, are calculated as, X p N and x pˆ  n where  N = total number of elements in the population;  X = number of elements in the population that possess a specific characteristic;  n = total number of elements in the sample; and  x = number of elements in the sample that possess a specific characteristic.  For the large values of n (n ≥ 30), the sampling distribution is very closely normally distributed. pˆ  pq  N  p,  n    Mean and Standard Deviation of Sample Proportion  P̂  p  P̂  pq n Example If the true proportion of voters who support Proposition A is pˆ  0.40 what is the probability that a sample of size 200 yields a sample proportion between 0.40 and 0.45? If p  0.40 and n = 200, what is P  0.40  pˆ  0.45 ? σ pˆ  p(1  p) 0.4(1  0.4)   0.03464 n 200 0.45  0.40   0.40  0.40 ˆ P(0.40  p  0.45)  P  Z  0.03464 0.03464    P(0  Z  1.44)  0.4251 Example: The National Survey of Engagement shows about 87% of freshmen and seniors rate their college experience as “good” or “excellent”. Assume this result is true for the current population of freshmen and seniors. Let p̂ be the proportion of freshmen and seniors in a random sample of 900 who hold this view. Find the mean and standard deviation of . Solution: Let p the proportion of all freshmen and seniors who rate their college experience as “good” or “excellent”. Then, p = 0.87 and q = 1 – p = 1 – 0.87 = 0.13 The mean of the sample distribution of p̂ is:  pˆ  p  0.87 The standard deviation of p̂ is:   pˆ pq 0.87(0.13)   0.011 n 900 Sampling Distribution for the Difference between Two Proportions  Now say we have two binomial populations with proportion of successes p1 and p2 respectively. Samples of size n1 are taken from population 1 and samples of size n2 are taken from population 2. Then p̂1 and p̂2 are the proportions from those samples. p1 1  p1    ˆ P1 ~ N  p1 ,  n1   p2 1  p2    ˆ P2 ~ N  p2 ,  n 2   p1 1  p1  p2 1  p2    ˆ ˆ P1  P2 ~ N  p1  p2 ,   n n 1 2   Example: A certain change in a process for manufacture of component parts was considered. It was found that 75 out of 1500 items from the existing procedure were found to be defective and 80 of 2000 items from the new procedure were found to be defective. If one random sample of size 49 items were taken from the existing procedure and a random sample of 64 items were taken from the new procedure, what is the probability that a) the proportion of the defective items from the new procedure exceeds the proportion of the defective items from the existing procedure? b) proportions differ by at most 0.015? c) the proportion of the defective items from the new procedure exceeds proportion of the defective items from the existing procedure by at least 0.02? Solution: PˆN :The proportion of defective items from the new procedure PÊ :The proportion of defective items from the existing procedure 80  0.04 2000 0.04(0.96)   PˆN ~ N  0.04,  64   pN  75  0.05 1500 0.05(0.95)   PÊ ~ N  0.05,  49   pE  0.05(0.95) 0.04(0.96)   ˆ ˆ PN  PE ~ N  0.04  0.05,   49 64   Pˆ  Pˆ ~ N  0.01, 0.0016  N  E   a) P PˆN  PÊ  P PˆN  PÊ  0  0   0.01    PZ   0.0016    P  Z  0.25   0.4013    b) P | PˆN  PÊ | 0.015  P 0.015  PˆN  PÊ  0.015  0.015   0.01   0.015   0.01  P Z  0.0016 0.0016    P  0.125  Z  0.625   0.2838    c) P PˆN  PÊ  0.02  P PˆN  PÊ  0.02  0.02   0.01    PZ   0.0016    P  Z  0.75   0.2266 End of Chapter 1

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download chapter1(3) - Portal UniMAP