Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

Introduction Point Estimation Sampling Distribution of Sample Mean (X ) Sampling Distribution for the Difference between Two Means Sampling Distribution of Sample Proportion (p̂) Sampling Distribution for the Difference between Two Proportions Introduction A sampling distribution - probability distribution of a sample statistic based on all possible simple random sample of the same size from the same population. If we take several sample and find mean of the sample, therefore the distribution of the sample mean called sampling distribution of the sample mean, X . For example, suppose you sample 50 students from your college regarding their mean GPA. If you obtained many different samples of 50, you will compute a different mean for each sample. We are interested in the distribution of all potential mean GPA we might calculate for any given sample of 50 students. Introduction If we take several sample and find the ratio of the specific characteristic in the sample, therefore the distribution of the sample proportion called sampling distribution of the sample proportion,p̂ . Point Estimation Point estimation is a form of statistical inference. In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter. We refer to X as the point estimator of the population mean . s is the point estimator of the population standard deviation . p̂ is the point estimator of the population proportion p. X The probability distribution of X is called its sampling distribution. It lists the various values that X can assume, and the probability of each value of X . If the population is normally distributed with mean μ and standard deviation σ, the sampling distribution of the sample mean, X is also normally distributed with: i) Mean ii) Standard deviation / standard error x x n where n is the sample size If the population is not normally distributed, apply central limit theorem. Central Limit Theorem: ◦ Even if the population is not normal, sample means from the population will be approximately normal as long as the sample size is large enough n↑ (n≥30). the sampling As the sample size gets large enough… distribution becomes almost normal regardless of shape of population Z-value for the sampling distribution of mean ( X ): x x Z x n Suppose a population has mean μ = 8 and standard deviation σ = 3. A random sample of size n = 36 is selected. What is the probability that the sample mean is between 7.8 and 8.2? Solution: • • • Even if the population is not normally distributed, the central limit theorem can be used (n > 30). So the sampling distribution of the sample mean is approximately normal with mean, x 8 and standard deviation, 3 x 0.5 Hence, n 36 7.8 - 8 X-μ 8.2 - 8 P(7.8 X 8.2) P 3 σ 3 36 n 36 P(-0.4 Z 0.4) 0.3108 Z X If n ≥ 30 (large) , the sampling distribution of the sample mean is normally distributed; 2 x ~ N , n 2 2 Note: If the unknown then it is estimated by s . If n < 30 (small), is known, and the sampling distribution of the sample mean is normally distributed if the sample is from the normal population; 2 x ~ N , n 2 If n<30 and is unknown. t distribution with n-1 degree of freedom is use; 2 T x 2 s n ~ t n 1 The amount of time required to change the oil and filter of any vehicles is normally distributed with a mean of 45 minutes and a standard deviation of 10 minutes. A random sample of 16 cars is selected. What is the standard error of the sample mean to be? What is the probability of the sample mean between 45 and 52 minutes? What is the probability of the sample mean between 39 and 48 minutes? Find the two values between the middle 95% of all sample means. Solution: X : the amount of time required to change the oil and filter of any vehicles X ~ N 45,102 n 16 X : the mean amount of time required to change the oil and filter of any vehicles 102 X ~ N 45, 16 a) The standard error, x 10 16 2.5 52 45 45 45 b) P 45 X 52 P Z 2.5 2.5 P 0 Z 2.8 0.4974 48 45 39 45 c) P 39 X 48 P Z 2.5 2.5 P 2.4 Z 1.2 0.4918 0.3849 0.8767 P a X b 0.95 d) b 45 a 45 P Z 0.95 2.5 2.5 P za Z zb 0.95 from table: za 1.96 zb 1.96 a 45 1.96 a 40.1 2.5 b 45 1.96 b 49.9 2.5 Z Exercise: A certain type of thread is manufactured with a mean tensile strength is 78.3kg, and a standard deviation is 5.6kg. Assuming that the strength of this type of thread is distributed approximately normal, find: a) The probability that the mean strength of a random sample of 10 such thread falls between 77kg and 78kg. b) The probability that the mean strength greater than 79kg. c) The probability that the mean strength is less than 76kg. d) The value of X to the right of which 15% of the mean computed from random samples of size 10 would fall. Suppose we have two populations, and which are normally distributed: X 1 ~ N ( 1 , 12 ) and X 2 ~ N ( 2 , 2 ) 2 Sampling distribution for X 1 and X 2: 1 X 1 ~ N 1 , n1 2 and 2 2 X 2 ~ N 2 , n 2 Now we are interested in finding out what is the sampling distribution of the difference between two sample means, the distribution of X 1 X 2: 12 22 X1 X 2 ~ N 1 2 , n1 n2 A taxi company purchased two brands of tires, brand A and brand B. It is known that the mean distance travelled before the tires wear out is 36300 km for brand A with standard deviation of 200 km while the mean distance travelled before the tires wear out is 36100 km for brand B with standard deviation of 300 km. A random sample of 36 tires of brand A and 49 tires of brand B are taken. What is the probability that the a) Difference between the mean distance travelled before the tires of brand A and brand B wear out is at most 300 km? b) Mean distance travelled by tires with brand A is larger than the mean distance travelled by tires with brand B before the tires wear out? Solution: X 1 : The mean distance travelled before the tires brand A wear out X 2 : The mean distance travelled before the tires brand B wear out Exercise: The mean final examination scores for students taking SM2703 is 30 marks (out f 50 marks) with standard deviation of 6 marks. Assume that the final scores are approximately normal. Two random samples were taken randomly consisting of 32 and 50 students respectively. What is the probability that: a) The mean final examination scores will differ by more than 3 marks? b) Mean final examination scores from group 1 is larger than group 2? p̂ The probability distribution of the sample proportion p̂ is called its sampling distribution. The population and sample proportion are denoted by p and p̂ respectively, and calculated as: x X pˆ p and n N where N = total number of elements in the population; X = number of elements in the population that possess a specific characteristic; n = total number of elements in the sample; and x = number of elements in the sample that possess a specific characteristic For the large values of n (n ≥ 30), the sampling distribution is very closely normally distributed. pq pˆ ~ N p, n With the mean of sample proportion is denoted by p̂ and equal to the population proportion, p. pˆ p The standard deviation of the sample proportion is denoted by pˆ pq n If the true proportion of voters who support Proposition A is p = 0.40, what is the probability that a sample of size 200 yields a sample proportion between 0.40 and 0.45? Solution: (0.4)(0.6) pˆ ~ N 0.4, 200 pˆ ~ N 0.4,0.0012 0.4 0.4 0.45 0.4 P(0.40 pˆ 0.45) P Z 0.0012 0.0012 P(0 Z 1.44) 0.4251 Z Now say we have two binomial populations with proportion of successes p̂1 and p̂2 : p1 q1 and p2 q2 pˆ 1 ~ N ( p1 , ) pˆ 2 ~ N ( p2 , ) n1 n2 The sampling distribution of the difference between two sample proportions, Pˆ1 Pˆ2 : p1q1 p2 q2 ˆ ˆ P1 P2 ~ N p1 p 2 , n1 n2 A certain change in a process for manufacture of component parts was considered. It was found that 75 out of 1500 items from the existing procedure were found to be defective and 80 of 2000 items from the new procedure were found to be defective. If one random sample of size 49 items were taken from the existing procedure and a random sample of 64 items were taken from the new procedure, what is the probability that a) the proportion of the defective items from the new procedure exceeds the proportion of the defective items from the existing procedure? b) proportions differ by at most 0.015? c) the proportion of the defective items from the new procedure exceeds proportion of the defective items from the existing procedure by at least 0.02? Solution: PˆN : the proportion of defective items from the new procedure Pˆ : the proportion of defective items from the existing procedure E 80 75 pN 0.04 pE 0.05 2000 1500 0.04(0.96) 0.05(0.95) ˆ ˆ PN ~ N 0.04, PE ~ N 0.05, 64 49 0.05(0.95) 0.04(0.96) ˆ ˆ PN PE ~ N 0.04 0.05, 49 64 Pˆ Pˆ ~ N 0.01,0.0016 N E a) P PˆN PˆE P PˆN PˆE 0 0 0.01 PZ 0.0016 P Z 0.25 0.4013 b) P | PˆN PˆE | 0.015 P 0.015 PˆN PˆE 0.015 0.015 0.01 0.015 0.01 P Z 0.0016 0.0016 P 0.125 Z 0.625 0.2838 c) P PˆN PˆE 0.02 P PˆN PˆE 0.02 0.02 0.01 PZ 0.0016 P Z 0.75 0.2266 Exercise: Usually 3% of the diskettes produced by machine A is defective while 2% of the diskettes produced by machine B is defective. If a random sample of 50 diskettes produced by machine A and a random sample of 50 diskettes produced by machine B are chosen, what is the probability that a) b) c) The difference between the sample proportion of defective diskettes produced by machine A and the sample of defective diskettes produced by machine B do not exceed o.1? The difference between the sample proportion of defective diskettes produced by machine A and the sample of defective diskettes produced by machine B is at least 0.15? The sample proportion of defective diskettes produce by machine A exceeds the sample proportion of defective diskettes produced by machine B.