Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
BOBBY B. LYLE SCHOOL OF ENGINEERING EMIS - SYSTEMS ENGINEERING PROGRAM SMU EMIS 7370 STAT 5340 Department of Engineering Management, Information and Systems Probability and Statistics for Scientists and Engineers Estimation Basic Concepts & Estimation of Proportions Dr. Jerrell T. Stracener 1 BOBBY B. LYLE SCHOOL OF ENGINEERING EMIS - SYSTEMS ENGINEERING PROGRAM SMU EMIS 7370 STAT 5340 Department of Engineering Management, Information and Systems Probability and Statistics for Scientists and Engineers Estimation Basic Concepts & Estimation of Proportions Dr. Jerrell T. Stracener 2 Estimation • Types of estimates and methods of estimation • Estimation - Binomial distribution - Estimation of a Proportion - Estimation of the difference between two proportions • Estimation - Normal distribution - Estimation of the mean - Estimation of the standard deviation - Estimation of the difference between two means 3 Estimation • Estimation - Normal distribution (continued) - Estimation of the ratio of the two standard deviations - Tolerance intervals • Estimation - Lognormal distribution • Estimation - Weibull distribution • Estimation - Unknown distribution Types - Continuous populations - Finite populations 4 Estimation Types of Estimates & Methods of Estimation 5 Definition - Statistic A statistic is a function of only the values of a random sample, X1, X2, …, Xn. For example 1 n X Xi n i 1 is a statistic 6 Properties of Estimates ^ A statistic is said to be an unbiased estimator of the parameter if ^ ^ E() If we consider all possible unbiased estimators of some parameter , the one with the smallest variance is called the most efficient estimator of 7 Types of Estimates • Point Estimate A function of the values of a random sample that yields a single value, i.e., a point • Interval Estimate An interval, whose end points are functions of the values of a random sample, for which one can assert with a specified confidence that the interval contains the parameter being estimated 8 Types of Estimates & Methods of Estimation If we use a sample mean to estimate the mean of a population, a sample proportion to estimate the probability of success on an individual trial, or a sample variance to estimate the variance of a population, we are in each case using a point estimate of the parameter in question. These estimates are called point estimates since they are single numbers, single points, used, respectively, to estimate , , and 2. Since we can hardly expect the point estimates based on samples to hit the parameters they are supposed to estimate exactly ‘on the nose’, it is often desirable to give an interval rather than a single number. 9 Types of Estimates & Methods of Estimation We can then assert with a certain probability (or degree of confidence) that such an interval contains the parameter it is intended to estimate. For instance, when estimating the average IQ of all college students in the US, we might arrive at a point estimate of 117, or we might arrive at an interval estimate to the effect that the interval from 113 to 121 contains the ‘true’ average IQ of all college students in the US. 10 Interval Estimates of for Different Samples 11 Method of Maximum Likelihood Given independent observations x1, x2, ..., xn from a probability density function (continuous case) f(x; ) or probability mass function (discrete case) ^ p(x; ) the maximum likelihood estimator is that which maximizes the likelihood function. L(X1, X2, ..., Xn; ) = f(X1; )·f(X2; )·...·f(Xn; ), if x is continuous = p(X1; )·p(X2; )·...·p(Xn; ), if x is discrete 12 Method of Maximum Likelihood Let x1, x2, ..., xn denote observed values in a sample. In the case of a discrete random variable the interpretation is very clear. The quantity L(x1, x2, ..., xn; ), the likelihood of the sample, is the following joint probability: P(X1 = x1, X2 = x2, ... , Xn = xn) This is the probability of obtaining the sample values x1, x2, ..., xn. For the discrete case the maximum likelihood estimator is one that results in a maximum value for this joint probability, or maximizes the likelihood of the sample. 13 Estimation of Proportions 14 Estimation of Proportions Estimation of the proportion, p, based on a random sample of the Binomial Distribution B(n,p) • Point Estimation • Interval Estimation • Approximate Method • Sample Size • Exact Method Estimation of the difference in two proportions P1 - P2 based on random samples from B(n, P1) and B(n, p2). 15 Estimation of Proportions • Point Estimation • Interval Estimation 16 Estimation - Binomial Distribution Estimation of a Proportion, p • X1, X2, …, Xn is a random sample of size n from B(n, p), where 1 if success Xi 0 if failure for i 1, ..., n • Point estimate of p: fs _ p X n ^ where fs = # of successes 17 Estimation - Binomial Distribution • Approximate (1 - )·100% confidence interval for p: p 'L , p 'U where where and and ^ p p p ' L p Z / 2 Z 2 p^ q^ n ^ p p p ' U , is the value of the standard normal random variable Z such that PZ z / 2 2 18 Note When n is small and the unknown proportion p is believed to be close to 0 or to 1, the approximate confidence interval procedure established here is unreliable and, therefore, should not be used. To be on the safe side, one should require n^ p5 or n^ q5 19 ^ Error in Estimating p by p error p p^ ^ p z /2 ^^ pq n ^ p z / 2 ^^ pq n 20 Error in Estimating p by ^ p • If ^ p is used as an estimate of p, we can be (1 - )·100% confident that the error will not exceed z / 2 ^^ pq n • If ^ p is used as an estimate of p, we can be (1 - )·100% confident that the error will be less than a specified amount e when the sample size is z2 / 2 ^^ pq n e2 21 Error in Estimating p by ^ p • If ^ p is not used as an estimate of p, we can be at least (1 - )·100% confident that the error will not exceed a specified amount e when the sample size is z2 / 2 n 2 4e 22 Example In a random sample of n = 500 families owning television sets in the city of Hamilton, Canada, it is found that x = 340 subscribed to HBO. a. How large a sample is required if we want to be 95% confident that our estimate of p, ^ p, is within 0.02? b. How large a sample is required if we want to be 95% confident that our estimate of p is within 0.02? 23 Example - Solution a. Let us treat the 500 families as a preliminary sample providing an estimate ^ p = 0.68. 1.96 0.680.32 n 2090 2 0.02 2 Therefore, if we base our estimate of p on a random sample of size 2090, we can be 95% confident that our sample proportion will not differ from the true proportion by more than 0.02 24 Example - Solution b. We shall now assume that no preliminary sample has been taken to provide an estimate of p. Consequently, we can be at least 95% confident that our sample proportion will not differ from the true proportion by more than 0.02 if we choose a sample of size 2 1.96 n 2 40.02 2401 25 Example: Estimation of Binomial parameter p In a random sample of n = 500 families owning television sets in the city of Hamilton, Canada, it was found that fS = 340 owned color sets. Estimate the population proportion of families with color TV sets and determine a 95% confidence interval for the actual proportion of families in this city with color sets. 26 Example: solution The point estimate of p is ^ p = 340/500 = 0.68. Then, an approximate 95% confidence interval for p is pL , pU where pL ^p p and p Z / 2 and ^^ pq n pU ^p p 0.680.32 1.96 500 0.04089 so that pL 0.68 0.04089 0.63911 27 Example: solution and pU 0.68 0.04089 0.72089 an approximate 95% confidence interval for p is (0.63911, 0.72089). Therefore, our “best” (point estimate) of p is 0.68 and we are about 95% confident that p is between 0.64 and 0.72. 28 Estimation - Binomial Population • Exact (1 - )·100% Confidence Interval for p: PL , PU PL f s F1 n f s 1 f s F1 PU (f s 1)F2 n f s (f s 1)F2 , where F2 = F / 2;2(fs 1), 2( n fs ) , and , where F1 = F1( α / 2); 2fs , 2( n fs 1) and F ,df1 ,df2 is the value of x for which P(X> F ,df1 ,df2 )= NOTE: Use the ‘FINV’ function in Excel to get the values of F1 and F2 29 Example A random sample of 25 vehicle records are selected for audit from a large number of county records. It is found that 5 have errors. Estimate the population proportion of vehicle records having errors in terms of a point estimate and 95% confidence interval. 30 Example - solution fS 5 px 0.20 n 25 ^ An approximate 95% confidence interval for p is pL' , pU' where = 0.05 and Z Z 0.025 1.96 2 31 Example - solution Then ^ p p z ' L 2 ^^ pq n 0.20 1.96 0.20.8 25 0.20 1.960.08 0.20 0.157 0.043 32 Example - solution ^ p p z ' U 2 ^^ pq n 0.20 0.157 0.357 33 Example - solution An exact 95% confidence interval for p is pL , pU where fs F pL α 1 ,2 f s ,2 n f s 1 2 n f s 1 f s F1 α ,2 f ,2 n f 1 5F0.975,10,42 2 s s 21 5F0.975,10,42 50.308 21 50.308 0.068 34 Example - solution and pU f s 1F , 2 f 1, 2n f 2 s s n f s f s 1F ,2 f 1,2n f 2 s s 6 F0.025,12, 40 20 6 F0.025,12, 40 62.39 20 62.39 0.418 35 Estimation - Binomial Populations Estimation of the difference between two proportions • Let X11, X12, …, X1n1 , and X21, X22, …, X 2 n2 , be random samples from B(n1, p1) and B(n2, p2) respectively • Point estimation of p1 – p2 = p ^ p^ p1 ^ p2 X1 X 2 f f 1 2 n1 n2 36 Estimation - Binomial Populations • Approximate (1 - )·100% confidence interval for p p1 p2 pL , pU where ^ pL p Z 2 ^ p^ q 1 1 n1 ^ p ^ q 2 2 n2 and ^ pU p Z 2 ^ p^ q 1 n1 1 ^ p ^ q 2 2 n2 37 Example: Estimation of P1 - P2 A certain change in a manufacturing process for component parts is being considered. Samples are taken using both the existing and the new procedure in order to determine if the new procedure results in an improvement. If 75 of 1500 items from the existing procedure were found to be defective and 80 of 2000 items from the new procedure were found to be defective, find a confidence interval for the true difference in the fraction of defectives between the existing and the new process. 38 Example: solution Let p1 and p2 be the true proportions of defectives for the existing and new procedures, respectively. Hence 75 ^ p1 0.05 1500 and ^ p 2 80 0.04 2000 and the point estimate of p = p1 - p2 is ^p ^ p1 ^ p2 = 0.05 - 0.04 = 0.01 39 Example: solution An approximate 90% confidence interval for p = p1 - p2 is ' , ' where pL pU 40 Example: solution ^ pL p ' ^ pU p ' Z 2 ^ p^ q 1 1 n1 p^2 q^2 n2 0.050.95 0.040.96 1.645 1500 2000 0.011732 41 Example: solution Then ' pL 0.01 0.011732 0.001732 ' pU 0.01 0.011732 0.021732 Therefore an approximate 90% confidence interval for = p1 - p2 is (-0.0017, 0.0217). LN LN 0.0217 rN |E 0.927 L LE LN 0.0234 or about 93% of the length of the confidence Interval favors the new procedure 42