Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Quantitative Methods Varsha Varde Estimation • • • • • • • • Contents. 1. Introduction 2. Point Estimators and Their Properties 3. Single Quantitative Population 4. Single Binomial Population 5. Two Quantitative Populations 6. Two Binomial Populations 7. Choosing the Sample Size Varsha Varde 2 • Statistical inference has two main branches: • Estimation • Hypothesis testing. Estimation • The objective of statistical estimation is to estimate the value of some unknown population parameter on the basis of the sample drawn from this population. • Estimation is of two types: • (i) point estimation which gives a single valued estimate of the population parameter, and • (ii) interval estimation which provides a range of values with the help of two numbers Point Estimate • Sample statistics are used as estimates of population parameters. • sample mean(x) ־is used as an estimate of population mean (µ) • sample standard deviation (S)is used as an estimate of population standard deviation(σ) ,where S is as shown in the next slide • the proportion of items in a sample(p ¯) with given characteristics is taken as an estimate of the proportion of such items in the population(p). • Such estimates are called point estimates, because they provide a single valued estimate of the parameter. Sample Standard Deviation n s s 2 x i x i 1 n 1 2 Interval Estimate • Interval Estimate: It is an estimate that includes a range of values in which a population parameter is expected to lie. • The population parameter would be within this interval not with certainty but with a specified probability. • This probability is known as the level of confidence. Commonly used levels of confidence are 0.90, 0.95 and 0.99.Correspondingly levels of significance are 0.10,0.05,and 0.01 Interval Estimate • Confidence Level: It is the probability with which the population parameter lies within the stated interval of values. The stated interval of values is known as the confidence interval. • Confidence Interval: It gives an interval of values, centered on sample statistics, in which the population parameter is expected to lie with a known level of confidence • When we say that a 95% confidence interval for the population parameter is (2.5, 3.5) it indicates that the true value of the population parameter would be anywhere between 2.5 and 3.5 with probability 0.95 or in other words we can expect to be right in this assertion 95% of the times and wrong 5% of the times. Example Type of Estimates: Point or Interval • The average age of a bank clerk is 27 • We are 95% confident that the average age of a bank clerk is in the range of 22 to 30 • The proportion of customers who experience helping attitude of bank employees is 60% • We are 90% confident that the proportion of customers who experience helping attitude of bank employees is in the range of 50% to 70%. Desired Properties of Point Estimators. • (i) Unbiased: Mean of the sampling distribution is equal to the parameter. • (ii) Efficient: Minimum variance, Small standard error of point estimator. • (iii) Consistent: Error of estimation, distance between a parameter and its point estimate decreases as sample size increases. • (iv) Sufficient: Maximum usage of sample information Desired Properties of Interval Estimators. • Confidence Level(1 –α)100% should be as high as possible. (α)100% is level of significance and (1 –α)100% is level of confidence. (α)equals either 0 .10 or 0.05 or 0.01 Margin of Error Or Precision: (Bound on the error of estimation) should be as small as possible. Varsha Varde 11 Parameters of Interest. • Single Population: µ ( Mean of population) • Single Population: p ( Population proportion) • Two Populations: µ1 ,µ2 ( Means of two populations) • Two Populations: p1 , p2 (Proportions in two populations) Varsha Varde 12 Single Quantitative Population • Parameter of interest: µ • Sample data: n, x־, s • Other information:(1 –α)100% level of confidence • Point estimator of µ : x־ • mean of x ־:E(x =)־µx = ־µ • Standard error of x ־: SE(x = )־σ/vn (also denoted as σx)־ Varsha Varde 13 Single Quantitative Population • Confidence Interval (C.I.) for µ: • x¯ ± zα/2σ/vn ( point estimate ± Bound ) • Confidence level: (1 -α)100% which is the probability that the interval estimator contains the parameter. • zα/2= 1.96 for 95% level of confidence • zα/2= 1.645 for 90% level of confidence • zα/2= 2.58 for 99% level of confidence • Margin of Error. ( or Bound on the Error of Estimation) B = zα/2σ/vn • Width of the Confidence Interval (C.I.) W= 2zα/2σ/vn • Assumptions. 14 • 1. Large sample (n >=Varsha 30)Varde Examples • Example 1. We are interested in estimating the mean number of unoccupied seats per flight, µ, for a major airline. A random sample of n = 225 flights shows that the sample mean is 11.6 and the standard deviation is 4.1. • Data summary: n = 225; x = ־11.6; s = 4.1. • Question 1. What is the point estimate of µ ( Do not give the margin of error)? • x = ־11.6 Varsha Varde 15 Example • Question 2. Give a 95% bound on the error of estimation (also known as the margin of error) • B = zα/2σ/vn • = 1.96 x4.1/v225 = 0.5357 • Question 3. Find a 90% confidence interval for µ. • x ־± zα/2σ/vn • 11.6 ± 1.645x4.1/v225 • 11.6 ± 0.45 = (11.15, 12.05) Varsha Varde 16 Example • Question 4. Interpret the CI found in Question 3. • The interval (11.15, 12.05) contains the true value of the population parameter µ with probability 0.90 • Question 5. What is the width of the CI found in Question 3.? • The width of the CI is • W = 2zα/2σ/vn • W = 2(0.45) = 0.90 • OR • W = 12.05 - 11.15 = 0.90 Varsha Varde 17 Example • Question 6. If n, the sample size, is increased what happens to the width of the CI? what happens to the margin of error? • The width of the CI decreases. • The margin of error decreases. • Sample size: • n ≈ (zα/2)2σ2 /B2 • where σ is estimated by s. • Note: In the absence of data,σ is sometimes approximated by R /4 where R is the range. Varsha Varde 18 Example • Example 2. Suppose you want to construct a 99% CI for µ so that W = 0.05. You are told that preliminary data shows a range from 13.3 to 13.7. What sample size should you choose? • Data summary: α= .01;R = 13.7 - 13.3 = 0.4; • so σ = 0.4/4 = .1. Now • B = W/2 = 0.05/2 = 0.025. Therefore • n = (zα/2)2σ2/B2=2.582(.1)2/0.0252 = 106.50 . • So n = 107. (round up) • Exercise 1. Find the sample size necessary to reduce W in the flight example to 0.6. Use α= 0.05. Varsha Varde 19 Single Binomial Population • Parameter of interest: p • Sample data: n, x, p ¯ = x /n (x here is the number of occurrences of a particular event in n trials). • Other information:α ,level of significance • Point estimator: p¯ • Mean of : p¯ =µP ¯ = p • Standard error of p¯ = σ p¯ = √pq/n • Confidence Interval (C.I.) for p: p ¯ ± zα/2√p ¯q ¯/n • Confidence level: (1 -α)100% which is the probability that the interval estimator contains the parameter. • Margin of Error: B = zα/2 √p ¯q ¯/n • Assumptions. • 1. Large sample (np ≥5; nq ≥5) • 2. Sample is randomly selected Varsha Varde 20 Example • Example 3. A random sample of n = 484 voters in a community produced x = 257 voters in favor of candidate A. • Data summary: n = 484; x = 257; • p ¯ = x/n = 257/484 = 0.531. • Question 1. Do we have a large sample size? • np ¯ = 484(0.531) = 257 which is ≥5. • nq ¯ = 484(0.469) = 227 which is ≥5. • Therefore we have a large sample size. Varsha Varde 21 Example • Question 2. What is the point estimate of p and its margin of error at 95% level of confidence ? • p ¯ =x/n=257/484= 0.531 • B = zα/2 √p ¯q ¯ /n • = 1.96√(0.531)(0.469)/484 • = 0.044 • Question 3. Find a 90% confidence interval for p. • p ¯ ± zα/2√p ¯q ¯/ n • =0.531 ± 1.645(0.531)(0.469)/484 • =0.531 ± 0.037 = (0.494, 0.568) Varsha Varde 22 Example • Question 4. What is the width of the CI found in Question 3.? • The width of the CI is • W = 2zα/2p ¯q ¯/n= 2(0.037) = 0.074 • Question 5. Interpret the CI found in Question 3. • The interval contains p with probability 0.90.OR If repeated sampling is used, then 90% of CI constructed would contain p. Varsha Varde 23 Example • Question 6. If n, the sample size, is increased • what happens to the width of the CI? • what happens to the margin of error? • The width of the CI decreases. • The margin of error decreases. • Sample size. • n ≈ (zα/2)2(p ¯q ¯)/B2 . • Note: In the absence of data, choose p ¯ = q ¯ = 0.5 or simply p ¯q ¯ = 0.25. Varsha Varde 24 Example • Example 4. Suppose you want to provide an accurate estimate of customers preferring one brand of coffee over another. You need to construct a 95% CI for p so that B = 0.015. • You are told that preliminary data shows a p ¯ = 0.35. What sample size should you choose ? Use α= 0.05. • Data summary: α= .05; p ¯ = 0.35;B = .015 • n =(zα/2)2(p ¯q ¯)/B2=(1.96)2(0.35)(0.65)/(.015)2 = 3, 884.28 • So n = 3, 885. (round up) Varsha Varde 25 Example • Exercise 2. Suppose that no preliminary estimate of p ¯ is available. Find the new sample size. Use α= 0.05. • Exercise 3. Suppose that no preliminary estimate of p ¯ is available. Find the sample size necessary so that α= 0.01. Varsha Varde 26 • • • • • • Two Quantitative Populations Parameter of interest: µ1 - µ2 Sample data: Sample 1: n1, x¯1, s1 ; Sample 2: n2, x¯2, s2 Point estimator: X¯1 - X¯2 Estimator mean: µX¯1-X¯2 = µ1 - µ2 Standard error: SE(X¯1 - X¯2) =√ σ21/n1+ σ22/n2 • Confidence Interval:(X¯1 - X¯2) ± zα/2 √ σ21/n1+ σ22/n2 • • • • • Assumptions. 1. Large samples( n1 ≥30; n2 ≥30) 2. Samples are randomly selected 3. Samples are independent Sample size.n≈ ( zα/2 )2 (σ21+σ22 )/B2 Varsha Varde 27 Two Binomial Populations • • • • • • • • • • • • • Parameter of interest: p1 - p2 Sample 1: n1, x1, p1 = x1/n1 Sample 2: n2, x2, p2 = x2/n2 p1 - p2 (unknown parameter) α (significance level) Point estimator: p1 - p2 Estimator mean: µ p1 - p2 = p1 - p2 Estimated standard error: σ p1 - p2 = √(p1q1 /n1+p2q2 /n2) Confidence Interval:(p1 - p2) ± zα/2 √(p1q1/n1+ p2q2/n2) Assumptions:1. Large samples(n1p1 ≥5, n1q1 ≥5, n2p2≥ 5, n2q2≥ 5) 2. Samples are randomly and independently selected Sample size: n ≈(zα/2 )2 (p1q1+ p2q2)/B2 For unknown parameters: n ≈ (zα/2 )2 (0.5)/B2 Varsha Varde 28 Sample size: • Sample size: for estimating population mean • n ≈ (zα/2)2σ2 /B2 • where σ is estimated by s (Sample Standard Deviation) • Note: In the absence of data,σ is sometimes approximated by R /4 where R is the range. • zα/2= 1.96 for 95% level of confidence • zα/2= 1.645 for 90% level of confidence • zα/2= 2.58 for 99% level of confidence • B= Precision or bound or margin of permissible error Sample size • Sample size for estimating population proportion • n ≈ (zα/2)2(p ¯q ¯)/B2 . • Note: In the absence of data, choose p ¯ = q ¯ = 0.5 or simply p ¯q ¯ = 0.25 • p ¯ is sample proportion • q ¯ =1- p ¯ • zα/2= 1.96 for 95% level of confidence • zα/2= 1.645 for 90% level of confidence • zα/2= 2.58 for 99% level of confidence • We are doing a customer satisfaction study for a washing machine. • We are measuring satisfaction on a scale of 1to 10 • Determine the sample size required for 95% level of confidence & level of precision or margin of error in estimation at 0.3. • We are estimating average customer satisfaction • So, n ≈ (zα/2)2σ2 /B2 • σ is not known. Range is 10-1=9. So we use estimate of σ as R/4 =9/4=2.25 • zα/2=1.96 ; B=.3 • n= (1.96 x 2.25/0.3)2 = 216 • So sample size required is 216 • We are doing a study for estimating proportion of population who use toothpaste brand Colgate. • Determine the sample size required for 95% level of confidence & level of precision or margin of error in estimation at 0.02. • We are estimating proportion of customers • So, n ≈ (zα/2)2(p ¯q ¯)/B2 • p ¯ is not known. So we use estimate of p ¯ as 0.5 . So p ¯q ¯ = .25 • zα/2=1.96 ; B=.04 • n= 0.25(1.96 / 0.04)2 = 2305 • So sample size required is 2305