Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 8 Estimation Mat og metlar © Estimator and Estimate Metill og mat An estimator of a population parameter is a random variable that depends on the sample information and whose value provides approximations to this unknown parameter. A specific value of that random variable is called an estimate. Metill fyrir þýðisstika er hending sem er háð úrtaksupplýsingum og gildi metilsins sem kallast mat gefur nálgun á hinn óþekkta þýðisstika. Point Estimator and Point Estimate Punktmetill og punktmat Let  represent a population parameter (such as the population mean  or the population proportion ). A point estimator, θ̂, of a population parameter, , is a function of the sample information that yields a single number called a point estimate. For example, the sample mean, X, is a point estimator of the population mean , and the value that X assumes for a given set of data is called the point estimate. θ̂ X Þýðisstiki (population parameter) Unbiasedness Óhneigður (óbjagaður) The point estimator θ̂ is said to be an unbiased estimator of the parameter  if the expected value, or mean, of the sampling distribution of θ̂ is ; that is, E (ˆ)   Punktmetill θ̂ er sagður óhneigður metill fyrir stikann  ef vongildi líkindadreifingar úrtaks fyrir er ; þ.e., θ̂ E (ˆ)   Probability Density Functions for unbiased and Biased Estimators Þéttifall fyrir hneigðan og óhneigðan metil (Figure 8.1) ˆ1 ˆ2  ˆ Bias Bjögun (skekkja) Let θ̂ be an estimator of . The bias in θ̂ is defined as the difference between its mean and ; that is Bias (ˆ)  E (ˆ)   It follows that the bias of an unbiased estimator is 0. Látum θ̂vera metil fyrir . Bjögun í θ̂ er skilgreind sem mismunur milli vongildis metilsins og ; þ.e. Bias (ˆ)  E (ˆ)   Samkvæmt þessu er bjögun (bias) fyrir óhneigðan metil 0. Most Efficient Estimator and Relative Efficiency Skilvirkasti metillinn og hlutfallsleg skilvirkni Suppose there are several unbiased estimators of . Then the unbiased estimator with the smallest variance is said to be the most efficient estimator or to be the minimum variance unbiased estimator of . Let θ̂1and θ̂ 2 be two unbiased estimators of , based on the same number of sample observations. Then, a) θ̂1is said to be more efficient than θ̂ 2 if Var (ˆ1 )  Var (ˆ2 ) b) The relative efficiency of θ̂1 with respect to θ̂ 2 is the ratio of their variances; that is, hlutfallsleg skilvirkni Var(θˆ2 ) Relative Efficiency  Var(θˆ1 ) Point Estimators of Selected Population Parameters (Table 8.1) Population Parameter Point Estimator Properties Mean,  X Unbiased, Most Efficient (assuming normality) Mean,  Xm Unbiased (assuming normality), but not most efficient Proportion,  p Unbiased, Most Efficient Variance, 2 s2 Unbiased, Most Efficient (assuming normality) Confidence Interval Estimator Metill fyrir öryggismörk A confidence interval estimator for a population parameter  is a rule for determining (based on sample information) a range, or interval that is likely to include the parameter. The corresponding estimate is called a confidence interval estimate. Metill fyrir öryggismörk á þýðisstika  er til að ákvarða (byggt á úrtaksgögnum) spönn, eða bil sem líklegt er til að ná utan um hinn sanna stika. Samsvarandi mat köllum við mat fyrir öryggismörk eða bara öryggismörk. Confidence Interval and Confidence Level Let  be an unknown parameter. Suppose that on the basis of sample information, random variables A and B are found such that P(A <  < B) = 1 - , where  is any number between 0 and 1. If specific sample values of A and B are a and b, then the interval from a to b is called a 100(1 - )% confidence interval of . The quantity of (1 - ) is called the confidence level of the interval. If the population were repeatedly sampled a very large number of times, the true value of the parameter  would be contained in 100(1 - )% of intervals calculated this way. The confidence interval calculated in this manner is written as a <  < b with 100(1 - )% confidence. Látum  vera óþekktan stika. Hugsum okkur á að á grunni úrtaksupplýsinga séu hendingar A og B reiknaðar þannig að P(A <  < B) = 1 - , þar sem  er einhver tala milli 0 og 1. Ef ákveðin gildi A og B eru a and b, þá er bilið frá a til b kallað 100(1 - )% öryggismörk fyrir . Stærðin (1 - ) er kallað öryggsstig bilsins. Ef endurtekin úrtök væru tekin úr þýðinu mjög oft þá myndi 100(1 )% allra þeirra bila sem reiknuð væri út innihalda hinn sanna stika . Öryggismörkin sem reiknuð eru á þennan hátt eru skrifuð sem a <  < b með 100(1 - )% vissu. P(-1.96 < Z < 1.96) = 0.95, where Z is a Standard Normal Variable (Figure 8.3) 0.95 = P(-1.96 < Z < 1.96) 0.025 0.025 -1.96 1.96 Notation Táknmálsnotkun Let Z/2 be the number for which P ( Z  Z / 2 )   2 where the random variable Z follows a standard normal distribution. Látum Z/2 vera tölu sem P ( Z  Z / 2 )   2 Þar sem hendingin Z fylgir staðlaðri normaldreifingu Selected Values Z/2 from the Standard Normal Distribution Table (Table 8.2)  Z/2 Confidence Level 0.01 0.02 0.05 0.10 2.58 2.33 1.96 1.645 99% 98% 95% 90% Confidence Intervals for the Mean of a Population that is Normally Distributed: Population Variance Known Öryggismörk fyrir meðaltal þýðis sem er normaldreift og með þekkta dreifni Consider a random sample of n observations from a normal distribution with mean  and variance 2. If the sample mean is X, then a 100(1 - )% confidence interval for the population mean with known variance is given by or equivalently, Z / 2 Z / 2 X X  n n X B where the margin of error (also called the sampling error, the bound, or the interval half width) is given by B  Z / 2  n Basic Terminology for Confidence Interval for a Population Mean with Known Population Variance Orðnotkun fyrir öryggismörk þýðismeðaltals með þekktri dreifni Terms (Table 8.3) Symbol Standard Error of the Mean X Z Value (also called the Reliability Factor) Z / 2 Margin of Error skekkjumörk Lower Confidence Limit Neðri mörk Upper Confidence Limit Efri mörk Width (width is twice the bound) Breidd B To Obtain: / n Use Standard Normal Distribution Table B  Z / 2  n LCL LCL  X  Z / 2 UCL UCL  X  Z / 2 w w  2 B  2 Z / 2  n   n n Student’s t Distribution Given a random sample of n observations, with mean X and standard deviation s, from a normally distributed population with mean , the variable t follows the Student’s t distribution with (n - 1) degrees of freedom and is given by X  t s/ n Hugsum okkur slembið úrtak n athugana með úrtaksmeðaltal X og úrtaksstaðalfrávik s, úrtakið er fengið úr þýði sem er normaldreift með vongildi , breytan t er sögð fylgja Student’s t dreifingu með (n - 1) frígráður og er gefin af Notation Táknmálsnotkun A random variable having the Student’s t distribution with v degrees of freedom will be denoted tv. The tv,/2 is defined as the number for which P(tv  tv , / 2 )   / 2 Slembin breyta sem hefur Student’s t dreifingu með v frelsisgráður verður táknuð með tv. Stærðin tv,/2 er skilgreind sem stærðin sem Confidence Intervals for the Mean of a Normal Population: Population Variance Unknown Öryggismörk fyrir vongildi í normaldreifðu þýði með óþekktri dreifni Suppose there is a random sample of n observations from a normal distribution with mean  and unknown variance. If the sample mean and standard deviation are, respectively, X and s, then a 100(1 - )% confidence interval for the population mean, variance unknown, is given by X  tn 1, / 2 s s    X  tn 1, / 2 n n or equivalently, X B where the margin of error, the sampling error, or bound, B, is given s by B  t n 1, / 2 n and tn-1,/2 is the number for which P(t n 1  t n 1, / 2 )   / 2 The random variable tn-1 has a Student’s t distribution with v=(n-1) degrees of freedom. Confidence Intervals for Population Proportion (Large Samples) Öryggismörk fyrir þýðishlutfall (Stór úrtök) Let p denote the observed proportion of “successes” in a random sample of n observations from a population with a proportion  of successes. Then, if n is large enough that (n)()(1- )>9, then a 100(1 - )% confidence interval for the population proportion is given by p  Z / 2 p(1  p) p(1  p)    p  Z / 2 n n or equivalently, pB where the margin of error, the sampling error, or bound, B, is given by p(1  p) B  Z / 2 n and Z/2, is the number for which a standard normal variable Z satisfies P ( Z  Z / 2 )   / 2 Notation Táknmálsnotkun A random variable having the chi-square distribution with v = n-1 degrees of freedom will be denoted by 2v or simply 2n-1. Define as 2n-1, the number for which P(  2 n 1  2 n 1, )  Hending með chi-square dreifingu þar sem v = n-1 frelsisgráður er táknuð með 2v eða 2n-1. Skilgreinum 2n-1, sem töluna sem um gildir að The Chi-Square Distribution (Figure 8.17) 1- 0  2n-1, The Chi-Square Distribution for n – 1 and (1-)% Confidence Level (Figure 8.18) /2 /2 1- 2n-1,1- /2 2n-1,/2 Confidence Intervals for the Variance of a Normal Population Öryggismörk fyrir dreifni í normaldreifðu þýði Suppose there is a random sample of n observations from a normally distributed population with variance 2. If the observed variance is s2 , then a 100(1 - )% confidence interval for the population variance is given by Hugsum okkur slembið úrtak n gagna úr normaldreifðu þýði með dreifni 2. Ef úrtaksdreifni er s2 , þá eru 100(1 - )% öryggismörk fyrir þýðisdreifni gefin sem (n  1) s 2  n21, / 2 2  (n  1) s 2  n21,1 / 2 is the number for which P(  and 2n-1,1 - /2 is the number for which P(  where 2 n-1,/2 2 2 n 1 2 n 1   2 n 1, / 2 ) 2 n 1,1 / 2  ) 2  2 And the random variable n-1 follows a chi-square distribution with (n – 1) degrees of freedom. Og hendingin 2n-1 fylgir chi-square dreifingu með (n – 1) frelsisgráður Confidence Intervals for Two Means: Matched Pairs Öryggismörk fyrir tvö vongildi : Pör (Matched Pairs) Suppose that there is a random sample of n matched pairs of observations from a normal distributions with means X and Y . That is, x1, x2, . . ., xn denotes the values of the observations from the population with mean X ; and y1, y2, . . ., yn the matched sampled values from the population with mean Y . Let d and sd denote the observed sample mean and standard deviation for the n differences di = xi – yi . If the population distribution of the differences is assumed to be normal, then a 100(1 - )% confidence interval for the difference between means (d = X - Y) is given by d  tn 1, / 2 or equivalently, sd s   d  d  tn 1, / 2 d n n d B Confidence Intervals for Two Means: Matched Pairs (continued) Where the margin of error, the sampling error or the bound, B, is given by B  t n 1, / 2 sd n And tn-1,/2 is the number for which P (t n 1  t n 1, / 2 )   2 The random variable tn – 1, has a Student’s t distribution with (n – 1) degrees of freedom. Confidence Intervals for Difference Between Means: Independent Samples (Normal Distributions and Known Population Variances) Öryggismörk fyrir mismun vongilda: Óháð úrtök Suppose that there are two independent random samples of nx and ny observations from normally distributed populations with means X and Y and variances 2x and 2y . If the observed sample means are X and Y, then a 100(1 - )% confidence interval for (X - Y) is given by ( X  Y )  Z / 2 or equivalently,  X2 nx   Y2 ny   X  Y  ( X  Y )  Z  / 2 (X Y )  B where the margin of error is given by B  Z / 2  X2 nx   Y2 ny  X2 nx   Y2 ny Confidence Intervals for Two Means: Unknown Population Variances that are Assumed to be Equal Öryggismörk fyrir mismun vongilda: Óþekkt dreifni en dreifnin er eins skv. Forsendu. Suppose that there are two independent random samples with nx and ny observations from normally distributed populations with means X and Y and a common, but unknown population variance. If the observed sample means are X and Y, and the observed sample variances are s2X and s2Y, then a 100(1 - )% confidence interval for (X - Y) is given by s 2p s 2p s 2p s 2p ( X  Y )  tnx  n y 2, / 2    X  Y  ( X  Y )  tnx  n y 2, / 2  nx n y nx n y or equivalently, (X Y )  B where the margin of error is given by B  tnx  n y 2, / 2 s 2p nx  s 2p ny Confidence Intervals for Two Means: Unknown Population Variances that are Assumed to be Equal (continued) The pooled sample variance, s2p, is given by s  2 p tnx ny 2, / 2 is the number for which (nx  1) s X2  (n y  1) sY2 nx  n y  2 P(t nx  n y  2  t nx  n y  2, / 2 )   2 The random variable, T, is approximately a Student’s t distribution with nX + nY –2 degrees of freedom and T is given by, ( X  Y )  (  X  Y ) T 1 1 sp  n X nY Confidence Intervals for Two Means: Unknown Population Variances, Assumed Not Equal Suppose that there are two independent random samples of nx and ny observations from normally distributed populations with means X and Y and it is assumed that the population variances are not equal. If the observed sample means and variances are X, Y, and s2X , s2Y, then a 100(1 - )% confidence interval for (X - Y) is given by ( X  Y )  t( v , / 2) s X2 sY2 s X2 sY2    X  Y  ( X  Y )  t( v , / 2 )  nx n y nx n y where the margin of error is given by B  t( v , / 2 ) s X2 sY2  nx n y Confidence Intervals for Two Means: Unknown Population Variances, Assumed Not Equal (continued) The degrees of freedom, v, is given by s X2 sY2 2 [( )  ( )] nX nY v 2 sX 2 sY2 2 ( ) /( n X  1)  ( ) /( nY  1) nX nY If the sample sizes are equal, then the degrees of freedom reduces to     2   (n  1) v  1  2 s X sY2   2   2 sY s X   Confidence Intervals for the Difference Between Two Population Proportions (Large Samples) Öryggismörk fyrir mismun þýðishlutfalla (stór úrtök) Let pX, denote the observed proportion of successes in a random sample of nX observations from a population with proportion X successes, and let pY denote the proportion of successes observed in an independent random sample from a population with proportion Y successes. Then, if the sample sizes are large (generally at least forty observations in each sample), a 100(1 - )% confidence interval for the difference between population proportions (X - Y) is given by ( pX  pY )  B Where the margin of error is B  Z / 2 p X (1  p X ) pY (1  pY )  nX nY Sample Size for the Mean of a Normally Distributed Population with Known Population Variance Gagnasafn fyrir vongildi normaldreifðs þýðis með þekktri þýðisdreifni Suppose that a random sample from a normally distributed population with known variance 2 is selected. Then a 100(1 - )% confidence interval for the population mean extends a distance B (sometimes called the bound, sampling error, or the margin of error) on each side of the sample mean, if the sample size, n, is Z / 2 n B2 2 2 Sample Size for Population Proportion Stærð gagnasafns fyrir þýðishlutfall Suppose that a random sample is selected from a population. Then a 100(1 - )% confidence interval for the population proportion, extending a distance of at most B on each side of the sample proportion, can be guaranteed if the sample size, n, is 0.25( Z / 2 ) n B2 2 Key Words  Bias  Bound  Confidence interval:  For mean, known variance  For mean, unknown variance  For proportion  For two means, matched  For two means, variances equal  For two means, variances not equal  For variance  Confidence Level  Estimate  Estimator  Interval Half Width  Lower Confidence Limit (LCL)  Margin of Error  Minimum Variance Unbiased Estimator  Most Efficient Estimator  Point Estimate  Point Estimator Key Words (continued)  Relative Efficiency  Reliability Factor  Sample Size for Mean, Known Variance  Sample Size for Proportion  Sampling Error  Student’s t  Unbiased Estimator  Upper Confidence Limit (UCL)  Width