Download Consolidation & Review

Inference 1 Sample Proportions  Column of zeros and ones ˆ : estimated proportion or fraction equals # of  p successes/ # of observations = k/n  Where k= Binomial[mean=np, variance=p*(1-p)n]  E[ p ˆ } = E[k/n] =E[(1/n)*k] = 1/n E[k] =( 1/n)np=p ˆ ] = Var[k/n] = (1/n)2 Var k =(1/n)2 np(1-p)  Var[ p ˆ ] = p(1-p)/n  Var[ p ˆ is a point estimate of p  p   The estimate of the Var of p ˆ is p ˆ *(1 - p ˆ )/n     2 Sample Proportions example from Lab 3  one of the ten columns of 50 observations of ones and zeros with the smallest proportion of 0.32 ˆ = 0.32*0.68/50 =0.004352 with square root, i.e  Var p standard deviation of 0.066  A 95 % confidence interval for an estimate of p from  this sample is:  Prob [-1.96≤(0.32-p)/0.066≤1.96]=0.95  Prob[-1.96*0.066≤(0.32-p)≤1.96*0.066]=0.95  Prob[-0.13≤(0.32-p)≤0.13]=0.95  Prob[0.13≥(p-0.32)≥-0.13]=0.95  Prob[0.45≥p≥ 0.19]=0.95 3 4 Note:  This 95 % confidence interval does not include p=0.5, the population parameter chosen for the simulation, illustrating that 5 % of the time the 95% confidence interval will not include the true value!  The Prob [-1.96≤(0.32-p)/0.066≤1.96]=0.95  Is the same as Prob[-1.96≤z≤1.96]=0.95, where,  (p ˆ - p)/ ˆ ( pˆ ) = z = (0.32 – p)/0.066, in this example  We can use the normal distribution approximation to the binomial in this example since n*p = 50*0.32 ≥ 5 and  n*(1-p)≥5  5 a Z value of 1.96 leads to an area of 0.475, leaving 0.025 in the Upper tail 6 Interval Estimation  The conventional approach is to choose a probability for the interval such as 95% or 99% 7 So z values of -1.96 and 1.96 leave 2.5% in each tail 8 f ( z)  [1 / 2 ] * e 1/ 2[( z 0) /1]2 Density Function for the Standardized Normal Variate 0.45 0.4 0.35 Density 0.3 0.25 0.2 -1.96 1.96 0.15 0.1 2.5% 2.5% 0.05 0 -5 -4 -3 -2 -1 0 1 Standard Deviations 2 3 4 5 9 Application of Sample Proportions 10 11 12 13 Field Poll Margin of Error ˆ= p ˆ *(1 - p ˆ )/n = 0.47* 0.53/599 = 0.000416  Variance of p  ˆ ( pˆ ) = √0.000416 = 0.0204  So two standard deviations is about 0.041 0r 4.1%, i.e the  of error is plus or minus 4.1 percentage points margin 14 Inferring the unknown population mean from a sample mean  Example from Lab 3: simulate the population as uniform, with random variable x, 0≤x≤1, and density f(x) =1  Note:  f (x)dx  x |  1 0  1  F(1), theCDF 1 1 2 1  Note the expected value of x, E[x]=  x * f ( x)dx   x *1* dx  x / 2 |0  1 / 2 0 0  Var[x] = E[x-E(x)]2 = E[x – E(x)]2 =E{x2 -2xE[x]+E[x]2}  Var[x] = {E(x2) – 2E[x]*E[x] + E[x]2 } = E[x2] –[Ex]2 1 1 2 2  E[x ] =  x f (x)dx   x 2dx  [x 3 /3] |10 = 1/3 1 1 0 0 0 0  Var[x] = E[x2] – E[x]2 = 1/3 –[1/2]2 = 1/3-1/4 = (4 -3)/12 =1/12   X~ Uniform(mean=1/2, Variance=1/12) 15 In lab 3 we drew a random sample of size 50 from this uniform distribution and calculated the sample mean:x  ( x ) / n 50 i 1  From the central limit theorem, we know, and we saw in lab 3, that the sample mean is distributed normally 16 Central tendency and dispersion of sample mean n n n 1 1 1 E[x ]  E  x i /n  (1/n) Ex i  (1/n)   (1/n)n *    Where μ is the population mean. In the simulation from Lab 3 using the uniform distribution, we knew that μ = 0.5 n n Var[x ]  Var[ x i /n  (1/n) Var x i  (1/n) 2 1 1 n 2 n  var[ x ]  (1/n)  2 i 1 2  (1/n) 2 n 2   2 /n 1 Where σ2 is the variance of x. In the simulation from Lab 3 using the uniform distribution, we knew the σ2 =1/12. 17 Hypothesis testing  Example from Lab 3 for sample proportions  Step one: formulate the hypotheses  Null hypothesis, H0: p = 0.5  Alternative hypothesis, HA : p<0.5  Step two: Identify a test statistic ˆ  p) / ˆ pˆ z  (p  Where the value for p is from the null hypothesis, so z= (0.32 - 0.5)/0.066 = 0.18/0.066 = 2.73  If the null hypothesis were true, what is the probability of getting a test statistic of this size 18 Hypothesis Testing: 4 Steps  Formulate all the hypotheses  Identify a test statistic  If the null hypothesis were true, what is the probability of getting a test statistic this large?  Compare this probability to a chosen critical level of significance, e.g. 5% 19 19 a Z value Of 2.73 leads to an area of 0.4968, leaving 0.0032 in the Upper tail, and Hence 0.0032 In the lower tail. If you choose a risk level of .05, i.e. α = 0.05 for The probability A type I error, Then reject H0 20 20 f ( z)  [1 / 2 ] * e 1/ 2[( z 0) /1]2 Density Function for the Standardized Normal Variate 0.45 0.4 0.35 Density 0.3 0.25 0.2 0.15 0.1 0.0032 0.05 0 -5 -4 -3 -2.73 -2 -1 0 1 Standard Deviations 2 3 4 5 21 f ( z)  [1 / 2 ] * e 1/ 2[( z 0) /1]2 Density Function for the Standardized Normal Variate 0.45 0.4 0.35 Density 0.3 0.25 0.2 0.15 0.050 0.1 0.05 -5 -4 -3 -2 0 -1.645 -1 0 1 Standard Deviations 2 3 4 5 22

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Consolidation & Review