Download Estimated standard error of the sample average

Estimation Sampling Distributions • Because estimators are based on random samples, they are random variates just like data! • Estimators have distributions called sampling distributions Say are interested in mean Mn mass contained in bullets manufactured at a particular factory Lets use the average mass of Mn in a sample (of size n) to estimate the population mean mass: What might the distribution of size 10 over the course of a week? look like if we take 1000 samples of Sampling Distributions Important features of an estimator’s sampling distribution: (Approximate) sampling distribution of Sampling dist mean: Sample size, n = 10 bullets Number of samples = 1000 Sampling dist s.d.: Handy Unbiased Estimators • An unbiased estimator of the mean that we always use is: Same as MLE estimate • An unbiased estimator of the variance (which we will typically use as a variance estimator) is: Different from MLE estimate Handy Unbiased Estimators • An unbiased estimator for a proportion is: Heads, Success, etc, … • An unbiased estimator of the standard error of p is: Sampling Distributions • Uncertainty in the estimate can be represented as standard deviation for the sampling distribution: is called the standard error of the estimator • Estimated standard error of the sample average by plugging in Interval Estimation • We are interested in methods that produce an interval: • Given the assumptions of the methods are satisfied, the interval covers the true value of the parameter with (approximate) probability at least 1 – a. • Common interval methods for: • Confidence intervals • Prediction intervals • Tolerance intervals • Credibility/Probability intervals (Bayesian) Confidence Intervals • q is a parameter we are interested in and assume we don’t know its true value. • e.g. a mean, a sd, a proportion, etc. • Consider an experiment that will collect a sample of data. • Then BEFORE we collect the data, we can devise procedure such that: Estimates we will get from the sample we have yet to collect Confidence Intervals • In order to get actual numerical values for the experiment and plug in the data and we perform • The outcomes for this experiment are: • Under the frequentist definition, probabilities (other than 0 or 1) only exist for outcomes of experiments that haven’t happened yet. • After we collect data is a set of plausible values for q. Confidence Intervals • Given a sample of data, the (1 − a)×100% confidence interval for a parameter estimate on the sample is: • We are (1 − a)×100% confident that the true value of q is covered by • The CI’s level of confidence: (1 − a)×100% is the same “number” as the CI –method’s probability of producing an interval that covers q, but… confidence is not probability Confidence Intervals • So how do we compute a (1 − a)×100% confidence interval given a set of data?? • General Case: (1 − a)×100% CIs for the mean m : • Sample size n, sd sX unknown and estimated: Two sided One sided, lower bound One sided, upper bound Student-t(n-1) quantiles qt(1-a/2,df=n-1) or qt(1-a,df=n-1) Compute the Confidence Intervals A the mass of an unknown powder was determined 30 times. The Results are shown below (units: mg): 4.11, 3.70, 3.36, 3.68, 4.42, 3.23, 4.03, 4.03, 3.52, 4.75, 5.09, 3.47, 3.02, 4.24, 4.74, 4.51, 2.90, 4.15, 3.54, 3.81, 2.98, 3.82, 4.32, 3.06, 4.00, 4.05, 3.19, 3.17, 3.67, 4.37 Compute: a. b. c. d. The sample mean: The sample sd: The estimated standard error of the mean: The number of estimated standard errors that cover 95% of the sampling distribution symmetrically about the sample mean: ± Compute the Confidence Intervals a. Sample mean = 3.83 b. Sample sd = 0.58 c. Est se of mean = 0.11 d. For 95% , a = 0.05. 95% spread symmetrically about the mean we want t0.025, 29 and t0.975, 29 = ± 2.04523 # Data from the question: x <- c(4.11, 3.70, 3.36, 3.68, 4.42, 3.23, 4.03, 4.03, 3.52, 4.75, 5.09, 3.47, 3.02, 4.24, 4.74, 4.51, 2.90, 4.15, 3.54, 3.81, 2.98, 3.82, 4.32, 3.06, 4.00, 4.05, 3.19, 3.17, 3.67, 4.37) n <- length(x) mn <- mean(x) s <- sd(x) se <- s/sqrt(n) # # # # Sample size Sample average (estimated mean) Sample standard deviation Estimated standard error of the mean alpha <- 0.05 conf <- 1 - alpha/2 tt <- qt(p = conf, df = n-1) # Level of significance # Level of confidence # t-quantile: The number of estimated standard # errors that cover conf*100% of the # sampling distribution for the mean. Compute the Confidence Intervals e. Compute the two-sided 95% CI for the mean given this data: [ 3.83 – 2.04*0.11, 3.83 + 2.04*0.11 ] lo <- mn - tt*se hi <- mn + tt*se c(lo,hi) # Two-sided confidence interval for a set of # plauseable values for the mean given this sample. [3.61, 4.05] Confidence Intervals • For us, we can approximate the CI for any parameter we have encountered as • (1 − a)×100% CIs for general parameter q : Two sided One sided, lower bound One sided, upper bound Student-t(n-1) quantiles qt(1-a/2,df=n-1) or qt(1-a,df=n-1) Example Over a several month period the rate of attacks on a certain computer network per day were measured: 11.1, 12.3, 12.0, 11.3, 12.6, 12.9, 12.0, 13.2, 11.8, 13.2, 12.4, 10.3, 12.0, 12.1, 13.1 Compute the 90% lower confidence limit of the hack rate parameter.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Estimated standard error of the sample average