* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Variability
Degrees of freedom (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Confidence interval wikipedia , lookup
Taylor's law wikipedia , lookup
Resampling (statistics) wikipedia , lookup
German tank problem wikipedia , lookup
Variability Variability • The differences between individuals in a population • Measured by calculations such as Standard Error, Confidence Interval and Sampling Error. • Variability gives an indication of the effort required to estimate population parameters with confidence. Standard Error (of the mean) • is the standard deviation of all possible sample means around the true population mean calculated by dividing the standard deviation by the square root of the sample size. • the notation for standard error is lower case s with an x bar subscript or SE. Standard Error for finite populations • If the total population is small or if the sample comprises more than 5 to 10 percent of the population, the sample mean is probably closer to the population mean than with infinite populations. As a result, the standard error of the mean is also smaller. The finite population correction factor serves to reduce the standard error when relatively large samples are drawn from finite populations. If every unit in the population is measured then the true mean is known and the standard error becomes zero, as there are no other possible means using that sample size. Confidence Interval • Used to specify the precision of the sample mean in relation to the population mean. It is derived by adding and subtracting a tabulated t value for the desired level of confidence times the standard error. • The t value comes from the Student's t distribution which is really a series of distributions based on the normal distribution, each defined by the degrees of freedom or the sample size minus one (represented as v). When applying a probability level to the Student's t distribution, at a 95% probability level for example, the t value is found from a lookup table. Using the degrees of freedom and the probability level, the t value is the multiplier for the standard error to determine the confidence interval level. The t value increases as sample size decreases, leading to larger confidence intervals and decreased precision. Student’s t distribution For t = 2 there is a 95% chance your sample confidence interval includes the true population mean Confidence Interval Effect of Standard Deviation The red distribution has a mean of 40 and a standard deviation of 5; the blue distribution has a mean of 60 and a standard deviation of 10. For the red distribution, 68% of the distribution is between 45 and 55; for the blue distribution, 68% is between 40 and 60. Sampling Error • rather than work with absolute confidence limits, convert them to a percent of the sample mean which is called sampling error. The notation in the handbook is an upper case E. Take the confidence interval quantity and scale it to the sample mean by dividing by the sample mean. Express this value as a percent by multiplying by 100 Sample Error Example step 1 (Calculate Standard Error) Plots 1/5 acre in size were used in this example and acres is equal to 18. So the total number of plots for the strata would be 5 plots per acre times 18 acres = 90 potential plots. Notice the application of the Finite Correction Factor (FCF) for this method. Sample Error – Step 2 Recall the Standard Error was calculated as 8.3 ft3 36.2% is a bit larger than the level we set to begin with (10%) – Implications? Determining Sample Size • Since t is very close to 2 for 95% confidence at infinite sample size we will use it. • E is the desired sampling error, we will use 10% Sample Coefficient of Variation refresher • Because populations with large means tend to have larger standard deviations than those with small means, the coefficient of variation permits a comparison of relative variability about different-sized means. The sample coefficient of variation is an expression of the standard deviation as a percentage of the mean. It is usually represented by upper case CV. It is calculated as the standard deviation divided by the mean multiplied by 100 to convert to a percentage. Sampling Intensity once a reasonable number of sample units are selected (i.e., over 20 to 30) the changes in the estimates of the mean and standard deviation by adding another sampling unit probably won't amount to much. The biggest gain to be had is in the estimation of standard error. Recall the formula for calculating the standard error results by dividing the sample standard deviation by the square root of the sample size. So as sample size increases, the standard error decreases. Effect of CV change • At high Coefficients of Variations very high sample numbers are needed to obtain desired confidence. • At lower desired errors, samples needed approach population size Sampling Intensity The USFS Way Sample Selection – from Precruise data 1. Determine the sampling error for the sale as a whole. (set to 10%) 2. Subdivide (or stratify) the sale population into sampling components as needed to reduce the variability within the sampling strata. 3. Calculate the coefficient of variation (CV) by stratum and a weighted CV over all strata. 4. Calculate number of plots for the sale as a whole and then distribute by stratum. Number of Plots Value of t is assumed to be 2 Error is set at 10% Distribute Plots by Stratum For each stratum, the calculation would look like this: • n1 = (17.6 * 185) / 67.9 = 48 plots • n2 = (7.7 * 185) / 67.9 = 21 plots • n3 = (7.2 * 185) / 67.9 = 20 plots • n4 = (35.4 * 185) / 67.9 = 96 plots • Which totals to the 185 plots for the sale.