Download Inference on the Mean of a Population, Variance Known

Chapter 4 Modeling and Analysis of Uncertainty Statistical Inference • Statistical inference is using the information contained in a sample to make decisions about the population. Statistical inference has two sub areas: 1. Parameter estimation – the estimating of a parameter of a population probability distribution. Point and Interval estimates. 2. Hypothesis testing – the testing of a hypothesis made regarding a parameter of a probability distribution. Point Estimation • If X is a random variable with a probability distribution f (x), and the distribution has an unknown parameter θ, and if X1..Xn is a random sample of size n from f (x) ˆ  h X1 , X 2 , X n  is called a point estimator of . 1.  h is a function of the observations (all random variables ) in the sample. ˆ is a random variable. Therefore  2. After the random sample has been taken , the random variables ˆ has a numerical value that is have numerical values and  referred to as ˆ . 3. A point estimate of some population parameter  is a single ˆ. numerical value ˆ of a statistic  ˆ) Parameters  and Point Estimates ( Population Mean μ ------------Sample mean X Population Variance σ2 ----------Sample variance S2 Difference in two means μ1–μ2----Difference in two sample means X1  X2 All of these point estimates are statistics. Estimator for Percentiles A. Let x p denote the p-quantile of X and z p denote the p-quantile of the standard normal. Assume X~N(μ, σ2) . Then  xp    FX  x p   p     p    xp     zp  (3)  xp     z p Thus, we would estimate x p by xˆ p  ˆ  ˆ z p  x  sz p . (4) Estimators for Percentiles B. When sampling from X  exp(), X  1  ,  2 X 1 2   2 , x p    X ln 1  p    ln 1  p  . Thus, 1 ˆ  x , ˆ  , and xˆ p   x ln 1  p  . x What if we wanted to estimate  X2 ? We wouldn’t use s2, we would use ˆ X2  x 2 . Estimators for Percentiles Example 1: The data (ordered) are thicknesses of copper foil from a rolling operation (units in mils). Normal Probability Plot .999 .99 .95 Probability THICKNESS 4.72 4.84 4.87 4.94 4.98 5.00 5.01 5.05 5.06 5.07 5.11 5.12 5.21 5.24 5.38 .80 .50 .20 .05 .01 .001 4.7 4.8 4.9 5.0 5.1 5.2 5.3 5.4 THICKNESS Average: 5.04 StDev: 0.165745 N: 15 Anderson-Darling Normality Test A-Squared: 0.159 P-Value: 0.936 Estimators for Percentiles The normal distribution provides a good fit to the data. Thus, assume the data come from a normal distribution, with ˆ  x  5.04, ˆ  s  0.1657. The 90th percentile is estimated by xˆ0.90  x  sz0.90  5.04  0.1657(1.2816)  5.25 (To obtain standard normal quantiles use normal tables or MINITAB or find normal quantiles directly from MINITAB –we’ll demonstrate). The sample 90th percentile (see Chapter 2) is 5.30. Unbiased Estimators • Bias is a measure of closeness of the estimator to the parameter. It is defined by: The point estimator ̂ is an unbiased estimator for the parameter θ if: E( ̂ ) = θ • If the estimator is not unbiased, then the difference E( ̂ ) - θ is called the bias of the estimator ̂ . In words: “An estimator is unbiased when its expected value is equal to the parameter it estimates.” Point Estimation  Pdf of an unbiased estimator ̂1   ˆ  B  Pdf of a biased estimator ̂ 2  ˆ 0 B  1 2 ˆ is used, “On the average” when  2 an error of size  is made. Example 2: Recall that for a random sample from a population with mean  and variance  2 , E  X    and E  S 2    2 . Thus, the sample mean (variance) is always unbiased for the population mean (variance). But note – the sample standard deviation is never unbiased for  : E S   E   S2  E S2    However, when sampling from the normal distribution, for sample sizes  10 the bias is very small. Remark: In the expression for sample variance, n S2   X i 1 i X 2 n 1 why not use n in the denominator? Because by using n – 1, we get an unbiased estimator for the population variance. Whether or not an estimator is unbiased, the smaller the variance the better. Point Estimation Estimator Variance • When there are two unbiased estimators for the same parameter, a decision as to which is better can be made by examining the variance of each. • The estimator with the least variance is more likely to result in an estimate that is closer to the population parameter. • The unbiased estimator with the least variance is referred to as the minimum variance unbiased estimator (MVUE). Mean Square Error • A measure that incorporates both the bias and the variance of an estimator is the mean square error. • The mean square error of an estimator ̂ of the parameter θ is defined as : MSE ( ̂ ) = E( ̂ - θ)2 • The mean square error can be rewritten as: • MSE ( ̂ ) = V( ̂ )+(bias)2 Relative Efficiency • The mean square error can be used to compare the ability of two estimators to estimate a parameter. • For two estimators ̂1 , ̂ 2 , the relative efficiency of ̂ 2 to ̂1 is:     ˆ1 MSE  ˆ2 MSE  A ratio less than 1 indicates that ̂ 2 is not as efficient as ̂1 in estimating the parameter value. 1. The MSE of an unbiased estimator is simply Var  ˆ  . 2. An estimator with small variance and small bias may be more efficient than one with a larger variance and no bias, as depicted below. Pdf of a biased estimator ̂ 2 Pdf of unbiased estimator ̂1  Which estimator is better? Why? Standard Error • The standard error of a statistic is the standard deviation of its sampling distribution. V̂ • If the expression for the standard error contains its own parameters with unknown values, but which have estimators, use the estimators and the result is the estimated standard error of a statistic. Point Estimation Hypothesis Testing Statistical Hypotheses We like to think of statistical hypothesis testing as the data analysis stage of a comparative experiment, in which the engineer is interested, for example, in comparing the mean of a population to a specified value (e.g. mean pull strength). Hypothesis Testing • A statistical hypothesis is a statement about the parameters of one or more populations. They are expressed H0: μ = 50cm/s the null hypothesis H1: μ ≠ 50cm/s the alternative hypothesis • Generally the interest is in whether or not a parameter has changed, is in accordance with theory, or is in conformance with a requirement. Hypothesis Testing cont. • Testing the hypothesis involves taking a random sample, computing a test statistic, and from the test statistic make a decision about the null hypothesis. • The value of the test statistic is compared to the critical region, and if it is in the critical region the null hypothesis is rejected. Type I Error • The limits of the critical region are referred to as the critical values. • These critical values depend on the desired probability for rejecting the null hypothesis when in fact it is true. • Rejecting the null hypothesis H0 when it is true is defined as a type I error. • The probability of a type I error is denoted as α. P( reject Ho when Ho is true) • The type I error probability is also referred to as the significance level or size of the test. • Generally the α value is decided prior to conducting the test. Type II Error • Failing to reject the null hypothesis when it is false is referred to as a Type II error. • The probability of making a type II error is denoted as β. P( fail to reject H0 when H0 is false). • This probability is a function of 2 values. First, how close the hypothesized value of the parameter is to the actual value of the parameter, and second, the sample size. • The closer the two values of the parameter and the smaller the sample size, the larger the value of β. • To calculate β, a true population parameter value is required. Power of a test • The power of a statistical test is the probability of rejecting the null hypothesis when it is false. • This probability is 1-β. • If the power value is deemed to low, increase α or the sample size. Hypothesis Testing Statistical Hypotheses For example, suppose that we are interested in the burning rate of a solid propellant used to power aircrew escape systems. • Now burning rate is a random variable that can be described by a probability distribution. • Suppose that our interest focuses on the mean burning rate (a parameter of this distribution). • Specifically, we are interested in deciding whether or not the mean burning rate is 50 centimeters per second. Hypothesis Testing 4-3.1 Statistical Hypotheses Two-sided Alternative Hypothesis One-sided Alternative Hypotheses Hypothesis Testing 4-3.1 Statistical Hypotheses Test of a Hypothesis • A procedure leading to a decision about a particular hypothesis • Hypothesis-testing procedures rely on using the information in a random sample from the population of interest. • If this information is consistent with the hypothesis, then we will conclude that the hypothesis is true; if this information is inconsistent with the hypothesis, we will conclude that the hypothesis is false. Hypothesis Testing 4-3.2 Testing Statistical Hypotheses Hypothesis Testing 4-3.2 Testing Statistical Hypotheses Hypothesis Testing 4-3.2 Testing Statistical Hypotheses Sometimes the type I error probability is called the significance level, or the -error, or the size of the test. Hypothesis Testing 4-3.2 Testing Statistical Hypotheses Hypothesis Testing 4-3.2 Testing Statistical Hypotheses 4-3 Hypothesis Testing 4-3.2 Testing Statistical Hypotheses Hypothesis Testing 4-3.2 Testing Statistical Hypotheses Hypothesis Testing 4-3.2 Testing Statistical Hypotheses • The power is computed as 1 - b, and power can be interpreted as the probability of correctly rejecting a false null hypothesis. We often compare statistical tests by comparing their power properties. • For example, consider the propellant burning rate problem when we are testing H 0 :  = 50 centimeters per second against H 1 :  not equal 50 centimeters per second . Suppose that the true value of the mean is  = 52. When n = 10, we found that b = 0.2643, so the power of this test is 1 - b = 1 - 0.2643 = 0.7357 when  = 52. Hypothesis Testing 4-3.3 One-Sided and Two-Sided Hypotheses Two-Sided Test: One-Sided Tests: Hypothesis Testing 4-3.4 General Procedure for Hypothesis Testing 4 and 5 determine 6 Case 1: Inference on the Mean of a Population, Variance Known Assumptions 4-4 Inference on the Mean of a Population, Variance Known 4-4.1 Hypothesis Testing on the Mean We wish to test: The test statistic is: This statistic follows the standard normal distribution when H0 is true 4-4 Inference on the Mean of a Population, Variance Known 4-4.1 Hypothesis Testing on the Mean Reject H0 if the observed value of the test statistic z0 is either: z0 > z/2 or z0 < -z/2 Fail to reject H0 if -z/2 < z0 < z/2 As ± z/2 are the critical values that define the borders of the critical (reject) region of the test statistic 4-4 Inference on the Mean of a Population, Variance Known 4-4.1 Hypothesis Testing on the Mean 4-4 Inference on the Mean of a Population, Variance Known 4-4.1 Hypothesis Testing on the Mean Inference on the Mean of a Population, Variance Known 4-4.2 P-Values in Hypothesis Testing Alternatively it is the probability that the test statistic will assume the calculated value z0 or greater when the null is true. Inference on the Mean of a Population, Variance Known 4-4.2 P-Values in Hypothesis Testing Inference on the Mean of a Population, Variance Known 4-4.3 Type II Error and Choice of Sample Size Finding The Probability of Type II Error b Inference on the Mean of a Population, Variance Known 4-4.3 Type II Error and Choice of Sample Size Finding The Probability of Type II Error b Inference on the Mean of a Population, Variance Known 4-4.3 Type II Error and Choice of Sample Size Sample Size Formulas 4-4 Inference on the Mean of a Population, Variance Known 4-4.3 Type II Error and Choice of Sample Size Example 3: A. Let X 1 , X 2 , N   ,  2  . Since X , X n be a random sample from N   ,  2 n  , the standard error of X as an estimator of  is  n . In most cases  will be unknown. The estimated statndard error is ˆ X  ˆ s  . n n B. For the data of Example 2, (thickness of copper foil) ˆ  x  5.04, ˆ  s  0.1657. Thus, s 0.1657 ˆ X    0.0428. n 15 ˆ X Recall that MINITAB displays this: Descriptive Statistics Variable N Mean THICKNESS 15 5.0400 Variable THICKNESS Minimum 4.7200 Median 5.0500 TrMean 5.0385 Maximum 5.3800 StDev 0.1657 Q1 4.9400 SE Mean 0.0428 Q3 5.1200 Example 4: The mean force exerted at a particular point in a newly designed mechanical assembly should be less than 4000 psi. Based on previous designs, the engineer “knows” that the force is normally distributed with standard deviation 65 psi. He will obtain data on 20 prototype assemblies to test H 0 :   4000 H1:   4000 at the 0.01 level of significance. A. He obtained x  3982.8. Thus, x  4000 z0   1.18 65 20  z0.01  2.326 z0   z0.01 Therefore z 0 is not in the critical region.  Do not reject H 0 . The engineer cannot (as he “hoped” to) conclude that the mean force is less than 4000 psi. B. Do the problem on MINITAB, but note that you must have raw, not summarized, data. The engineer collected the data on the next slide. Stat > Basic Statistics > 1-Sample Z P-value  0.01. Do not reject H0 MINITAB Output: Test of mu = 4000.0 vs mu < 4000.0 The assumed sigma = 65.0 Variable FORCE N 20 Mean 3982.8 StDev 60.5 SE Mean 14.5 Z -1.18 P 0.12 4-4 Inference on the Mean of a Population, Variance Known 4-4.4 Large-Sample Test In general, if n  30, the sample variance s2 will be close to 2 for most samples, and so s can be substituted for  in the test procedures with little harmful effect. That is for large sample sizes: X   approx N  0,1 S n The results we have developed for the case of known variance are used for large sample size tests with unknown variance, except for two important differences. 1. Wherever you see a  substitute the sample standard deviation s. 2. The confidence intervals and hypothesis tests become approximate rather than exact. 3. If the underlying distribution is not too non-normal, the approximation should be adequate if n  30. We’ll do an example. Example 5: A civil engineer working for NYSDOT tests 50 standard-sized specimens of concrete for moisture content after 5 hours of drying time. She collects the following data (appropriate units). She also constructs the histogram shown below. 26.4 31.0 20.6 12.4 19.6 16.9 5.2 15.8 17.8 21.1 19.6 41.7 39.3 22.7 33.1 23.4 23.0 19.3 19.3 17.2 5.1 59.4 5.7 12.1 27.7 6.5 16.9 21.9 32.0 20.9 18.6 10.4 5.7 14.2 11.7 15.3 15.2 19.5 11.4 13.6 x  19.38 s  10.51 20 Frequency 11.7 15.3 29.1 13.1 39.1 9.4 14.9 11.2 11.2 25.0 10 0 0 12 24 36 MOISTURE 48 60 The histogram indicates that distribution is quite skewed. She also does a normality test: in MINITAB do Stat > Basic Statistics > Normality Test Normal Probability Plot .999 .99 Probability .95 .80 .50 .20 .05 .01 Very low Pvalue .001 10 20 30 40 50 60 MOISTURE Average: 19.384 StDev: 10.5096 N: 50 Anderson-Darling Normality Test A-Squared: 1.264 P-Value: 0.002 She just wants a 95% confidence interval on the mean moisture content. Since the sample size is fairly large, an approximate 95% confidence interval is: x  z0.025 s 10.51  19.38  1.96 n 50  19.38  2.91  16.47, 22.29 You can do this on MINITAB by “making believe” that  is known and   s. 4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Some Practical Comments on Hypothesis Testing Statistical versus Practical Significance 4-4 Inference on the Mean of a Population, Variance Known 4-4.5 Some Practical Comments on Hypothesis Testing Statistical versus Practical Significance Interval Estimation The standard error provides a measure of precision of an estimate. In fact, very often in the engineering literature a point on a graph will have so-called “error bars” around it. The plotted point is actually the mean of a sample and the error bar is x  ˆ X . This is actually a special case of a confidence interval, formally defined on the next slide. 4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean Two-sided confidence interval: One-sided confidence intervals: Confidence coefficient: 4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean 4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean  z 0.005 2.576 0.01 2.326 0.025 1.96 0.05 1.645 Example 6: A. Let X 1 , X 2 , N   , 2  , X n be a random sample from X  and assume that  is known. Since Z,  n   X  [by (10)] P   z 2   z 2   1    n        P   z 2  X    z 2   1 n n       P  X  z 2    X  z 2 (11)   1 n n  L  U Comparison of (9) and (11) implies that a 100(1 – )% confidence interval for  is       x  z 2 n , x  z 2 n   x  z 2 n .   (12) B. The output temperature of a gas turbine is N  625 C,222  . An engineer has “tweaked” the design in order to increase it by at least 5%. The standard deviation will not be affected. A test of 16 turbines gives a sample mean of 678º. By (12), a 95% confidence interval for the (population) mean is 22 678  1.96  678  10.78  667.2,688.8. 16 4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean 4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean Relationship between Tests of Hypotheses and Confidence Intervals If [l,u] is a 100(1 - ) percent confidence interval for the parameter, then the test of significance level  of the hypothesis will lead to rejection of H0 if and only if the hypothesized value is not in the 100(1 - ) percent confidence interval [l, u]. 4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean Confidence Level and Precision of Estimation The length of the two-sided 95% confidence interval is whereas the length of the two-sided 99% confidence interval is 4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean Choice of Sample Size 4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean Choice of Sample Size Example 7: The tensile strength of a fiber used in the manufacture of insulation material is normally distributed with a standard deviation of 0.04gm. Having made some process adjustments, however, the mean has not been established. In a test of 40 fibers, the data shown on the next slide was collected. Find a 99% confidence interval for the population mean. Tensile Strength (gms) 9.99 10.05 10.02 9.99 10.05 9.96 10.00 9.99 10.03 10.05 10.02 9.98 9.96 10.01 10.00 9.90 10.04 9.99 9.93 9.99 9.97 9.96 10.00 10.07 9.95 9.96 9.97 9.97 10.01 9.97 9.99 10.01 10.01 10.01 9.95 9.99 9.97 9.95 9.96 10.01 x  9.9908 The 99% confidence interval is: x  z0.005  0.04  9.9908  (2.5758) n 40  9.9908  (2.5758)0.00632  9.9908  0.0163  9.974,10.007  Let’s see how to do it using MINITAB. Do Basic Statistics > 1-Sample Z The result of Example 7 is: “With 99% confidence, the mean tensile strength is between 9.974 and 10.007 grams.” However, for a variable like tensile strength, an engineer would perhaps like to say: “With 99% confidence, the mean tensile strength is at least (whatever).” This brings us to the notion of one-sided confidence intervals, defined next. Definition: Suppose  is an unknown parameter. (i) Suppose L is a statistic based on a sample X1 , X 2 , , X n . If P   L   1   , (23a) then the observed value of L, l say, is called a 100(1 – )% one-sided lower confidence limit or bound for . (ii) Suppose U is a statistic based on a sample X1 , X 2 , , X n . If P   U   1   , (23b) then the observed value of U, u say, is called a 100(1 – )% one-sided upper confidence limit for . We said we would define one-sided confidence intervals, but in the definition, we refer to one-sided limits or bounds. Well – we really have defined intervals. Corresponding to the limits l and u in (23a,b) are the intervals (-, u) and ( l, ). For the present case of sampling from N   , 2  with  known, one-sided lower and upper 100(1  )% confidence limits are respectively x  z  n and x  z  n . (24) Example 8: In Example 7, it would probably have been more appropriate to report a one-sided confidence limit. Suppose there is a customer-mandated spec that specifies that the mean strength be at least 10. The lower 99% onesided confidence limit for the mean tensile strength of this fiber is 9.9908  z0.01 (0.00632)  9.9908  2.3263(0.00632)  9.976 Remark: MINITAB Student doesn’t do one-sided limits directly. Can you figure out how to do it indirectly? 4-4 Inference on the Mean of a Population, Variance Known 4-4.6 Confidence Interval on the Mean One-Sided Confidence Bounds Problem 4-27 • In the production of airbag inflators for automotive safety systems, a company is interested in ensuring that the mean distance of the foil to the edge of the inflator is at least 2 cm. Measurements on 20 inflators yielded an average value of 2.02 cm. Assume a standard deviation of .05 on the distance measurements and a significance level of .01. a) Test for conformance to the company’s requirement. Airbag Inflator and Assembly Problem 4-27 b) What is the P-value of this test? c) What is the β of this test if the true mean is 1.97 cm.? d) What sample size would be necessary to detect a true mean of 1.97 with probability of at least .90? e) Find a 99% lower confidence bound on the true mean. f) Use the CI found in part e to test the hypothesis in part a. g) What sample size is required to be 95% confident that the error in estimating the mean is less than .01 4.4 Case II: Inference on the Mean of a Population, Variance Unknown   Let X1 , X 2 , , X n be a random sample from N  ,  2 , with  ² unknown. This case differs from Case I in that now we shall consider the more realistic situation in which the variance is unknown. The primary reason for considering Case I at all is that it is a convenient way of illustrating the basic concepts of confidence interval estimation and hypothesis testing. When sampling from the normal distribution the exact confidence interval for  is based solely on the fact that X   n N  0,1 (27a) and for the exact hypothesis test, under H0 X  0  n N  0,1 . (27b) Although (27a,b) are true whether or not the variance is known, the problem is that if  is not known this knowledge does us no good with respect to confidence intervals and hypothesis tests for the mean. Intuitively, what you don’t know, you estimate. In particular, suppose in (27a), we estimate the population standard deviation by the sample standard deviation. This leads to the question T X  S n ? The pdf of the random variable T defined in (28) has been derived [see eqn. (4-40), pg. 163, text)]. The distribution is called (Student’s) t distribution. The parameter of a t distribution is it’s degrees of freedom, say k. In particular, the random variable defined by (28) has n – 1 degrees of freedom (write T tn1 ). Thus, for each sample size, T has a different pdf. Intuitively, it would seem that the pdfs of the t distributions should look like standard normal pdfs. They do, but there is a difference. Some representative t pdfs are shown on the next slide. Higher tails than pdf of standard normal Pdfs of several t distributions, where k denotes the degrees of freedom. They look much like standard normal pdfs, but the tails are higher (more variance), but note that the t pdf approaches that of the standard normal as k  . Percentage points of t distributions are defined similarly to those of the standard normal as shown in the figure below. Percentage points of the t distribution, where k denotes the degrees of freedom. To obtain percentage points, use MINITAB or Table II in Appendix A. We’ll demonstrate. 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean Example 4-7 • A sample of size 15 was taken on the coefficient of restitution of golf club drivers. The variance of the coefficient is unknown • Does the mean coefficient of restitution exceed .82 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.1 Hypothesis Testing on the Mean 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.2 P-value for a t-Test b - P( Type II Error) We need to know what is the distribution of the test statistic X  T0  0 S n under H1? The answer is: T0 has a so-called non-central t-distribution. The probability for making a type II error, and the corresponding power of the test can be illustrated for hypothesis tests having an unknown variance in an identical fashion as was done for tests having a known variance. However, numerically this is a cumbersome computational procedure, an we will find β values and the power of a test using Minitab only. β - P( Type II Error) Example 4-8 If the mean differs from .82 by as much as .02, is the sample size of 15 adequate to ensure the null hypothesis will be rejected with probability of at least .80 • Stat>Power and Sample>1 sample t Calculate power for each sample size Specify a sample size, difference, sigma, click on options and insert the type of alternative hypothesis and level of significance. Sample size • The sample size required to detect a difference from the hypothesized mean when the variance of the characteristic of interest is unknown, with a specified power will be determined using Minitab. Example 4-8 • Stat>Power and Sample>1 sample t Calculate sample size for each power value Specify a power, difference, sigma, click on options and insert the type of alternative hypothesis and level of significance. 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.3 Type II Error and Choice of Sample Size Fortunately, this unpleasant task has already been done, and the results are summarized in a series of graphs in Appendix A Charts Va, Vb, Vc, and Vd that plot for the t-test against a parameter d for various sample sizes n. 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.3 Type II Error and Choice of Sample Size These graphics are called operating characteristic (or OC) curves. Curves are provided for two-sided alternatives on Charts Va and Vb. The abscissa scale factor d on these charts is defined as Confidence Intervals for the Mean We can derive a confidence interval for the t-distributed test statistic in a manner similar to when the test statistic follows the standard normal distribution. We can write:   X  P  t 2,n1   t 2,n1   1   S n     P X t n  X    t 2,n1 S  P t 2,n1 S  2,n 1 where T S n    X  t 2,n1 S X  ~ tn1 S n  n  1  n  1 . (29) By the definition of confidence intervals (see (9)), we see from (29) that A 100(1 – )% confidence interval for the mean of a normal distribution, variance unknown, is s s  s   x  t 2,n1 n , x  t 2,n1 n   x  t 2,n1 n .   (30) To obtain one-sided lower (upper) 100(1 – )% limits, in (30), change /2 to  and take the lower (upper) limit. Example 9: Suppose in Example 7 neither the mean nor the standard deviation is known. A. Use the data given in Example 7 to find a 99% confidence interval for the mean. In Example 7, x  9.9908 and s  0.03540. The 99% confidence interval when the variance is unknown is thus s 0.03540 x  t0.005,39  9.9908  (2.7079) n 40  9.9908  (2.7079)0.0055972  9.9908  0.015156  9.976,10.006 Recall when the variance was known in ex. 7, the CI was  9.9908  0.0163  9.974,10.007  B. Now use MINITAB. Go to: Stat>Basic Statistics > 1-Sample t The results appear on the next slide. Sample Size Determination for Confidence Intervals (Variance Unknown) Recall that in the case where the variance is known, sample size for a 100(1  )% confidence interval is given by (26), repeated below  z 2  n  .  E  2 (26) In the present case with the variance unknown, the best that can be done is to use (26) with your best guess of . You can also take a pilot sample and estimate . 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.4 Confidence Interval on the Mean 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.4 Confidence Interval on the Mean 4-5 Inference on the Mean of a Population, Variance Unknown 4-5.4 Confidence Interval on the Mean

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Inference on the Mean of a Population, Variance Known