Download Inference for 1 Sample - SFU Mathematics and Statistics Web Server

STAT 270 Inference for a Single Sample Richard Lockhart Simon Fraser University Spring 2015 — Surrey Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 1 / 51 Purposes of These Notes Describe point estimation, interval estimation, and hypothesis testing. Describe a random sample. Define a confidence interval and its level. Derive some confidence intervals in 1 sample problems: means and proportions. Discuss difference between Fisher and Neyman-Pearson. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 2 / 51 Purposes Continued Describe ingredients of Neyman-Pearson hypothesis testing. Define null and alternative hypotheses. Define a test statistic, rejection region, level. Define a Type I and Type II errors. Differentiate between one-tailed and two-tailed problems. Specific formulas for hypotheses about means and proportions. Define a P-value. Understand technical meaning of statistically significant. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 3 / 51 List of Statistical Problems Name most likely value of parameters: point estimation. Name range of likely values: confidence interval. Assess evidence against hypothesis about parameters: hypothesis testing. Make forecasts, do interpolation. And more. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 4 / 51 Point Estimation Estimate: number which is our best guess for parameter value. Estimator: rule for computing estimate from data. An estimator is a random variable which is a function of the data. Example. Newcomb& Michelson measured speed of light in 1880s. Made 66 measurements of time taken by light to travel 7.44373 km. Measured values are X1 , X2 , . . . , Xn with n = 66. Use lower case letters for observed values. First measurement was 24.828 millionths of a second. Convert measurement to speed of light. x1 = 109 · 7.44373/24.828 = 2.998119 × 108 m/s. x2 = 2.998361 × 108 m/s. Point estimate of speed of light is 2.998336×108 m/s. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 5 / 51 Estimators We were using the rule: average the data. So our estimator was X̄ = X1 + · · · + Xn . n Model for measurement error. Several parts: X1 , . . . , Xn independent and identically distributed. Let µ = E(Xi ) be the population mean. Long run average measurement. Population SD is σ. Speed of light is c — standard notation. Relate µ to c: µ = c + bias Often assume bias is 0. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 6 / 51 Newcomb data Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 7 / 51 Point Estimation Have data and model for population. Model describes population in terms of some parameters. Binomial(n, α) model: α is a parameter. Sample from a N(µ, σ 2 ) model. Parameters are µ and σ. Sample from the Gamma density 1 f (x; α, β) = βΓ(α) α−1 x exp(−x/β) β x > 0. Parameters are α and β. Generic notation: θ. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 8 / 51 Standard Errors Estimates should always be accompanied by some assessment of their likely accuracy. For unbiased estimators with approximately normal sampling distributions we use the Standard Error. The SE of an estimator θ̂ of θ is q SE = Var(θ̂). That is: Standard Error of an estimator is another name for its SD. The standard error of α̂ in the Binomial(n, α) problem is p p(1 − p) √ n The SE of X̄ is σ √ . n Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 9 / 51 Estimated Standard Errors What accompanies our point estimate is a number, not a formula. The SE is usually a formula with unknown parameters in it. We estimate the SE by plugging in estimates of the parameters. p The SE for α̂ is α(1 − α)/n so Estimated SE is p α̂(1 − α̂) √ . n And you plug in data to get a number to put in your report. We use Standard Errors in Confidence Intervals. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 10 / 51 Confidence Interval Definition A level β confidence interval for a parameter θ is the interval [L, U] between two statistics L and U such that P(L ≤ θ ≤ U) ≥ β for all possible parameter values. We prefer to replace ≥ by = or ≈. We use CIs by: ◮ ◮ ◮ Deciding how to do data analysis before gathering data (decide on formulas for L and U before getting data). Get data; compute observed values of L and U, say l and u. Say ‘I am 100β% confident that θ is the interval [l, u]’. 1 − β is the error rate or non-coverage rate. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 11 / 51 Populations and Samples Meaning of a sample from a population. Population is group we want to find out about. Can be real: all Canadian adults of working age. Can be ‘conceptual’: all possible outcomes of some experiment. Populations often thought of as populations of numbers. Conceptual populations often described by probability density or pmf. Examples: heights of adults. Think of population as being normally distributed with mean µ and sd σ. Example: repeatedly measure speed of light in a vacuum. Each measurement is ‘truth’ plus ‘measurement error’. Population of errors describe by density: N(0, σ 2 ) perhaps. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 12 / 51 Populations and Samples Sample is part of the group for which data is obtained. Use n for number of items sampled. Call it a “single sample” problem if we measure 1 number for each item sampled. Call measurements X1 , . . . , Xn . Random sampling: fixed number of members of group selected by random mechanism playing no favourites. With replacment: pick one at a time. On i th selection each member of population has same chance of being drawn, even if that member has been picked before. Usual model for conceptual populations. Without replacment: pick one at a time. On i th selection each member of population who has not been drawn yet has same chance of being drawn. Common model (sampling method) for real populations. Neither is usual selection method in real surveys. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 13 / 51 Simplest Derivation of Confidence Interval Mathematical model for a single sample: X1 , . . . , Xn are independent and identically distributed. Write ‘iid’. Simplest populations to describe – approximately normal, like heights. Suppose X1 , . . . , Xn are independent N(µ, σ 2 ). Suppose (quite unrealistically) that σ is known. I now show you a 95% confidence interval for µ, based on the data. Consider the random variable X̄ − µ √ . Z = σ/ n Then, regardless of what µ is, Z has a standard normal distribution. So P(a ≤ Z ≤ b) does not depend on µ. No matter what µ is P(−1.96 ≤ Z ≤ 1.96) = Z 1 .96φ(z) dz = 0.95. −1.96 Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 14 / 51 The Confidence Interval The event −1.96 ≤ Z ≤ 1.96 can be rewritten in a number of ways. It is the event X̄ − µ √ ≤ 1.96. −1.96 ≤ σ/ n √ Multiply by σ/ n (which is positive): σ σ −1.96 √ ≤ X̄ − µ ≤ 1.96 √ n n Notice this is still the event. Rearrange second inequality: σ L ≡ X̄ − 1.96 √ ≤ µ n Rearrange first inequality: σ µ ≤ X̄ + 1.96 √ ≡ U n So no matter what µ is: P(L ≤ µ ≤ U) = 0.95 Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 15 / 51 An example with data Simon Newcomb made 66 measurements of time taken by light to travel 7.44373 km. I round off a bit from real data. Convert to list of 66 speeds. Sample mean is 299,833,533 m/s. Temporarily assume σ = 130, 000 m/s is known. 95% confidence interval is 299, 833, 553−1.96× 130, 000 130, 000 √ to 299, 833, 553+1.96× √ m/s. 66 66 We say we are 95% confident that the speed of light is between 299,802,189 and 299,864,917 m/s. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 16 / 51 Caveats and improvements More digits than is wise but 6 leading digits worth reporting. The quantity 130, 000 √ m/s 66 is called the standard error of the sample mean. Pretty well everything is an approximation so many data analysts round 1.96 to 2. We are only pretending we know σ. Usually we have to use the data to tell us about σ as well as about µ. Notation: define upper α critical point of normal by: P(N(0, 1) > zα ) = α. So z0.025 = 1.96. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 17 / 51 The role of normality We assumed initially that the population we are sampling is, itself normally distributed. But our basic probability was: Z 1.96 X̄ − µ √ ≤ 1.96 = P −1.96 ≤ φ(z) dz = 0.95. σ/ n −1.96 Accuracy depends on sampling distribution of X̄ . Central limit theorem says: if n large enough this is normal for (nearly) any population distribution. More skewness means larger n needed. Heavy tails mean larger n needed. We often use rule of thumb: n ≥ 30. Message: use same formula if n large: σ X̄ ± zα/2 √ . n Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 18 / 51 Unknown SD, lots of data Actually Newcomb did not know σ at all. He measured s, the SD of his 66 measurements. In fact s = 130, 026 m/s. When n is large s will be close to σ so X̄ − µ X̄ − µ √ ≈ √ . σ/ n s/ n So just replace σ by s in confidence interval. We are 90% confident that the speed of light is in the range 130, 026 130, 026 to 299, 833, 553 + 1.645 × √ . 299, 833, 553 − 1.645 × √ 66 66 The Estimated Standard Error is 130, 026 √ 66 It estimates the Standard Deviation of X̄ . Notice use of z0.05 = 1.645 not z0.025 = 1.96. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 19 / 51 Small samples – Student’s t distribution How good is the approximation? Estimated SD ofX̄ using same data from which we computed the mean. So we should use something a bit bigger than 1.645. For 66 observations that ”bit bigger” is 1.997. Correct critical point comes from Student’s t distribution. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 20 / 51 More probability – small samples When sampling from a normally distributed population we have: Z x X̄ − µ √ P fT ,n−1 (u)du ≤x = S/ n −∞ where fT ,n−1 is Student’s t-density “with n − 1 degrees of freedom”. To be precise – but this density is not part of this course: Γ((ν + 1)/2) (1 + u 2 /ν)−(ν+1)/2 fT ,ν (u) = √ πνΓ(ν/2) As ν → ∞ this converges to the standard normal density. Curve looks a lot like normal but heavier tails. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 21 / 51 Specific scientific settings Specific settings have specific formulas for Estimated SE. Scenario 1: sample from normal population, σ (population SD) known, CI for population mean, µ. Interval (already done) is √ X̄ ± zα/2 σ/ n Scenario 2: sample from general population, σ (population SD) unknown, sample size n large, CI for population mean, µ. Interval (already done) is √ X̄ ± zα/2 s/ n Scenario 3: sample from normal population, σ (population SD) unknown, sample size n anything, CI for population mean, µ. Interval is √ X̄ ± tα/2,n−1 s/ n Multipliers tα/2,n−1 from other table at back of text. Statistical packages always do Scenario 3 arithmetic. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 22 / 51 Confidence intervals for proportions Common scientific framework Sequence of Bernoulli trials. Number n fixed, p is “Success Probability” on each trial. X is the number of successes. Goal is a confidence interval for proportions. Based on Central Limit Theorem. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 23 / 51 Using the CLT Recall p̂ = X /n and X = X1 + · · · + Xn ; each Xi is Bernoulli. So p̂ is a sample mean of the Xi . Population mean is µ = E(Xi ) = p. Population variance is σ 2 = Var(Xi ) = p(1 − p). p √ √ So SE of p̂ is σ/ n = p(1 − p)/ n. p √ Estimated SE is usually taken to be p̂(1 − p̂)/ n. CLT says p p̂ − p p(1 − p)/n ⇒ N(0, 1). Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 24 / 51 Using the CLT 2 Law of large numbers says: lim p̂ = p n→∞ So it is also true that p Result is: lim P n→∞ −zα/2 ≤ p p̂ − p ⇒ N(0, 1). p̂(1 − p̂)/n p̂ − p p̂(1 − p̂)/n ≤ zα/2 ! = Z zα/2 −zα/2 φ(z)dz = 1 − α. Leads to approximate level 1 − α confidence interval. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 25 / 51 Solving inequalities to get limits Temporary notation A is the event −zα/2 ≤ p p̂ − p p̂(1 − p̂)/n ≤ zα/2 . Solve inequalities in A to isolate p: multiply through by SE: o n p p A = −zα/2 p̂(1 − p̂)/n ≤ p̂ − p ≤ zα/2 p̂(1 − p̂)/n Rearrange each individual inequality: right hand gives p p̂ − zα/2 p̂(1 − p̂)/n ≤ p. Similarly for left inequality get p ≤ p̂ + zα/2 p p̂(1 − p̂)/n. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 26 / 51 General points Most essential: the meaning of confidence: If we analyze 100 data sets and compute 100 (exact) confidence intervals at the 95% level we expect that some of the 100 intervals will contain the truth and some won’t. The expected number which contain the truth is 95. The number which contain the truth is random. Rule of thumb: if np > 10 and n(1 − p) > 10 then normal approx is fine. You don’t know p but you use p̂ in the rule of thumb. Text uses 5 instead of 10. That is ok, too. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 27 / 51 A catalogue of confidence intervals Intervals for population proportions; done earlier. Intervals for population means. ◮ ◮ ◮ Samples from Normal populations with known σ. Samples from Normal populations with unknown σ. Large samples from more general populations. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 28 / 51 Confidence statements, normal populations Normal sample, σ known: P(−z ≤ X̄ − µ √ ≤ z) = Φ(z) − Φ(−z) σ/ n so if we find z so that Φ(z) − Φ(−z) = 1 − α then √ √ X̄ − zσ/ n to X̄ + zσ/ n is an exact level 1 − α confidence interval for µ. Value of z is denoted zα/2 because P(N(0, 1) > z) = α/2 = P(N(0, 1) < −z) in this case. We call zγ the upper tail γ critical point. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 29 / 51 Confidence statements, normal populations Normal sample, σ unknown: P(−t ≤ so if we find t so that X̄ − µ √ ≤ t) = S/ n Rt −t fT ,n−1 (u)du Z t fT ,n−1 (u)du −t = 1 − α then √ √ X̄ − tS/ n to X̄ + tS/ n is an exact level 1 − α confidence interval for µ. Value of t is denoted tα/2,n−1 because P(T > t) = α/2 = P(T < −t) Again tγ,ν is notation for the upper γ critical point of a Student’s t-distribution on n − 1 degrees of freedom. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 30 / 51 Confidence statements, large samples, general populations Sample from population mean µ and unknown SD σ: Z t X̄ − µ √ ≤ t) ≈ P(−t ≤ fT ,n−1 (u)du ≈ Φ(t) − Φ(−t) S/ n −t so and √ √ X̄ − tα/2,n−1 S/ n to X̄ + tα/2,n−1 S/ n √ √ X̄ − zα/2 S/ n to X̄ + zα/2 S/ n both approximate large sample level 1 − α confidence intervals for µ. Very rarely: σ is known so replace S by σ and use zα/2 . Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 31 / 51 Confidence statements, large samples, general populations Books traditionally recommend z for n ≥ 30 or n ≥ 40 or some such rule of thumb. BUT I say just use t; software always does and the t approximation is generally better. Rule of thumb comes from DARK AGES before computers when people used the tables in the book. Those are for statistics exams, nothing else. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 32 / 51 Typical hypothesis testing science questions New drug for blood pressure. Get 200 patients. Pick 100 at random to get new drug; others get old. Choose between two possibilities: drug reduces BP or doesn’t. Speed of light in vacuum is known. Measure speed of neutrinos. Is speed equal to speed of light or not? Are far away galaxies moving away from earth faster than nearby ones or not? Is speed of light same in north south and east west directions? Does some intervention program in prison reduce recidivism or not? Common feature: choose between two scientific alternatives. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 33 / 51 Methodology Conduct experiment in which response (BP, speed of neutrinos, two light speeds, recidivism) is measured. Formulate statistical models: data are like a sample from a normal population; number of patients surviving has binomial distribution; north south speeds and east west speeds like samples from 2 populations. Phrase the scientific alternatives as alternatives about the parameter values in the model: mean north south speed equals mean east west speed OR not; probability of re-offense in treatment group equals probability of re-offense in control group OR not . . . Develop a rule to make a choice between two alternatives. Understand error rates. Apply rule to data. Details follow. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 34 / 51 Example 1: Measurement bias Newcomb makes n = 66 measurements of time for light to travel 7.44373 km. Modern value for that time is 24.82961 microseconds. Is Newcomb biased? Model: each measurement is like draw from a population of possible measurements. Data is X1 , . . . , Xn sample from population with mean µ and SD σ. No bias translates to µ = 24.82961 microseconds. We say our null hypothesis, H0 , is µ = 24.82961. Our alternative hypothesis, Ha , becomes µ 6= 24.82961. H0 is pronounced “H nought” (“H not”). Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 35 / 51 The test statistic To make the decision we find a test statistic, T , which is function of data. It will depend on the number 24.82961 as well. It should tend to be big if the alternative hypothesis is right. It should NOT tend to be big if the null hypothesis is right. We will calculate T and choose alternative if it is “too big”. First obvious suggestion: T = |X̄ − 24.82961|. How big is too big? Compare T to variability of X̄ − 24.82961. √ Estimate that variability using Estimated Standard Error s/ n of X̄ . So change to X̄ − µ0 T = √ s/ n Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 36 / 51 How big is too big? Two big approaches – assess evidence versus make firm decision. Fisher: summarize size of T by a P-value and interpret this P value as strength of evidence against null hypothesis. Formal decision making: select rejection region. If T lands in rejection region we reject the null hypothesis and behave as if alternative hypothesis is true. Two approaches very closely connected. Neyman-Pearson approach first — formal decision making. Recognize two kinds of errors. Type I error: Newcomb has no bias but we say he did. Null hypothesis is true but we say it is false. Type II error: Newcomb was biased but we miss that fact. Null hypothesis is false but we decide it is true. Language used in book: reject null hypothesis or fail to reject null hypothesis. Other places: “fail to reject” null hypothesis is called “accept null hypothesis”. You behave as if null is true. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 37 / 51 Making a decision For Newcomb our rejection region is X̄ − µ0 T = √ > c s/ n c is critical point. How do we select c? Neyman Pearson method. Choose c to control Type I error rate. Select a pre-specified tolerable error rate: usually 5%. Call this rate α. Find c so that PHo (T > c) = α. PHo is notation to show that we compute this chance assuming that the null hypothesis is true. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 38 / 51 Specific scientific settings Scenario 1: sample from normal population, σ (population SD) known, hypothesis tests for population mean, µ. Two sided alternative: H0 :µ = µ0 , Ha :µ 6= µ0 X̄ − µ0 √ T = σ/ n and c = zα/2 One sided alternative. H0 :µ = µ0 , Ha :µ > µ0 or H0 :µ ≤ µ0 , Ha :µ > µ0 . X̄ − µ0 √ T = σ/ n and c = zα I expect you to know what to do if inequalities reversed. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 39 / 51 Scenario 2, σ unknown Scenario 2: sample from general population, σ (population SD) unknown, sample size n large, hypothesis tests for population mean, µ. Two sided alternative: H0 :µ = µ0 , Ha :µ 6= µ0 X̄ − µ0 T = √ s/ n and c = tα/2,n−1 One sided alternative. H0 :µ = µ0 , Ha :µ > µ0 or H0 :µ ≤ µ0 , Ha :µ > µ0 . X̄ − µ0 √ T = s/ n and c = tα,n−1 Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 40 / 51 Small samples Scenario 3: sample from normal population, σ (population SD) unknown, sample size n anything, CI for population mean, µ. Use same method as Scenario 2. But now the method is exact. Without the normal population assumption we are relying on the CLT and LLN and Slutsky’s theorem. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 41 / 51 Hypothesis tests for proportions Common scientific framework Sequence of Bernoulli trials. Number n fixed, p is “Success Probability” on each trial. X is the number of successes. Goal is a hypothesis test for proportions. Method based on application of the Central Limit Theorem. Same list of null / alternative choices: H0 :p = p0 or H0 :p ≤ p0 H0 :p = p0 allows either 1 or 2 sided alternatives. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 42 / 51 Using the CLT (repeat from CI notes!) Recall p̂ = X /n and X = X1 + · · · + Xn ; each Xi is Bernoulli. So p̂ is a sample mean of the Xi . Population mean is µ = E(Xi ) = p. Population variance is σ 2 = Var(Xi ) = p(1 − p). p √ √ So SE of p̂ is σ/ n = p(1 − p)/ n. CLT says: if p = p0 then p p̂ − p0 p0 (1 − p0 )/n ⇒ N(0, 1). Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 43 / 51 Using the CLT 2 Our test statistic is either for Ha :p > p0 or for Ha :p 6= p0 Critical value c is T =p p̂ − p0 p0 (1 − p0 )/n p̂ − p 0 T = p p0 (1 − p0 )/n zα/2 for two-sided alternative or zα for one-sided alternative. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 44 / 51 Some scientific examples Cadmium in a lake example. n = 17 measurements of cadmium concentration. x̄ = 211, s = 15, units are parts per million or some such. (Important but these numbers are made up.) Scientific question: decide between two possibilities – concentration below 200 vs above 200. Typical one-sided situation. Need to connect data to scientific question of interest. Introduce notation: X1 , . . . , Xn are the 17 measurements. Must assume that they are gathered and measured in such a way that they are a sample of size 17 from a population whose mean µ is “concentration of cadmium in the lake” Definition of that last is scientific problem. Issues to consider: is the whole lake sampled? are the measurements biased? are the measurement errors independent? Assume issues dealt with. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 45 / 51 Cadmium For first pass I consider BOTH possible H0 s. For H0 :µ ≤ 200 use T = X̄ − 200 √ s/ n and reject if T > t0.95,n−1 = 1.75. (Notice rejection region.) Notice use of borderline value, 200, in T . Plug in values and find T = 211 − 200 √ = 3.02 15/ 17 Since 3.02 > 1.75 we reject the hypothesis that µ ≤ 200. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 46 / 51 P-values BUT: in fact we can say a bit more. This number 3.02 is quite a bit bigger than 1.75. If we had used α = 0.01 instead of 0.05 our rejection region would be T > t0.01,16 = 2.58 and we would still have rejected. In fact we would reject for any α for which tα,16 < 3.02 Smallest possible α is when tα,16 = 3.02. Or P(T16 ≤ 3.02) = 1 − α = 1 − P(T16 ≥ 3.02) This α is Fisher’s P-value. Compute P by finding area to right of observed statistic under null density of statistic. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 47 / 51 P-values Reject H0 at level α if P < α. If H0 is right then P has a Uniform[0,1] distribution. Interpret P as measure of evidence strength – smaller P, stronger evidence against H0 . Call evidence statistically significant if P < 0.05. Highly statistically significant and very highly statistically significant are often used for smaller thresholds like 0.01 or 0.001. Some statistics packages label P-values with 1 star for P < 0.05, 2 stars for P < 0.01 and 3 stars for P < 0.001. These are all simply conventions. For two tailed problems: P is twice the area in the small tail. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 48 / 51 Example from Devore, Page 342 Q 65 Sample of n = 50 lens thicknesses. Given x̄ = 3.05 and s = 0.34 (all in mm). Desired mean thickness 3.20 mm. Do “the data strongly suggest that the true average thickness of such lenses is something other than what is desired”? Clear two sided alternative. Null must be H0 :µ = 3.20. Test statistic is 3.05 − 3.2 √ = 3.12 T = 0.34/ 50 P value? Twice area to right of 3.12 under t on 49 df. P = 0.003 which is very significant. (Table A.8 gives P in range of 0.002 to 0.004.) So we see very strong evidence against the assertion that the true average thickness is 3.2mm. We would reject null at α = 0.05 or even α = 0.01. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 49 / 51 Error rates and sample size calculations Type I error: incorrectly reject H0 . Type II error: incorrectly fail to reject H0 . Type I error rate is α; determined in advance. Type II error rate is β – depends on what true parameter value is. Can sometimes compute β = P(don’t reject) as a function. Answer will depend on n. Can then sometimes choose n to give suitable sample size. But often n depends on unknown parameters like σ. So we design for some hoped for value of σ. Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 50 / 51 Sample size, Z test, 1 sided Imagine testing µ ≤ µ0 against µ > µ0 . Assume that σ is known. Fix some α like 0.05. So reject if X̄ − µ0 √ > zα . Z = σ/ n Compute β: X̄ − µ0 √ < zα . β=P σ/ n For β > β0 we make a type II error is Z < zα . Centre on correct µ: X̄ − µ µ − µ0 √ + √ < zα β=P σ/ n σ/ n √ Area to left of zα − (µ − µ0 )/(σ/ n): µ − µ0 √ β = Φ zα − σ/ n Richard Lockhart (Simon Fraser University) STAT 270 Inference for a Single Sample Spring 2015 — Surrey 51 / 51

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Inference for 1 Sample - SFU Mathematics and Statistics Web Server