Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
http://cc.jlu.edu.cn/ms.html Medical Statistics 5 Tao Yuchun 1 2014.3.13 Statistical inference 1. Estimation of population parameter 2 2014.3.13 1.1 Sampling error and standard error of mean •Sampling Study Sampling error: Sample → sample mean (different from population mean) Different samples → Different sample means (different from each other ) 3 2014.3.13 (1) Sampling error is related to the variation of the population •No variation, no error, also no Sampling error! Example: The sample means of systolic blood pressure. For adult population (age 25~90) No variation, no statistics, too ! -- vary substantially For young population (age 18~25) -- not vary too much 4 2014.3.13 (2) Sampling error is also related to sample size If sample size = population size there is no sampling error! If sample size = 1 Sampling error ≡ variation of population! 5 2014.3.13 •How changes for sample mean? •See simulative experiment below •Sampling from N(4.6602, 0.57462) by computer (100 times) Frequency distribution of sample means Value of mean 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 5.25 5.75 6.25 6.75 7.25-7.75 Sample size=5 1 1 4 2 12 15 12 10 17 8 6 7 4 1 Frequency Sample Sample size=10 size=20 1 2 5 8 16 26 16 15 8 3 6 2 9 24 31 22 10 2 Sample size=50 1 5 22 45 24 3 2014.3.13 7 2014.3.13 •Sampling from a skew distribution by computer 1 2 3 4 5 6 7 8 9 (a) n=10 n=5 1 2 3 4 5 7 8 1 2 3 4 (b) 5 6 7 8 9 (c) n=30 n=20 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 (e) (d) 8 2014.3.13 1.2 The distribution of sample mean If the variable ~ a normal distribution N ( , 2 ) sample means ~ a normal distribution X ~ N ( , 2 ) X X n If the distribution of variable ~ skew, For small sample distribution of sample mean – skew For large sample sample mean close to a normal distribution X ~ N ( , X2 ) --Came from Central Limit Theorem 9 2014.3.13 1.3 standard error • Standard deviation of the population: • Standard deviation of the sample mean or Standard error of sample mean or Standard error: X • In any case: Standard error of sample mean = standard deviation of the population n or X n 10 2014.3.13 •For application S SX n S is estimation of σ, S Xis estimation of X. 11 2014.3.13 1.4 Student’s t distribution •The t distribution was discovered by William S. Gosset in 1908. •“Student” is his pen name. William S. Gosset For a normal distribution N ( , 2 ) X ~ N ( , X2 ) If Z X X 1876 - 1937 X / n Z follows a standard normal distribution ---N(0,1). 12 2014.3.13 •When σ is unknown, X X t SX S/ n t follows a t distribution. t curve 0 13 2014.3.13 • The Property of t Distribution I. centrosymmetric In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary. • Center is 0. II. ν — shape parameter --Came from Wikipedia • also called degree of freedom, ν = n-1. • determine shape of a t curve. • different ν, different t curve. When ν is increasing, t curve is close to standard normal curve; when ν →∞, t curve became standard normal curve. See this animation 14 2014.3.13 •The different t curves ν= ∞(standard normal curve) f(t) ν= 4 ν= 1 15 2014.3.13 III. The area under the t curve • The Table for t distribution. • t value denotes t , , α is probability, ν is degree of freedom, ν = n-1. •The area under the t curve means: One side:P(t≤-tα,ν)=α or P(t≥tα,ν)=α Two sides:P(t≤-tα,ν)+P(t≥tα,ν)=α •See next figure 16 2014.3.13 •The meanings of the area under the t curve for two sides ν 2 2 t , t , 17 2014.3.13 1.5 Confidence Interval of Population Mean Estimation Statistical inference parameter point estimation interval estimation Hypothesis testing Point estimation of population mean -- sample mean Interval estimation of population mean -- (1-α) confidence interval Confidence level: 1-α, such as 95% or 99%. 18 2014.3.13 • From P(t≤-tα,ν)+P(t≥tα,ν)=α •We can get P(t , t t , ) 1 X P(t , t , ) 1 SX P( X t , S X X t , S X ) 1 X t , S X •It is the formula of (1- α) confidence interval of population mean for two sides. 19 2014.3.13 • (1- α) confidence interval of population mean can abbreviate to 95% CI or 99% CI. Whenever we get a mean and standard deviation from a sample, X put them into then S X t , S X X t , S x X t , S x ~ X t , S x •The two extreme values are called confidence limits. 20 2014.3.13 Example Systolic blood pressures of 20 healthy males were measured. X 118.4mmHg, S 10.8mmHg What is 95% confidence interval of the population mean? X 118.4mmHg , S 10.8mmHg S 10.8 n 20, SX 2.415 n 20 n 1 20 1 19, 0.05 t0.05,19 2.093 came from the Table of t distribution X t0.05,19 S X 118.4 2.093 2.415 113.3 X t0.05,19 S X 118.4 2.093 2.415 123.5 21 2014.3.13 95% CI: (113.3, 123.5) mmHg •What does “confidence interval” mean? (1-α) CI Not include μ 22 2014.3.13 You should know: Once you got a 95% confidence interval of the certain population mean, the μ for this population may be in it, also may not be in it, but the probability being in it is 95% ! (Guilin Pagodas http://en.wikipedia.org/wiki/Guilin) 23 C 2014.3.13 In figure, the red curve is standard normal curve,the blue curve is t curve,df is ν (degree of freedom). 24 2014.3.13