• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```http://cc.jlu.edu.cn/ms.html
Medical Statistics
5
Tao Yuchun
1
2014.3.13
Statistical inference
1. Estimation of population
parameter
2
2014.3.13
1.1 Sampling error and standard error
of mean
•Sampling Study
Sampling error:
Sample → sample mean
(different from population mean)
Different samples → Different sample
means
(different from each other )
3
2014.3.13
(1) Sampling error is related to the
variation of the population
•No variation, no error, also no Sampling error!
Example: The sample means of systolic blood
pressure.
No variation,
no statistics,
too !
-- vary substantially
For young population (age 18~25)
-- not vary too much
4
2014.3.13
(2) Sampling error is also related to
sample size
If
sample size = population size
there is no sampling error!
If
sample size = 1
Sampling error ≡ variation of population!
5
2014.3.13
•How changes for sample mean?
•See simulative experiment below
•Sampling from N(4.6602, 0.57462) by computer (100 times)
Frequency distribution of sample means
Value
of mean
0.75
1.25
1.75
2.25
2.75
3.25
3.75
4.25
4.75
5.25
5.75
6.25
6.75
7.25-7.75
Sample
size=5
1
1
4
2
12
15
12
10
17
8
6
7
4
1
Frequency
Sample
Sample
size=10
size=20
1
2
5
8
16
26
16
15
8
3
6
2
9
24
31
22
10
2
Sample
size=50
1
5
22
45
24
3
2014.3.13
7
2014.3.13
•Sampling from a skew distribution by computer
1
2
3
4
5
6
7
8
9
(a)
n=10
n=5
1
2
3
4
5
7
8
1
2
3
4
(b)
5
6
7
8
9
(c)
n=30
n=20
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
(e)
(d)
8
2014.3.13
1.2 The distribution of sample mean
If the variable ~ a normal distribution N ( ,  2 )
sample means ~ a normal distribution X ~ N ( ,  2 )
X

X 
n
If the distribution of variable ~ skew,
For small sample
distribution of sample mean – skew
For large sample
sample mean close to a normal distribution
X ~ N (  ,  X2 )
--Came from Central Limit Theorem
9
2014.3.13
1.3 standard error
•
Standard deviation of the population:
•
Standard deviation of the sample mean
or Standard error of sample mean
or Standard error:

X
• In any case:
Standard error of sample mean =
standard deviation of the population
n
or

X 
n
10
2014.3.13
•For application
S
SX 
n
S is estimation of σ, S Xis estimation of  X.
11
2014.3.13
1.4 Student’s t distribution
•The t distribution was discovered
by William S. Gosset in 1908.
•“Student” is his pen name.
William S. Gosset
For a normal distribution N ( ,  2 )
X ~ N (  ,  X2 )
If
Z
X 
X
1876 - 1937
X 

/ n
Z follows a standard normal distribution ---N(0,1).
12
2014.3.13
•When σ is unknown,
X  X 
t

SX
S/ n
t follows a t distribution.
t curve
0
13
2014.3.13
• The Property of t Distribution
I. centrosymmetric
In statistics, the number of degrees of freedom
is the number of values in the final calculation
of a statistic that are free to vary.
• Center is 0.
II. ν — shape parameter
--Came from Wikipedia
• also called degree of freedom, ν = n-1.
• determine shape of a t curve.
• different ν, different t curve. When ν is
increasing, t curve is close to standard normal
curve; when ν →∞, t curve became standard
normal curve. See this animation
14
2014.3.13
•The different t curves
ν= ∞(standard normal curve)
f(t)
ν= 4
ν= 1
15
2014.3.13
III. The area under the t curve
• The Table for t distribution.
• t value denotes t , , α is probability, ν is
degree of freedom, ν = n-1.
•The area under the t curve means:
One side：P(t≤-tα,ν)=α or P(t≥tα,ν)=α
Two sides：P(t≤-tα,ν)+P(t≥tα,ν)=α
•See next figure
16
2014.3.13
•The meanings of the area under the t curve
for two sides
ν

2

2
 t ,
t ,
17
2014.3.13
1.5 Confidence Interval of Population
Mean
Estimation
Statistical
inference
parameter
point estimation
interval estimation
Hypothesis testing
 Point estimation of population mean
-- sample mean
 Interval estimation of population mean
-- (1-α) confidence interval
 Confidence level: 1-α, such as 95% or 99%.
18
2014.3.13
• From P(t≤-tα,ν)+P(t≥tα,ν)=α
•We can get
P(t ,  t  t , )  1  
X 
P(t , 
 t , )  1  
SX
P( X  t , S X    X  t , S X )  1  
X  t , S X
•It is the formula of (1- α) confidence interval
of population mean for two sides.
19
2014.3.13
• (1- α) confidence interval of population
mean can abbreviate to 95% CI or 99% CI.
 Whenever we get a mean and standard
deviation from a sample,
X
put them into
then
S
X  t , S X
X  t , S x  X  t , S x ~ X  t , S x
•The two extreme values are called confidence limits.
20
2014.3.13
Example
Systolic blood pressures of 20 healthy
males were measured. X  118.4mmHg, S  10.8mmHg
What is 95% confidence interval of the population mean?
X  118.4mmHg ,
S  10.8mmHg
S
10.8
n  20,
SX 

 2.415
n
20
  n  1  20  1  19,
  0.05
t0.05,19  2.093
came from the Table of t distribution
X  t0.05,19 S X  118.4  2.093  2.415  113.3
X  t0.05,19 S X  118.4  2.093  2.415  123.5
21
2014.3.13
95% CI:
(113.3,
123.5) mmHg
•What does “confidence interval” mean?
(1-α) CI
Not include μ

22
2014.3.13
 You should know:
Once you got a 95% confidence interval of the
certain population mean, the μ for this population
may be in it, also may not be in it, but the
probability being in it is 95% !
(Guilin Pagodas http://en.wikipedia.org/wiki/Guilin)
23
C
2014.3.13
In figure, the red curve is standard normal curve，the blue
curve is t curve，df is ν (degree of freedom).
24
2014.3.13
```
Related documents