Download Course 4 - medinfo.umft.ro

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Statistical inference wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
“Victor Babes”
UNIVERSITY OF MEDICINE
AND PHARMACY
TIMISOARA
DEPARTMENT OF
MEDICAL INFORMATICS AND BIOPHYSICS
Medical Informatics Division
www.medinfo.umft.ro/dim
2007 / 2008
STATISTICAL ESTIMATION
STATISTICAL TESTS (I)
COURSE 4
STATISTICAL ESTIMATION
1.1. Numerical variables - example
• A STUDY ON CHILDREN SOMATIC
DEVELOPMENT
– N = 25 children, age 10, Timisoara, 1997
– mean X = 137 cm
– standard deviation s = 5 cm
• Can we extend conclusions to the entire
population?
• For several samples, various averages!
1.2. GRAPHICAL REPRESENTATIONS
Individual values – continuous line
Sample means – dotted line
1.3. Population characteristics
• Population mean
μ
• Standard error of the mean
EXAMPLE
• A STUDY ON CHILDREN SOMATIC
DEVELOPMENT
• N = 25 children, age 10, Timisoara, 1997
• mean X = 137 cm
• standard deviation s = 5 cm
• standard error of the mean sx = 1 cm
1.4. LOCALIZATION OF POPULATION MEAN
  ( x  sx , x  sx ); p  68%
  ( x  2sx , x  2sx ); p  95%
  ( x  3sx , x  3sx ); p  99.7%
1.5. DEFINITIONS
– a) STANDARD DEVIATION=
• DISPERSION INDICATOR SHOWING
INDIVIDUAL VALUES SPREADING
AROUND SAMPLE MEAN
– b) STANDARD ERROR OF THE MEAN=
• DISPERSION INDICATOR SHOWING
SAMPLE MEAN SPREADING
AROUND POPULATION MEAN
EXERCISE
• For a group of N = 36 cardiac patients
we found the mean blood systolic
pressure of 150 mm Hg with a standard
deviation of 12mm.
– a) In which interval are there located 68%
of patient systolic pressure values ?
– b) In which interval can we find the mean
systolic pressure with 95% probability ?
– c) What percent of pacients have values
above 162 ?
1.6. Generalization
• LOCATION OF POPULATION
CAHARACTERISTICS
• TYPES:
– MEANS
– PROPORTIONS
– DIFFERENCES (MEANS, PROPORTIONS)
• 1.6.a. MEAN ESTIMATION
– LARGE SAMPLES N > 30
– X = NORMAL DISTRIBUTION
•
•
•
•
(REGARDLESS INDIVIDUAL DISTRIBUTION)
68%
- 1
95.4%
2
90%
- 1.65
99%
2.58
95%
- 1.96
99.7%
3
Xˆ   X  z  .s x 
• 1.6.b. SMALL SAMPLES N < 30
– X - t DISTRIBUTION
– DEGREES OF FREEDOM
• 1.6.c. PROPORTIONS
ˆ
P  P  z  .s p 
STATISTICAL TESTS
2. STATISTICAL TESTS
• 2.1. SIGNIFICANT AND
NONSIGNIFICANT DIFFERENCES
• a) Example:
– BOYS
– n = 25
– X = 137 cm
– s = 5 cm
– sx = 1 cm
– (135, 139) ...95%
GIRLS
n = 25
X = 138.5
s=5
sx = 1
nonsignificant
X = 139.5
significant
b) DEFINITIONS
•
•
•
•
NON-SIGNIFICANT DIFFERENCES
High probability to occur by chance
Sampling variability
The two samples belong to the same
population
• SIGNIFICANT DIFFERENCES
• Low probability to occur by chance
• Must have another cause
2.2. STATISTICAL
HYPOTHESES
• a) NULL HYPOTHESIS
– H0 : X1 = X2 ( not mathematical equal, but statistical!)
– There are no significant differences
between the two values (samples)
• b) ALTERNATE HYPOTHESES
– H1 : X1  X2 (bilateral)
–
X1 > X2 , X1 < X2 (unilateral)
• 2.3. SIGNIFICANCE THRESHOLD
– a) DEFINITION:
• value of probability below which we start
consider significant differences
– b) VALUE:
•  = 0.05 = 5 %
– c) CONFIDENCE LEVEL
• 1 -  = 0.95 = 95 %
• 2.4. P COEFFICIENT
– P = probability that the observed
differences have occurred by chance
(sampling variab.)
2.5. DECISION
• If p > 0.05 => Non-significant
differences, (N) , H0 accepted
• If p < 0.05 => Significant differences,
(S), H0 rejected
– If p < 0.01 => Very significant differences,
(V), H0 rejected
– If p < 0.001 => Extremely significant
differences, (E)
3. TESTS
CHARACTERISTICS
• 3.1. ERRORS
– TYPE I: H0 = TRUE, BUT REJECTEED
– TYPE II: H0 = FALSE, BUT ACCEPTED
• 3.2. TEST CONFIDENCE = 1 - 
•
TEST POWER = 1 - b
•
inverse proportionality
• 3.3. Parametric and nonparam.
– Parametric - for normal distributed
variables
– Nonparametric - for other distributions
• 4. CLASSES OF TESTS
– SIGNIFICANCE TESTS
– HOMOGENEITY T.
– CONCORDANCE T.
– INDEPENDANCE T.
– CORRELATION COEFICIENT TESTS
-end-