Download OBJECTIVES

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

Categorical variable wikipedia , lookup

Analysis of variance wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
OBJECTIVES


Understand the reason for and use of statistics
Review descriptive statistics




Measures of central tendency
Measures of variability
Measures of relationship
Inferential Statistics


Parametric
Non-parametric




Average: 13085.77
Mode: 13226
Median: 14022
Standard deviation
198.3
What are statistics for?



Painting a mathematical picture….or
simplifying large sets of data
Identifying if relationships exist
Establishing the probability of a cause and
effect relationship
How many tests are there?




There are well over 100 statistical
calculations and tests
However, many are rarely if ever used
Therefore, we will focus on those that are
used most often…
Descriptive (Part I) and Inferential
Statistics (Part II)
Which tests should you know?

The New England Journal of Medicine
from Vol. 298 through 301 (760 articles)
were reviewed to determine what
statistical tests are most important to
learn to understand the scientific
literature.
FINDINGS***

“A reader who is conversant with some
simple descriptive statistics
(percentages, means, and standard
deviations) has full statistical access to
58% of the articles.
Findings ***

“Understanding t-test increases this access
to

67 percent.
“Familiarity with each additional test
gradually increases the percentage of
accessible articles.”
Descriptive Statistics***


Purpose --- To simply describe the data,
the sample, or the population.
Descriptive Statistics DO NOT establish a
possible cause & effect relationship
Inferential Statistics*


Purpose --- In experimental research uses
a sample of the population – inferential
statistics permits the researcher to
generalize from the sample data to the
entire population.
Aids the researcher in determining if cause
and effect relationships exist.
Descriptive Statistics
Measures of Central
Tendency***
Mode --- most frequently occurring score
Median (Mdn) --- score physically in the
middle of all scores
Mean (M or X) --- arithmetic mean--- i.e.
sum of scores divided by the number of
scores
Mean***


Is generally the preferred measure of
central tendency
Is used frequently for other calculations
Measures of Variability
(dispersion)
Measures of Variability***




AKA spread.
If scores are similar….they have low
variability (homogeneous)
If scores are dissimilar…they have high
variability (heterogeneous)
Two sets of scores may have the exact
same mean but one set may have low
variability and the other very
high….therefore… measures of variability
help DESCRIBE these differences
Range**




Simplest measure of variability
The difference between the lowest and
highest score
Usually reported in the literature as
“range” (e.g. the range of ages was...)
The range is an unstable calculation
because it is only based upon 2 numbers
Standard Deviation &
Variance***


These measures are calculated based
upon ALL data scores and are therefore
better represent the data set
They are used with other statistical
calculations
Variance**


Related to the amount that an individual
score VARIES from the mean.
This measurement is how much the scores
deviate from the mean, BUT it uses all the
scores.
Standard Deviation***




Is the square root of the variance
How widely the values in a set are spread
apart.
A large SD tells you that the data are fairly
diverse, while a small SD tells you the
data are pretty tightly bunched together.
If a mean is presented in the data, SD
MUST be there at well.
Standard Deviation (aka
S.D.)***
Standard Deviation example***




If a group of people in an experiment had
a mean age of 40, with a standard
deviation of 4, what does this mean?
1. 68% of this group were between ages
36 and 44 (1 sd +/- the mean)
2. 95% were between 32 and 48
3. 99% were between 28 and 52
Practice question

Your treatment group (n:50) has a mean
score on an IQ test of 100 with a standard
deviation of 15.



How many subjects fall within one standard
deviation of the mean?
If a subject has an IQ of 125 , how many SD
does he/she fall from the mean?
If an individual has an IQ of 135, what
percentage of people is he/she smarter than?
Correlations***




Measures of central tendency and
variability describe only ONE variable
CORRELATIONS describe the relationship
between TWO variables
Correlations can be positive, negative or
zero
Correlations range from +1 through -1
Correlations**





+.95, +.87, High positive correlations
+.19, +.22, Low positive correlations
+.03, -.02, No relationship
-.23, -.19, Low negative correlations
-.94, -.88, High negative correlations
Summary



Descriptive statistics do just that
They describe the distribution of data toward the
center (mean, median, mode) and they describe
the variability away from the center (range,
variance, standard deviation)
They also determine if there is a relationship
between 2 (or more) variables
Summary




Of the 100+ statistical tests only a few are
frequently used
You can intelligently read and understand nearly
70% of the biomedical literature with an
understanding of descriptive statistics and the ttest
Correlations are an important and often used
type of descriptive statistics
Descriptive Statistics must be well understood in
order to understand inferential statistics
Statistical & Medical
Significance**


It is important to keep in mind that statisticians
use the word “significance” to represent the
results of testing a hypothesis
In everyday language and in the clinical setting,
a “significant” finding or treatment relates to
how “important” it is from a clinical and not a
mathematical perspective
4 Possibilities***





Medically & statistically significant
Medically but not statistically significant
Statistically but not medically significant
Neither statistically or medically significant
Very large groups of subjects can reflect
statistically significant differences between two
groups …but they may not be medically
significant from the perspective of cost, risks,
policies etc.
Before the Experiment

Researcher must determine the:




Critical value
Directional or non-directional hypothesis
Power
In order to determine the sample size.
Critical Value***



The researcher must establish the value at
which they will consider the results
“significant”…this is referred to as the CRITICAL
VALUE
There is some subjective and somewhat
arbritrary decision to be made in this regard by
the researcher
The customary minimal critical values are either
p<.05 or p<.01
Power

Determines the level of certainty that the
results are accurate
Directional / Non-directional
Hypothesis***


“Directional” means that the researcher
anticipates or expects a specific positive or
negative impact from the treatment (or
other independent variable)
“Non-directional” means that the
researcher does not know what to expect.
Perhaps the treatment will make the
patient better or worse
Directional Hypothesis/Non… &
Critical Value**


These two decisions need to be made
before the study is started and will
determine to some degree if your results
will be significant
…
Describing Results***




p values
Confidence interval (CI)
Odds ratios
Number needed to treat (NNT).
p Values***




Most common form evaluating results,
although not the best
Tells you the odds of your result being
caused by chance
.05, .01
Written as p<.05.
Confidence Interval***



Also give you the odds that the results are
due to chance
Also give you the range in which the
results should fall
Example: Significant improvement in the
VAS, 95% CI (25-50).
Odds Ratios***





It is the ratio of the number in the treatment
group with improvement divided by the number
without improvement compared to the number
in the control group with improvement divided
by the number without improvement
20/5 divided by 5/20= 16
The higher the number the better the result
CI and odds ratios used together
95% CI odds ratio (2-8).
NNT**


Number (of patients) needed to treat
This gives you the number of patients you
need to treat before you have someone
improve. The closer to 1 you get the
better the treatment
Post-operative pain 50%↓
Review of Descriptive
Statistics
***








Mean
Median
Mode
Variance
Standard Deviation
Odds ratio
NNT
CI
***


Prior to designing experiment, effect of
power, critical value, hypothesis and
change measured
Statistical and medical significance
Standard Deviation (aka S.D.)
Inferential Statistics
Parametric versus Nonparametric
Parametric Statistics
3 most commonly used tests
& some related concepts commonly
expressed in the literature
Assumptions---Parametric***


Parametric Inferential Statistics --- assumes that
the sample comes from a population that is
NORMALLY DISTRUBUTED & that the variance is
similar (homogeneous) between sample and
population (or 2 populations)
The tests are very POWERFUL ---I.e. can
recognize if there is a significant change based
upon the experimental manipulation
Usage***

“Generally, it is agreed that unless there is
sufficient evidence to suggest that the
population is extremely non-normal and
that the variances are heterogeneous,
parametric tests should be used because
of their additional power”
Parametric Statistical Tests
t-Test aka Student t-Test
ANOVA
Analysis of covariance
t- Test***




Developed by Gosset under the pseudonym
Student
3 different versions of the t-tests that apply to 3
different research designs
All three forms of the t-Test are based upon the
MEANS of two groups
The larger the difference in the calculated t
scores, the greater the chance that the null
hypothesis can be rejected
t-Test***



3 different general ways to use the
student t-test
2 variations of each…dependent upon the
type of hypothesis the researcher uses
The hypothesis can either be directional or
non-directional
t-Test #1 Single Sample***



Compares the sample to the mean value of the
population
Is not used often because the mean for the
population if usually not known
E.g. Stanford Binet I.Q. test….has a mean of 100
and 1 S.D. of 15 (although not everyone in the
U.S. has had the test, enough have been tested
to accept the data as representative
t-Test #2 Correlated Groups***




Used when subjects serve as their own
controls (or when they are matched to very
similar subjects)
For each subject we could have a pre and a
post treatment score (e.g. pain, blood
pressure, algometer, cholesterol levels, range
of motion…)
The null hypothesis would be that the
difference between pre and post scores would
be 0 (treatment is not effective)
If the difference is sufficient, the null
hypothesis can be rejected
T-Test #3 Independent t-Test***




AKA independent groups t-Test
Most commonly used
Used when you have 2 groups (2 samples)
out of an entire population
Ho = X control = X treatment
Analysis of Variance (ANOVA)***



The t-Test only allows us to compare 2
groups
What if we have a study comparing 2 or
more types of treatment with a controls of
both no treatment and placebo?
ANOVA is designed to handle multiple
groups similar to what the t-Test does with
2 groups
Analysis of Covariance
(ANCOVA)***



Sometimes studies nuisance variables
impact the dependent variable (outcome
measures) but not the independent
variable (e.g. treatment).
These unwanted variables can interfere
with our analysis of the data
Example…
Example***




We want to see if one of two treatment
protocols will have a positive effect upon
patients with low back pain
The patients are randomly assigned to the
treatment and control groups
We realize from the histories that there are
factors that impact recovery from low back
pain that we have not accounted for (e.g.
obesity, smoking, occupation, age etc. etc.)
These factors could impact rate of recovery
(dependent variables)
The Analysis of Covariance pulls those
possible confounding… nuisance factors out
Regression Analysis**


Regression analysis falls somewhere
between parametric and non-parametric
statistics.
When there are a number of variables,
and there appears to be some trends, but
nothing seems to be statistically
significant, this measurement will be used.
Regression, cont.



Regression measures how well data fits a
straight line.
The data is systematically analyzed,
variable by variable, until only those
variables that “fit” are left in the model.
Also referred to as least fit or best fit.
Non-parametric Statistics***



These tests are based on the MEDIAN
rather than the MEAN of the sample.
Parametric tests are generally performed
FIRST, then non-parametric tests.
Non-parametric tests are performed when
the expectation of normalcy is violated.
Nonparametric Inferential
Statistics --- no Assumptions**



Makes no assumptions about the
distribution of the data (distribution free)
So it does not assume that there is a
normal distribution of the data…..etc.
Is less powerful…meaning that a greater
difference (or change) needs to be
present in the data before a significant
difference can be detected
Nonparametric Tests***



Wilcoxon Signed Rank Test— equivalent to
the t-test; data is multidirectional
Wilcoxon Ranked Sum Test—equivalent to
the Student t-test
Kruskal-Wallis Test– equivalent to the oneway analysis of variance
Kappa***




Commonly used in chiropractic & medical
literature to convey the degree of interexaminer
and/or intraexaminer reliability
Interexaminer– two or more examiners
checking/evaluating or testing for the same
finding
Intraexaminer– one examiner
checking/evaluating or testing for the same
finding on two different occasions
Kappa calculates the degree of agreement
between the first and second check/evaluation
or test
Kappa values ***






< 0 Poor agreement
0 — 0.20 Slight agreement
0.21 — 0.40 Fair agreement
0.41 — 0.60 Moderate agreement
0.61 — 0.80 Substantial agreement
0.81 — 1.00 Almost perfect agreement
Summary




“Significance” is used in different
ways…statistical & medical
The most commonly used inferential statistical
test is the Student t-Test (which has 3 versions
depending what you are comparing) it only
compares 2 group(s)/sample
Hypothesis can be directional or nondirectional…
Critical value, power and direction of hypothesis
is established by the researcher BEFORE the
study is started.
Summary….contd.



Parametric tests assume several things
related to a relatively normal distribution
of data
Analysis of variance is used for comparing
more than two variables
Analysis of Covariance is used to account
for and remove the effects of nuisance
variables
Knowledge
Comprehension and
understanding of facts, truths or
principles.