Download Lecture 5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

Categorical variable wikipedia , lookup

Analysis of variance wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
OBJECTIVES


Understand the reason for and use of statistics
Review descriptive statistics




Inferential Statistics






Measures of central tendency
Measures of variability
Measures of relationship
Parametric
Non-parametric
Average: 13085.77
Mode: 13226
Median: 14022
Standard deviation
198.3
What are statistics for?



Painting a mathematical picture….or simplifying large sets
of data
Identifying if relationships exist
Establishing the probability of a cause and effect
relationship
How many tests are there?




There are well over 100 statistical calculations and tests
However, many are rarely if ever used
Therefore, we will focus on those that are used most
often…
Descriptive (Part I) and Inferential Statistics (Part II)
Which tests should you know?

The New England Journal of Medicine from Vol. 298
through 301 (760 articles) were reviewed to determine
what statistical tests are most important to learn to
understand the scientific literature.
FINDINGS***

“A reader who is conversant with some simple
descriptive statistics (percentages, means, and
standard deviations) has full statistical access to
of the articles.
58%
Findings ***


“Understanding t-test increases this access to 67 percent.
“Familiarity with each additional test gradually increases
the percentage of accessible articles.”
Descriptive Statistics***


Purpose --- To simply describe the data, the sample, or the
population.
Descriptive Statistics DO NOT establish a possible cause &
effect relationship
Inferential Statistics*


Purpose --- In experimental research uses a sample of the
population – inferential statistics permits the researcher to
generalize from the sample data to the entire population.
Aids the researcher in determining if cause and effect
relationships exist.
Descriptive Statistics
Measures of Central Tendency***
Mode --- most frequently occurring score
Median (Mdn) --- score physically in the middle of all scores
Mean (M or X) --- arithmetic mean--- i.e. sum of scores
divided by the number of scores
Mean***


Is generally the preferred measure of central tendency
Is used frequently for other calculations
Measures of Variability (dispersion)
Measures of Variability***




AKA spread.
If scores are similar….they have low variability
(homogeneous)
If scores are dissimilar…they have high variability
(heterogeneous)
Two sets of scores may have the exact same mean but
one set may have low variability and the other very
high….therefore… measures of variability help DESCRIBE
these differences
Range**




Simplest measure of variability
The difference between the lowest and highest score
Usually reported in the literature as “range” (e.g. the
range of ages was...)
The range is an unstable calculation because it is only
based upon 2 numbers
Standard Deviation & Variance***


These measures are calculated based upon ALL data
scores and are therefore better represent the data set
They are used with other statistical calculations
Variance**


Related to the amount that an individual score VARIES
from the mean.
This measurement is how much the scores deviate from
the mean, BUT it uses all the scores.
Standard Deviation***




Is the square root of the variance
How widely the values in a set are spread apart.
A large SD tells you that the data are fairly diverse, while a
small SD tells you the data are pretty tightly bunched
together.
If a mean is presented in the data, SD MUST be there at
well.
Standard Deviation (aka S.D.)***
Standard Deviation example***




If a group of people in an experiment had a mean age of
40, with a standard deviation of 4, what does this mean?
1. 68% of this group were between ages 36 and 44 (1 sd
+/- the mean)
2. 95% were between 32 and 48
3. 99% were between 28 and 52
Practice question

Your treatment group (n:50) has a mean score on an IQ
test of 100 with a standard deviation of 15.
How many subjects fall within one standard deviation of the
mean?
 If a subject has an IQ of 125 , how many SD does he/she fall
from the mean?
 If an individual has an IQ of 135, what percentage of people is
he/she smarter than?

Correlations***




Measures of central tendency and variability describe only
ONE variable
CORRELATIONS describe the relationship between TWO
variables
Correlations can be positive, negative or zero
Correlations range from +1 through -1
Correlations**





+.95, +.87, High positive correlations
+.19, +.22, Low positive correlations
+.03, -.02, No relationship
-.23, -.19, Low negative correlations
-.94, -.88, High negative correlations
Summary



Descriptive statistics do just that
They describe the distribution of data toward the center (mean,
median, mode) and they describe the variability away from the
center (range, variance, standard deviation)
They also determine if there is a relationship between 2 (or more)
variables
Summary




Of the 100+ statistical tests only a few are frequently used
You can intelligently read and understand nearly 70% of the
biomedical literature with an understanding of descriptive statistics
and the t-test
Correlations are an important and often used type of descriptive
statistics
Descriptive Statistics must be well understood in order to
understand inferential statistics
Statistical & Medical Significance**


It is important to keep in mind that statisticians use the word
“significance” to represent the results of testing a hypothesis
In everyday language and in the clinical setting, a “significant”
finding or treatment relates to how “important” it is from a clinical
and not a mathematical perspective
4 Possibilities***




Medically & statistically significant
Medically but not statistically significant
Statistically but not medically significant
Neither statistically or medically significant

Very large groups of subjects can reflect statistically significant
differences between two groups …but they may not be medically
significant from the perspective of cost, risks, policies etc.
Before the Experiment

Researcher must determine the:
Critical value
 Directional or non-directional hypothesis
 Power


In order to determine the sample size.
Critical Value***



The researcher must establish the value at which they will consider
the results “significant”…this is referred to as the CRITICAL VALUE
There is some subjective and somewhat arbritrary decision to be
made in this regard by the researcher
The customary minimal critical values are either p<.05 or p<.01
Power

Determines the level of certainty that the results are
accurate
Directional / Non-directional Hypothesis***


“Directional” means that the researcher anticipates or
expects a specific positive or negative impact from the
treatment (or other independent variable)
“Non-directional” means that the researcher does not
know what to expect. Perhaps the treatment will make
the patient better or worse
Directional Hypothesis/Non… & Critical
Value**


These two decisions need to be made before the study is
started and will determine to some degree if your results
will be significant
…
Describing Results***




p values
Confidence interval (CI)
Odds ratios
Number needed to treat (NNT).
p Values***




Most common form evaluating results, although not the
best
Tells you the odds of your result being caused by chance
.05, .01
Written as p<.05.
Confidence Interval***



Also give you the odds that the results are due to chance
Also give you the range in which the results should fall
Example: Significant improvement in the VAS, 95% CI
(25-50).
Odds Ratios***





It is the ratio of the number in the treatment group with
improvement divided by the number without improvement
compared to the number in the control group with improvement
divided by the number without improvement
20/5 divided by 5/20= 16
The higher the number the better the result
CI and odds ratios used together
95% CI odds ratio (2-8).
NNT**


Number (of patients) needed to treat
This gives you the number of patients you need to treat
before you have someone improve. The closer to 1 you
get the better the treatment
Post-operative pain 50%↓
Review of Descriptive Statistics








Mean
Median
Mode
Variance
Standard Deviation
Odds ratio
NNT
CI
***
***


Prior to designing experiment, effect of power, critical
value, hypothesis and change measured
Statistical and medical significance
Standard Deviation (aka S.D.)
Inferential Statistics
Parametric versus Nonparametric
Parametric Statistics
3 most commonly used tests
& some related concepts commonly expressed in the
literature
Assumptions---Parametric***

Parametric Inferential Statistics --- assumes that the sample comes
from a population that is NORMALLY DISTRUBUTED & that the
variance is similar (homogeneous) between sample and population
(or 2 populations)

The tests are very POWERFUL ---I.e. can recognize if there is a
significant change based upon the experimental manipulation
Usage***

“Generally, it is agreed that unless there is sufficient
evidence to suggest that the population is extremely nonnormal and that the variances are heterogeneous,
parametric tests should be used because of their additional
power”
Parametric Statistical Tests
t-Test aka Student t-Test
ANOVA
Analysis of covariance
t- Test***




Developed by Gosset under the pseudonym Student
3 different versions of the t-tests that apply to 3 different research
designs
All three forms of the t-Test are based upon the MEANS of two
groups
The larger the difference in the calculated t scores, the greater the
chance that the null hypothesis can be rejected
t-Test***



3 different general ways to use the student t-test
2 variations of each…dependent upon the type of
hypothesis the researcher uses
The hypothesis can either be directional or non-directional
t-Test #1 Single Sample***


Compares the sample to the mean value of the population
Is not used often because the mean for the population if usually
not known

E.g. Stanford Binet I.Q. test….has a mean of 100 and 1 S.D. of 15
(although not everyone in the U.S. has had the test, enough have
been tested to accept the data as representative
t-Test #2 Correlated Groups***




Used when subjects serve as their own controls (or when they are
matched to very similar subjects)
For each subject we could have a pre and a post treatment score
(e.g. pain, blood pressure, algometer, cholesterol levels, range of
motion…)
The null hypothesis would be that the difference between pre and
post scores would be 0 (treatment is not effective)
If the difference is sufficient, the null hypothesis can be rejected
T-Test #3 Independent t-Test***




AKA independent groups t-Test
Most commonly used
Used when you have 2 groups (2 samples) out of an entire
population
Ho = X control = X treatment
Analysis of Variance (ANOVA)***



The t-Test only allows us to compare 2 groups
What if we have a study comparing 2 or more types of
treatment with a controls of both no treatment and
placebo?
ANOVA is designed to handle multiple groups similar to
what the t-Test does with 2 groups
Analysis of Covariance (ANCOVA)***



Sometimes studies nuisance variables impact the
dependent variable (outcome measures) but not the
independent variable (e.g. treatment).
These unwanted variables can interfere with our analysis
of the data
Example…
Example***




We want to see if one of two treatment protocols will have a
positive effect upon patients with low back pain
The patients are randomly assigned to the treatment and control
groups
We realize from the histories that there are factors that impact
recovery from low back pain that we have not accounted for (e.g.
obesity, smoking, occupation, age etc. etc.) These factors could
impact rate of recovery (dependent variables)
The Analysis of Covariance pulls those possible confounding…
nuisance factors out
Regression Analysis**


Regression analysis falls somewhere between parametric
and non-parametric statistics.
When there are a number of variables, and there appears
to be some trends, but nothing seems to be statistically
significant, this measurement will be used.
Regression, cont.



Regression measures how well data fits a straight line.
The data is systematically analyzed, variable by variable,
until only those variables that “fit” are left in the model.
Also referred to as least fit or best fit.
Non-parametric Statistics***



These tests are based on the MEDIAN rather than the
MEAN of the sample.
Parametric tests are generally performed FIRST, then nonparametric tests.
Non-parametric tests are performed when the expectation
of normalcy is violated.
Nonparametric Inferential Statistics --- no
Assumptions**



Makes no assumptions about the distribution of the data
(distribution free)
So it does not assume that there is a normal distribution of
the data…..etc.
Is less powerful…meaning that a greater difference (or
change) needs to be present in the data before a
significant difference can be detected
Nonparametric Tests***



Wilcoxon Signed Rank Test— equivalent to the t-test; data
is multidirectional
Wilcoxon Ranked Sum Test—equivalent to the Student ttest
Kruskal-Wallis Test– equivalent to the one-way analysis of
variance
Kappa***




Commonly used in chiropractic & medical literature to convey the
degree of interexaminer and/or intraexaminer reliability
Interexaminer– two or more examiners checking/evaluating or
testing for the same finding
Intraexaminer– one examiner checking/evaluating or testing for
the same finding on two different occasions
Kappa calculates the degree of agreement between the first and
second check/evaluation or test
Kappa values ***






< 0 Poor agreement
0 — 0.20 Slight agreement
0.21 — 0.40 Fair agreement
0.41 — 0.60 Moderate agreement
0.61 — 0.80 Substantial agreement
0.81 — 1.00 Almost perfect agreement
Summary

“Significance” is used in different ways…statistical & medical



The most commonly used inferential statistical test is the Student tTest (which has 3 versions depending what you are comparing) it
only compares 2 group(s)/sample
Hypothesis can be directional or non-directional…
Critical value, power and direction of hypothesis is established by
the researcher BEFORE the study is started.
Summary….contd.



Parametric tests assume several things related to a
relatively normal distribution of data
Analysis of variance is used for comparing more than two
variables
Analysis of Covariance is used to account for and remove
the effects of nuisance variables