Download Statistics for AKT - Northallerton VTS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Receiver operating characteristic wikipedia , lookup

History of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Statistics for AKT
Everything I knew for the AKT (must admit some of the more obscure equations like pre and post test
probability were not well remembered!). Based mainly on Nick Collier’s teaching and passmedicine
answers.
Data
Data can be continuous, discrete, or categorical (which may in turn be ordinal or nominal).
Continuous data concerns a continuous variable (e.g. height or weight)
Discrete data must be a whole number (e.g. number of children)
Categorical data concerns categories (e.g. blue or brown eyes)
Categorical data may be ordinal data where it is possible to rank the categories (e.g. small, medium and
large) or nominal when ordering is not possible (e.g. eye colour)
Continuous data from a biological population will usually form a bell shaped curve when plotted. If the
curve is symmetrical it is called a Normal distribution (a type of parametric distribution and so
parametric statistical techniques are required). If it is not symmetrical it is a Skewed distribution (a type
of non – parametrical distribution and non-parametric statistical techniques are required)
Normal Distribution (a type of parametric distribution)
Mean=mode (most frequently occurring number)=median (middle number when the numbers are
ranked )
Distribution is described using the mean and the standard deviation or variance
Standard Deviation = the average distance each data point is from the mean (i.e. how spread out the
data is). 66% of the data points will be within 1SD of the mean, 95% within 1.96SD, 99% within 3SD
Variance = SD2
Standard error of the mean – this concept is quite confusing, so imagine the population of the UK. The
heights of everyone will be normally distributed and will have a mean. If you want to find out the mean
height of the UK population you could measure everyone but that would take too long. An alternative
would be to take a sample of the population e.g. the population of Northallerton, and assume that it is
representative of the whole of the UK. You can then measure everyone in Northallerton and calculate
the mean and standard deviation for Northallerton. However, how likely is it that the mean for
Northallerton matches the mean for the UK? The way to work this out is the Standard Error of the Mean
– basically this gives a range of figures measured from the sample mean (i.e. the Northallerton mean)
that the population mean is likely to be within. There is a 95% chance that the population mean will lie
within 1.96SEM of the sample mean (i.e. take the Northallerton mean and add and subtract 1.96SEM
from it and this will give you a range within which there is 95% chance that the UK mean will be) This is
how confidence intervals are arrived at. The bigger the sample size is the smaller the SEM becomes.
This makes sense, if you measure the heights of half the population you are likely to be more accurate
than if you measure the height of only 10 people.
SEM = SD/
Skewed Distributions (a type of non parametric distribution)
Described using median and range (distance between smallest and largest number)
Positive is skewed to the right (i.e. the longer tail sticks out to the right). Mean>median>mode.
Negative is skewed to the left (i.e. the longer tail sticks out to the left). Mean <median<mode.
(To remember the order of mean, median and mode write them out in that order, which is alphabetical,
then put in the arrows pointing to the direction of the skew)
Statistical Tests
When doing a study, what is being looked at is whether there is a difference between two sets of data.
For example, does treatment with an ACE inhibitor change the blood pressure in the treatment group
compared to the non-treatment group?
Before starting a study several values have to be decided on:
The null hypothesis (i.e. the assumption which needs proving or disproving) is that there is no difference
between the two groups of data (i.e. that the two samples were drawn from the same population).
p value (level of statistical significance) = the probability that the two samples have come from the same
population (i.e. no difference between the two groups) = the probability of obtaining our result or
something more extreme if the null hypothesis is true.
P<0.05 = probability of the two samples being from the same population is <1/20
Once the level of statistical significance to be shown has been chosen then the power of the study is
used to calculate the sample size needed.
Power = probability of the study rejecting the null hypothesis when it is false (ideally should be >95%).
This is affected by the sample size, treatment effect size and the p value to be demonstrated. You want
to make sure you have a large enough sample to give a meaningful result that wouldn’t be achieved by
chance alone.
Even if the study has a power of 95% there is still a 1/20 chance that it will not reject the null hypothesis
when it is false (i.e. will say there is no difference between the two sample groups even if there is). If
this happens it is a Type 2 error (i.e. a false negative – the null hypothesis is accepted when it is actually
false). The rate of a type 2 error occurring is signified as β and 1-β = power. The other type of error that
can occur is a type 1 error where the null hypothesis is falsely rejected when it is true (i.e. a false
positive). If p= 0.05 then this type of error will happen with a 1/20 chance. A type 1 error is signified as
α.
Therefore if there is no difference between the two samples then if the study has a p value of 0.05 then
there is a 1/20 chance of it showing a difference between the two samples ( a type 1 error). If there is
actually a difference between the two samples then a study with a power of 95% has a 1/20 chance of
showing no difference (a type 2 error).
There are various ways of deciding if there is a difference between the two groups of data. The
significance of this difference is determined by the p value. Depending on the distribution of the data
different tests need to be used. We don’t really need to know how these tests are done so it is just a
case of learning which should be used when.
Numerical data:
Normal/parametric distribution – Student’s T test (‘paired’ if it is the same people in each
sample, i.e. a before and after study, ‘unpaired’ is used if the two groups contain different
people)
Skewed/non-parametric distribution – Paired – Wilcoxon, Unpaired - Mann Whitney U (these
rank the data before analysis)
Categorical data (binomial = two possible outcomes expressed as percentage or proportion) – Fishers’s
test or Chi squared for large samples
Correlation and Regression
As well as looking for differences between two sets of data you might want to see if two sets of data
correlate (e.g. does blood pressure reduce as ACE inhibitor dose is increased). You again use different
tests depending on the data distribution.
Numerical data:
Normal/parametric – Pearson’s correlation coefficient (N.B. Parametric = Pearson’s)
Skewed/non-parametric – Spearman’s rank correlation coefficient (N.B. Spearman’s = Skewed)
Correlation exists when there is a linear relationship between two variables (N.B .there may be a nonlinear relationship between two variables but this would give a low correlation coefficient)
Correlation coefficient = r, this is the strength of the relationship i.e. how closely points lie to a line
drawn through plotted data. If the points are very scattered then the two variables are not very well
correlated, however if all the points lie on the line of best fit then the two variables are well correlated.
r can vary from -1 (negative correlation) to +1 (positive correlation) and 0 is no correlation
Correlation does not give any information on how much a variable will change based on the other
variable. It also does not show cause and effect.
Linear regression is used to predict how one variable changes when a second variable is altered i.e. it
quantifies the relationship between two variables. It describes the line of best fit in mathematical
terms. The regression coefficient gives the slope of the line (i.e. the change in one value per unit change
in the other). Regression is measured using the method of least squares. Regression is useful if we want
to use one measure as a proxy for the other e.g. fasting glucose for HbA1c.
Survival Time
Cumulative survival over time in a cohort of patients if shown in a Kaplan Meier survival curve and it is
possible to compare survival in different groups of patients using the curves. Regression models can be
used to calculate the relative hazard ratio for events occurring in each (i.e. you can find out the impact
on survival that individual events have and give them a figure = the relative hazard ratio).
(I had never heard of a Kaplan Meier survival curve when I did AKT but I’m sure most of us can work out
what this graph shows even if we do not know the name of it!)
Odds and Rates
Basic definition of odds and rates:
If you have 6 red balls and 4 blue balls in a bag, the odds (ratio) of picking out a red ball is 6:4 or 3:2 or
1.5. The probability (rate) of picking a red ball is 6/10 or 0.6. Probability can be expressed either as a
proportion as above (i.e. 0.6) or as a percentage by multiplying by 100 (i.e. 60%). Odds are never
expressed as a percentage. When doing calculations with rates make sure you know if it is a proportion
or a percentage that you are dealing with. Often it is best to only change to a percentage at the end of
the calculation to avoid confusion.
Consider a study where treatment or exposure is given to one group and placebo or no exposure to the
other group. Disease incidence is then measured in each group. It is possible to express the difference
between the two groups in various ways. Different ways of expressing the difference are appropriate
for different types of studies.
Disease
No disease
Treatment or exposure =
a
b
experimental
Placebo or no exposure = control c
d
N.B. the control group may not be taking a placebo but an established drug as it would be unethical for
example, to compare a new chemotherapy agent to placebo if there is already an established treatment.
Basic expression of data:
Control event rate = rate of disease in placebo group = c/c+d (x100 if want to express it as a percentage)
= those who have the disease who are in the control group/all those in the control group
Experimental event rate = rate of disease in the treated/exposed group = a/a+b (x100 if you want to
express it as a percentage) = those who have the disease who are in the experimental group / all those
in the experimental group
Ideally you would hope that if the factor being studied is a treatment then the EER should be lower than
to CER. If exposure to a risk factor is being studied then the EER should be higher than the CER. (This
may not always be the case if the treatment is actually worse than placebo or the possible risk factor is
protective!)
Control event odds = c/d = those who have the disease who are in the control group/those who don’t
have the disease who are in the control group (never expressed as a percentage)
Experimental event odds = a/b = those who have the disease who are in the experimental group/those
who don’t have the disease who are in the experimental group (never expressed as a percentage)
These basic expressions of the data can be manipulated to show differences between the two groups:
Absolute risk reduction (attributable risk) is the most basic method where you simply subtract one rate
from the other to show the proportion (or percentage) of cases that can be considered to be due to the
exposure or reduction that can be considered to be due to the treatment. (For studying a treatment
where EER is likely to be less than CER then = CER-EER, for studying an exposure where EER is likely to be
greater than CER then = EER-CER)
Relative risk difference = ARR/CER (x100 to express as a percentage) (make sure ARR and CER are both
expressed either as a proportion or a percentage before calculating this)
Risk ratio (risk ratio) = EER/CER (this is not expressed as a percentage as it is an odds of the rates!) (1= no
difference between the two groups)
Odds ratio = EEO/CEO (never expressed as a percentage) (1=no difference between the two groups)
Another number that can be calculated from this basic data is the Number Needed to Treat. I haven’t
been able to come up with an explanation for this equation but you just need to remember that:
NNT = 1/ARR (x100 if ARR was expressed as a percentage)
Types of Study
Cohort – looks at outcomes in groups which are divided by their risk factors, used to assess harm or risk
(e.g. risk of diabetes in breast-fed and non-breast-fed babies in the future). Use relative risk.
Case control – looks back at risk factors in patients with and without a condition, used to assess
aetiology of rare conditions (e.g. were teenagers with leukaemia more likely to have lived near an
electricity pylon as a child). Use odds ratio.
Controlled trials – looks at outcomes in groups with different interventions, used to assess the benefit of
treatments (e.g. is perindopril better than placebo at lowering blood pressure). Use relative risk.
Cross-sectional survey – number of people in the population with disease with or without certain risk
factors , used to assess prevalence.
Meta-analysis – secondary research, compiling the results of other trials.
Pragmatic = real life scenarios vs. Explanatory – controlled situation.
Depending on the situation a trial of a treatment may need to show that it is no worse than, the same as
or better than another treatment. These are described as non-inferiority, equivalency and superiority
trials. Non-inferiority trials are the cheapest as they need fewest participants, however, it is expensive
to show superiority as a large number of participants is required. Non-inferiority or equivalency may be
sufficient for the drug company to show if their drug is cheaper, has fewer side effects or is easier to
take than the established alternative. If a drug is more expensive or has more side effects than the
established alternative then to get people to use it then it must be shown to be superior.
Graphs
Forest plot = blobbogram = block and whisker plot:
Used for meta-analysis. The square shows the size of the study, the length of the line is the confidence
intervals. The diamond is the summary of all the data with each end signifying the confidence intervals.
If the line crosses OR or RR = 1 then the study is not significant.
Funnel plot:
N.B. a funnel plot can be drawn in any orientation.
Study size is plotted against effect size (RR or OR) for all studies looking at the same variables. If there is
a gap where small studies with unwanted outcome (e.g. treatment worse than placebo) should be then
this shows a publication bias. I.e. studies with an unwanted outcome not being published as the results
would be damaging to the pharmaceutical company. Large studies tend to be published irrespective of
the result.
Tests
Almost a separate part of statistics to trials is the statistics of tests. A test can be an investigation or a
question or examination that gives a yes/no answer (e.g. Hb <10, do you have diarrhoea?, presence of
breast lump). We all know that just because a test says ‘yes’ it does not mean you have the condition
(e.g. having a breast lump does not equal cancer, a positive mammogram also does not equal cancer).
There are ways of showing what a test means and how likely someone is to have a particular condition
based on the answers of various tests.
Test
+
-
Disease
+
True positive
False negative
False positive
True negative
When doing questions you can either draw the chart and put in the letters a,b,c and d into the grid and
learn the equations or I prefer to use the descriptions true positive etc. I find then, even if I draw the
chart the wrong way round, I can still answer the questions and the equations start to make sense. You
just need to consider whether the test is positive or negative and whether the result is correct (true) or
incorrect (false).
Sensitivity = TP/(TP+FN) = the proportion of true positives correctly identified (basically look at all the
people who actually have the disease and work out what proportion of them are picked up by the test).
It is important that the sensitivity is high if you want to make sure everyone with the condition is picked
up by the test, e.g. for screening.
Specificity = TN/(TN+FP) = the proportion of true negatives correctly identified (basically look at all the
people who don’t have the disease and work out what proportion are correctly identified as negative by
the test). It is important to have a high specificity if you don’t want to diagnose people with a condition
if they don’t have it, e.g. for a diagnostic test after screening.
The sensitivity and specificity of a test are constant so do not vary by the disease prevalence. A ROC
(receiver operator characteristic or relative operator characteristic) curve shows the relationship
between sensitivity and specificity of a test. As sensitivity increases specificity will reduce and vice
versa. A ROC curve plots true positives (sensitivity) vs. false positives (1-specificity) for different cut off
values of a diagnostic test. It enables you to choose the diagnostic cut off which gives the best
compromise between sensitivity and specificity, this is the point at the top left hand corner of the curve.
Positive predictive value = TP/(TP+FP) = the proportion of patients with a positive test result who are
correctly diagnosed (basically look at all the people who tested positive and work out what proportion
of them actually have the disease). This takes into account the prevalence of the disease. Therefore if a
disease is common and the test result is positive then it is likely that the person has the disease,
however if the disease is rare, even if the test is positive it is unlikely that they have the disease.
Negative predictive value = TN/(TN+FN) = the proportion of patients with a negative test result who are
correctly diagnosed (basically look at all the patients with a negative test result and work out what
proportion of them don’t have the disease). This is affected by prevalence.
To remember sensitivity, specificity, PPV and NPV I remember that sensitivity looks at those who have
the disease (TP and FN), specificity looks at those who don’t have the disease (TN and FP), PPV looks at
those who have a positive test (TP and FP) and NPV looks at those who have a negative test (TN and FN).
Both figures go on the bottom of the equation then put the true one on top.
Pre and post test probability:
It is possible to assess an individual’s risk of having a condition based on the test result and background
information.
The first number that is needed is the likelihood ratio:
Likelihood ratio for a positive test result = sensitivity/(1-specificity)
Likelihood ratio for a negative test result = (1-sensitivity)/specificity
These are constant for a specific test (i.e. do not depend on prevalence) and are used to alter pretest
probability to posttest probability. The way I remember the formulas is that sensitivity is always on top
and specificity on the bottom then the negative goes on the top for LR- and on the bottom for LR+.
Pre-test probability – this is the chance someone has of having the condition before a test is done. For a
random person on the street with no symptoms their pre-test probability is the prevalence of that
condition. However as soon as someone develops symptoms and presents to the GP or A&E their
chance of having the condition increases and we often don’t know the precise number. In practice
therefore this concept is perhaps most useful when looking at screening as you have an unselected
population whose pre-test probability of having the condition is the prevalence of the condition. (e.g.
the pre-test probability of a baby having CF is 1/2500).
You then want to know, given that tests are rarely perfect, what the probability of having the condition
is after having a positive or negative test. If the test is positive it increases the probability of having the
condition, if the test is negative it reduces the probability.
To get from pre-test to post-test probability (or prevalence) there are two methods:
1. Use a nomogram which is a graphical way of using likelihood ratio to convert pre-test to posttest probability
2.
Convert pre-test probability to pre-test odds, then calculate post-test odds using likelihood
ratio, then convert post-test odds to post-test probability.
Here are the formulas for this second method:
Pre-test probability-> pre-test odds
pre-test odds = pre-test probability/(1-pre-test probability)
Pre-test odds -> post-test odds
post-test odds = pre-test odds x LR
Post-test odds -> post-test probability
post-test probability = post-test odds/(1+post-test odds)
You can apply this calculation multiple times, i.e. the post-test probability after one test becomes the
pre-test probability before the next test.
Worked example for Tests
(Using completely made up numbers)
Prevalence of HIV in a population 10%
So, 1000 randomly selected people are tested for HIV:
Test 1
+
-
HIV
+
98
2
20
80
Sensitivity = TP/TP+FN = 98/98+2 = 0.98
Specificity = TN/TN+FP = 80/80+20 = 0.8
1.If the test result is positive what is the probability of being HIV +ve?
LR+ = sensitivity/1-specificity = 0.98/1-0.8 = 4.9
Pretest probability = 0.1 (i.e. the prevalence)
Pretest odds = pretest prob/1-pretest prob = 0.1/1-0.1 = 0.111
Posttest odds = pretest odds x LR+ = 0.544
Posttest probability = posttest odds/1+posttest odds = 0.544/1+0.544 = 0.35 = 35%
Therefore only 35% of people who test positive will actually be HIV +ve
2.If the test result is negative what is the proability of being HIV +ve?
LR- = 1-sensitivity/specificity = 1-0.98/0.8 = 0.025
Pretest probability = 0.1
Pretest odds = 0.111 (same as above)
Posttest odds = pretest odds x LR- = 0.002775
Posttest probability = posttest odds/1+posttest odds = 0.002775/1+0.002775 = 0.00277 = 0.28%
Therefore only 0.28% of people who test negative will be HIV +ve