* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Statistics for AKT - Northallerton VTS
Survey
Document related concepts
Transcript
Statistics for AKT Everything I knew for the AKT (must admit some of the more obscure equations like pre and post test probability were not well remembered!). Based mainly on Nick Collier’s teaching and passmedicine answers. Data Data can be continuous, discrete, or categorical (which may in turn be ordinal or nominal). Continuous data concerns a continuous variable (e.g. height or weight) Discrete data must be a whole number (e.g. number of children) Categorical data concerns categories (e.g. blue or brown eyes) Categorical data may be ordinal data where it is possible to rank the categories (e.g. small, medium and large) or nominal when ordering is not possible (e.g. eye colour) Continuous data from a biological population will usually form a bell shaped curve when plotted. If the curve is symmetrical it is called a Normal distribution (a type of parametric distribution and so parametric statistical techniques are required). If it is not symmetrical it is a Skewed distribution (a type of non – parametrical distribution and non-parametric statistical techniques are required) Normal Distribution (a type of parametric distribution) Mean=mode (most frequently occurring number)=median (middle number when the numbers are ranked ) Distribution is described using the mean and the standard deviation or variance Standard Deviation = the average distance each data point is from the mean (i.e. how spread out the data is). 66% of the data points will be within 1SD of the mean, 95% within 1.96SD, 99% within 3SD Variance = SD2 Standard error of the mean – this concept is quite confusing, so imagine the population of the UK. The heights of everyone will be normally distributed and will have a mean. If you want to find out the mean height of the UK population you could measure everyone but that would take too long. An alternative would be to take a sample of the population e.g. the population of Northallerton, and assume that it is representative of the whole of the UK. You can then measure everyone in Northallerton and calculate the mean and standard deviation for Northallerton. However, how likely is it that the mean for Northallerton matches the mean for the UK? The way to work this out is the Standard Error of the Mean – basically this gives a range of figures measured from the sample mean (i.e. the Northallerton mean) that the population mean is likely to be within. There is a 95% chance that the population mean will lie within 1.96SEM of the sample mean (i.e. take the Northallerton mean and add and subtract 1.96SEM from it and this will give you a range within which there is 95% chance that the UK mean will be) This is how confidence intervals are arrived at. The bigger the sample size is the smaller the SEM becomes. This makes sense, if you measure the heights of half the population you are likely to be more accurate than if you measure the height of only 10 people. SEM = SD/ Skewed Distributions (a type of non parametric distribution) Described using median and range (distance between smallest and largest number) Positive is skewed to the right (i.e. the longer tail sticks out to the right). Mean>median>mode. Negative is skewed to the left (i.e. the longer tail sticks out to the left). Mean <median<mode. (To remember the order of mean, median and mode write them out in that order, which is alphabetical, then put in the arrows pointing to the direction of the skew) Statistical Tests When doing a study, what is being looked at is whether there is a difference between two sets of data. For example, does treatment with an ACE inhibitor change the blood pressure in the treatment group compared to the non-treatment group? Before starting a study several values have to be decided on: The null hypothesis (i.e. the assumption which needs proving or disproving) is that there is no difference between the two groups of data (i.e. that the two samples were drawn from the same population). p value (level of statistical significance) = the probability that the two samples have come from the same population (i.e. no difference between the two groups) = the probability of obtaining our result or something more extreme if the null hypothesis is true. P<0.05 = probability of the two samples being from the same population is <1/20 Once the level of statistical significance to be shown has been chosen then the power of the study is used to calculate the sample size needed. Power = probability of the study rejecting the null hypothesis when it is false (ideally should be >95%). This is affected by the sample size, treatment effect size and the p value to be demonstrated. You want to make sure you have a large enough sample to give a meaningful result that wouldn’t be achieved by chance alone. Even if the study has a power of 95% there is still a 1/20 chance that it will not reject the null hypothesis when it is false (i.e. will say there is no difference between the two sample groups even if there is). If this happens it is a Type 2 error (i.e. a false negative – the null hypothesis is accepted when it is actually false). The rate of a type 2 error occurring is signified as β and 1-β = power. The other type of error that can occur is a type 1 error where the null hypothesis is falsely rejected when it is true (i.e. a false positive). If p= 0.05 then this type of error will happen with a 1/20 chance. A type 1 error is signified as α. Therefore if there is no difference between the two samples then if the study has a p value of 0.05 then there is a 1/20 chance of it showing a difference between the two samples ( a type 1 error). If there is actually a difference between the two samples then a study with a power of 95% has a 1/20 chance of showing no difference (a type 2 error). There are various ways of deciding if there is a difference between the two groups of data. The significance of this difference is determined by the p value. Depending on the distribution of the data different tests need to be used. We don’t really need to know how these tests are done so it is just a case of learning which should be used when. Numerical data: Normal/parametric distribution – Student’s T test (‘paired’ if it is the same people in each sample, i.e. a before and after study, ‘unpaired’ is used if the two groups contain different people) Skewed/non-parametric distribution – Paired – Wilcoxon, Unpaired - Mann Whitney U (these rank the data before analysis) Categorical data (binomial = two possible outcomes expressed as percentage or proportion) – Fishers’s test or Chi squared for large samples Correlation and Regression As well as looking for differences between two sets of data you might want to see if two sets of data correlate (e.g. does blood pressure reduce as ACE inhibitor dose is increased). You again use different tests depending on the data distribution. Numerical data: Normal/parametric – Pearson’s correlation coefficient (N.B. Parametric = Pearson’s) Skewed/non-parametric – Spearman’s rank correlation coefficient (N.B. Spearman’s = Skewed) Correlation exists when there is a linear relationship between two variables (N.B .there may be a nonlinear relationship between two variables but this would give a low correlation coefficient) Correlation coefficient = r, this is the strength of the relationship i.e. how closely points lie to a line drawn through plotted data. If the points are very scattered then the two variables are not very well correlated, however if all the points lie on the line of best fit then the two variables are well correlated. r can vary from -1 (negative correlation) to +1 (positive correlation) and 0 is no correlation Correlation does not give any information on how much a variable will change based on the other variable. It also does not show cause and effect. Linear regression is used to predict how one variable changes when a second variable is altered i.e. it quantifies the relationship between two variables. It describes the line of best fit in mathematical terms. The regression coefficient gives the slope of the line (i.e. the change in one value per unit change in the other). Regression is measured using the method of least squares. Regression is useful if we want to use one measure as a proxy for the other e.g. fasting glucose for HbA1c. Survival Time Cumulative survival over time in a cohort of patients if shown in a Kaplan Meier survival curve and it is possible to compare survival in different groups of patients using the curves. Regression models can be used to calculate the relative hazard ratio for events occurring in each (i.e. you can find out the impact on survival that individual events have and give them a figure = the relative hazard ratio). (I had never heard of a Kaplan Meier survival curve when I did AKT but I’m sure most of us can work out what this graph shows even if we do not know the name of it!) Odds and Rates Basic definition of odds and rates: If you have 6 red balls and 4 blue balls in a bag, the odds (ratio) of picking out a red ball is 6:4 or 3:2 or 1.5. The probability (rate) of picking a red ball is 6/10 or 0.6. Probability can be expressed either as a proportion as above (i.e. 0.6) or as a percentage by multiplying by 100 (i.e. 60%). Odds are never expressed as a percentage. When doing calculations with rates make sure you know if it is a proportion or a percentage that you are dealing with. Often it is best to only change to a percentage at the end of the calculation to avoid confusion. Consider a study where treatment or exposure is given to one group and placebo or no exposure to the other group. Disease incidence is then measured in each group. It is possible to express the difference between the two groups in various ways. Different ways of expressing the difference are appropriate for different types of studies. Disease No disease Treatment or exposure = a b experimental Placebo or no exposure = control c d N.B. the control group may not be taking a placebo but an established drug as it would be unethical for example, to compare a new chemotherapy agent to placebo if there is already an established treatment. Basic expression of data: Control event rate = rate of disease in placebo group = c/c+d (x100 if want to express it as a percentage) = those who have the disease who are in the control group/all those in the control group Experimental event rate = rate of disease in the treated/exposed group = a/a+b (x100 if you want to express it as a percentage) = those who have the disease who are in the experimental group / all those in the experimental group Ideally you would hope that if the factor being studied is a treatment then the EER should be lower than to CER. If exposure to a risk factor is being studied then the EER should be higher than the CER. (This may not always be the case if the treatment is actually worse than placebo or the possible risk factor is protective!) Control event odds = c/d = those who have the disease who are in the control group/those who don’t have the disease who are in the control group (never expressed as a percentage) Experimental event odds = a/b = those who have the disease who are in the experimental group/those who don’t have the disease who are in the experimental group (never expressed as a percentage) These basic expressions of the data can be manipulated to show differences between the two groups: Absolute risk reduction (attributable risk) is the most basic method where you simply subtract one rate from the other to show the proportion (or percentage) of cases that can be considered to be due to the exposure or reduction that can be considered to be due to the treatment. (For studying a treatment where EER is likely to be less than CER then = CER-EER, for studying an exposure where EER is likely to be greater than CER then = EER-CER) Relative risk difference = ARR/CER (x100 to express as a percentage) (make sure ARR and CER are both expressed either as a proportion or a percentage before calculating this) Risk ratio (risk ratio) = EER/CER (this is not expressed as a percentage as it is an odds of the rates!) (1= no difference between the two groups) Odds ratio = EEO/CEO (never expressed as a percentage) (1=no difference between the two groups) Another number that can be calculated from this basic data is the Number Needed to Treat. I haven’t been able to come up with an explanation for this equation but you just need to remember that: NNT = 1/ARR (x100 if ARR was expressed as a percentage) Types of Study Cohort – looks at outcomes in groups which are divided by their risk factors, used to assess harm or risk (e.g. risk of diabetes in breast-fed and non-breast-fed babies in the future). Use relative risk. Case control – looks back at risk factors in patients with and without a condition, used to assess aetiology of rare conditions (e.g. were teenagers with leukaemia more likely to have lived near an electricity pylon as a child). Use odds ratio. Controlled trials – looks at outcomes in groups with different interventions, used to assess the benefit of treatments (e.g. is perindopril better than placebo at lowering blood pressure). Use relative risk. Cross-sectional survey – number of people in the population with disease with or without certain risk factors , used to assess prevalence. Meta-analysis – secondary research, compiling the results of other trials. Pragmatic = real life scenarios vs. Explanatory – controlled situation. Depending on the situation a trial of a treatment may need to show that it is no worse than, the same as or better than another treatment. These are described as non-inferiority, equivalency and superiority trials. Non-inferiority trials are the cheapest as they need fewest participants, however, it is expensive to show superiority as a large number of participants is required. Non-inferiority or equivalency may be sufficient for the drug company to show if their drug is cheaper, has fewer side effects or is easier to take than the established alternative. If a drug is more expensive or has more side effects than the established alternative then to get people to use it then it must be shown to be superior. Graphs Forest plot = blobbogram = block and whisker plot: Used for meta-analysis. The square shows the size of the study, the length of the line is the confidence intervals. The diamond is the summary of all the data with each end signifying the confidence intervals. If the line crosses OR or RR = 1 then the study is not significant. Funnel plot: N.B. a funnel plot can be drawn in any orientation. Study size is plotted against effect size (RR or OR) for all studies looking at the same variables. If there is a gap where small studies with unwanted outcome (e.g. treatment worse than placebo) should be then this shows a publication bias. I.e. studies with an unwanted outcome not being published as the results would be damaging to the pharmaceutical company. Large studies tend to be published irrespective of the result. Tests Almost a separate part of statistics to trials is the statistics of tests. A test can be an investigation or a question or examination that gives a yes/no answer (e.g. Hb <10, do you have diarrhoea?, presence of breast lump). We all know that just because a test says ‘yes’ it does not mean you have the condition (e.g. having a breast lump does not equal cancer, a positive mammogram also does not equal cancer). There are ways of showing what a test means and how likely someone is to have a particular condition based on the answers of various tests. Test + - Disease + True positive False negative False positive True negative When doing questions you can either draw the chart and put in the letters a,b,c and d into the grid and learn the equations or I prefer to use the descriptions true positive etc. I find then, even if I draw the chart the wrong way round, I can still answer the questions and the equations start to make sense. You just need to consider whether the test is positive or negative and whether the result is correct (true) or incorrect (false). Sensitivity = TP/(TP+FN) = the proportion of true positives correctly identified (basically look at all the people who actually have the disease and work out what proportion of them are picked up by the test). It is important that the sensitivity is high if you want to make sure everyone with the condition is picked up by the test, e.g. for screening. Specificity = TN/(TN+FP) = the proportion of true negatives correctly identified (basically look at all the people who don’t have the disease and work out what proportion are correctly identified as negative by the test). It is important to have a high specificity if you don’t want to diagnose people with a condition if they don’t have it, e.g. for a diagnostic test after screening. The sensitivity and specificity of a test are constant so do not vary by the disease prevalence. A ROC (receiver operator characteristic or relative operator characteristic) curve shows the relationship between sensitivity and specificity of a test. As sensitivity increases specificity will reduce and vice versa. A ROC curve plots true positives (sensitivity) vs. false positives (1-specificity) for different cut off values of a diagnostic test. It enables you to choose the diagnostic cut off which gives the best compromise between sensitivity and specificity, this is the point at the top left hand corner of the curve. Positive predictive value = TP/(TP+FP) = the proportion of patients with a positive test result who are correctly diagnosed (basically look at all the people who tested positive and work out what proportion of them actually have the disease). This takes into account the prevalence of the disease. Therefore if a disease is common and the test result is positive then it is likely that the person has the disease, however if the disease is rare, even if the test is positive it is unlikely that they have the disease. Negative predictive value = TN/(TN+FN) = the proportion of patients with a negative test result who are correctly diagnosed (basically look at all the patients with a negative test result and work out what proportion of them don’t have the disease). This is affected by prevalence. To remember sensitivity, specificity, PPV and NPV I remember that sensitivity looks at those who have the disease (TP and FN), specificity looks at those who don’t have the disease (TN and FP), PPV looks at those who have a positive test (TP and FP) and NPV looks at those who have a negative test (TN and FN). Both figures go on the bottom of the equation then put the true one on top. Pre and post test probability: It is possible to assess an individual’s risk of having a condition based on the test result and background information. The first number that is needed is the likelihood ratio: Likelihood ratio for a positive test result = sensitivity/(1-specificity) Likelihood ratio for a negative test result = (1-sensitivity)/specificity These are constant for a specific test (i.e. do not depend on prevalence) and are used to alter pretest probability to posttest probability. The way I remember the formulas is that sensitivity is always on top and specificity on the bottom then the negative goes on the top for LR- and on the bottom for LR+. Pre-test probability – this is the chance someone has of having the condition before a test is done. For a random person on the street with no symptoms their pre-test probability is the prevalence of that condition. However as soon as someone develops symptoms and presents to the GP or A&E their chance of having the condition increases and we often don’t know the precise number. In practice therefore this concept is perhaps most useful when looking at screening as you have an unselected population whose pre-test probability of having the condition is the prevalence of the condition. (e.g. the pre-test probability of a baby having CF is 1/2500). You then want to know, given that tests are rarely perfect, what the probability of having the condition is after having a positive or negative test. If the test is positive it increases the probability of having the condition, if the test is negative it reduces the probability. To get from pre-test to post-test probability (or prevalence) there are two methods: 1. Use a nomogram which is a graphical way of using likelihood ratio to convert pre-test to posttest probability 2. Convert pre-test probability to pre-test odds, then calculate post-test odds using likelihood ratio, then convert post-test odds to post-test probability. Here are the formulas for this second method: Pre-test probability-> pre-test odds pre-test odds = pre-test probability/(1-pre-test probability) Pre-test odds -> post-test odds post-test odds = pre-test odds x LR Post-test odds -> post-test probability post-test probability = post-test odds/(1+post-test odds) You can apply this calculation multiple times, i.e. the post-test probability after one test becomes the pre-test probability before the next test. Worked example for Tests (Using completely made up numbers) Prevalence of HIV in a population 10% So, 1000 randomly selected people are tested for HIV: Test 1 + - HIV + 98 2 20 80 Sensitivity = TP/TP+FN = 98/98+2 = 0.98 Specificity = TN/TN+FP = 80/80+20 = 0.8 1.If the test result is positive what is the probability of being HIV +ve? LR+ = sensitivity/1-specificity = 0.98/1-0.8 = 4.9 Pretest probability = 0.1 (i.e. the prevalence) Pretest odds = pretest prob/1-pretest prob = 0.1/1-0.1 = 0.111 Posttest odds = pretest odds x LR+ = 0.544 Posttest probability = posttest odds/1+posttest odds = 0.544/1+0.544 = 0.35 = 35% Therefore only 35% of people who test positive will actually be HIV +ve 2.If the test result is negative what is the proability of being HIV +ve? LR- = 1-sensitivity/specificity = 1-0.98/0.8 = 0.025 Pretest probability = 0.1 Pretest odds = 0.111 (same as above) Posttest odds = pretest odds x LR- = 0.002775 Posttest probability = posttest odds/1+posttest odds = 0.002775/1+0.002775 = 0.00277 = 0.28% Therefore only 0.28% of people who test negative will be HIV +ve