Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Regression analysis wikipedia , lookup
Predictive analytics wikipedia , lookup
Neuroinformatics wikipedia , lookup
Corecursion wikipedia , lookup
Pattern recognition wikipedia , lookup
K-nearest neighbors algorithm wikipedia , lookup
Data analysis wikipedia , lookup
Surviving Pharmacy Residency Research: Tips and Tricks for Statistical Planning Surviving Pharmacy Residency Research: Tips and Tricks for Statistical Planning © Fraser Health Authority, 2007 The Fraser Health Authority (“FH”) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial redistribution. In consideration for this authorization, the user agrees that any unmodified reproduction of this publication shall retain all copyright and proprietary notices. If the user modifies the content of this publication, all FH copyright notices shall be removed, however FH shall be acknowledged as the author of the source publication. Reproduction or storage of this publication in any form by any means for the purpose of commercial redistribution is strictly prohibited. This publication is intended to provide general information only, and should not be relied on as providing specific healthcare, legal or other professional advice. The Fraser Health Authority, and every person involved in the creation of this publication, disclaims any warranty, express or implied, as to its accuracy, completeness or currency, and disclaims all liability in respect of any actions, including the results of any actions, taken or not taken in reliance on the information contained herein. FH Health Research Intelligence Unit How can we help? Grant Facilitator-Writer Conducting a search for funding opportunities. Automatic notification of new funding sources and deadlines. Identifying a research team. Preparing letters of intent. Identifying resources required for conducting research. Formulating the research budget. Writing the grant application in collaboration with researchers. Understanding FH and funding agency requirements regarding preparation of specific documents. FH Health Research Intelligence Unit How can we help? Epidemiologist Specifying the research goal, objectives and hypothesis. Identifying measurable outcomes. Specifying the variables for analysis. Identifying sources of data. Developing data collection tools for quantitative or qualitative studies. Developing the statistical analysis plan. Understanding how to use statistical software, such as SPSS. Workshop Outline Research 101- Basic Research Steps Research Question Refinement Common Study Designs- Resource Levels of Data Power and Sample Size Statistical Test Selection- Exercise Data Reporting- Resource Simple Stats with Excel- Resource Pharmacy Residency Project 1) Develop a research question 2) Conduct thorough literature review 3) Re-define research question or hypothesis 4) Design research methodology/study 5) Create research proposal 6) Apply for funding 7) Apply for ethics approval 8) Collect and analyze data 9) Draw conclusions and relate findings Research Question Refinement Research question will describe in operational terms, what you think will happen in the study. Good Versus Bad Research Question Are patients who take drug X more likely to experience episodes of delirium? Do patients who receive medication X between September 2008 and November 2008 experience more episodes of delirium as compared to patients who received drug Y during the same time period? Classification of Research Studies Observational Studies: Descriptive Studies: Focus on describing populations and describing the relationship between variables Research Studies Analytic Studies: Observational Descriptive Experimental Analytic Make inferences about the population based on a random sample. Experimental Studies: Test relationships between exposures and outcomes. Investigator has direct control over study condition and exposure status. Hierarchy of Studies Type of study is selected according to the purpose of research. Experimental Studies Analytic Studies Descriptive Studies Levels of Evidence HandoutResearch Design Hierarchy Probability Sampling Methods: Random There are several methods to choose from: Simple random sampling. Probability Sampling Methods: Stratified Stratified sampling (divide the population into non-overlapping strata and sample from within each stratum independently). Guarantees representation of all important groups. Probability Sampling Methods: Systematic Selection of the sample using an interval “k” so that every “k” unit in the frame is selected, is called systematic random sampling. Probability Sampling Methods: Systematic Steps to achieve a systematic random sample: 1. Number the units in the population from 1 to N. 2. Decide on the n (sample size) that you want or need. • k = N/n = the interval size. 3. Randomly select an integer between 1 and k. 4. Then take every kth unit. Example: 1. N=200 2. n=40, take N/n, 200/40=5 (interval size). 3. Randomly select a number between 1 and 5 (let’s pick 4). 4. Begin with 4, and take every 5th unit. Probability Sampling Methods: Cluster Cluster sampling. Divide population into clusters and randomly sample clusters. Measure all units within sampled clusters. Example: See blue areas on map. Not just geographic areas, could select hospitals, schools etc. Non-Probability Sampling Methods There are different types of non-probability sampling methods as well: Convenience (not representative of population). Purposive (certain group in mind). Expert sampling (seek out specific expertise). Snowball sampling (ask people to participate, they ask more people). If you select non-probability sampling methods, the conclusions drawn from the study results apply only to that specific population. Measurement: Levels of Data The level of data will dictate which statistical test you should use. Categorical = Data that is classified into categories and cannot be arranged in any particular order (e.g. Apples and pears, gender, eye colour, ethnicity). Ordinal = Data ordered, but distance between intervals not always equal. (e.g. Low, middle and high income). Continuous = equal distance between each interval (e.g. 1,2,3., age). Statistics and Statistical Test Selection Types of Statistics Descriptive Statistics: Describes research findings E.g. Frequencies, averages. Inferential Statistics: Makes inferences about the population, based on a random sample. In a random sample, each person/unit has an equal chance of being selected Allows generalizability to population. Types of Variables Variables can be classified as independent or dependent. An independent variable is the variable you believe will influence your outcome measure. A dependent variable is the variable that is dependant on or influenced by independent variable(s). The dependent variable can also be the variable you are trying to predict. Statistical Test Selection Selecting the appropriate Statistical test requires several steps: Test selection should be based on: 1) What is your goal? Description? Comparison? Prediction? Quantify association? Prove effectiveness? Prove causality? 2) What kind of data have you collected? What are the levels of data (Nominal, ordinal, continuous)? Was your sample randomly selected? 3) Is your data normally distributed? Should you use a parametric or nonparametric test? 4) What are the assumptions of the statistical test you would like to use? Does the data meet these assumptions? Parametric Tests Parametric tests assume that the variable in question is from a normal distribution. Non-parametric tests do not require the assumption of normality. Most non-parametric tests do not require an interval level of measurement; can be used with nominal/ordinal level data. Assumptions There are various assumptions for each test. Before you select a test, be sure to check the assumptions of each test. You will need to contact a consultant, or review statistical/research methods resources to find this information. Some examples of common assumptions are: The dependent variable will need to be measured on a certain level (i.e. Interval level). The independent variable(s) will need to be measured on a certain level (i.e. Ordinal level). The population is normally distributed (not skewed). If your data do not meet the assumptions for a specific test, you may be able to use a non-parametric test instead. Type of Data Goal Measurement Normal Population Ordinal, or NonNormal Population Median, interquartil e range Wilcoxon test Binomial Survival Time -Two Possible Outcomes Proportion Kaplan Meier survival curve Describe one group Mean, SD Compare one group to a hypothetic al value One-sample t test Compare two unpaired groups Unpaired t test Mann-Whitney test Log-rank test or Mantel-Haenszel* Paired t test Wilcoxon test Fisher's test (chisquare for large samples) McNemar's test Compare two paired groups Chi-square test Cox proportional hazard regression** One-way Compare ANOVA three or more unmatched groups Kruskal-Wallis test Chi-square or Binomial test ** Conditional proportional hazards regression* Statistical Test Selection Group Exercise Using your tables, select the appropriate statistical tests for 10 research scenarios. During the group exercise… Steps to choose the appropriate statistical method for the data analysis: 1. Identify whether the research problem raises the question of describe, relate (association), or compare (difference). 2. Identify the levels of measurement in the research question (Nominal/Categorical, Ordinal/Rank, Continuous/Evenly spaced). 3. Identify the number of variables, or samples being described, related, or compared. 4. Identify whether comparison samples are related (analyze same group before and after) or independent (not at all related, looking at different groups). 5. Choose the appropriate statistical tool for the data and situation using the decision tree in the handout. 1. A pilot experiment designed to test the effectiveness of a new approach to electrode placement for Electro Shock Therapy (ECT) has been conducted over a one year time period in the Fraser Health Authority. Patients from two different mood disorder clinics participated in this study. Patients from Clinic X received ECT therapy according to current practice guidelines. Patients from Clinic Y received a new exploratory ECT treatment. Patients in each clinic were matched for age, gender, and type of disorder. A random sample of 30 matched pairs of patients were selected for inclusion in the study. At end of one year, patients were administered a memory test yielding a total score out of 100. Dr. Vasdil would like to know what statistical procedure needs to be selected to test for differences among groups of patients on the memory test. What is the question: Compare How many samples: 2 Related or independent: Independent What is the level of measurement: Continuous How many dependent variables: 1 Test: T-test Sample Size There are several rules of thumb for determining sample size. 1) It’s a good idea to have a minimum of 30 cases (as a total group, or if comparing groups, 30 for each group). If you have less you can use a non-parametric test, but it is still better to have close to 30 cases. 2) If using regression, it is best to have between 10-50 cases per independent variable. 3) If you are validating a survey, it is never good to have more questions than cases. 4) If the total population that you are examining is less than 30. Use all of them. 5) For pilot studies the recommendation is a sample size of 12 per group 6) For surveys, a sample size of 400 per group can do just about anything. 7) For surveys, a 30% response rate is the bare minimum. Note: For a precise sample size estimate you will need to conduct a power analysis. Statistical Power Power is the capability of a statistical test to correctly detect a significant effect if it exists. Assumes value between 0 and 1 (%) Power= 1-B (B= probability of a Type II error). Type II error – the error of not rejecting a false research finding. Type I error- the error of rejecting a correct research finding. Types of Power A Priori- Conducted before study commences (at proposal stage). Post Hoc- After study has been completed. Easy way to increase power? Increase sample size Increase Effect size Components Involved in Power Calculation Sample Size- Number of cases. Effect Size –Magnitude of the trend and variation. Alpha Level- Odds of concluding that the presence of an effect is due to chance alone (.05 or .01). Also known as Type I Error, or the error of rejecting a correct research finding Power level- 80-90% common One or two-tailed test- two tailed is common. Components Involved in Power Calculation Sample Size- What we want to find out. Effect Size –Magnitude of the trend…but what if you don’t know? Look to pilot data or literature. Keep in mind, the smaller the effect size, the larger the sample size required. Alpha Level- .05 Power level- 80-90% Important Consultation Information What is your research question? Components of power calculation Levels of data (nominal, ordinal, continuous) Sampling plan Data Organization: Codebook What is a codebook? A codebook is a log of your variables (and levels of data) and how you will code them. A codebook will help everyone understand the coding schemes to ensure that they are on the same page! Data Processing and Analyses: Codebook Example Variable Name Variable Label Values Coding Missing Variable Type age age 1,2,3,4,5 1=10-20 years 2=21-30 years 3=31-40 years 4=41-50 years 5=51+ years 97=Incorrect response 98=No response 99=Not Applicable Ordinal sex sex 1,2 1=male, 2=female 97=Incorrect response 98=No response 99=Not Applicable Nominal happiness happiness at work 1,2,3 1=not happy 2=somewhat happy 3=very happy 97=Incorrect response 98=No response 99=Not Applicable Ordinal Spreadsheet Example ID# Age Sex Happiness 1 1 1 2 2 2 2 2 3 3 1 2 4 57 2 2 5 45 2 3 6 66 2 3 7 2 2 3 8 88 2 3 Data Analysis with Excel Most simple analyses can be done using Excel, including correlation, regression and even random number generation. Install the data analysis pack. Go to tools, add-ins, and add the ‘analysis tool pack’. Create worksheet and codebook. Choose statistical test. Follow commands in help menu. Data Analysis with Excel http://home.ubalt.edu/ntsbarsh/excel/excel.htm Data Reporting and Presentation of Data Graphical summaries are a great way to present your data Excel is great for creating tables and graphs The type of data you have will reflect the type of graphical summary you should use. Data Reporting and Presentation of Descriptive Data Categorical data: Frequency Tables and Bar Charts. Example: Fruit Count Percent Valid Percent Pineapples 4 20% 21% Apples 5 25% 26% Oranges 10 50% 53% Unknown 1 5% _______ Total 20 100% 100% Data Reporting and Presentation of Descriptive Data Fruit Study Unknown Oranges Apples Pineapples 0 5 10 Data Reporting and Presentation of Descriptive Data Continuous Data: Tables and Histograms Age Count Percent 20-30 4 20% 31-40 5 25% 41-50 10 50% 51-60 1 5% Total 20 100% Data Reporting and Presentation of Descriptive Data 10 8 20-30 31-40 41-50 51-60 6 4 2 0 20 30 40 50 What is the difference between a Histogram and a Bar Chart? Histogram: For continuous data where data are divided into contiguous class intervals (or in other words, connected through unbroken sequence). Bar Chart: For categorical data where categories are not contiguous. Measures of Central Tendency Reporting If averages Categorical data= Mode Ordinal data= Median Continuous data= Mean there are outliers (or extreme values), report the median instead of the mean. Reporting Inferential Stats Handout Resource- APA Guidelines http://www.ilstu.edu/~jhkahn/apastats.html Reporting Inferential Stats It’s important to include means, standard deviations and sample size in your results section. Example: Correlation Variable X was strongly correlated with Variable Y, r=.59, p<.01. Important to Keep your Audience in Mind Residency Project Publication Departmental Report Aaron: TCPS certification for residents reminder… Questions?