Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Developer’s Guide to Undertaking Rapid Evidence Assessments (REAs) Version 2.0 © Commonwealth of Australia 2014 This work is copyright. Apart from any use as permitted under the Copyright Act 1968, no part may be reproduced by any process without prior written permission from the Commonwealth. Requests and inquiries concerning reproduction and rights should be addressed to the publications section Department of Veterans’ Affairs or emailed to [email protected]. Acknowledgements This guide has been prepared by the Australian Centre for Posttraumatic Mental with assistance from Optum, Adelaide University, and Department of Veterans’ Affairs. For citation: Varker, T., Forbes, D., Dell, L., Weston, A., Merlin, T., Hodson, S. & O’Donnell, M., (2014). A Developer’s Guide to Undertaking Rapid Evidence Assessments (REAs). Guide prepared for the Department of Veterans Affairs. Australian Centre for Posttraumatic Mental Health 2 Contents Acknowledgements ............................................................................................................ 2 Contents .............................................................................................................................. 3 Overview - What is a Rapid Evidence Assessment? ........................................................ 4 Putting together the team ................................................................................................... 5 The REA methodology........................................................................................................ 5 Development Phase ............................................................................................................ 7 1. Question development ............................................................................................ 7 2. Methods development ............................................................................................. 8 3. Information retrieval/ management ........................................................................ 10 Processing Phase ............................................................................................................. 10 4. Screening step 1: titles/abstracts .......................................................................... 10 5. Screening step 2: full paper .................................................................................. 10 6. Assessing the quality of the evidence ................................................................... 11 Reporting Phase ............................................................................................................... 17 Recommended reading .................................................................................................... 18 References ........................................................................................................................ 19 Appendix 1 ........................................................................................................................ 20 Appendix 2 ........................................................................................................................ 21 Appendix 3 ........................................................................................................................ 22 Overview - What is a Rapid Evidence Assessment? A Rapid Evidence Assessment (REA) is a research methodology which uses the same methods and principles as a systematic review but makes concessions to the breadth or depth of the process, in order to suit a shorter timeframe. The advantages of REAs are that rigorous methods for locating, appraising and synthesising evidence from previous studies can be upheld and results can be produced in a shorter time than that required for a full systematic review. The purpose of an REA is to provide a balanced assessment of what is already know about a specific problem or issue. They are particularly useful for examining the strength of the evidence in a particular area so that conclusions are reached that consider not only the outcomes of the studies, but the quality, quantity and design of the studies that have been reviewed. The shorter time frame, lower cost (relative to full systematic reviews), and evaluation of the strength of the evidence make REAs particularly helpful in informing policy and decision makers, program managers and researchers. REAs utilise a number of strategies to assist making them rapid. This includes having a narrow question, limiting the time frame in which studies are published, and making concessions on how the published studies are synthesised. Often REAs make use of existing high quality guidelines or systematic reviews/meta-analyses to assist making them rapid. Thus REAs try to maximise the existing synthesising literature in order to minimise time and cost. In order to identify whether an REA is the appropriate methodology for asking a specific question it is important to be aware of the limitations of REAs. The very things that make REAs rapid also can be a limitation in some ways. For example the process of restricting the time period from which to review studies may result in important studies not being included in the review. Similarly, the narrowness of the question may result in other studies being excluded. Often REAs are limited to studies that are published in peer-review journals which results in the omission of literature such as unpublished pilot studies, difficult-to-obtain material and/or foreign language studies. Therefore REAs are subject to any reporting biases that may be evident in the peer review literature. While it is important to acknowledge these limitations, there are situations where the advantage of REAs outweighs their limitations, and REA is the appropriate methodology to conduct a review. It is important to acknowledge, however, that while there is a growing use of REAs, there is a lack of published studies that have specifically scrutinised and reported methodologies utilised in REAs in any detail1. If REAs are not conducted and reported in a transparent way, it is impossible to determine the validity, appropriateness and, ultimately, the utility of the resulting review2. The purpose of this guide is to provide a framework for undertaking an REA. This guide provides a transparent description of phases that may be considered when undertaking an REA – from question development through to evaluating the body of literature. Putting together the team A significant part of the success of the REA is dependent on whether the question that the review is able to answer is the same as the question the end user expected to have answered. REAs by their very nature require a well-operationalised and specific question and it is critical that the end user is clear about what will be answered, and importantly, what will not be answered by the question. It is essential that the end user is aware that the wider the question is, the more time (and therefore cost) the REA will take. The REA team that manages and conducts the review should have a range of skills. Ideally, this would include expertise in systematic and / or rapid review methods, information retrieval, and expertise in the relevant clinical / topic area. The end user should be consulted when operationalising the question, in the development of inclusion and exclusion criteria for the literature, and in determining the quantity and type of final reports to be produced. The process of communication between the REA team and the end user should be dynamic throughout the entire review. The REA methodology The REA methodology balances the level of confidence in the findings of the review with the time it will take to conduct, and the cost that will be incurred. The methodology consists of a number of phases (identified in Table 1 below) and is based on the steps to conduct a systematic review3. The development of an REA in the manner outlined below ensures that the REA is conducted in a rigorous, replicable and, most importantly, transparent way. Importantly, an REA protocol should be developed for each question to document the specific processes for each phase below. 5 Table 1: The process of a Rapid Evidence Assessment 1. Question development Development Phase 2. Methods development a) Develop inclusion criteria b) Search strategy 3. Information retrieval/ management 4. Screening step 1 of 2 : titles/abstracts - Retrieval of papers - Processing Phase 5. Screening step 2 of 2 : full paper - Data abstraction - 6. Assess the quality of the evidence Reporting Phase 8. Report results 6 Development Phase 1. Question development The first step within the development phase of the REA is to deconstruct the question into specific components, an example of which is outlined below. This process of question development is designed to clearly define the scope of the question to be addressed. Population Intervention Comparison Outcome (PICO) Formulation The development of a Population Intervention Comparison Outcome (PICO) framework in the initial stages can help to structure, contain, and set the scope for the research question. Inclusion of a comparison component is dependent on the question asked, and may not be appropriate for all question types. The key points to consider when developing the PICO for a research question, and a working example is provided in the tables below: Table 2: The PICO framework for question development P Patient, Problem, Population What are the important characteristics of the patient? What is a description of the problem? What is a description of the population? Consider disease or health status age, race, sex, previous ailments, current medications I Intervention What is the specific diagnostic test, treatment, adjunctive therapy, or medication of interest? C Comparison (optional) What alternative diagnostic test, treatment, or medication is being considered? NOTE: The Comparison is the only optional component. An Intervention can be examined without alternatives, and in some cases, there may not be an alternative. O Outcome What are the result(s) of what can be accomplished, improved, or affected? These should be measurable and may consist of relieving or eliminating specific symptoms, or improving or maintaining function. When defining the outcome, “more effective” is not acceptable unless it describes how the intervention is more effective. Table 3: Example of a research question and the PICO framework RESEARCH QUESTION: What are the effective psychological interventions for adults with a diagnosis of depression? 7 P Patient, Problem, I Intervention Population Age ≥ 18 Gender - no Intervention specification Diagnosis- Major Cognitive behavioural therapy C Comparison (optional) Comparison Waitlist/ no-treatment/ minimal attention O Outcome Reduction in depression Treatment as usual symptoms on validated depressive episode Pharmacotherapy alone measures Not undergoing any other psychological Attention/ placebo control Psychological intervention which is not cognitive behavioural therapy included in RCTs or treatment for depression (treatment naive) pseudo-RCTs only Research question in “PICO” format: In adults with diagnosed major depressive episode, has cognitive behavioural therapy been shown to be effective in RCT or pseudo-RCT studies in reducing the symptoms of depression? 2. Methods development a) Develop inclusion criteria REAs are carried our more speedily than systematic reviews, but the inclusion and exclusion criteria need to be no less rigorous when it comes to determining conceptual boundaries. An example of core inclusion and exclusion criteria for both empirical papers and high-quality guidelines are presented below. Appropriate inclusion and exclusion criteria need to be determined by the REA team and are tailored to the specific question. Table 4: Core inclusion and exclusion criteria for empirical papers Inclusion criteria 1. Published, peer-reviewed research studies 2. Research papers that were published within the last 10 years* 3. Quantitative studies with outcome data that assesses the key dependent variable 4. Human Adults (i.e. ≥ 18 years of age) 5. English Language Exclusion criteria 1. Non-English papers 2. Papers more than 10 years old* 3. Papers where a full-text version is not readily available 4. Animal studies 5. Qualitative studies * the timeframe of 10 years may be expanded/collapsed depending on the question 8 Table 5: Core inclusion and exclusion criteria for high-quality guidelines Inclusion criteria 1. Underpinned by a systematic review 2. Ratings of the strength of the evidence 3. Recommendations generated by a group of content experts Exclusion criteria 1. Not underpinned by a systematic review 2. No ratings of the strength of the evidence 3. Recommendations not generated by a group of content experts b) Search strategy The search strategy is devised using relevant subject headings for each database and additional free text words that have been identified by an expert on the phenomenon of interest. The specific search terms used to conduct the search should be recorded and documented. To identify relevant literature for the REA, a systematic bibliographic search should be undertaken. Examples of databases to search include: National Guideline Clearinghouse (USA) Clinical Guidelines Portal (Australia) The Cochrane Library EMBASE MEDLINE (PubMed) PsychINFO The REA methodology prioritises the use of guidelines, and systematic reviews with metaanalyses, in order to utilise pre-existing high-quality rigorous research, to limit unnecessary duplication, and to increase the rapidness of the review. Individual empirical papers may then be searched for from the data cut-off point in the guideline or systematic review/metaanalysis (where applicable). Presence of guidelines or systematic review/meta-analyses If guidelines, and/or systematic review/meta-analyses are available the following procedure may be applied: 9 I. II. III. Order of precedence: (1) guidelines; (2) systematic review/meta-analyses The most recent guideline or systematic review/meta-analysis should be subject to an assessment of quality. If the guideline or systematic review/ meta-analysis does not satisfy the quality assessment then the next most recent source should be assessed in reverse sequential order (e.g. most recent to oldest) until the quality assessment criteria is met (NOTE: information about quality assessment is provided in the processing phase, item 4, below). The guideline or systematic review/ meta-analysis that satisfies the quality assessment determines what the cutoff year will be for the primary research articles (e.g., if a meta-analysis had a data-cut off of January 2009, then primary research studies from 2008 and earlier would be excluded). 3. Information retrieval/ management Tools to support information retrieval and management include general word processing packages, spreadsheets and databases. There are also a number of license-based applications that may be used. The advantages of these applications can include the ability to develop quality control mechanism that assist with minimising data entry errors. Processing Phase 4. Screening step 1: titles/abstracts Similar to a systematic review, the REA methodology employs a two-step screening process4. In the first step, records are screened for relevance against the inclusion criteria using the information available in the title and the abstract. Full text versions of all studies which satisfy the screening criteria or studies in which inclusion cannot be definitively determined are obtained. Only papers with readily available full-text versions should be included. 5. Screening step 2: full paper In step two of processing the full-text version of the paper is screened. At this stage a decision on whether the paper should be included or excluded, based on pre-defined criteria, is to be made. Inter-rater reliability A random selection of 20% of the articles processed at step two (full paper) may be checked by an independent reviewer. If an inter-rater agreement rate of less than 95% is found 10 regarding inclusion / exclusion, the second reviewer should conduct an independent review of all full-text papers. In the presence of discrepancies, discussions should be held between the reviewers to reconcile the discrepancies. Data extraction Information about study characteristics and findings from the included studies is recorded during this stage. This information will be used to appraise the evidence. It is important to note that an REA is not designed to drill into the detail of individual studies to the same extent as systematic reviews. Information such as study characteristics, participant characteristics, results and main findings may be extracted. Information can be recorded in a data extraction form, to ensure that information is collected in a standardised way. 6. Assessing the quality of the evidence The choice of process to evaluate the evidence is determined by the type of the data used to address the question of interest. Two different processes are presented below. One process is appropriate for intervention type questions where the data is aimed at identifying whether a particular intervention improves outcomes. The second process presented is appropriate for questions that aim to identify rates of disorders (such a prevalence or incidence rates). Both processes use a similar structure but are adapted to suit the type of data utilised by the different questions. (i) Evaluation of the evidence for intervention questions Studies identified for inclusion in an intervention type question can be subjected to a more refined quality assessment by use of the quality of evidence process outlined below. This process encompasses five components, which have been adapted from the FORM framework5: The strength of the evidence base, in terms of the quality (risk of bias associated with how the research was conducted), quantity, and level of evidence (study design) The direction of the study results in terms of positive, negative or null findings The consistency of the study results across the included studies (including across a range of study populations and study designs). The generalisability of the body of evidence to the target population of the intervention being assessed 11 The applicability of the body of the evidence to the Australian context and health system The first three components provide a gauge of the internal validity of the study data in support of efficacy (for an intervention). The last two components consider the external factors that may influence effectiveness, in terms of the generalisability of study results to the intended target population, and applicability to the Australian context. Strength of the evidence base The quality of the evidence base can assessed in terms of the (a) quality and risk of bias, (b) quantity of evidence, and (c) level of evidence. a) Quality and risk of bias reflects how well the studies have been conducted, including how the participants were selected, allocated to groups, managed and followed-up, and how the study outcomes were defined, measured, analysed and reported. Quality and bias in meta-analyses / systematic reviews and individual studies can be assessed in the following way: Meta-analyses and systematic reviews - in the instance that either a meta-analysis or systematic review are included in the review, they could be rated according to an adapted version of the NHMRC quality criteria6. These criteria are presented in Appendix 1. A consensus agreement as to an overall rating of ‘Good’, ‘Fair’, or ‘Poor’ may be sought from three independent raters. Individual studies - an assessment is conducted for each individual study with regard to the quality and risk of bias criteria, utilising a modified version of the Chalmers Checklist for appraising the quality of studies of interventions (NHMRC 2000 guidebook7). This checklist is presented in Appendix 2. A consensus agreement as to an overall rating of ‘Good’, ‘Fair’, or ‘Poor’ may be sought from three independent raters. Quantity of evidence reflects the number of studies that are included as the evidence base for each ranking. The quantity assessment also takes into account the number of participants in relation to the frequency of the outcomes measures (i.e. the statistical power of the studies). Small underpowered studies that are otherwise sound may be included in the evidence base if their findings are generally similar- but at least some of the studies cited as evidence must be large enough to detect the size and direction of any effect. Level of evidence reflects the study design. Each study is classified according to a hierarchy of evidence commonly used in Australia8: Level I: A systematic review of RCTs Level II: An RCT 12 Level III-1: A pseudo-randomised controlled trial (i.e. a trial where a pseudo-random method of allocation is utilised, such as alternate allocation). Level III-2: A comparative study with concurrent controls. This can be any one of the following: o Non-randomised experimental trial [this includes controlled before-and-after (pre-test/post-test) studies, as well as adjusted indirect comparisons (i.e. utilise A vs B and B vs C to determine A vs C with statistical adjustment for B)] o Cohort study o Case-control study o Interrupted time series with a control group Level III-3: A comparative study without concurrent controls. This can be any one of the following: o Historical control study o Two or more single arm study [case series from two studies. This would include indirect comparisons utilise (i.e. A vs B and B vs C to determine A vs C where there is no statistical adjustment for B] o Interrupted time series without a parallel control group. Level IV: Case series with either post-test or pre-test/post-test outcomes Procedure for judging the strength of the evidence base A judgement can be made about the strength of the evidence base, taking into account the quality and risk of bias, quantity of evidence and level of evidence. Agreement may be sought between three independent raters and consensus about the strength of the evidence based could be categorised as outlined below. In situations where there are many studies, the studies of the highest level (i.e. Level I or Level II) should be used to determine the category which best reflects the strength of the evidence, and lower level studies should be disregarded. High strength Moderate strength Low strength One or more Level I studies with a One or two Level II studies One or more Level I through to low risk of bias OR three or more with a low risk of bias OR two Level IV study with a high risk of Level II studies with a low risk of or more Level III studies with a bias bias low risk of bias 4 4 4 Direction of evidence A judgement can be made about the direction of the findings in evidence base, in regards to whether positive or negative results have been found. Agreement is sought between three independent raters and consensus about the direction of the evidence base can then be categorised as outlined below. In cases where there are studies which show findings in 13 different directions, preference should be given to the category which correlates with the findings of the study which is of highest level and best quality. In cases where studies of the same level have findings in different directions, then the middle category (Unclear direction) should be selected. Positive direction Unclear direction Negative direction The weight of the evidence The evidence does not show The weight of the evidence indicates positive results significant effects OR the indicates negative results results are mixed Consistency The consistency component assesses whether the findings were consistent across the included studies (including across a range of study populations and study designs). It is important to determine whether study results are consistent to ascertain whether the results are likely to be replicable or only likely to occur under certain conditions. Should results differ for certain subpopulations, this could then be reflected in the discussion. Most studies are consistent and inconsistency may be All studies are explained, reflecting that Some inconsistency All studies are consistent reflecting that results are moderately- reflecting that results inconsistent reflecting results are highly likely highly likely to be are somewhat unlikely that results are highly to be replicable replicable to be replicable unlikely to be replicable Generalisability This component covers how well the participants and settings of the included studies match the target population. Population issues that might influence this component include gender, age or ethnicity, or level of care (e.g. community or hospital). Issues such as the prevalence of the disease in the study population as compare to the target population and stage of disease (e.g. early versus advanced) can also be considered. 14 The population/s The population/s examined in the examined in the evidence are different to evidence are not the the target population, same as the target The population/s The population/s but it is clinically population and hard to examined in the examined in the sensible to apply this judge whether it is evidence are the same evidence are similar to evidence to the target sensible to generalise to as the target population the target population population target population Applicability This component addresses whether the evidence base is relevant to the Australian context, or to more local settings (such as rural areas or cities). Factors that may reduce the direct application of study findings to the Australian context or specific local settings include organisational factors (e.g. availability of trained staff) and cultural factors (e.g. attitudes to health issues, including those that may affect compliance). Applicable to the Applicable to the Directly applicable to the Australian context with Australian context with Not applicable to the Australian context few caveats some caveats Australian context Ranking the evidence (optional) Ranking the evidence provides a simple, overall picture of the state on the literature for the end user. However, not all questions (and bodies of evidence) lend themselves to being ranked especially if the body of literature for a specific question is very difficult to synthesise in any way. The decision whether or not to rank should be discussed with the REA team and end user. If ranking is considered appropriate, then taking into account the considerations of the strength of the evidence (quality, quantity of evidence and level of evidence), direction, consistency, generalisability and applicability, the total body of the evidence for intervention questions could be ranked (through consensus agreement) into one of four categories: Supported, Promising, Unknown and Not Supported (see Figure 1). It is important to ensure that this ranking takes into account the findings of the guidelines or systematic review/metaanalyses utilised in the review. 15 NOTE: If the strength of the evidence is considered to be low, the next steps of rating direction, consistency, generalisability and applicability need not be conducted and the evidence can be rated as ‘Unknown’. SUPPORTED PROMISING UNKNOWN NOT SUPPORTED Clear, consistent Evidence Insufficient Clear, consistent evidence of suggestive of evidence of evidence of no beneficial effect beneficial effect beneficial effect – effect or negative but further further research / harmful effect research required required Figure 1. Categories within the intervention ranking system (ii) Evaluation of the evidence for prevalence questions Similar to intervention type questions, a process for conducting a quality assessment for prevalence type questions is outlined below. This process also encompasses four components: Quality and risk of bias Data source (primary or secondary) Quantity of evidence The generalisability of the body of evidence to the target population Quality and risk of bias reflects the scientific benchmarks for prevalence studies where randomly selected samples, clear definitions of population and disorder/topic of interest, the use of validated tools and reporting information on non-responders, constitute a ‘gold standard’ quality of evidence (see Appendix 3 for modified version of a checklist for prevalence studies)9. Bias can also be assessed in terms of the data source, which reflects whether the data collected in each study was primary (e.g., clinical interview) or secondary (e.g., medical chart review). Primary data sources are collected with purposeful intention by researchers to measure a particular phenomenology of interest, meaning the researcher can control or 16 manipulate relevant variables to increase the likelihood of obtaining the true prevalence rate. In comparison, secondary data sources are collected at a time point after the diagnosis was made, where at the time of diagnosis, neither the patient nor the clinician were aware that the diagnosis would be used for research purposes10. Therefore, by nature, secondary data sources are opportunistic, which may increase or decrease risk of bias depending on the phenomenology of interest. Quantity of evidence reflects the number of studies that were included as the evidence base for each ranking. Importantly for prevalence studies, the quantity assessment also takes into account the number of participants included in the study. Generalisability covers how well the participants and settings of the included studies can be generalised to the target population. Population issues that might influence this component included gender, age, or ethnicity, or level of care (e.g. community or hospital). Ranking the evidence (non-applicable) Prevalence questions do not generally lend themselves to being ranked. Normally, if there is a sufficient quantity of good quality evidence that is generalisable to the population of interest, then it is possible to extrapolate with a high degree of certainty as to what the prevalence of a particular condition or event is likely to be. In lieu of ranking prevalence questions, it is suggested that summary comments are made instead. Reporting Phase The final phase in the REA process is to report on all studies identified as eligible for inclusion. The content and design of this report will vary by nature of the audience and intent. 17 Recommended reading NHMRC. How to review the evidence: Systematic identification and review of the scientific literature. Canberra: National Health and Medical Research Council; 2000. Downloadable from: http://www.nhmrc.gov.au/guidelines/publications/cp65 University of York. Centre for Reviews and Dissemination. Systematic reviews : CRD's guidance for undertaking reviews in health care. York: CRD, University of York; 2009. Downloadable from: http://www.york.ac.uk/inst/crd/pdf/Systematic_Reviews.pdf References 1. Harker J, Kleijen M. What is a rapid review? A methodological exploration of rapid reviews in Health Technology Assessments. International Journal of Evidence-based Healthcare. 2012;10:397-410. 2. Khangura S, Konnyu K, Cushman R, Grimshaw J, Moher D. Evidence summaries: the evolution of a rapid review approach. Systematic Reviews. 2012;1(1):10. 3. Victorian Government Department of Human Services. A better place: Victorian homelessness 2020 strategy. Melbourne 2010. 4. Goodman LA, Saxe L, Harvey M. Homelessness as psychological trauma: broadening perspectives. American Psychologist. 1991;46(11):1219-1225. 5. Hillier S, Grimmer-Somers K, Merlin T, et al. FORM: An Australian method for furmulating and grading recommendations in evidence-based clinical guidelines. BMC Medical Research Methodology. 2011;11(23). 6. NHMRC. How to use the evidence: Assessment and application of scientific evidence. Canberra: National Health and Medical Research Council; 2000. 7. NHMRC. How to review the evidence: Systematic identification and review of the scientific literature. Canberra: National Health and Medical Research Council; 2000. 8. Merlin T, Weston A, Tooher R. Extending an evidence hierarchy to include topics other than treatment: revising the Australian 'levels of evidence'. BMC Medical Research Methodology. 2009;9(34). 9. Giannakopoulos NN, Rammelsberg P, Eberhard L, Schmitter M. A new instrument for assessing the quality of studies on prevalence. Clinical oral investigations. 2012;16(3):781-788. 10. VicHealth. The health costs of violence: measuring the burden of disease caused by intimate partner violence. Melbourne 2004. 19 Appendix 1 Quality and Bias Assessments for Meta-Analysis and Systematic Review Study Type Systematic review Error Categories Citation: Y N NR NA Quality Criteria A. Was an adequate search stragegy used? Was a systematic search streagy reported? I Were the databases search reported? III Was more than one database searched? III Were search terms reported? IV Did the litarature search include hand searching? IV B. Were the inclusion criteria apprpriate and applied in an unbiased way? Were inclusion/exclusion criteria reported? II Was the inclusion criteria applied in an unbiased way? III Was only level II evidence included? I=IV C. Was a wuality assessment of included studies undertaken? Was the quality of the studies reported? III Was a clear, pre-determined strategy used to assess study quality? IV D. Were the characteristics and results of the individual studies appropriately summarised? Were the characteristics of the individual studies reported? III Were baseline demographic and clinical characteristics reported for patients in the individual studies? IV Were the results of the individual studies reported? III E. Were the methods for pooling the data appropriate? If appropriate, was a meta-analysis conducted? III-IV F. Were the sources of heterogenity explored? Was a test for heterogeneity applied? III-IV If there was heterogeneity, was this discussed or the reasons expored? III-IV Comments Quality raitng: [Good/Fair/Poor] Systematic review: Included studies: Note: Quality criteria adapted from NHMRC (2000) How to use the evidence: assessment and application of scientific evidence. HNMRC, Canberra. 20 Appendix 2 Chalmers Checklist for appraising the quality of intervention studies Completed Yes No 1. Method of treatment assignment Correct, blinded randomisation method described OR randomised, double-blind method stated AND group similarity documented Blinding and randomisation stated but method not described OR suspect technique (eg allocation by drawing from an envelope) Randomisation claimed but not described and investigator not blinded Randomisation not mentioned 2. Control of selection bias after treatment assignment Intention to treat analysis OR full follow-up Intention to treat analysis AND <25% loss to follow-up Analysis by treatment received only OR no mention of withdrawals Analysis by treatment received AND no mention of withdrawals OR more than 25% withdrawals/loss-tofollow-up/post-randomisation exclusions 3. Blinding Blinding of outcome assessor AND patient and care giver (where relevant) Blinding of outcome assessor OR patient and care giver (where relevant) Blinding not done Blinding not applicable 4. Outcome assessment (if blinding was not possible) All patients had standardised assessment No standardised assessment OR not mentioned 5. Additional Notes Any factors that may impact upon study quality or generalisability 21 Appendix 3 Checklist for Considering the Quality of Descriptive, Observational Prevalence Studies: Modified from Giannakopolous, Rammelsberg, Eberhard, Schmitter (2012) Completed Yes No 1. Target Population Target population clearly defined, including: age, sex, employment, ethnicity, religion AND relevant data from health questionnaire of sampled persons, if appropriate Target population not clearly defined : limited data available on: age, sex, employment, ethnicity, religion AND relevant data from health questionnaire of sampled persons, if appropriate Target population poorly defined: little or no information on age, sex, employment, ethnicity, religion OR little or no information from relevant data from health questionnaire of sampled persons, if appropriate 2.Sampling method (Representativeness) i Sophisticated probability sampling used (e.g. stratified sampling; cluster sampling; multistage sampling; multiphase sampling) Simple probability sampling used: (e.g. simple random sampling) No probability sampling used 1 3. Measurement (Reliability) Standardised data-collection methods (e.g. validated clinical interview or diagnostic instrument/criteria) OR reliable survey instruments (e.g. validated self-report measure / validated screening instrument) Non-standardized data collection OR Non-validated interview or non-validated self-report measure 1 Simple sampling methods (from Boyle, 1998): Predetermined number of units (individuals, families, households) selected from the sampling frame so each unit has an equal chance of being chosen 22 4. Information About Non- responders Analysis of differences conducted on non-responders No analysis of differences information provided on non-responders OR only proportion (e.g. %) of non-respondents supplied without any other information 5. Additional Information Information that may affect the overall rating (e.g. were special features accounted for? Were there satisfactory/appropriate statistical analyses, confidence intervals, etc.?) i Complex sampling methods (from Boyle, 1998): Stratified Sampling: a population is divided into relatively homogeneous subgroups (strata) and samples selected independently and with known probability from each strata; Cluster Sampling: population divided into affiliated units or clusters e.g. neighbourhoods or households and a sample of clusters selected with known probability; Multistage Sampling: samples are selected with known probability in hierarchical order e.g. a sample of neighbourhoods, then sample of households, then sample of individuals; Multiphase Sampling: sampled individuals are screened and subsets selected with known probability for more intensive assessment 23