Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NZHTA REPORT November 2006 Volume 9 Number 4 Screening Strategies for Antenatal Down Syndrome Screening A systematic review of the literature Rebecca O’Connell Meagan Stephenson Robert Weir New Zealand Health Technology Assessment Department of Public Health and General Practice Christchurch School of Medicine and Health Sciences Christchurch, New Zealand NEW ZEALAND HEALTH TECHNOLOGY ASSESSMENT (NZHTA) Department of Public Health and General Practice Christchurch School of Medicine and Health Sciences Christchurch, New Zealand Screening Strategies for Antenatal Down Syndrome Screening A systematic review of the literature Rebecca O’Connell Meagan Stephenson Robert Weir NZHTA REPORT November 2006 Volume 9 Number 4 This report should be referenced as follows: O’Connell, R. Stephenson, M. Weir, R. Screening strategies for antenatal Down syndrome screening. NZHTA Report 2006; 9(4). 2006 New Zealand Health Technology Assessment (NZHTA) ISBN ISBN ISSN 1-877235-93-8 (Print) 1-877235-94-6 (Web) 1174-5142 i CONTRIBUTION BY AUTHORS This report was authored by Rebecca O’Connell (Research Fellow), and Meagan Stephenson (Research Fellow) who both conducted the critical appraisals and prepared the report. Dr Rob Weir (Director) also contributed to the writing of the methods section. ACKNOWLEDGEMENTS Dr Rob Weir (Director) peer reviewed the final draft. Susan Bidwell (Information Specialist) developed and undertook the search strategy and coordinated retrieval of documents. Catherine Turnbull (Administrator) provided document formatting. Kay Hodgson assisted with retrieval of documents. Acknowledgment is made of the contribution of Professor Nicholas Wald, Director, Wolfson Institute of Preventive Medicine, who undertook an external peer review of a late draft and provided valuable comments on the report. The Canterbury Medical Library assisted with the retrieval of articles. NZHTA is a Research Unit of the University of Otago funded under contract to the New Zealand Ministry of Health. This report was commissioned by Karen Mitchell, Group Manager, National Screening Unit, of New Zealand’s Ministry of Health. We thank the National Screening Unit staff at the Ministry of Health for partially funding the review, assisting in developing the scope and providing background material for the review. DISCLAIMER New Zealand Health Technology Assessment (NZHTA) takes great care to ensure the information supplied within the project timeframe is accurate, but neither NZHTA, the University of Otago, nor the contributors involved can accept responsibility for any errors or omissions. The reader should always consult the original database from which each abstract is derived, along with the original articles, before making decisions based on a document or abstract. All responsibility for action based on any information in this report rests with the reader. NZHTA and the University of Otago accept no liability for any loss of whatever kind, or damage, arising from reliance in whole or part, by any person, corporate or natural, on the contents of this report. This document is not intended as personal health advice. People seeking individual medical advice are referred to their physician. The views expressed in this report are those of NZHTA and do not necessarily represent those of the University of Otago or the New Zealand Ministry of Health. This review was commissioned by Karen Mitchell, on behalf of the New Zealand Ministry of Health. NZHTA is a Research Unit of the University of Otago and is funded under contract by the New Zealand Ministry of Health. COPYRIGHT This work is copyright. Apart from any use as permitted under the Copyright Act 1994 no part may be reproduced by any process without written permission from New Zealand Health Technology Assessment. Requests and inquiries concerning reproduction and rights should be directed to the Director, New Zealand Health Technology Assessment, Christchurch School of Medicine and Health Sciences, P O Box 4345, Christchurch, New Zealand. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING ii CONTACT DETAILS New Zealand Health Technology Assessment (NZHTA) Department of Public Health and General Practice Christchurch School of Medicine and Health Sciences PO Box 4345 Christchurch New Zealand Tel: +64 3 364 3696 Fax: +64 3 364 3697 Email: [email protected] Web Site: http://nzhta.chmeds.ac.nz/ SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING iii EXECUTIVE SUMMARY Background This review was requested by the Ministry of Health to inform the policy process in assessing options for antenatal Down syndrome screening and in determining whether a national antenatal screening programme for Down syndrome should be established in New Zealand. Aim This review aimed to systematically appraise the international evidence for the use of technologies and screening strategies for antenatal Down syndrome. Including: The validity of Down syndrome screening methods Any difficulties implementing screening strategies The impact of screening on amniocentesis and chorionic villus sampling The safety of these procedures. Data sources The following databases were searched (using the search strategy outlined in Appendix 1): Bibliographic databases Cinahl Cochrane Register of Controlled Trials Current Contents Embase Medline PubMed (last 60 days) Science Citation Index Social Science Citation Index Review databases ACP Journal Club Cochrane Database of Systematic Reviews Database of Abstracts of Reviews of Effectiveness Health Technology Assessment database NHS Economic Evaluation database Other Clinicaltrials.gov Current Controlled Trials References of retrieved papers were scanned for relevant publications. TRIP database UK National Screening Committee Down’s Syndrome Screening Programme Hand searching of journals, contacting of manufacturers, or contacting of authors for unpublished research was not undertaken in this review. A complete list of the sources searched for this review is given in Appendix 2. Searches were limited to English language material published from January 2000 onwards. The searches were completed on 10 August 2006. Result of the search strategy The search strategy for the validity of screening strategies and implementation difficulties yielded 1138 articles. From 219 articles identified as potentially eligible for inclusion, a final group of 66 papers were selected for appraisal, all of which were primary research (including statistical modelling). The search strategy for the impact of screening on invasive testing rates and the safety of those invasive SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING iv tests yielded 1116 articles. From 192 articles identified as potentially eligible for inclusion, a final group of 35 papers were selected for appraisal, all but one of which were primary research. Key results Accuracy of screening methods Maternal age alone is not an appropriate screening test for DS. Screening methods that combine tests in an independent manner are not recommended. The quad test provided the best screening performance characteristics in the 2nd trimester The combined test performed better than other 1st trimester tests and all 2nd trimester strategies, and serum integrated screening had a similar performance to the combined test. All other integrated and sequential screening strategies have improved performance compared to either 1st trimester or 2nd trimester screening strategies. Integrated screening performed better than both stepwise and contingent screening, with a lower FPR for a fixed DR. The performance of stepwise and contingent screening is similar. Difficulties implementing any screening strategies The following implementation difficulties were reported: NT measurement not successful or taking more time than expected, NT requiring trained staff, high quality equipment and quality control, women defaulting from 2nd trimester maternal serum screening, need to adjust for maternal weight, and for false positive results in previous pregnancies need for USS dating, issues with serum marker reliability (assay drift, and inaccurate marker MoM), inappropriate model parameters giving inaccurate risk estimates, issues with screening in twin pregnancies. Uptake of invasive testing following receipt of screening results Maternal age (<35 years versus ≥ 35 years) appears to be a factor in the uptake of invasive testing following screening. Women’s individual estimates of calculated risk appear to influence the uptake of invasive testing, with an increase in the uptake of invasive testing as the likelihood of carrying a fetus with DS increases. Perceived accuracy of the screening test and social, ethnic and cultural factors may influence uptake and it would be wise to conduct a local study of the acceptability of both screening and invasive testing. Changes in the rate of invasive testing with the introduction of a screening programme Overall, rates of invasive testing increased slightly with the introduction of screening programmes. This was against the backdrop of a steeper rise in maternal age and the increase in invasive testing was less than what would be expected if invasive tests were offered based on maternal age alone. Changes in rates of invasive testing varied as a function of maternal age. Rates of invasive testing decreased among older mothers (≥ 35 years) and increased among younger mothers (< 35 years) with the introduction of a screening programme. Rates of fetal loss associated with invasive testing procedures A high quality systematic review confirmed that a 1% increased risk of fetal loss (against a background risk of 2%) following amniocentesis in a low risk population is the best estimate of the rate of procedure-related loss. The recommendations of the review were that prenatal diagnosis is safest performed in the second trimester by amniocentesis, which is safer than both transcervical CVS and first trimester amniocentesis. Transabdominal CVS is the safest method for first trimester prenatal diagnosis, followed by transcervical CVS. Conclusions Amongst the papers identified to assess the validity of DS screening methods the evidence indicated that integrated or sequential screening had superior screening performance compared with screening SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING v confined to either the 1st or 2nd trimester, and these are the screening methods of choice, whether it be the integrated test, or sequential methods (contingent and stepwise screening). Studies considered in this review had different strengths and weaknesses. The best designs for determining the validity of a screening method are observational studies (cohort and nested casecontrol studies) in which clinical intervention does not follow a positive screening result (for example, where NT screening results are not used clinically until the results of second trimester screening are available). Studies based on intervention are subject to the bias that some screen positive affected pregnancies would abort spontaneously in the absence of screening and selective ToP. Such studies will tend to overestimate the screening performance of the tests considered. Some of the evidence comparing the validity of different screening methods was also obtained from case-control studies. In these studies the sample will include cases (Down syndrome) and unaffected controls, and may not be representative of the groups who would be tested in a screening programme. Some modelling papers in this review used the same primary data, similar modelling techniques, and similar assumptions which may mean the same or similar results are replicated. In relation to the uptake of invasive testing procedures following screening for Down syndrome, it appears that both maternal age and individual risk estimates are factors in women’s decisions to opt for invasive testing or not. Other factors may also play a part in the uptake of both screening and invasive testing, such as perceived confidence in the accuracy of the screening test, and the social-acceptability of screening and invasive testing. The overall rate of invasive testing may increase slightly with the introduction of a screening programme but when examined by maternal age group, it appears that the rate of invasive testing increases in younger mothers and decreases in older mothers. Long-term evaluations of populationbased screening programmes using integrated and sequential methods have not been completed as yet, so the effects on invasive testing rates are unclear at this stage. Relative to a screening programme based on maternal age alone, the introduction of first or second trimester screening methods decreases the rate of unnecessary invasive procedures. Again, social, ethnic and cultural variability in the acceptability of screening and invasive testing is a factor in the impact of screening policy changes. Because of this variability, it would be wise to conduct a local study of the acceptability of screening and invasive testing in older and younger mothers of different ethnic groups before implementing a large-scale screening programme in New Zealand. There are a number of considerations apart from the validity of a screening strategy which are important when deciding between the different screening strategies including: acceptability of the strategies to women and clinicians, compatibility with neural tube defect screening, and the availability of accurate alternative screening methods if women do not present at the appropriate gestational age. The implications for resources for training, monitoring, and quality control particularly for NT testing would need to be carefully considered when determining whether a screening programme should proceed. It would also be important to ensure software is available which will accurately determine an individual’s risk of DS, including calculation of correct MoMs, adjustments (including for maternal weight, and false positive results from previous pregnancies) and use of appropriate population parameters for Gaussian distributions. Mesh headings Down syndrome, chromosomes-human-pair 21, pregnancy trimester-first, pregnancy trimester-second, ultrasonography-prenatal, nuchal translucency measurement, prenatal diagnosis, mass screening, amniocentesis, chorionic villi sampling SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING vi TABLE OF CONTENTS CONTRIBUTION BY AUTHORS ......................................................................................................................I ACKNOWLEDGEMENTS ...............................................................................................................................I DISCLAIMER ...............................................................................................................................................I COPYRIGHT ................................................................................................................................................I CONTACT DETAILS ....................................................................................................................................II EXECUTIVE SUMMARY .............................................................................................................................III TABLE OF CONTENTS ............................................................................................................................... VI LIST OF TABLES ..................................................................................................................................... VIII LIST OF ABBREVIATIONS AND ACRONYMS ............................................................................................... IX LIST OF ABBREVIATIONS AND ACRONYMS ............................................................................................... IX GLOSSARY............................................................................................................................................. XIII CHAPTER 1: BACKGROUND 1 NEED FOR SYSTEMATIC REVIEW ................................................................................................................1 SCREENING - GENERAL OVERVIEW ............................................................................................................2 AIM ...........................................................................................................................................................2 REVIEW SCOPE ..........................................................................................................................................2 REVIEW QUESTIONS ...................................................................................................................................3 DOWN SYNDROME SCREENING METHOD TERMINOLOGY ............................................................................3 STRUCTURE OF REPORT .............................................................................................................................5 CHAPTER 2: REVIEW METHODOLOGY 7 SELECTION CRITERIA .................................................................................................................................7 SEARCH STRATEGY ....................................................................................................................................8 STUDY SELECTION .....................................................................................................................................9 APPRAISAL OF STUDIES ...........................................................................................................................10 KEY OUTCOME MEASURES FOR PRIMARY STUDIES...................................................................................11 PART A CHAPTER 3: ACCURACY OF FIRST TRIMESTER SCREENING STRATEGIES 15 PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY ..............................................................................15 PRIMARY RESEARCH: STUDY RESULTS ....................................................................................................17 CHAPTER 4: COMPARISON OF SECOND TRIMESTER SCREENING STRATEGIES 63 PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY ..............................................................................63 PRIMARY RESEARCH: STUDY RESULTS ....................................................................................................64 CHAPTER 5: COMPARISON OF 1ST TRIMESTER STRATEGIES, 2ND TRIMESTER STRATEGIES, INTEGRATED AND SEQUENTIAL METHODS. 85 PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY ..............................................................................85 PRIMARY RESEARCH: STUDY RESULTS ....................................................................................................87 PART B CHAPTER 6: CHANGES IN THE RATE OF INVASIVE TESTING FOLLOWING THE INTRODUCTION OF SCREENING 135 PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY ............................................................................135 PRIMARY RESEARCH: STUDY RESULTS ..................................................................................................138 CHAPTER 7: UPTAKE OF TESTING FOLLOWING SCREENING RESULTS SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 157 vii CHAPTER 8: INVASIVE TESTING AND PROCEDURE-RELATED LOSS 185 SECONDARY RESEARCH ........................................................................................................................ 185 PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY ............................................................................ 187 PRIMARY RESEARCH: STUDY RESULTS .................................................................................................. 188 CHAPTER 9: DISCUSSION 197 SUMMARY OF EVIDENCE ....................................................................................................................... 197 CONCLUSIONS ....................................................................................................................................... 198 REFERENCES 201 APPENDIX 1: SEARCH STRATEGIES 207 SEARCH STRATEGIES ............................................................................................................................. 207 SEARCHES FROM OTHER SOURCES ......................................................................................................... 211 APPENDIX 2: SOURCES SEARCHED 213 SOURCES SEARCHED ............................................................................................................................. 213 APPENDIX 3 RETRIEVED STUDIES EXCLUDED FOR REVIEW : PART A 215 APPENDIX 4: RETRIEVED STUDIES EXCLUDED FOR REVIEW: PART B 223 APPENDIX 5: 231 DESIGNATIONS OF LEVELS OF EVIDENCE ............................................................................................... 231 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING viii LIST OF TABLES Table 1: Criteria for assessing screening programmes ............................................................. 2 Table 2: Down syndrome screening methods that use both first and second trimester screening tests ............................................................................................................ 4 Table 3. Assessment of validity of a diagnostic test............................................................... 11 Table 4. Comparison of DRs and FPR, maternal age versus other screening strategies ........ 17 Table 5. Comparison of DRs for a 5% FPR, combined test, nuchal translucency and fβhCG + PAPP-A..................................................................................................... 18 Table 6. Comparison of DRs and FPR for fixed cut offs (same for all tests unless indicated), combined test, nuchal translucency and fβhCG + PAPP-A.................... 19 Table 7. Comparison of FPRs for an 85% DR, combined test, nuchal translucency and fβhCG + PAPP-A..................................................................................................... 20 Table 8. Comparison of DRs and FPR, fβHCG and total hCG .............................................. 20 Table 9. Comparison of DRs and FPR, fβHCG and PAPP-A ................................................ 21 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components .............................. 24 Table 11. Comparison of DRs and FPRs, maternal age versus other screening strategies ....... 65 Table 12. Comparison of performance of the quad test, triple test, and double test................. 66 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components .......................................... 69 Table 14. Non-interventional studies comparing DRs and FPRs of 1st trimester NT (or combined test) versus 2nd trimester strategies (and integrated, sequential, or independent strategies)............................................................................................. 88 Table 15. Comparison of the validity of fully (or serum) integrated screening, stepwise screening, contingent screening, and screening combining tests (combined test and quad test) in an independent manner ................................................................. 92 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters ............................................. 97 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme .......................... 142 Table 18. Uptake of invasive testing following 1st trimester screening.................................. 161 Table 19. Uptake of invasive testing following 2nd trimester screening ................................. 163 Table 20. Uptake of invasive testing following 1st and 2nd trimester screening...................... 164 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening .............................................................................. 166 Table 22. Evidence table of secondary research studies appraised investigating the rate of fetal loss following invasive prenatal diagnostic procedures ............................. 186 Table 23. Evidence table of primary research studies investigating the rate of fetal loss following invasive prenatal diagnostic procedures ................................................ 191 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING ix LIST OF ABBREVIATIONS AND ACRONYMS X/12 — Months e.g. 6/12 = 6 months X/40 — Weeks of gestation e.g. 10/40 = 10 weeks of gestation. +ve — Positive -ve — Negative 1st T — First trimester 2nd T — Second trimester 95% CI — 95 percent confidence interval 95th centile. — 95th percentile AFP — Alphafetoprotein AC — Amniocentesis ADAM — A Disintegrin and Metalloprotease AMA — Advanced maternal age ART — Assisted reproductive technology BPD — Biparietal Diameter CI — Confidence interval Cinahl Cumulative Index to Nursing and Allied Health Literature CRL — Crown-rump length CVS Chorionic Villus Sampling DM — Diabetes Mellitus DNA Deoxyribonucleic acid DR — Detection rate (sensitivity) DS — Down syndrome EDD — Expected Date of Delivery SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING x ELISA — Enzyme-Linked ImmunoSorbent Assay fβhCG — Free beta hCG (human chorionic gonadotrophin) FASTER First and Second Trimester Evaluation of Risk FMF Fetal Medicine Foundation FmHx — Family History FNR false negative rate FPR false positive rate GA Gestational age GP general practitioner hCG — Human chorionic gonadotrophin HhCG Hyperglycosylated hCG (also called ITA) HTA Health technology assessment Hx History INAHTA International Network of Agencies for Health Technology Assessment IT — Invasive test/testing ITA — Invasive trophoblast antigen (also called HhCG) LR — Likelihood ratio LMP — Last menstrual period MA — Maternal age MeSH — Medical Subject Headings MSS — Maternal Serum Screen MoH Ministry of Health (NZ) MoM — Multiple of the Median NHS National Health Service (UK) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING xi NHMRC National Health and Medical Research Council NT — nuchal translucency NZ New Zealand NZHTA New Zealand Health Technology Assessment ONTD Open neural tube defect OSCAR One stop clinic for assessment of risk OR odds ratio Paeds paediatricians PAPP-A pregnancy associated plasma protein PPV positive predictive value QA quality assurance QC quality control RCT randomised controlled trial RNZCGP Royal New Zealand College of General Practitioners RR — Relative risk SA — South Australia SD standard deviation Se sensitivity Sig diff Significantly different Sp specificity SPR — screen positive rate SURUSS — Serum Urine and Ultrasound Screening Study T18 — Trisomy 18 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING xii TA — Transabdominal TC — Transcervical ToP Termination of pregnancy ThCG — Total hCG TT — Triple test TV — Transvaginal uE3 — unconjugated estriol UK United Kingdom USA — United States of America USS — Ultrasound scan. WA — Western Australia SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING xiii GLOSSARY A prior pertaining to something which is designated in advance. Age standardisation A procedure for adjusting rates designed to minimise the effects of differences in age composition when comparing rates for different populations Amniocentesis an invasive prenatal diagnostic test performed in pregnant women to identify chromosomal abnormalities in the fetus by karyotyping. A needle is inserted into the uterus and used to remove a small amount of amniotic fluid from the sac surrounding the fetus. Analyte a substance or chemical constituent that is undergoing analysis. In this report analyte is used to refer to the serum constituent undergoing analysis, e.g. hCG Aneuploidy Having an abnormal number of chromosomes Ascertainment bias (detection bias) ascertained, diagnosed or verified. Systematic differences between groups in how outcomes were Bias Deviation of results or inferences from the truth, or processes leading to such deviation. Any trend in the collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are systematically different from the truth. Blinded study A study in which observers and/or subjects are kept ignorant of the group to which they are assigned. When both observers and subjects are kept ignorant, the study is referred to as double blind. Case-control study An epidemiological study involving the observation of cases (persons with the disease, such as cervical cancer) and a suitable control (comparison, reference) group of persons without the disease. The relationship of an attribute to the disease is examined by comparing retrospectively the past history of the people in the two groups with regard to how frequently the attribute is present. See also nested case control. Chorionic villus sampling an invasive prenatal diagnostic test performed in pregnant women to identify any chromosomal abnormalities in the fetus by karyotyping.. A needle is used to remove a small amount of chorionic villus tissue from the placenta. The needle may be inserted via the cervix (transcervical) or abdomen (transabdominal). Cohort study The analytic method of epidemiologic study in which subsets of a defined population can be identified who are, have been, or in the future may be exposed or not exposed in different degrees, to a factor or factors hypothesised to influence the probability of occurrence of a given disease or other outcome. Studies usually involve the observation of a large population, for a prolonged period (years), or both. Combined test First trimester DS screening test based on combining 1st T MSS (PAPP-A, and fβhCG) and NT with maternal age. Confidence interval The computed interval with a given probability, e.g. 95%, that the true value of a variable such as a mean, proportion, or rate is contained within the interval. The 95% CI is the range of values in which it is 95% certain that the true value lies for the whole population. Confounder A third variable that indirectly distorts the relationship between two other variables, because it is independently associated with each of the variables. Contingent screening All women have first trimester screening and are divided into three groups based on results: high, intermediate and low risk. High risk women have diagnostic test; low risk women are reassured and have no further screening. All others have second trimester screening tests. For these women the results of first and second trimester testing is combined to produce one integrated result. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING xiv Cross-sectional study A study that examines the relationship between diseases (or other health related characteristics), and other variables of interest as they exist in a defined population at one particular time. Double test Second trimester serum screening using AFP and hCG (fβhCG or total hCG) and incorporating maternal age Effectiveness A measure of the extent to which a specific intervention, procedure, regimen, or service, when deployed in the field in routine circumstances, does what it is intended to do for a specified population. Evidence table A summary display of selected characteristics (e.g., methodological design, results) of studies of a particular intervention or health problem. False negative result A negative test result in a person who does have the condition being tested for. False positive result A positive test result in a person who does not have the condition being tested for. Fetal Medicine Foundation (FMF) A registered charity (London, UK) which promotes research and training in fetal medicine Integrated screening Any method which integrates measurements performed during the first and second trimester of pregnancy into a single test result. Integrated test Unless otherwise qualified, ‘integrated test’ refers to the integration of nuchal translucency and PAPP-A measurements in the first trimester with the quadruple test markers in the second trimester. It is the also known as the fully integrated test Invasive testing used in this case to describe prenatal diagnostic tests, e.g. amniocentesis or chorionic villous sampling, which involve removal of tissue or fluid from the placenta or uterus. Generalisability (applicability, external validity) Applicability of the results to other populations. High risk groups Usually refers to groups of women that have been identified as having a higher than expected, or higher than average for the population as a whole, incidence of the disease in question. Incidence The number of new events (cases; e.g. of disease) occurring during a certain period, in a specified population. Independent screening The practice of offering women who have had first trimester Down syndrome screening a second trimester screening test without consideration of the first trimester results Internal validity bias. The extent to which the design and conduct of a study are likely to have prevented Matching The process of making a study group and a comparison group comparable with respect to extraneous factors. Mean Calculated by adding all the individual values in the group and dividing by the number of values in the group. Median Any value that divides the probability distribution of a random variable in half. For a finite population or sample the median is the middle value of an odd number of values (arranged in ascending order) or any value between the two middle values of an even number of values. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING xv Meta-analysis The process of using statistical methods to combine the results of different studies. The systematic and organised evaluation of a problem, using information from a number of independent studies of the problem. Misclassification The erroneous classification of an individual, a value, or an attribute into a category other than that to which it should be assigned. Monte Carlo simulation A method of generating values from a known distribution using a computer simulation for the purposes of experimentation Multiple regression simultaneously. Any analysis of data that takes into account a number of variables Negative predictive value (NPV) The probability a person does not have the disease when the screening test is negative. Nested case control study A case control study in which cases and controls are drawn from the population in a cohort study. That is, the case control study is “nested” within the cohort study design so that the effects of some potential confounding variables are reduced or eliminated. A case control study can also be nested into a case series study. See also case control study, cohort study, and case series study. Nuchal translucency subcutaneous fluid-filled space at the back of the neck of a fetus. Observational study A study in which the investigators do not seek to intervene, and simply observe the course of events. Population-based screening programme A population-based screening programme is one in which screening is systematically offered by invitation to a defined, identifiable population. Positive predictive value (PPV) The probability a person actually has the disease when the screening test is positive. Prevalence The number of events in a given population at a designated time (point prevalence) or during a specified period (period prevalence). Primary care First contact, continuous, comprehensive and coordinated care provided to individuals and populations undifferentiated by age, gender, disease or organ system. Primary research/study ‘Original research’ in which data are collected. The term primary research/study is sometimes used to distinguish it from a secondary study (re-analysis of previously collected data), meta-analysis, and other ways of combining studies (such as economic analysis and decision analysis). (Also called original study.) Quadruple test Second trimester serum screening using AFP, hCG (fβhCG or total hCG), uE3 and inhibin and incorporating maternal age. Also referred to as the quad test. Randomised controlled trial An epidemiologic experiment in which subjects in a population are randomly allocated into groups to receive or not receive an experimental preventive or therapeutic procedure, manoeuvre, or intervention. Randomised controlled trials are generally regarded as the most scientifically rigorous method of hypothesis testing available in epidemiology. Recall bias Systematic bias due to differences in accuracy or completeness of recall or memory of past events or experiences. Reference standard An independently applied test that is compared to a screening or diagnostic test being evaluated in order to verify the latter’s accuracy. A reference standard, therefore, provides an accurate or “truth” diagnosis for verification of positive and negative diagnoses SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING xvi Relative risk (RR) The ratio of the risk of disease or death among the exposed to the risk among the unexposed. It is a measure of the strength or degree of association applicable to cohort studies and RCTs. Reliability The degree to which results obtained by a measurement procedure can be replicated. Lack of reliability can arise from divergences between observers or measurement instruments, measurement error, or instability in the attribute being measured. Risk factor An exposure or aspect of personal behaviour or lifestyle, which on the basis of epidemiologic evidence is associated with a health-related condition. Screening - Screening is the examination of asymptomatic people in order to classify them as likely or unlikely to have the condition that is the object of screening. Secondary care Surgical and medical services that are generally provided in a hospital setting. In many cases, access to these services is by referral from a primary care health professional such as a general practitioner. Secondary research/study Re-analysis of previously collected data, meta-analysis, and other ways of combining studies (such as economic analysis and decision analysis) Sequential screening Screening where results of screening in 1st trimester are combined with 2nd trimester screening in either an independent, contingent, stepwise or integrated manner (see separate glossary entries) Selection bias Any error in selecting the study population such that the people who are selected to participate in a study are not representative of the reference population or, in analytic studies the comparison groups are not comparable. Sensitivity analysis A method to determine the robustness of an assessment by examining the extent to which results are affected by changes in methods, values of variables, or assumptions. Sensitivity (Se) Sensitivity is the proportion of truly diseased persons in a screened population who are identified as diseased by a screening test. Sensitivity is a measure of the probability of correctly diagnosing a case, or the probability that any given case will be identified by the test. Specificity (Sp) The proportion of truly non-diseased persons who are so identified by a screening test. It is a measure of the probability of correctly identifying a non-diseased person with a screening test. Stepwise screening All women have first trimester screening and are divided into two groups based on results: high and low risk. Those with a high risk result are offered diagnostic testing. All others have second trimester screening tests. For these women the results of first and second trimester testing are combined to produce one integrated result. Statistical difference A result that is unlikely to have happened by chance. The usual threshold for this judgement is that the results, or more extreme results, would occur by chance with a probability of less than 0.05 if the null hypothesis was true. Statistical tests produce a p-value used to assess this. Statistical modelling See Monte Carlo simulation Systematic review Literature review reporting a systematic method to search for, identify and appraise a number of independent studies. Triple test Second trimester serum screening using AFP, hCG (fβhCG or total hCG), and uE3 and incorporating maternal age Trisomy Presence of an extra chromosome in each cell SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING xvii True negative A test correctly identifies a person without the condition. True positive A test correctly identifies a person with the condition. Variance A measure of the variation shown by a set of observations, defined by the sum of the squares of deviation from the mean, divided by the number of degrees of freedom in the set of observations. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 1 Chapter 1: Background Down syndrome (DS) or Trisomy 21 is a chromosomal abnormality resulting from an extra copy of chromosome 21. It is the most common aneuploidy in live born infants, with a population incidence of about 1 in 1000 births. The severity of the disorder can vary but outcomes may include severe developmental delay, congenital heart defects and physical growth impairment. There is no national data on the prevalence of DS in New Zealand but a yearly prevalence of 1.17 /1000 births has been reported between the years 1997-2003 (Chang, 2006). The risk of carrying a fetus with DS has been recognised as increasing with maternal age since the 1930’s. A maternal age of 35 years or more at the expected date of delivery has been used as the basis for DS screening since the 1970s. Women older than this age cut-off are offered prenatal diagnosis of DS by either amniocentesis, usually from 16 weeks gestation, or chorionic villus sampling from approximately 11 weeks gestation (Nicolaides, 2004). Each of these tests carries a risk of fetal loss and other complications, and this method of screening has been criticised as resulting in a high number of unnecessary invasive procedures and procedure-related fetal losses. Additionally, screening based on maternal age alone identifies only approximately 30% of affected fetuses (Chang, 2006), as most children with DS are born to women under 35 years of age. Efforts to develop screening methods to identify mothers at high risk of carrying an affected fetus have focussed on nuchal translucency (NT) thickness and maternal serum biochemistry as well as the presence of other sonographic markers. Abnormal levels of maternal alpha-fetoprotein (AFP), unconjugated estriol (uE3) and human chorionic gonadotropin (hCG) levels were initially noted to be associated with the presence of DS, and the importance of two other biomarkers, Inhibin-A and pregnancy associated placental protein A (PAPP-A), were later identified. There has been considerable debate however regarding the best combination of screening tests and whether they should be offered in the first or second trimester, or both. NEED FOR SYSTEMATIC REVIEW Overseas screening In Australia second trimester serum screening and first trimester serum in combination with nuchal translucency measurements are the established standards of care, and have been utilised with increasing rates of uptake since the 1980’s (Chang, 2006; O’Leary et al. 2006). The American College of Gynaecologists (ACOG) recommend offering invasive testing to women older than 35 years and to women with a positive maternal serum screen result (Slack et al. 2006). In the U.K., the National Institute for Clinical Excellence (NICE) suggests all pregnant women should be offered screening for Downs syndrome based on methods which provide at least a 60% DR and no more than a 5% FPR. They recommend nuchal translucency, first or second trimester maternal serum screening, or a combination of NT and serum screening (NICE, 2003). Current New Zealand screening Currently New Zealand has no formalised antenatal screening programme for DS. The Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) recommends offering screening or prenatal diagnosis based on maternal age or having a previous affected pregnancy (RANZCOG 2004). It is suggested that women who are not considered to be at increased risk should be made aware of the availability of screening. The screening methods currently recommended by RANZCOG are nuchal translucency; first trimester combined, or second trimester screening. These tests, however, are not available in all localities, serum screening is not available free of charge, and consequently they have had a low uptake in New Zealand in the past (Chang, 2006). Screening is conducted on an ad hoc basis, mostly based on ultrasound scanning. In a recent report to the National Screening Unit (Ministry of Health) entitled “Assessment of Antenatal Screening for DS in New Zealand” it was recommended that invasive diagnostic testing based on age and nuchal thickness SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 2 screening alone should be urgently reviewed, and that a national screening programme should be implemented based on “best practice” (Stone and Austin, 2006). The National Screening Unit has commissioned New Zealand Health Technology Assessment (NZHTA) to undertake a systematic review to assess options for DS screening in order to inform policy on a national screening programme. SCREENING - GENERAL OVERVIEW The focus of this review is secondary prevention, specifically screening. The National Health Committee (NHC) of New Zealand defines screening as “…a health service in which members of a defined population, who do not necessarily perceive they are at risk of, or are already affected by, a disease or its complications, are asked a question or offered a test to identify those individuals who are more likely to be helped than harmed by further tests or treatments to reduce the risk of disease or its complications” (NHC, 2003, p29). The effectiveness of a screening programme depends upon high levels of coverage of the population. Criteria to inform the assessment of screening programmes in New Zealand have been developed by the NHC (2003). These criteria are listed in Table 1 below. Table 1: Criteria for assessing screening programmes 1. The condition is a suitable candidate for screening. 2. There is a suitable test 3. There is an effective and accessible treatment or intervention for the condition identified through early detection 4. There is high quality evidence, ideally from randomised controlled trials, that a screening programme is effective in reducing mortality or morbidity. 5. The potential benefit from the screening programme should outweigh the potential physical and psychological harm (caused by the test, diagnostic procedures and treatment). 6. The health care system will be capable of supporting all necessary elements of the screening pathway, including diagnosis, follow-up and programme evaluation. 7. There is consideration of social and ethical issues. 8. There is consideration of cost-benefit issues. The scope of this review is narrow and limited to partially addressing criterion two, and criteria five and six. AIM To systematically identify and appraise international evidence for the antenatal use of technologies and screening strategies for DS. REVIEW SCOPE The review scope was developed in consultation with the Director of the NZHTA and the National Screening Unit. There are two parts to the review. Part A is concerned with the validity of DS screening methods and with any difficulties implementing screening strategies. Part B is concerned with the impact of screening on amniocentesis and chorionic villus sampling, and the safety of these procedures. For Part A, studies were included for appraisal if they reported comparisons of the DR and FPR of screening using any of the following screening methods: maternal age, first trimester serum screening, first trimester nuchal translucency screening, second trimester serum screening, or screening combining the results of 1st and 2nd trimester screening (Cuckle et al. 2005). The scope also includes any difficulties implementing any of the screening strategies. These difficulties include: successful measurement, limitations of access, quality control issues, reliability of different strategies, and incomplete application of screening strategies. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 3 For Part A, the search was limited to full reports published in English and published between January 2000 and 29 June 2006. Full details of inclusion and exclusion criteria are provided in the next chapter. For Part B, studies were included for review if they reported the proportion of women tested by amniocentesis and chorionic villus sampling after different DS screening results, how the introduction of DS screening programmes impacted on testing rates, what proportion of fetal loss is associated with testing, and what interventions are available that reduce the risk of fetal loss from these procedures. The search was limited to full reports published in English and published between January 2000 and 7th August 2006. Full details of inclusion and exclusion criteria are provided in the next chapter. REVIEW QUESTIONS Part A 1. What is the case DR for a given FPR resulting from antenatal screening for DS for the following screening strategies/combinations of tests: maternal age (MA) first trimester serum screening second trimester serum screening first trimester nuchal translucency screening integrated first and second trimester screening sequential first and second trimester testing: Independent screening sequential first and second trimester testing: Stepwise and contingent screening. 2. What difficulties have been experienced in the implementation of any of the following screening strategies/combinations of tests: maternal age first trimester serum screening second trimester serum screening first trimester nuchal translucency screening integrated first and second trimester screening sequential first and second trimester testing: independent screening sequential first and second trimester testing: stepwise and contingent screening. Part B 3. In relation to amniocentesis and chorionic villus sampling for the detection of DS: what proportion of women are tested by amniocentesis or CVS following receipt of a high risk result on screening what proportion of women are tested by amniocentesis or CVS following receipt of a low risk result on screening what proportion of women are tested by amniocentesis or CVS without having had a screening test what impact does the introduction of a screening programme have on testing rates what is the proportion of fetal loss associated with testing what interventions are available that reduce the risk of fetal loss from testing. DOWN SYNDROME SCREENING METHOD TERMINOLOGY The terms used in the literature to describe Down syndrome screening methods that use both first and second trimester screening tests vary. Some refer to all these methods as sequential methods (Cuckle et al. 2005). Others refer to only the stepwise and the contingent screening methods as sequential methods, i.e. excluding the integrated test (Malone et al. 2005). In some papers the “stepwise” strategy is known as the “sequential” strategy (Palomaki et al. 2006). We have chosen to adopt the terminology of the FASTER study (Malone et al. 2005) and SURUSS (Wald et al. 2003b). Any method combining the results of tests taken in the first and second trimesters SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 4 into one test (with first trimester results held until completion of second trimester testing) is an integrated screening method (fully integrated or serum integrated). A screening method which discloses and acts upon the results of first trimester screening, and then offers a proportion of women (based on risk cut-offs) a second trimester screening test is termed sequential screening. Sequential screening strategies describe both stepwise and contingent screening methods. The terminology does have some drawbacks. The peer reviewer of this document has highlighted the fact that sequential screening methods offer an integrated test to women who have second trimester screening. However, it may be confusing to use the overarching term “integrated screening”, as well as having an “integrated test”. Table 2 describes the four conventional screening methods that combine information from both first and second trimester screening tests. Table 2: Down syndrome screening methods that use both first and second trimester screening tests Screening method (overarching term) 1st Trimester tests 2nd Trimester tests Description of screening strategy Serum integrated (integrated) PAPP-A Quad test All women have first and second trimester screening tests. Results are held until completion of second trimester screening when all results are combined into one result Fully integrated or integrated test (integrated) NT and PAPP-A Quad test All women have first and second trimester screening tests. Results are held until completion of second trimester screening when all results are combined into one result. Stepwise (sequential) Combined test Quad test All women have first trimester screening and are divided into two groups based on results: high and low risk. Those with a high risk result are offered diagnostic testing. All others have second trimester screening tests. For these women the results of first and second trimester testing are combined to produce one integrated result. Quad test All women have first trimester screening and are divided into three groups based on results: high, intermediate and low risk. High risk women have diagnostic test; low risk women are reassured and have no further screening. All others have second trimester screening tests. For these women the results of first and second trimester testing is combined to produce one integrated result. (NT, PAPP-A, and either fβhCG or total hCG) Contingent (sequential) Combined test (NT, PAPP-A, and either fβhCG or total hCG) In addition, sometimes a screening method employs the same strategy as a conventional integrated or sequential screening method but different tests are used. For instance, measuring NT in the first trimester and then offering those with a high risk result a diagnostic test, and those with a low risk a 2nd trimester double test. This method uses the same strategy as the stepwise screening method and in this review this is described as combining tests in a “stepwise manner”. In some appraised studies women were offered a first trimester screening test then women who were screen negative were offered a second trimester screen without consideration of the first trimester results. This practice of combining tests in an independent manner leads to erroneous risk estimates (Wald 2006c) and is not recommended. For definitions of all other Down syndrome screening methods please refer to the glossary. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 5 STRUCTURE OF REPORT This report is divided into two Parts (A and B) and chapters. Chapter 2 (Review Methodology) includes the selection criteria (study inclusion and exclusion criteria), the search strategy, and outcomes considered for Part A and Part B of the report. The results section of the review includes primary and secondary research considered for Part A (Chapters 3, 4 and 5) and Part B (Chapters 6, 7, and 8) of the review. This section also provides an overview of the appraised papers and evidence, as well as detailed evidence tables, which present each appraised study’s methods, results, limitations, and authors’ conclusions. Chapter 9 (Discussion) summarises results, briefly discusses methodological limitations in the area, and presents key conclusions. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 6 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 7 Chapter 2: Review Methodology SELECTION CRITERIA PART A: Review questions 1 and 2 Study inclusion criteria Publication type Studies published between January 2000 and June 2006 inclusive, in the English language, including primary (original) research (published as full original reports) and secondary research (systematic reviews and meta-analyses) appearing in the published literature. Context Studies reporting on the comparison of the diagnostic accuracy of two screening strategies. Screening strategies included maternal age, first trimester serum screening, first trimester nuchal translucency screening, second trimester serum screening, and sequential or integrated screening. Outcomes Measures of case DRs for a given FPR are presented in the results as well as any implementation difficulties. Implementation difficulties include unsuccessful measurement, limitations of access to the screening test, quality control issues in conducting the test, reliability of different screening strategies and incomplete application. Study design Systematic reviews or primary research comparing different screening strategies. Sample size Studies with samples of at least 100 participants. Study exclusion criteria Research papers were excluded if they: were not published in English were “correspondence”, editorials, expert opinion articles, non-systematic reviews, conference proceedings, or abstracts reported animal studies did not clearly describe their methods and results, or had significant discrepancies had been superseded by a later publication with longer follow-up data and overlap in the patient population. PART B: Review question 3 Study inclusion criteria Publication type Studies published between January 2000 and August 7th 2006 inclusive, in the English language, including primary (original) research (published as full original reports) and secondary research (systematic reviews and meta-analyses) appearing in the published literature. Context Studies reporting the proportion of women tested by amniocentesis and chorionic villus sampling following receipt of either a low risk or high risk result, and the proportion of women having these procedures without antenatal DS screening. Also, studies reporting the impact of the introduction of a screening programme on testing rates, studies reporting the proportion of fetal loss associated with testing, and studies describing interventions that reduce the risk of fetal loss from these procedures. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 8 Outcomes The proportion of women opting for invasive testing as a function of their screening result, the change in overall proportion of tests with the introduction of a screening programme, proportion of fetal loss in women who have an invasive test and those who do not, changes in the rate of fetal loss as a result of risk factors. Study design Systematic reviews, randomised controlled trials, cohort studies, case control studies, cross sectional studies and before and after designs. Sample size Studies with samples of at least 100 participants. Study exclusion criteria Research papers were excluded if they: were not published in English were “correspondence”, editorials, expert opinion articles, non-systematic reviews, conference proceedings, abstracts reported animal studies did not clearly describe their methods and results, or had significant discrepancies had been superseded by a later publication with longer follow-up data and overlap in the patient population. SEARCH STRATEGY A systematic method of literature searching and selection was employed in the preparation of this review. Searches were limited to English language material published from January 2000 onwards. The searches were completed on 29th June and 7th August, 2006 for Part A and Part B respectively. Principal sources of information The following databases were searched (using the search strategy outlined in Appendix 1): Bibliographic databases Cinahl Cochrane Register of Controlled Trials Current Contents Embase Medline PubMed (last 60 days) Science Citation Index Social Science Citation Index Review databases ACP Journal Club Cochrane Database of Systematic Reviews Database of Abstracts of Reviews of Effectiveness Health Technology Assessment database NHS Economic Evaluation database Other Clinicaltrials.gov Current Controlled Trials References of retrieved papers were scanned for relevant publications. TRIP database SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 9 UK National Screening Committee Down’s Syndrome Screening Programme Hand searching of journals, contacting of manufacturers, or contacting of authors for unpublished research was not undertaken in this review. A complete list of the sources searched for this review is given in Appendix 2. Search terms used: Review questions 1 and 2 Index terms from Medline (MeSH terms): Down syndrome, pregnancy trimester-first, nuchal translucency measurement, pregnancy trimester-second, exp prenatal diagnosis, pregnancyassociated plasma protein-A, chorionic gonadotropin-beta subunit-human, alpha-fetoproteins, estriol-blood, inhibins-blood, ultrasonography-prenatal, maternal age, mass screening, false positive reactions, false negative reactions. Index terms from Embase: Down syndrome, fetus echography, first trimester pregnancy, second trimester pregnancy, maternal serum, alpha fetoprotein, inhibin A, maternal age, exp prenatal diagnosis, pregnancy associated plasma protein A, chorionic gonadotropin beta subunit, screening test, mass screening, prenatal screening, screening. The above index terms were used as keywords in databases where they were not available and in those databases without controlled vocabulary. Additional keywords (not standard index terms) were used in all databases: trisomy 21, papp-a, beta hcg, uE3, unconjugated estriol, unconjugated oestriol, inhibin a, afp, ((integrated or sequential or contingent or step-wise) adj (screen$ or test$)), screen$, test$. Non English language references, letters and news items were excluded using database limits where available. Filters for study design were not used in the search in order to retrieve a wider range of references from which the final selection of studies could be made. Search terms used: review question 3 Index terms from Medline (MeSH terms): Down syndrome, chromosomes-human-pair-21, trisomy, amniocentesis, chorionic villi sampling, karyotyping. Index terms from Embase: Down syndrome, trisomy 21, amniocentesis, chorion villus sampling, karyotyping, fetus karyotyping. The above index terms were used as keywords in databases where they were not available and in those databases without controlled vocabulary. Additional keywords (not standard index terms) were used in all databases: trisomy 21, chorionic vill$,. Non English language references, letters and news items were excluded using database limits where available. Filters for study design were not used in the search in order to retrieve a wider range of references from which the final selection of studies could be made. STUDY SELECTION Studies were selected for appraisal using a two-stage process. Initially, the titles and abstracts (where available) identified from the search strategy, were scanned and excluded as appropriate. The full text articles were retrieved for the remaining studies and these were appraised if they fulfilled the study selection criteria outlined above. PART A: Review questions 1 and 2 There were 1138 studies identified by the search strategy. 219 full text articles were obtained after excluding studies from the search titles and abstracts. A further 153 of these full text articles did not fulfil the inclusion criteria and are presented in Appendix 3. Reasons for rejecting these papers for appraisal are: outcomes did not include a comparison of two or more screening methods (74) methods were not fully described (2) papers were either a letter (1), or comment (12) non-systematic review (3) data in the study was superseded (1) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 10 sample size less than 100 (3) did not fit the scope of the review for other reasons (could not extract DR and FPR from paper, screening methods were outside the scope of the review) (57). Therefore, 66 articles were fully appraised and are included in this report. Along with other cited publications (e.g. those providing background material), these are presented in the References. PART B: Review question 3 There were 1116 studies identified by the search strategy. 191 full text articles were obtained after excluding studies from the search titles and abstracts. A further 156 of these full text articles did not fulfil the inclusion criteria and are presented in Appendix 4. Reasons for rejecting these papers for appraisal are: outcomes did not include the proportion of invasive tests requested or performed (41) methods were not fully described (5) papers were either a letter (1), or comment (24) non-systematic review (3) data in the study was superseded (6) sample size less than 100 (8) uncontrolled study (9) did not fit the scope of the review for other reasons (screening methods were outside the scope of the review, cost-effectiveness analysis of effect of invasive testing) (59). Therefore, 35 articles were fully appraised and are included in this report, and presented in the References. APPRAISAL OF STUDIES The evaluation initially classified studies according to National Health and Medical Research Council (NHMRC, 2000) levels of evidence criteria, so as to rank them in terms of quality according to a predetermined “evidence hierarchy” (see Appendix 5). These evidence levels are only a broad indicator of the quality of the research. The levels describe groups of research which are broadly associated with particular methodological limitations. However, these levels are only a general guide to quality because each study may be designed and/or conducted with particular strengths and weaknesses. High level evidence is provided by a well conducted randomised-controlled trial. NHMRC checklists of quality issues to consider in appraising research studies were also used relevant to study design. Summaries of appraisal results are shown in tabular form as Evidence Tables and include: source reference (authors, publication date) and country where study was principally conducted study setting design evidence level (applying NHMRC criteria) description of screening strategies sample (patient characteristics including number of women and patient inclusion and exclusion criteria) eligible outcome measures used and verification of outcomes results of analyses on eligible outcomes, including statistically tested comparisons and reporting relevant statistical data comments on the study’s limitations relevant to its internal and external validity authors’ conclusions reviewer’s conclusions. Conclusions are drawn based on the study design and the specific problems associated with individual studies. Systematic reviews are described and critiqued in terms of their search strategy, inclusion/exclusion criteria, data synthesis and interpretation. Note: that such papers were considered principally as SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 11 background information as they invariably do not use the same selection criteria as this review and do not consider subsequently published research. KEY OUTCOME MEASURES FOR PRIMARY STUDIES Assessment of validity of a diagnostic or screening test The diagnostic or screening test performance includes consideration of validity of the test. In the context of screening for DS, the outcome measures of choice for the assessment of predictive performance are the case DR and the FPR. The case DR is the proportion of women with Down’s syndrome pregnancies who have positive results and is therefore equivalent to the sensitivity. The FPR is 1-specificity. These measures are calculated based on presentation of results as shown in Table 3. Table 3. Assessment of validity of a diagnostic test Reference test Diagnostic test Positive Negative Total sample size Positive Negative a b c d n1 n2 Based on Table 3, measures of validity, and 95 percent confidence intervals, were calculated using the following formulae: Sensitivity = a/(a+c) = a/n1 Confidence interval for sensitivity: p ± 1.96(pq/n1)1/2 Where p = a/(a+c) q = c/(a+c) FPR = b/(b+d) = b/n2 Confidence interval for FPR: p ± 1.96(pq/n2)1/2 Where p = b/(b+d) q = b/(b+d) If either n*p or n(1-p) was less than five, confidence intervals based on the normal approximation to the binomial distribution, using the formulae above were considered unreliable and exact methods based on the binomial distribution were used to calculate the confidence interval. Stata version 7.0 was used for these calculations. Assessment of quality Assessment of quality was based on internationally accepted criteria (Irwig and Glaziou 1996; Irwig et al. 1994; Jaeschke et al. 1994). These were: there was an independent blind comparison with a reference gold standard; SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 12 the results of the test and reference standard were assessed independently of each other; the sample of patients in the study included an appropriate spectrum of patients similar to those in whom the test is likely to be used in New Zealand; the results of the test being evaluated did not influence the decision to perform the standard reference test (verification bias); and there was sufficient detail in the study report to permit replication of the test. For most studies of the performance of DS screening the choice of reference standard depends on the results of the screening test. Those with a positive screen will be offered an invasive diagnostic test with either CVS or amniocentesis, and where the screen is negative infants will normally undergo a paediatric examination to exclude DS. Therefore, the comparison with a reference standard will not be an independent blinded comparison and there may be verification bias. As this is the case for the majority of studies it has not been included as a limitation unless there are few details given of the efforts to determine outcomes in live infants. It should be noted however that the difference in timing of the application of the reference standard makes it difficult to accurately determine the validity of antenatal DS screening methods. DS cases detected by screening are detected earlier (usually in 1st or 2nd trimester) than those missed by screening (at term), and DS fetuses are more likely to miscarry than unaffected fetuses. Many DS pregnancies detected by screening and having a termination of pregnancy would have miscarried if they had not been screened, and some DS pregnancies not detected by screening will not be known to investigators as they will have spontaneous fetal loss (Wald et al. 2003b). Also, screening markers may preferentially detect DS cases which are likely to miscarry. Results of studies that determine the performance of screening at 1st trimester or 2nd trimester will have higher DRs than those based on detecting all DS cases at term (Wald et al. 2003b). In reality this later situation will not occur as it would require an observational study which would be unethical (Wald et al. 2003b). Limitations of the review This study has used a structured approach to review the literature. However, there were some inherent limitations with this approach. Namely, systematic reviews are limited by the quality of the studies included in the review and the review’s methodology. This review has been limited by the restriction to English language studies. Restriction by language may result in study bias, but the direction of this bias cannot be determined. In addition, the review has been limited to the published academic literature, and has not appraised unpublished work. Restriction to the published literature is likely to lead to bias since the unpublished literature tends to consist of studies not identifying a significant result. Papers published pre-2000 were not considered as these were thought to predate current DS screening strategies. The studies were initially selected by examining the abstracts of these articles. Therefore, it is possible that some studies were inappropriately excluded prior to examination of the full text article. However, where detail was lacking or ambiguous, papers were retrieved as full text to minimise this possibility. This review was confined to an examination of the technical aspects of screening and diagnostic testing and did not consider the acceptability, or any ethical, economic or legal considerations associated with these interventions. Interventions were not assessed in terms of their impact on general quality of life. No studies included in this review were conducted in New Zealand, and therefore, their applicability to the New Zealand population and context may be limited and needs to be considered. However, as this review was concerned with evidence around diagnostic accuracy and technical aspects of screening and diagnostic testing, rather than wider issues which may be influenced by national characteristics, the evidence should be mostly applicable to the New Zealand situation. Although two researchers appraised the articles included in this review they did not cross validate the data extraction and appraisal process. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 13 The review scope was developed with the assistance of National Screening Unit staff at the Ministry of Health. It had the goal of providing information that would inform the policy process in determining whether a national antenatal screening programme for DS should be implemented in New Zealand, and if so which screening strategy should be used. This review was conducted over a limited timeframe (June, 2006 to December, 2006). This review has greatly benefited from the advice provided by the consultant peer reviewer. However, it has not been exposed to wider peer review. For a detailed description of interventions and evaluation methods, and results used in the studies appraised, the reader is referred to the original papers cited. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 14 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 15 Chapter 3: Accuracy of First Trimester Screening Strategies PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY The search identified 27 eligible papers comparing the accuracy of first trimester screening strategies, including single and combined test options. Below is an overview of study designs and aspects of quality represented by these studies. Full details of the papers appraised, including methods, key results, limitations and conclusions, are provided in evidence Table 10 (pages 24-61). Studies with directly observed DRs and FPRs are presented first, followed by studies where the results are estimations of performance using statistical modelling. For each of these two groups of papers, studies are presented in chronological order of publication within the table. Study design, and grading Of the 27 papers comparing first trimester screening all were based on primary research; there were no papers based on secondary research that fitted the inclusion and exclusion criteria of the review protocol. Twenty papers reported directly observed (or age standardised) performance, and seven reported DRs and FPRs estimated by statistical modelling using Monte Carlo simulation. Most DS screening strategies use a degree of statistical modelling to determine an individual’s risk. Each marker level is converted to a multiple of the gestational age specific median (MoM) of unaffected pregnancies (either from the study population or meta-analysis), and the likelihood ratio (LR) is calculated from the overlapping multivariate Gaussian distributions of affected and unaffected pregnancies (Cuckle 2003; Nicolaides 2004). The LR for each screening test is combined with the a priori risk (maternal age specific risk of DS) to produce a risk estimate for each individual, which becomes the result of the screening method (Morris and Wald 2005). A risk level cut-off is then used to determine whether the screening result is positive or negative and hence the DR and FPR of the particular screening method (Morris and Wald 2005).The accuracy of the risk estimate depends on the parameters used in the model: the means, standard deviations, and correlation coefficients (Cuckle 2003). It is important that these parameters are obtained from meta-analysis of published results with tailoring of variances and covariances to the local population (Cuckle 2003). This is especially true for the parameters of unaffected pregnancies as individual studies will have a relatively small number of cases. Another use of statistical modelling is to estimate screening performance of different screening strategies (Cuckle 2003). The papers described in this review as statistical modelling papers, use Monte Carlo simulation to sample from the Gaussian distributions to estimate screening performance. Of the 20 papers reporting directly observed performance, eighteen were cohort studies and two of these were retrospective cohort studies (O'Leary et al. 2006; Spencer et al. 2003b). The sample sizes in the cohort studies of routinely screened women ranged from 1836-30,564. A study of twin pregnancies had a sample size of 200 (Gonce et al. 2005). There were two case-control studies. One was a nested case-control study with a sample size of 139 (Marsk et al. 2006). The other case-control study had a sample size of 463 (Hallahan et al. 2000). While all these studies were graded III-2 as per NHMRC, the ideal design for determining the DR and FPR of a screening strategy is a large prospective cohort (or nested case-control) study where all women receive all screening methods being considered (Deeks 2001). Case-controls studies (other than nested case-control studies) that recruit healthy participants without the disease, separately from cases, SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 16 will overestimate validity when compared to cohort studies with populations representative of a routinely screened population (Deeks 2001). Study setting and samples The proportion of DS in the cohort studies of routinely screened women ranged from 0.14-0.95%. While none of these studies were restricted to women of advanced maternal age or younger women, there were variations in the maternal age distribution between studies. One paper screened women attending a prenatal diagnosis centre and about half were having amniocentesis for AMA or other risk factors (De Biasio et al. 2001). The paper had a relatively high mean maternal age (31 yrs + 8 months) and a high proportion of DS in the sample (0.87%). As maternal age is associated with DS, studies in populations with a larger proportion of older women may produce higher DR than studies in the general population. For the results to be applicable to a given population the maternal age distribution of the population in the study should be similar to that of the population in question. Note also that the proportion of DS relates to DS detected in the 1st trimester and will be larger than that detected in 2nd trimester or at term (because of the increase in spontaneous fetal loss in DS pregnancies). Both of the case-control studies excluded cases of chromosomal disorders other than DS. The proportion of DS in the nested case-control study was 22%, and in the other case-control study was 13.6%. The majority of studies excluded multiple births. One study specifically looked at twin pregnancies (Gonce et al. 2005), and one included twins in the analysis (Soergel et al. 2006). The issue of DS screening in twin pregnancies (or multiples) is particularly important as twin pregnancies increase with maternal age and assisted reproductive technology (ART) pregnancies are also more common with advanced maternal age (Gonce et al. 2005). Screening for DS in twins is difficult as serum screening does not provide an individual risk for each twin, and the serum results of an unaffected twin may “normalise” the results of an affected twin (Gonce et al. 2005). It is especially important for screening methods to be accurate as there is an increased chance of miscarriage with invasive diagnostic testing (Gonce et al. 2005). Only one study included screening in primary care (Niemimaa et al. 2001). Three of the single centre studies were in university hospitals (Gonce et al. 2005; Soergel et al. 2006; Wojdemann et al. 2005). These studies may produce results which are difficult to reproduce in regional hospital settings, particularly where the screening strategy includes measurement of NT. As seen in the results section (Difficulties implementing any of the screening strategies) it is important for NT to be measured in a setting with high quality equipment, well conducted quality control programmes and staff who are highly trained and who undergo regular audit. Comparison screening methods The screening methods compared in this chapter are: MA as a screening test (cut-off ≥35 years unless specified) PAPP-A free βhCG (fβhCG) total hCG (where papers include “hCG” this is assumed to be total hCG) combination of PAPP-A + fβhCG (or total hCG) NT measurement combined test (NT, and PAPP-A + fβhCG) ADAM12 (A Disintegrin and Metalloprotease 12) Hyperglycosylated hCG (Hhcg) or Invasive Trophoblast Antigen (ITA) AFP repeat measures of PAPP-A and fβhCG at 10/40 and at 12/40 unless otherwise stated all screening strategies incorporate the individual’s maternal age. All papers in this chapter used conventional methods to combine maternal age specific risks and marker levels (LRs) to determine an individuals risk and therefore the DR and FPR (i.e. using multivariate Gaussian distributions as described above). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 17 Outcomes Where a large number of results for different risk cut-offs, fixed FPRs or DRs have been reported (e.g. for modelling papers), a fixed FPR of 5% and/or a fixed DR of 85% have been selected for the evidence table as this is the convention for reporting DS screening test performance. While studies report DR for fixed FPR, in reality the cut-off chosen for screening programmes is an individual risk (Benn and Donnenfeld 2005). The majority of cohort studies include other chromosomal disorders as unaffected cases. In this way their screening results will either be false positives or true negatives. While these cases are not strictly false positives (as a positive screen is clinical relevant), for the purpose of this review and where it was possible, figures were recalculated to include this subpopulation in the “unaffected” group (von Kaisenberg et al. 2002). PRIMARY RESEARCH: STUDY RESULTS Accuracy of screening methods Maternal age Nine studies had results for maternal age alone as a screening test (see Table 4). In most studies, maternal age was clearly inferior to the other strategies used. The exceptions were one study where βHCG had a lower DR than maternal age for a fixed 5% FPR (Montalvo et al. 2005) and one other study where maternal age had a higher DR but also a higher FPR than fβHCG alone and PAPP-A alone (von Kaisenberg et al. 2002). Therefore, the comparative performance of maternal age and free βHCG alone and PAPP-A alone was not clear in the latter study. Overall, the literature supported the low predictive performance of maternal age compared with other screening strategies in the first trimester. Table 4. Comparison of DRs and FPR, maternal age versus other screening strategies Reference (Gasiorek-Wiens et al. 2001) (Schuchter et al. 2002) (von Kaisenberg et al. 2002) (Wapner et al. 2003) (Scott et al. 2004) (Avgidou et al. 2005) (Montalvo et al. 2005) Test MA NT MA NT MoM >95%centile NT + MA (MoM) Combined MA NT FβHCG PAPP-A Combined MA FβHCG + PAPP-A NT Combined MA FβHCG + PAPP-A NT combined MA NT Combined MA PAPP-A βHCG NT Combined DR (%), (95% CI) 66.7 79.5 64.3 57.1 71.4 85.7 52.6 73.7 15.8 26.3 84.2 32.8 67.2 68.8 78.7 80 80 100 100 31.5 81.6 90.3 21.1 47.4 10.5 63.2 78.9 (60-73) (74-85) (39.1-89.4) (31.3-83) (30.2-75.1) (53.9-93.5) (6.5-46.1) (66.3-88.1) (24.6-37.6) (76.2-87.0) (86.2-94.5) (6.1-45.6) (24.9-69.8) (1.3-33.1) (41.5-84.9) (60.6-97.3) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING FPR (%), (95% CI) 35.7 (35.0-36.3) 6.0 (5.7-6.3) 12.8 4.8 (4.2-5.4) 10 5 (4.5-5.7) 35.7 (34.1-37.2) 4.8 (4.1-5.5) 4.7 (4.0-5.3) 4.6 (3.9-4.3) 6.6 (5.8-7.5) 5 5 5 5 29 19 5.7 7.2 5 5 5 5 5 5 5 5 18 Table 4. Comparison of DRs and FPR, maternal age versus other screening strategies (continued) Reference (Soergel et al. 2006) (Marsk et al. 2006) Test MA NT FβHCG + PAPP-A Combined MA NT Combined DR (%), (95% CI) 63.6 81.8 87.5 87.5 67 94 94 (30.8-89.1) (48.2-97.7) (47.3-99.7) (47.3-99.7) (51-84) (79-99) (79-99) FPR (%), (95% CI) 28.3 5.1 15.4 4.0 37 20 7 (27-30) (4.3-6.0) (13.8-16.9) (3.2-4.8) (28-46) (13-28) (2-12) Combined versus NT and combined versus FβHCG + PAPP-A Twenty-three studies were identified that included the combined test (fβhCG, PAPP-A and NT) and had comparisons with NT and/or fβHCG + PAPP-A. These comparisons took different forms. In nine studies, the DR for each screening strategy was presented for a 5% FPR. These results are summarised in Table 5. The combined strategy consistently had a higher DR for the fixed FPR compared with the two alternative strategies. It should be noted that in some of these studies there were a wide range of other tests included. For studies in this table no conclusion could be made for the comparison of 1st trimester MSS compared with NT. Table 5. Comparison of DRs for a 5% FPR, combined test, nuchal translucency and fβhCG + PAPP-A Reference (Krantz et al. 2000) (Crossley et al. 2002) (Spencer et al. 2003b) (Wapner et al. 2003) (Avgidou et al. 2005) (Montalvo et al. 2005) (Gyselaers et al. 2005) (Spencer et al. 2002) (Palomaki et al. 2005) Test Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Modelled Combined FβHCG + PAPP-A NT Modelled Combined FβHCG + PAPP-A NT DR (%) for 5% FPR (95% CI) 91 63 74 82 (65-93) 55 (39-70) 54 (37-71) 92 (74-99)1 68 (50-86) 76 (59-93) 78.7 (66.3-88.1) 67.2 68.8 90.3 (86.2-94.5) Not given 81.6 (76.2-87.0) 78.9 (60.6-97.3) Not given 63.2 (41.5-84.9) 73.6 (56.0-90.1) 61.5 (42.8-80.2) 42.3 (23.3-61.3) 88.9 69.5 73.5 84 67 Not given In 15 studies a fixed cut off was used for the different tests, resulting in variation in both the DR and the FPR. These studies are summarised in Table 6. The results presented in this table were also consistent with improved performance of the combined strategy. For example, the DR of combined screening was higher than or the same as that for fβHCG + PAPP-A while at the same time the FPR was lower in the combined screening strategy in all 10 studies where this information was available. When comparing the combined strategy with NT alone the DR of combined screening was higher than or the same as that for NT testing while at the same time the FPR was lower in the combined screening SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 19 strategy in nine studies where this information was available. In four studies with estimates for combined screening and NT the results were more difficult to interpret given a higher DR but also a higher FPR from combined screening. In one study the combined and NT strategies had the same DR (100%) but the combined strategy had a higher FPR (7.2% versus 5.7%), (Scott et al. 2004). In this study there were only 5 cases of DS so the estimated DR may not be accurate. Also the median PAPPA was 0.88 MoM rather than 1 which will have increased the FPR for combined test. Once again it was not possible to determine whether 1st (MSS) or NT has a better performance. Table 6. Comparison of DRs and FPR for fixed cut offs (same for all tests unless indicated), combined test, nuchal translucency and fβhCG + PAPP-A Reference (De Biasio et al. 2001) (Niemimaa et al. 2001) (Schuchter et al. 2002) (von Kaisenberg et al. 2002) (Muller et al. 2003a) (Wapner et al. 2003) (Scott et al. 2004) (Wojdemann et al. 2005) (Gonce et al. 2005) (Gyselaers et al. 2005) (Soergel et al. 2006) (Marsk et al. 2006) (O'Leary et al. 2006) (Christiansen and Olesen Larsen 2002) (Christiansen and Jaliashvili 2003) Test Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined (1:300) FβHCG + PAPP-A NT ≥ 95th centile Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Combined FβHCG + PAPP-A NT Modelled Combined FβHCG + PAPP-A NT Modelled Combined FβHCG + PAPP-A NT DR (%), (95% CI) 87 (62-98) 69 62 80 (28-99) 75 (35-97) 60 (15-95) 85.7 Not given 71.4 84.2 Not given 73.7 (53.9-93.5) 73 (56-90) 69 (51-87) 62 (43-80) 85.2 (73.8-93.0) 85.2 82 100 80 100 91 (58.7-99.8) 73 (39-93.4) 75 (42.8-93.4) 100 (29-100) Not given 100 (16-100) 80.8 (66-96) 80.8 (66-96) 42.3 (23-61) 87.5 (47.3-99.7) 87.5 (47.3-99.7) 81.8 (48.2-97.7) 94 (79-99) Not given 94 (79-99) 83 (74-93) 85 (76-94) 73 (62-85) FPR (%), (95% CI) 3.3 (2.5-4.1) 4.5 6.7 8.3 (6.9-9.6) 9.7 (8.5-11.0) 11.6 (10.0-13.2) 5 (4.5-5.7) Not given 10 6.6 (5.8-7.5) Not given 4.8 (4.1-5.5) 4.7 (4.1-5.3) 8 (7.3-8.7) 5 (4.4-5.6) 9.4 (8.8-10.1) 23.2 11.9 7.2 19 5.7 2.1 (1.8-2.5) 8.8 (8.1-9.5) 1.8 (1.5-2.1) 5.1 (0.7-9.5) Not given 14.3 (7.4-21.2) 8.6 ( 16.8 (16.1-17.4) 4.8 (4.5-5.2) 4.0 (3.2-4.8) 15.4 (13.8-16.9) 5.1 (4.3-6.0) 7 (2-12) Not given 20 (13-28) 3.7 (3.5-4.0) 11.5 (11.1-11.9) 8.6 (8.2-8.9) 85.5 Not given 74.0 4.4 Not given 5.7 85.9 75.5 Not given 2.7 5.3 Not given Other studies included results on the FPR given an 85% DR (see Table 7). These provide similar predictive performance information to that provided by the DR for a given FPR. Specifically, if combined has a higher DR than FβHCG + PAPP-A when both have a FPR of 5% then we would expect the combined strategy to have a lower FPR than FβHCG + PAPP-A when the DR is fixed at SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 20 85% for both. Combined testing showed the best performance characteristics (compared with NT alone and fβHCG + PAPP-A) in the four studies where this information was provided. Note, however, all four of these studies were included in Table 5 so this is not telling us anything more about the predictive performance compared with the data in Table 5. Table 7. Comparison of FPRs for an 85% DR, combined test, nuchal translucency and fβhCG + PAPP-A Reference Test (Krantz et al. 2000) Combined 1.4 FβHCG + PAPP-A 6.8 NT 3.4 (Avgidou et al. 2005) FPR for an 85% DR Combined FβHCG + PAPP-A 2.9 NT (Montalvo et al. 2005) Combined FβHCG + PAPP-A 8.2 (7.8-8.4) 8.6 (7.8-9.4) Not given 65 NT (Palomaki et al. 2005) (2.7-3.0) Not given (64.0-66.4) Modelled Combined 5.6 FβHCG + PAPP-A 16 NT Not given fβhCG and total hCG Four papers compared the performance of fβHCG and total hCG either alone or combined with PAPPA. (see Table 8). fβHCG had a higher estimated DR than total hCG when both tests had a fixed FPR of 5%. Table 8. Comparison of DRs and FPR, fβHCG and total hCG Reference Test (Hallahan et al. 2000) fβHCG 45 5 Total hCG 35 5 PAPP-A + fβHCG 67 5 PAPP-A + hCG 52 5 fβHCG 49.5 5 Total hCG 50.8 5 PAPP-A + fβHCG 88.9 5 PAPP-A + hCG 78.7 5 PAPP-A + fβHCG 64 5 PAPP-A + hCG 67 5 (Spencer et al. 2000a) (Spencer and Cuckle 2002) (Palomaki et al. 2005) DR (%) FPR (%) Modelled Modelled Modelled PAPP-A and fβhCG Five studies were identified that presented estimates for the predictive performance of fβHCG and PAPP-A (see Table 9). In all but one study PAPP-A was a better discriminator of DS than fβhCG. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 21 Table 9. Comparison of DRs and FPR, fβHCG and PAPP-A Reference (Krantz et al. 2000) (von Kaisenberg et al. 2002) (Montalvo et al. 2005) (Spencer and Cuckle 2002) (Laigaard et al. 2003) Test fβHCG PAPP-A Free βHCG PAPP-A Free βHCG PAPP-A Free βHCG PAPP-A Free βHCG PAPP-A DR (%), (95% CI) 46 38 15.8 26.3 10.5 (1.3-33.1) 47.4 (24.9-69.8) 49.5 50.8 42.4 52.3 FPR (%), (95% CI) 5 5 4.7 4.6 5 5 5 5 5.1 5.1 Evidence regarding first trimester screening included in Chapter 5 evidence tables Five studies included in Chapter 5 also have evidence relevant to this chapter (Cuckle et al. 2005; Cuckle 2003; Malone et al. 2005; Rode et al. 2003; Wald et al. 2003b). None of this evidence contradicted the conclusions of this chapter i.e. that the combined test is better than both NT and 1st T MSS (PAPP-A and fβhCG) that fβhCG performs better than total hCG and that it is not possible to say whether 1st T MSS or NT performs better. Other tests Muller et al. (2003a) added 1st trimester AFP to both FβHCG + PAPP-A and the combined tests. The addition of AFP did not enhance test performance in this study. The DR and screen positive rate (SPR) of ADAM12 was evaluated by Laigaard et al. (2003). ADAM12 had improved performance compared with βHCG alone (DR 81.5% SPR 3.2% versus βHCG alone: DR 59.9%, SPR 12.9%), and PAPP-A alone (DR of 66.2%, and FPR of 11.2%). Palomaki et al. (2005) evaluated the use of serum ITA. In a key comparison they compared the DR of ITA + PAPP-A and NT to the conventional combined test. With a fixed FPR of 5%, the DR of both these screening methods was 84%. Spencer et al. (2002) also estimated the performance of ITA (HhCG) in the 1st trimester and found ITA was unlikely to be of any additional value. In another study a strategy involving repeat measures of PAPP-A and fβhCG at 10/40 and 12/40 showed a slight improvement in performance over the combined test where either the serum was taken at 10/40 or it was taken at 12/40 (Spencer and Cuckle 2002). At a fixed 5% FPR the DR for the repeat measures was 88.6% compared to 87.2% when serum was taken at 10/40, and 87.3% when serum was taken at 12/40. The authors concluded that when women present twice in the 1st trimester this strategy could be considered. However, there is no justification for taking two samples routinely (Spencer and Cuckle 2002). Christiansen (2002) evaluated a contingent screening strategy confined to the 1st trimester. This involved all women having 1st trimester PAPP-A and βhCG prior to 11/40. Then only those with an intermediate risk would have NT, and those with very high risk would have an invasive screen. The reasoning was to explore the possibility of rationing NT. The 1st trimester contingent screening strategy had a DR of 78.9% and a FPR of 4% compared to a combined screening which had a DR of 85.5% and a FPR of 4.4%. This performance would be achieved with only about 20% of women requiring NT. Difficulties implanting any of the screening strategies Eleven studies gave details of any difficulties implanting screening strategies. NT was successfully measured in all participants in a multicentre study but the investigators noted that 3-5% required TV USS for the measurement (Gasiorek-Wiens et al. 2001). In another study, NT could not be measured in 2% (Schuchter et al. 2002). In a further study at least one image was obtained for SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 22 NT measurement in 72.9% of participants although only 51.9% successfully had all three images measured, and staff felt more time was needed to measure NT (Crossley et al. 2002). In another study, NT was too small to measure precisely so was assumed to be under 5mm (Muller et al. 2003a). In another study a learning period was required for NT measurement (Wapner et al. 2003). This required careful training, evaluation of competence and external quality control. In one study, 10 of 25 units approached declined to participate due to lack of time/staff for USS, having USS equipment with inadequate resolution, or having a large number of women booking in the community (Crossley et al. 2002). One group investigated twin pregnancies (Gonce et al. 2005). In monochorionic twins they were unsure whether the largest, smallest or average of the two NT measurements should be used. There were also documented difficulties in obtaining both NT measurements and serum tests in one study. This study measured NT in 7,580 but only 3,551 had serum samples taken as well (von Kaisenberg et al. 2002). In some studies, the marker MoMs were different from expected. For example, in Gonce et al. (2005), the authors noted that twin serum levels should be double that for singleton pregnancies but the MoM for fβHCG was only 1.57 and in Scott (2004) the MoM for PAPP-A was 0.88 rather than 1. In three studies the distribution of NT measurements differed from the Fetal Medicine Foundation (FMF) distributions (Gyselaers et al. 2005; Muller et al. 2003a; Wapner et al. 2003). The importance of using correct parameters was noted in one study (Spencer et al. 2000a) which found that when PAPP-A + total hCG samples were taken at 84-97 days and analysed using appropriate parameters (for gestational age), this method achieved performance comparable to screening with PAPP-A and fβhCG. Various types of assay methodology could be used for the serum tests. In one study an in-house polymonoclonal assay was used rather than an automated double monoclonal assay for the measurement of PAPP-A (Christiansen and Jaliashvili 2003). The authors noted that the simplicity of the polymonoclonal approach was obtained at the expense of automation, which potentially increases the risk of human error. Summary of results There was a high level of consistency supporting: 1. 2. Poor performance of maternal age alone as a screening test compared with other screening methods improved performance of the combined test compared with either NT or fβHCG + PAPP-A (either on the basis of a higher DR for a fixed FPR or a higher DR with a lower FPR). There was also some support for fβHCG having better performance in the 1st trimester than total hCG, and PAPP-A being a better discriminator of DS than fβhCG. There was no clear evidence to suggest that either 1st trimester MSS or NT had superior screening performance. Other tests were included in the studies appraised. There appeared to be some support for ADAM as a marker for DS, and serum ITA showed promised as a screen but further research is required. There were some difficulties noted with the screening tests. Most of the cited difficulties were with the measurement of NT: either it was not able to be done in everyone, or the measurement was inaccurate when assessed against standards such as the FMF standards. This may be because not all operators were trained, or there was not enough time for measurement. There were also issues with serum marker MoM being different from expected or problems when using inappropriate data for distribution parameters. Conclusion The evidence appraised in this chapter showed that maternal age alone is not an appropriate screening test for DS, that fβhCG performs better than hCG in the first trimester, and that in the first trimester the combined test is the best DS screening method based on DR and FPR. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 23 There were some limitations to this evidence including some evidence based on case-control studies or modelling papers based on data from case-control studies, and some studies having a retrospective design. Most of the cohort studies were based in hospital settings: results (especially for NT) may not be replicated in regional settings. Some samples were biased as not all women had all tests and in some studies this was because women with large NT measurements had invasive diagnostic testing instead of 1st trimester MSS. For screening confined to the first trimester the combined test is the screening method of choice. The implications of resources for training, monitoring and quality control particularly for NT testing would need to be carefully considered when determining whether a screening programme should proceed. There are also quality control issues associated with determining risk. Correct medians are needed for calculating MoMs and appropriate parameters are required for accurate risk calculations. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 24 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Krantz et al. 2000) The study compared screening for DS and T18 using fßhCG, PAPP-A, fβhCG + PAPP-A, NT, fßhCG & NT, PAPP-A & NT, and the combined test. Between September 1995-June 1998 blood samples collected from 10,251 women and dried as spots on filtered paper. Outcomes DR and FPR of screening for DS using fßhCG, PAPP-A, fßhCG + PAPP-A , NT, fßhCG & NT, PAPP-A & NT, and the combined test was extracted for this evidence table. These results are calculated based on the age distribution of the USA population (14-49 years). Accuracy of screening methods Limitations Fetal loss not accounted for which could overestimate the DR. Using a fetal loss adjustment (31% loss rate from late 1st T to term) the 3 not detected by screening represents about 0.69% of those alive at 1st T. Therefore 4 would have been alive at 1st T (3/0.69 = 4) and the adjusted DR of combined test= 30/34 = 88%. Prospective cohort study Grade III-2 Analysed fβhCG and PAPP-A using ELISA procedure. 7,801 samples analysed at NTD labs (USA) and 2450 at Centro Di Diagnosi Prenatale (Italy) using identical reagents and procedures. NT done according to FMF guidelines and measured at 10+4/40-13+6/40 by FMF trained ultrasonographers. Marker values converted to MoM. Labs developed separate MoMs to account for interlab assay variation. LR calculated from multivariate loggaussian distributions of MoM in DS and “unaffected”. Risks determined by multiplying the LR by MA associated risk of DS and GA (Snijders et al. as referenced in this paper), and the MA distribution in USA. GA = 9/40-13+6/40 based on USS or LMP (if no USS). All pregnancies apparently healthy singleton with no DM. Of 10,251 in study, 5,809 met the criteria of the dates for NT and so had successful NT measured as well as MSS. Spectrum of disease: 50 DS/10251 = 0.49%. T18 = 20. “Adverse fetal outcomes” (includes other chromosomal disorders) = 75. Of the unaffected (excludes DS, T18, and other) having MSS n= 10,106 Mean MA 31.6 ± 5.4 (SD) Mean GA 11.65/40 ± 1.06 (SD) Having NT and MSS N= 5718 Mean MA 32.1 ± 5.7 (SD) Mean GA 12.07/40 ± 0.85 (SD) Of those with DS having MSS n= 50 Mean MA 37.2 ± 4.9 (SD) Mean GA 11.73/40 ± 1.16 (SD) Having NT and MSS n= 33 Mean MA 37.5 ± 4.4 (SD) Mean GA 12.07/40 ± 0.88 (SD) If MSS and NT not on same day, GA taken as the later of the 2 GA’s. For these figures it is presumed T18 and other chromosomal abnormalities included as unaffected, but unsure as DR and FPR only given as %. CI’s were therefore not calculated for the evidence table. The risk cut-off was varied until it reached a fixed 5% FPR. Also determined performance at a fixed DR of 70%. Verification No details given. Expected number of DS from the MA distribution was 48.9. Age standardized, 5% FPR fßhCG (1:145) 46% PAPP-A (1:105) 38% fβhCG + PAPP-A (1:140) 63% NT (1:195) 74% fßhCG & NT (1:240) 80% PAPP-A & NT (1:185) 81% Combined test (1:270) 91% Age standardized, 70% DR fßhCG (1:355) 15.8% PAPP-A (1:395) 19.0% fβhCG + PAPP-A (1:195) 6.8% NT (1:90) 3.4% fßhCG & NT (1:55) 2.3% PAPP-A & NT (1:35) 2.3% Combined test (1:15) 1.4% Difficulties implementing any of the screening strategies None noted SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING There is also a possibility of not ascertaining all DS cases born but not identified by screening. However, in this study the DS ascertained is similar to that expected based on MA and GA. No details of how verifications obtained. Unclear where parameters obtained for the Gaussian distribution for LR. The paper presents parameters for MSS markers and NT from 108 DS (including an additional 58 analysed previously). Two of the authors are employees of NTD labs and one is the owner of NTD labs as well as the owner of patents related to the use of fβhCG in DS screening. Author’s conclusion 1st T screening for DS and T18 is effective and offers substantial benefits to clinicians and patients. Reviewers’ conclusions Despite limitations, this is a large well designed study. 25 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Hallahan et al. 2000) Comparison of screening with fβhCG compared to intact hCG (with and without MA) in 1st T screening for DS. fβhCG and intact hCG were analysed in a total of 63 DS (35 liquid serum and 28 dried blood) and 400 unaffected specimens (200 liquid serum and 200 dried blood). Outcomes DR and FPR given are for screening for DS using the observed data (fixed 5% FPR, or a fixed 60% DR), and using Gaussian distribution parameters from previous studies and current study combined with the MA distribution of live births in the USA. Accuracy of screening methods Limitations A case-control design is not the best design for determining the validity of a screening test. The design (DS and unaffected controls) means the sample will not include fetuses with other chromosomal disorders which are also likely to test positive during DS screening. Case-control study Grade III-2 FßhCG was determined using inhouse ELISAs. Parameters for fβhCG were obtained from previous studies. For liquid samples these were Brambati et al. 1997 and Morssink et al. 1998 (as referenced in this paper), and for dried samples-ongoing prospective analysis (Orlandi et al. 1997, Krantz et al. in press, as reference in this paper). Intact hCG analysed using commercial kit (Immulite system, DPC). Gestational day specific medians calculated for liquid and dried samples separately, and then MoM calculated for the overall samples. The MoM and the MA specific risk used to determine the LR and the individual risk of DS. A meta-analysis (7 case-control studies including this one) was done for the distribution parameters of intact hCG versus fβhCG. Used studies where both measured in same analyte set. GA = 10-13/40. 100 (50 liquid, 50 dried) unaffected pregnancy specimens selected for each gestational week from USS dated, white, non-diabetic, singletons. For DS, mean MA = 38 ± 5.09 (SD) For unaffected controls, mean MA = 35.8 ± 5.71 (SD) Spectrum of disease: 63 DS/463 = 13.6% Presumably no other chromosomal abnormalities in study. Raw data not given, so unable to calculate CI for the DR and FPR. Verifications No details given. DR for fixed FPR 5% using observed parameters Intact hCG alone 19% fβhCG alone 27% Intact hCG & MA 35% fβhCG & MA 45% FPR for fixed DR of 60% using observed parameters Intact hCG alone 30% fβhCG alone 14.9% Intact hCG & MA 17.7% fβhCG & MA 10.5% (No other details of sample given) DR for fixed FPR 5% using meta-analysis parameters and USA age distribution Intact hCG & MA 34% fβhCG & MA 45% Difficulties implementing any of the screening strategies None noted. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING There was little detail about the source of the cases or controls-for example whether the cases had been detected by a positive 1st T screen using fβhCG. The use of an in-house ELISA for fβhCG may limit the reproducibility of the results unless the full description of methods is available. It is unclear whether those conducting the serum analysis were blinded to the outcome. Author’s conclusion FβhCG is a better marker than intact hCG for DS in 1st T. Reviewers’ conclusions The design of the study and the possible biased sample means that a further prospective study is needed to verify the results. However, results do seem to indicate that fβhCG is better at discriminating between DS and unaffected controls than intact hCG in 1st T DS screening. 26 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (De Biasio et al. 2001) The combined test was compared to fβHCG + PAPP-A, and to NT. Women attending the centre between July 1998 and Jan 2000 were recruited. Outcomes All tests done on everyone. Accuracy of screening methods DR (CI), FPR (CI) Limitations Only described the participants (1836). The eligible and source populations were not described. Each screening method was well described-could be replicated Women were having AC or went on to have TT. DR and FPR for combined test was compared to fβHCG + PAPP-A, and to NT. NT performed by TA USS at 10-13/40. 3 images obtained using standard methods (Nicolaides et al. 1992-as referenced in this paper). The GA (from CRL) ranged from 10-13 + 6/40 (median 12/40). cut-off of 1:350 Combined 87% (62-98), 3.3 (2.5-4.1) fβhCG + PAPP-A 69.2%, 4.5% NT 61.5%, 6.7% Prenatal diagnosis centre Genoa, Italy Prospective cohort study Grade III-2 Immunoradiometric assays for MSS were performed (Ortho-clinical Diagnostics). The risk of DS pregnancy estimated from multivariate Gaussian distribution (Wald and Hackshaw, 1997-as referenced in this paper), using commercially available software. A regression analysis was performed for NT and serum medians for gestational age. Median maternal age was 31yrs 8 months. Spectrum of disease: 16/1836 DS = 0.87% Participant Sample size = 1836. The DR was for those with DS and FPR was for those unaffected by DS (i.e. presumably all other chromosomal disorders considered to be unaffected). Difficulties implementing any of the screening strategies None noted. Didn’t say how parameters obtained for the Gaussian distribution, apart from medians for NT and serum markers. Unclear how many lost to follow-up or spontaneous fetal loss before karyotyping or if these were excluded from analysis. Of the 992 participants who had TT doesn’t say how outcome determined. Figures in the paper were not consistent so CI was not calculated for this evidence table. No raw date to check anything but combined screening. Also say 14/16 detected then in discussion say 13 detected and that the screened population was 1467 (not 1836 reported elsewhere). Verification Unclear if all tests were compared with a valid reference standard. The DR is of DS found in 1st T. Does not account for the fact that many of these (about 1 in 2) would be lost anyway by miscarriage. About half had invasive diagnostic testing (IT) mostly for AMA, and the rest had 2nd T screen (TT) without a description of outcome verification. The paper states that about 2 cases of DS would be expected in 1,467 pregnancies and that screening detected 13 DS. It is not clear why these figures differ from those reported elsewhere. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Possibly the population is high risk as they were attending a prenatal diagnosis centre and had AC for AMA and other high risk situations. This limits the applicability of the results. High percent DS in population. Author’s conclusions “Our results confirm the value of the combined test in first trimester screening for DS” Reviewers conclusions There are limitations which will affect the internal and external validity of the study. However it appears that combined screening performed better than fβHCG + PAPP-A which in turn performed better than NT. 27 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Niemimaa et al. 2001) Comparison of fβHCG + PAPP-A, NT, and combined screening. All pregnant women in Eastern and Northern Finland in 1999 were invited to participate during 1st visit to midwives at 6/40-13/40. Outcomes DR and FPR determined for cut-off of 1:250. Accuracy of screening methods Limitations Small number of DS cases (8 in total screened population). Antenatal clinics at health centres-primary and secondary care Eastern and Northern Finland Prospective cohort study Grade III-2 NT by ultrasonographers trained in NT measurement. MSS analysis by commercial kits (Wallac). Labs given clinical data including NT thickness. Quality control done regularly. For each serum level the MoM was calculated. Risk determined by Wallac 1T risk calculation programme (research version). Cut-off of ≥ 3mm NT used for intervention, but risk also calculated by Wallac using MA, CRL and NT. For serum screening the risk was calculated using Wallac programme. Blood drawn in primary care and maternity clinics of hospitals (university and central). MSS at 10– 13+6/40. GA mostly (82%) based on USS, or based on LMP. Presumed singleton. 2,515 had serum 17.5% >35 yrs which is = to the stats for Finnish pregnant women. Only 1,602 had NT as well as serum Spectrum of disease: 8 DS/2525 = 0.32% Also calculated DR for combined screening with fixed 5% FPR. Verification Contacted maternity clinics and National Register of Congenital Malformation, and National Research and Development Centre for Welfare and Health to get information on newborns with DS. Expected number DS (according to national Register of Congenital Malformation) was 7.8. DR (95%CI), FPR (95% CI) Cut-off of 1:250 NT 60% (15-95), 11.6% (10.0-13.20) Double 75% (35-97), 9.7% (8.5-11.0) Combined 80% (28-99), 8.3% (6.9-9.6), Fixed FPR 5% (Cut-off of 1:200) Combined 80% Difficulties implementing any of the screening strategies None noted. 2 cases of T18 (presumed included in analysis as unaffected). Corrections for weight but not smoking or DM. Not all tests were done on everyone. All had MSS but not everyone had NT as well. This could be a problem if this was based on risk (e.g. if those having NT as well had higher or lower risk for DS). Did not determine karyotype of fetal losses. Could have missed DS and therefore true DR could be lower. Does not mention those lost to follow-up. Lab was not blinded to the results of the NT. However the serum marker results are not subject to subjective interpretation. Author’s conclusions The study suggests that the combined test promises better sensitivity than the current testing methods (2nd T MSS) used in DS screening. Reviewers conclusions Some limitations which may decrease the accuracy of the reported DR and FPR and the ability to compare screening methods. However, it appears that combined is better than NT alone, and that fβHCG + PAPP-A is better than NT. However not all had NT. Serum results not given to women (were used to determine medians) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 28 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Gasiorek-Wiens et al. 2001) Comparison of screening by using NT or MA alone. Screening (June 1995-May 2000) of those with EDD before August 2001. Accuracy of screening methods DR (95%CI), FPR (95% CI) (German speaking DS screening group) Centres included some specialist perinatal diagnostic centres. Limitations Limited details of the study setting. No details of the efforts to determine outcome for fetal loss, or ToP or those lost to follow-up. Multicentre Tests could be replicated. All those involved in the study had FMF Certificate of Competence in the 10-14/40 scan. Outcomes Outcomes were the performance of screening using NT (risk at term cut-offs of 1:300 or 1:100), or MA (≥ 35 years) alone. Germany, Austria, Switzerland NT measured according to FMF guidelines. Prospective cohort study Risk for DS estimated using MA and NT for CRL with the use of FMF software. Outcome available in 21,959 (92.2%). Median MA 33yrs (range-15-49) & 36.1% ≥ 35 yrs (higher than general population = 30 yrs and 18.1%) Grade III-2 Screened 23,805 singletons. The median GA 12/40 (range-10-14) & median CRL = 61mm (range-3884) Spectrum of disease: 210 DS/21959 = 0.95% DS. The DR was for DS and the FPR was for “normal” pregnancies and did not include any of the 274 pregnancies with other chromosomal abnormalities. Verification CVS, AC, or birth of phenotypically normal infant or postnatal karyotyping. NT (1:300) 87.6% (83-92), 13% (12.6-13.5) (1:100) 79.5% (74-85), 6.0% (5.7-6.3) Biased sample as many women would have had a NT by own gynecologist and referred for further investigation. This could over estimate the DR. High NT in screened population. MA (≥ 35 years) 66.7%(60-73), 35.7% (35.0-36.3) Paper stated there was some self selection bias of anxious women. Difficulties implementing any of the screening strategies The paper did not include other chromosomal disorders as unaffected which is the convention. For comparison this would mean for NT (1:300) the FPR = 14.0% and for MA alone = 35.8% NT successfully measured in all, but 3-5% needed TV USS for NT measurement. Also 274 cases of other chromosomal disorders. Of those where no outcome-258 spontaneous fetal loss, 125 ToP, 1,463 no antenatal follow-up. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Author’s conclusions In Germany, Austria and Switzerland results of screening for chromosomal defects by NT, in centres with appropriately qualified sonographers and using the FMF software are similar to those reported in the UK using the same methods. Reviewer’s conclusions Despite a possibly biased sample which may limit the applicability of the estimated DR and FPR, screening using MA and NT appears better than MA alone. 29 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Schuchter et al. 2002) The study compared performance of screening using MA, NT alone, MA and NT, and combined test. Between 1st Dec 1997 and 31st April 2000, all women with singleton pregnancies, and GA 10-13 + 6/40 were invited for combined test. Outcomes The outcomes were DR and FPR for screening for DS based on MA alone, NT (cut-offs of ≥ 2.5mm, and MoM >95%), MA and NT, and the combined test (risk cut-off of 1:250). Accuracy of screening methods DR (95%CI), FPR (95% CI) Limitations The denominator for the different screening methods is unclear. Appears they have used the number with outcome known for combined test, but for MA alone have used the total who had MSS and NT (4939), and for NT have used the total having NT (5012) despite some of these having no outcome documented. These were classified as unaffected in the analysis. Some of the DR in text disagreed with tables. Hospital Vienna, Austria Prospective cohort study Grade III-2 Women invited for NT & CRL measurement at 10/40-12+6/40. If CRL <35mm another appointment given. If >70mm offered Triple test (only a few cases). NT by staff experienced in NT measurement. NT measured 3x by TA USS, using standard methods. PAPP-A and fβhCG analysed using commercial kits (Ortho Clinical Diagnostics). Medians for NT, PAPP-A and fβhCG (for GA) from study. Markers converted to MoM. Risk calculated based on MA and LR for markers using commercial software (Logical Medical systems). 5,012 had NT done. 4,939 had MSS of whom: 13 had spontaneous abortion before 16/40, 4 had positive combined test and no karyotyping (2 of whom had NT >2.5mm). Another 18 had no follow-up-1 with positive combined test result. 14 spontaneous abortions at 1624/40, 3 with normal karyotype. 92 delivered at another hospital and outcome unavailable. In some cases the CI was not calculated for this evidence table as the screened population number was not clear. Verification 311 of the whole population had invasive screen, 254 of whom had a positive screen. Rest for AMA (51) and for malformations on sonogram (6). MA ≥ 35yrs 64.3% (39.1-89.4), 12.8% NT ≥ 2.5mm 50% (24-76), 1.8% (1.4-2.2) NT MoM >95% 57.1% (31.3-83), 4.8% (4.2-5.4) NT MoM (with MA): 71.4%, 10% Combined (1:250) 85.7%, 5% (4.5-5.7) Difficulties implementing any of the screening strategies About 2% could not measure NT. 4802 known outcome. Risk >1:250 used for counselling during study. All those ≥ 35yrs or with abnormal sonogram also offered invasive diagnostic test. Spectrum of disease: 14 DS/4802 = 0.29% For the population age distribution without intervention the expected number of DS at term was 8. However, screening at 12 weeks when more DS viable. In discussion do use a correction for the DR to take account for fetal loss of DS pregnancies between 10/40 and birth (48% of DS pregnancies). This means 12 X 52% = 6.2 would have survived. DR is therefore 6.2/6.2 +2 = 76%. Local medians for NT and MSS markers when these numbers are small especially for DS medians. Wide CI for DR as small number of DS cases. Used the mean of 3 NT images not the largest figure, and also mean CRL was only 48mm which may have decreased performance of NT. Author’s conclusions The results make the combined test by far the best test for the detection of DS in a low-risk population. Reviewers conclusions Some limitations which limit the internal validity especially the problem of those with no outcomes being included, but authors did adjust for fetal loss. Combined test does appear better than NT which in turn appears better than MA alone. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 30 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Crossley et al. 2002) The study compared NT, fβHCG + PAPP-A, and combined test. Invited all pregnant women (1014/40) attending 15 routine antenatal clinics over a two year period to participate (n = 17,229). Outcomes included DR for FPR for DS screening by NT, fβHCG + PAPP-A, and combined test. Accuracy of screening methods DR (95%CI), for fixed 5%FPR Limitations Not clear how outcome determined e.g. for fetal losses. Not clear if any lost to follow-up, and if so whether they were presumed to be unaffected or removed from analysis? MA distribution about = to routinely screened population. Median MA = 29.9 yrs (versus 29.1yrs) and 15% >35 years (versus 12.8%). Verification Of the DS cases, nine had CVS (AMA or large NT), 23 AC (2nd T MSS or AMA) and 13 detected at birth (low risk, or unscreened, or declined diagnostic test). Multicentre study at 15 maternity units. Scotland, UK Prospective cohort study Grade III-2 GA assessed by CRL or BPD. Where CRL 31-94, or BPD <30mm, 3 x TA NT measurements attempted. (TV USS for NT was not part of the protocol). Staff all trained by FMF in NT and there was quality control. fβhCG and PAPP-A assayed using Kryptor analyser (Brahms). Subset of 2000 (including all affected) was analysed using Delfia immunoassay (Perkin-Elmer) for comparison. (Results were strongly correlated). Unaffected medians and other distribution parameters for NT and serum markers were derived from the study population. Marker levels converted to MoM for GA and risk obtained from Gaussian distribution. No results were given to participating women-all offered routine 2nd T screen. NT measurements were obtained in 72.9% of women and blood samples in 98.4%. Spectrum of disease: 45 DS/17,229 = 0.26 % No details of lost to followup, or miscarriage before outcome ascertained. However says all outcomes followed up. States DS prevalence was about that expected for this population. NT 54% (37-71) Double 55% (39-70) Combined (1:250) 82% (65-93) Difficulties implementing any of the screening strategies 15/25 units accepted study. Declining units gave reasons as: lack of time/staff for USS, USS with inadequate resolution, or large number booking in community. Success for NT = 72.9% (≥ 1 image) & 51.9% (3 images). Best success during 11-13/40. Most common reason for failure= fetal position. “Not enough time in appointment for repeat attempts” And “Need more than 10-15 mins.” Used study data for parameters which may be inappropriate, especially for DS as small sample. MoM for DS NT (1.65) and fβhCG (1.58) which is lower than expected. This may have decreased the performance of screening. Wide CI for DR as number of DS cases low There was industry support as CIS and Perkin-Elmer UK provided instrumentation and reagents. Author’s conclusions The authors felt they should include all DS in the series when determining DR (and not just those with NT and serum measured). Therefore DR = 62%. NT in combination with appropriate serum markers has the potential to detect over 80% of DS fetuses in early pregnancy. However, NT measurement is highly operator-dependent. It requires training, external quality control and adequate time to allow accurate measurement, otherwise suboptimal performance will result. Reviewers conclusions Using own parameters for risk estimate may have reduced the accuracy of the estimated DR and FPR. Also some details of the study missing such as how outcomes obtained and whether any loss to follow-up. Does appear that combined is better than NT and fβHCG + PAPP-A but the actual figures for DR and FPR may not be precise. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 31 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (von Kaisenberg et al. 2002) Compared combined test to MA, NT, PAPP-A alone, and βhCG alone. The study began September 1998 & included pregnancies with EDD before November 2001. Accuracy of screening methods DR (95%CI), FPR (95% CI) fβhCG + PAPP-A 11-14/40. All ultrasonographers had certificate of competence from FMF in the 11-14/40 scan. Maternal serum (fβhCG and PAPPA) analysed at 11-14/40 using the Kryptor analyser (Brahms) and corrected for maternal weight. The distributions for NT for CRL using the 95th centile and 99th centiles from FMF. Distributions for serum markers using the 95th centile and 5th centile for fβhCG and PAPP-A respectively from the data of this study. Risk of DS calculated from MA related risk combined with LR from NT and serum results using FMF software. 3,864 singleton pregnancies 1114/40 with live fetuses screened. All had MSS and NT. Those with chromosomal disorders other than DS (27) analysed separately. The total unaffected was 3505 and total screened after removal of other chromosomal disorders =3524. For the purpose of this evidence table the DR and FPR for the combined test (1:300) was also calculated with the chromosomal disorders included as unaffected (“other” included). Limitations Where there was a very high NT often had invasive diagnostic test without MSS as biochemical testing was a new policy. This means the sample was biased as they were not included in the study. The performance of the combined test and the serum markers could therefore be underestimated. German speaking DS screening group Eight centres. Germany Prospective cohort study Grade III-2 Completed follow-up was available in 3551 (91.8%). Median MA = was 33 yrs (range-1546 yrs) & 35.8% ≥ 35 years. Median GA at screening was 12/40 (range-11-14/40), and medial CRL was 64mm (range-45-84mm). Spectrum of disease: 19 DS/ 3551= 0.54% of the population screened and followed up. Verification The choice of who got which reference standard was not blinded to test results (those who had a positive screen had CVS or amniocentesis-verification bias) However, the reference standard was not discussed. Just says that centre forwarded outcome details with test results. MA 52.6%(30.2-75.1), 35.7% (34.137.2) NT ≥ 95th centile 73.7% (53.9-93.5), 4.8% (4.15.5) fβhCG ≥ 95th centile 15.8%, 4.7% (4.0-5.3) PAPP-A ≤ 5th centile 26.3% (6.5-46.1), 4.6% (3.9-4.3) Combined (1:300) 84.2%, 6.6% (5.8-7.5) Combined (1:100) 73.7% (53.9-93.5), 2.4% (1.92.9) Difficulties implementing any of the screening strategies 7580 had NT but only 3551 of these had serum taken as well. Only those with both were included in the analysis. Used local data for serum marker parameters which could be biased for DS as sample size of DS small. Small number of DS means CI of the DR is very wide. When other chromosomal disorders included as unaffected the combined test (1:300) had a FPR of 7.3% (6.4-8.1) versus 6.6%. Authors’ conclusions In Germany the rates of screening for chromosomal disorders by NT and MSS in centres with qualified ultrasonographers are similar to those reported in the UK using the same methodology. Reviewer’s conclusions Small number of DS cases means the DR may not be accurate. The accuracy of the DR is also affected by the biased sample where many had high NT and had invasive diagnostic testing and were not included in the study. However, it appears that combined screening performs better than each of the serum markers and NT. In this study NT appears better than the serum markers individually. All screening methods are better than MA alone. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 32 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Spencer et al. 2003b) The study compared screening for DS by NT, fβhCG + PAPP-A, and the combined test. During a 3 yr period (1st June 199831 May 2001) 12,339 women with singleton pregnancies were offered 1st T screening at the OSCAR clinic. Outcomes The outcomes extracted for this paper were the DR and FPR for screening for DS using NT, fβhCG + PAPP-A or the combined test at a risk cut-off of 1:300. Accuracy of screening methods Limitations It appears that the 98 lost to follow-up (but not on the DS register) were included in the analysis as “unaffected” pregnancies. This may over estimate the DR. It is assumed that the other chromosomal abnormalities were included as “unaffected” pregnancies for the analysis. DR (95%CI), FPR (95% CI) Combined (1 :300) 92% (74-99), 5.2% (4.8-5.6) OSCAR (One stop clinic for assessment of risk), maternity unit, District General Hospital Essex, U.K. Retrospective cohort study Grade III-2 In 1st year qualifying GA for OSCAR = 10+3/40-13+ 6/40. Next two years minimum GA increased to 11/40. If less than 11/40 by CRL (45mm) given another appointment. More than 13+6/40 (CRL = 85mm) offered AFP and fßhCG (not in analysis) NT carried out by staff with FMF certificate of competence. NT measured by TA USS as per FMF guidelines. Completed in about 20 mins (99%). Less than 1% had TV examination. PAPP-A and fβhCG analysed by Kryptor analyser (Brahms). Patient specific risks calculated by multivariate approach using population parameters from Spencer et al. 1999 (as referenced in this paper) and MA specific risks (Snijders et al. 1995 and 1999, as referenced in this paper) Cut-off for offer of invasive screen was 1:300. All women booking at Harold Wood Hospital were offered screening and given an appointment to attend at about 12/40 for OSCAR. 97.5% accepted offer=12,030 Median MA = 30 yrs (range-14.4 – 46 yrs). Median weight = 65.8 kg, and smoking in 18.1% (self reported, 1.1% unknown, rest 80.8% non smokers) Ethnicity-Mostly “white Caucasian” (93.9%) Median GA= 12 +2/40 (10+4/4013+6/40). Median CRL= 60mm (range-38-84mm) Excluded those > 13 + 6/40 on USS (n=702), fetal death on USS (n=233). Total of 11,105 Spectrum of disease: 25 DS/11105= 0.23% Verification Pregnancy outcomes were obtained from delivery room records, hospital database, and child health records and were cross checked with the fetal database. DR (95%CI) for fixed FPR 5% NT 76% (59-93) Double 68% (50-86) Difficulties implementing any of the screening strategies No difficulties noted Paper states that 98 lost to follow-up but were not on DS register. Based on MA distribution and risk of DS at 12 weeks would expect to see 26 cases. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Not clear how well fetal loss followed up and no adjustment for fetal loss between screening and term. Failure to ascertain all the DS pregnancies will overestimate DR. The lead author is an advisor to Brahms Diagnostica on matters of prenatal screening. CIS UK (Brahms Diagnostica was formally CIS) funded some of the lab aspects of the OSCAR clinic. Authors’ conclusions “The findings of the study confirm our prediction that the combined test at 11-14/40 would identify about 90% of DS for a 5% FPR which is far superior to the average sensitivity of 65% achievable by 2nd T MSS.” Reviewer’s conclusions Possible biases may overestimate the DR for this 1st T screening. However, large well designed study which continues to show the trend that combined test is superior to NT and fβHCG + PAPP-A. 33 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Muller et al. 2003a) The study compared NT, fβHCG + PAPP-A (with & without AFP), & combined test (with & without AFP). All women booking January 1998June 2001 were invited to provide blood sample and have NT USS 1113/40. Observed and modelled DR and FPR using NT, fßhCG + PAPP-A , fßhCG + PAPP-A plus AFP, combined test, and combined test plus AFP. The cut-off risk was 1 in 250 at term. Accuracy of screening methods Limitations Study population not well described. The paper states that a total of 5694 women were included in the study but not how many of those originally invited, accepted. Unclear if sample biased. Exclusion and inclusion criteria not stated. No description of MA or GA of those screened. Not everyone had all tests as evidenced by different denominators. If this was based on risk it may bias the results. Unclear how many screened by each method, or how the denominator of each DR and FPR calculation was derived. (French Collaborative Group) 9 centres-12 maternity units. Prospective cohort study. Grade III-2 Some statistical modelling NT performed by 60 staff: 2 FMF trained, 30 trained by FMF trained staff, 8 specific NT training, and 20 self taught. NT measured at 1114/40. Serum samples taken and stored at -20C. Retrospectively tested for PAPP-A, fßhCG, and AFP using fluorescent assay (Perkin Elmer). Marker levels expressed as MoM for GA (CRL). 87% had weight available and had adjustment for weight (Neveux et al. 1996 as referenced in this paper). Distribution parameters for unaffected pregnancies from study data (removal of outliers). DS mean from study, but for serum markers SD and correlation coefficients from meta-analysis (Cuckle and Van Lith, 1999 as referenced in this paper). The parameters used to obtain observed and modelled DR and FPR (as per Royston and Thompson, 1992, as referenced in this paper). 211 did not return after presenting too early for NT, or presented too late-removed from analyses. A total of 5,694 singletons were screened. (MA or GA of those screened not given). Spectrum of disease: 26 DS/5694= 0.46% The 1st T serum not used clinically but NT results were. When high NT result (usually >3mm) women were offered invasive prenatal diagnosis. Those not scanned or with NT considered to be normal were offered T2 screening. 24 other chromosomal disorders presumed included as “unaffected” in analysis. Verification As part of the National Screening Programme outcome sought for every pregnancy. Of 26 DS identified: 9 detected with high NT, 1 high risk and so had AC, 1 AMA and had invasive test, 14 2nd T screening, & 1 live birth. On the basis of MA 11 DS births expected (as per Cuckle, 1987). Assuming 45% fetal loss this was consistent with 26 found by screening. Directly observed DR (95%CI), FPR (95% CI) NT 62% (43-80), 5% (4.4-5.6%) PAPP-A, fßhCG 69%51-87), 8% (7.3-8.7) PAPP-A, fßhCG, and AFP 69% (51-87), 8% (7.3-8.8%) Combined test 73%(56-90), 4.7%(4.1-5.3) Combined plus AFP 73%(56-90), 4.7 (4.1-5.3) Modelled (DR, FPR) NT 64%, 6% PAPP-A, fßhCG 72%, 6.9% PAPP-A, fßhCG, and AFP 73%, 6.9% Combined test 81%, 4.5% Combined plus AFP 81%, 4.5% Difficulties implementing any of the screening strategies In 82 women the NT measurement was too small to measure precisely, and so assumed to be under 0.5mm. The study produced NT with wider SD (0.16) than FMF (0.12). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Unclear if karyotyping of spontaneous fetal losses, or if anyone was lost to follow-up without the outcome being obtained. Stored samples were analysed, and NT was not fully described which may limit the external validity. DS mean for risk from observed data (only 26 DS). Some industry support (Perkin-Elmer Life sciences) Authors’ conclusions In France 1st T screening with NT and MSS is likely to achieve a high efficiency. This has important implications for national screening policy. Reviewer’s conclusions Some design and quality issues limit the applicability of the results. The small number of DS (26) means the parameters for risk determination may have been inaccurate and also produces wide CI for the DR. However combined performed better than MSS, which in turn was better than NT. 34 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Wapner et al. 2003) Comparison of MA as a screen, fβHCG + PAPP-A, NT, and combined test Women with singleton pregnancies with GA 74-97days (CRL) offered screening for DS & T18 by combined test. Accuracy of screening methods DR (95%CI), FPR (95% CI) NT measured as per the FMF. Sonographers FMF trained. Quality monitoring. 3 NT measurements and largest used to calculate the MoM (using GA specific FMF standards) and then LR for DS and T18 based on FMF algorithm. Exclusion criteria: recent significant vaginal bleed, pregestational DM, donor oocyte, indications for prenatal diagnosis other than a risk of trisomy. Outcomes The outcomes presented are DR and FPR for MA, fβHCG + PAPP-A, NT, and combined test Limitations During the study two minor modifications were made to the risk-assessment algorithm corresponding to FMF changes in the algorithm, and in updates of the median serum levels based on our own data. Most (74%) screened based on updated data. Results reported here are based on risks calculated for the patients during the study. There was no change in the rate of detection and only a 0.1 % reduction in the FPR for DS when all risks were recalculated using recent version. BUN study 12 prenatal diagnostic centres USA, Canada Prospective cohort study Grade III-2 Blood was taken and applied to filter paper. Sent by mail to NTD labs, New York. ELISA for PAPP-A and fßhCG performed. Serum marker values were converted to MoM for GA (using medians from study data), and these were converted into LR calculated from Gaussian distributions of unaffected and affected populations (from previous studies(Hallahan et al. 2000)) Patients risk calculated from gestational specific risks according to maternal age, multiplied by the likelihood ratios for serum, NT (combined using commercial software). Women given risks but recommended continuation of pregnancy until 2nd T screening done 8816 women were eligible and consented. 302 did not complete screening: 11 blood spot analyses failed, 4 NT not visualised, 287 for scheduling or other reasons. 102 were removed because of a previous DS or T18 pregnancy, and 196 were removed because of unknown outcomes (lost to followup). 8216 left for analysis. Of these the mean MA at date of delivery =34.5 ± 4.6 GA at screening was 85.7 ± 5.7 day Ethnicity = Black 4%, White 83%, Hispanic 6%, Asian 5%, Other 2% Spectrum of disease: 61DS /8216 = 0.74% 11 T18 included in the analysis as unaffected. Risk cut-off for DS was 1:270 at 12/40. Age standardised using USA population 1997). Unclear whether T18 included in unaffected for analysis of screening performance for DStherefore CI cannot be calculated for this evidence table (where it was not provided) Age standardised (2nd T 1:270) MA 80.3%, 48% fβhCG + PAPP-A 85.2%, 23.2% NT 82%, 11.9% Combined 85.2%(73.8-93.0), 9.4%(8.8-10.1) Verification Reference standard was either invasive test with karyotyping or evaluation of the phenotype at birth. Age standardised DR for Fixed FPR fixed 5% MA 32.8% fβhCG + PAPP-A 67.2% NT 68.8% Combined 78.7% (66.3 -88.1) Outcome of pregnancy was determined by direct followup with the patient, and through delivery records. Combined better than serum alone (p=0.006) but not significantly better than NT alone (p=0.28). An effort was made to determine the karyotype of every fetus in pregnancies that ended in spontaneous fetal loss. Modelled Combined 1:337 78.8%, 5% 1:270 77.5%, 4.1% All tests were done on everyone as this was the basis for inclusion. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Medians were from study data. Older population-Mean MA at date of delivery =34.5 ± 4.6 Author’s conclusions First-trimester screening for trisomy 21 and 18 on the basis of MA, fβhCG and PAPP-A, and measurement NT has good DR at an acceptable FPR. Reviewer’s conclusions Well designed high quality study. 35 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Wapner et al. 2003) . BUN study 12 prenatal diagnostic centres USA, Canada Prospective cohort study Grade III-2 Continued Sample Outcomes and verification Results Of the 8216 in the analysis 90 had spontaneous fetal losses. For 60 the karyotype obtained (cytogenic analysis). Others, phenotype derived from pathological analysis. Difficulties implementing any of the screening strategies 49 cases would be expected based on the maternal age. For NT a learning period was required for staff with measurements becoming more consistent over time. This required stringent training, evaluation of competence and external quality control. Women who were referred because of increased NT were not included, so “ascertainment bias is unlikely”. Overall values for the MoM for NT values were 9% lower than the expected values from the FMF. As the study progressed the measurements converged toward those of FMF. 11 blood spot analyses failed, 4 NT not visualised SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 36 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Scott et al. 2004) Comparison of screening by MA, NT, fβhCG + PAPP-A and combined test. All women with singleton pregnancies referred to SUSW 1st July 2000-3rd May 2002 for 1st T screening offered NT, and MSS. GA = 11-14/40 (CRL 45-84 mm). Outcomes The outcomes extracted for this evidence table are the DR and FPR for screening for DS using MA alone (≥ 35yrs), NT, fβHCG + PAPP-A, and combined test. It is not clear whether the 12 other chromosomal disorders are included in the unaffected group for the analysis or removed, therefore CI were not calculated for this evidence table. Accuracy of screening methods DR, FPR Limitations The patients were self selected to be an older population. Sydney Ultrasound for Women (SUSW)-a private practice specialising in obstetric ultrasound Sydney Prospective cohort study Grade III-2 Serum markers measured using Kryptor analyser (Brahms Diagnostica). 500 local samples with USS dating used for Gaussian distributions. The MoM was corrected for maternal weight. NT (and CRL) was measured according to FMF standards. FMF software used for GA specific risk. Risk ≥1:300 = high risk. Patients could opt for invasive testing even if not screen positive. Fetal NT and serum screen successfully completed in all cases. Only included where MSS taken before NT in order to remove the chance large NT would be acted on or small NT would have no further screen. 2121 had both NT and MSS completed. 68 had no outcome (including 44 lost to follow-up and 23 spontaneous fetal loss and 1 ToP), so 2,053 included in the analysis Median MA = 32 yrs (range-15-44 yrs). 29% ≥35 yrs. Median CRL = 61 mm (45-84) = GA 11-14 /40. Spectrum of disease: 5 DS /2053 = 0.24%. 12 other chromosomal disorders. verification Data from pregnancies were obtained from referral doctors or patients via letter phone or a feedback form. MA (≥ 35yrs) 80%, 29% NT 100%, 5.7% fβHCG + PAPP-A 80%, 19% combined 100%, 7.2% The combined test FPR was significantly lower than for MA (p<0.01), and for fβHCG + PAPP-A (p<0.01), but similar for NT (p=0.07). The differences in DR not significant. Difficulties implementing any of the screening strategies Median PAPP-A was 0.88 MoM rather than 1 which would have increased the FPR slightly (the PAPP-A in this population would have been more similar to DS resultsincreasing FPR). When the program started it used the data from FMF. New curves were calculated using local data. By the end of study, PAPP-A MoM was low and new medians calculated using 15,000 patients. After the study the PAPP-A and fβhCG very close to 1. This reduced the FPR for MSS from 19 to 14%. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Use local parameters for Gaussian distribution of the serum markers. Difficult to work out the denominator for the FPR as only screen positive number given and not clear whether this includes other chromosomal disorders. Author’s conclusions A combination of MA, NT and MSS gives a high DR for both DS and other chromosomal abnormalities. Reviews conclusions Very small numbers especially for DS limits external validity. 37 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Wojdemann et al. 2005) Compared NT to fβHCG + PAPP-A, and to combined test. Only NT used as intervention marker. March 1998-June 2001 13621 invited to participate. Outcomes In the paper the FPR has been calculated using all screened as the denominator (as opposed to the unaffected). This means the CI calculated for this evidence table will not = those calculated from the paper. Accuracy of screening methods Limitations Small number of cases of DS means wide CI. DR (95%CI) , FPR (95%CI)-risk 1 :250 NT 75% (42.8-93.4), 1.8%. (1.5-2.1) Double 73% (39-93.4), 8.8% (8.1-9.5) Combined 91% (58.7-99.8) , 2.1(1.8-2.5) Not all had all tests. Copenhagen First Trimester Study 3 obstetric departments of University Hospital Copenhagen, Denmark Prospective cohort study Grade III-2 NT and MSS at 11-14/40 (Most had both-possible to choose one or both). NT measured as per FMF. 13 staff measuring NT certified by FMF. TA or TV USS (81% had both, 18% TA, 1% TV alone). GA using CRL. Internal and external quality control. Risk of DS based on NT, GA, and MA assessed by ViewPoint Fetal Database using delta-NT method. Women with a risk 1:250 or more offered karyotyping. Bloods taken immediately after USS and blinded to the results of other test. Sent to Statens Serum Institut and stored at -20C. PAPP-A and fβhCG analysed using respectively an in–house ELISA and commercial kit (EG&G Life Sciences). Study data (n=2702) used for serum marker medians, and literature (Cuckle and van Lith, 1999 as referenced in this paper) for correlations and DS medians. LR for fβhCG + PAPP-A calculated after weight adjustment and multiplied by the MA specific risk of DS at term (as per Cuckle et al. 1987 as referenced in this paper). An adjustment was made to determine risk at screening = MA related risk/ 0.7 (to account for 30% fetal loss from 1st T to birth). The risk in combined test calculated by LR NT X LR serum. 3680 not examined (2118 not interested, 456 “want to but can’t”, 890 miscarriage, 216 other reasons). Removed those without living fetuses, or CRL not 38-82 mm (after 8 months changed to 45-82mm = 11-13+6/40). Removed those without NT measured. Left 8995. 8622 of which were singleton and included in the analysis. Mean MA = 29.3yrs, 10.8% >35yrs versus 14% in general population (i.e. this group younger). 6441 singletons at right GA and had MSS after NT. Spectrum of disease: 12/8622 = 0.14% 15 other chromosomal abnormalities included as unaffected in DS analysis. Verification Chromosomal analysis for all second trimester miscarriage and cross check with chromosomal labs. Infants all examined by paeds. 96.2% followed up through patient records. Difficulties implementing any of the screening strategies None noted. Fetal loss between 1st T & 20/40 = 47. Estimated DS at birth in this population is 1.3-1.7%. Of 9 diagnosed prenatally, 6 would have survived to term (30% fetal loss rate) plus the 3 live births = 9 cases (birth prevalence of 1%). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING The age distribution in study was low–this may have been because women with AMA were entitled to an invasive screen so opted for no screening. No accounting for fetal loss between screening and term, although 2nd T fetal loss had karyotyping. Therefore DR may have been overestimated. May not have ascertained all the DS when look at expected birth prevalence. Author’s conclusions Results confirm that combined screening in nonselected population is better than other screening methods. Believe their focus on quality meant that they obtained low FPR. Reviewers conclusions Small numbers of DS means limited applicability of DR especially as no adjustment for fetal loss in 1st T. However, results suggest combined better than NT, which is better than fβHCG + PAPP-A. 38 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Avgidou et al. 2005) Compared MA as a screen to NT, and combined test. Those attending an OSCAR clinic at FMF centre in London July 1999December 2003 were offered screening by combined test at 1113 +6/40. Outcome The DR and FPR for DS screening was calculated with the other chromosomal disorders removed from analysis leaving 30430. Accuracy of screening methods Limitations Older population limits the external validity and may have increased the DR as the prevalence of DS was high. (supersedes Bindra (2002)) OSCAR clinic (provides pretest counseling, MSS, USS for NT, and post test counselling in a 1 hour visit to a multidisciplinary clinic) Essex, UK Prospective cohort study Grade III-2 The serum markers were analysed using the Kryptor analyser (Brahms). TA USS for NT and CRL by staff with FMF certificate of competence in the 11-14/40 scan. Risks calculated by multivariate approach using published parameters (Spencer et al. 1999 as referenced in this paper) and the MA and GA specific risk of DS at screening (Snijders et al. 1999 as referenced in this paper). MA specific risk multiplied by each LR from NT and weight adjusted MSS. Truncated LR. Combined risk of 1:300 offered invasive diagnostic test. OSCAR was carried out in 31,904 singleton pregnancies 11-13+6/40. No outcome in 53 ToP, 120 miscarriages, and 1167 were lost to follow-up. Outcome in 30564. Median maternal age was 34 yrs (15-49yrs), 48.5% ≥ 35 yrs. Median CRL 63mm (45-84mm). Spectrum of disease: 196 DS/ 30564 = 0.64% Other chromosomal abnormalities in 134. Analysis for DS screening = 30430 MA ≥ 35 yrs Verification Cytogenetics lab, letters and phone calls to women, or GPs or maternity units where delivered. Estimated DS for MA and GA distribution at screening would have been 192. DR (95%CI) for 5% FPR MA 31.5% (24.6-37.6) NT 81.6% (76.2-87.0) Combined 90.3% (86.2-94.5) FPR (95%CI) for DR of 85% MA 52.9% (52.3-53.4) NT 8.2% (7.8-8.4) Combined 2.9% (2.7-3.0) Risk cut-off 1:300 Combined 93.4% (89.0-96.1), 7.5% (7.27.8) Difficulties implementing any of the screening strategies None noted SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING No details of how fetal losses were followed-up. No adjustment for fetal loss ascertainment Authors’ conclusions The most effective method of screening for chromosomal defects is by first-trimester fetal NT and maternal serum biochemistry Reviewer’s conclusions Large sample all had all tests. Good quality and well designed. 39 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Montalvo et al. 2005) Compares MA, PAPP-A, βhCG, NT, and combined test. Routinely screened using combined test in all women presenting before 14/40. Accuracy of screening methods Tertiary Hospital NT measured using methods described by Nicolaides. TA USS unless NT could not be measured then TV USS. GA confirmed by CRL (10-13/40) and BPD (1314/40). All results corrected for MA, race, weight, height, smoking status, and presence of DM. PAPP-A and fβhCG measured using Kryptor analyser (CIS Bio International) Population parameters for PAPP-A and fβhCG from those published by Wald and Hackshaw. Marker levels converted to MoM for GA. Outcomes DR and FPR for screening for DS by MA, PAPP-A, βhCG, NT, and combined test were determined after removal of 24 other chromosomal disorders. Limitations Does not describe the source population so hard to determine if study population biased. Reasons for declining screening were not stipulated. Some may have had large NT and decided not to have serum screen. Madrid, Spain Prospective cohort study Grade III-2 Risk due to MA was calculated as per Cuckle et al. The “factor of division” applied was 0.4554. From these determined the MoM for CRL. Individual risks calculated by multivariate approach. MA specific risk multiplied by the LR derived for the NT and weight adjusted serum marker levels. Recommend invasive diagnostic test at risk cut-off of 1:270. Patients could also elect to have an invasive screen. All singleton pregnancies where outcome known were included. 4538 pregnancies completed screening July 1999-October 2004 included in the analysis. Mean age was 30 yrs (14-49yrs), 25.9% > 35yrs. Mean GA when screened (including for NT) = 11+5/40. 19 DS/4538 = 0.42%. 24 other chromosomal disorders. Verification Unclear number lost to follow-up or efforts to determine outcome of all pregnancies. No details of how outcome determined for screen negative pregnancies. DR (95%CI) for 5% FPR MA 21.1% (6.1-45.6) PAPP-A 47.4% (24.9-69.8) FβhCG 10.5% (1.3-33.1) NT 63.2% (41.5-84.9) Combined 78.9% (60.6-97.3) FPR for 85% DR MA 58.4% (57.0-59.8) PAPP-A 29.4% (28.1-30.7) βhCG 67.3% (65.9-68.7) NT 65% (64.0-66.4) Combined 8.6% (7.8-9.4) Combined (1:270) 78.9% (54-93.4), 3.6 %(3.04.1) Difficulties implementing any of the screening strategies No difficulties noted SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Unclear how outcomes determined or how many lost to follow-up. Removal of other chromosomal disorders will decrease the FPR. No mention of quality control for NT. Authors’ conclusions 1st T screening is efficient and its use may be appropriate in a tertiary hospital. Reviewer’s conclusions Possibly biased sample. Some quality issues such as reporting of attempts to determine outcome. Details lacking or not clear in paper. 40 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions Sample Outcomes and verification Results Comments (Gonce et al. 2005) Compared NT to combined test Prenatal unit at University Hospital First 22/12 of study NT used clinically and combined test results analysed retrospectively after follow-up was complete. Last 8/12 results of combined used clinically (interventionally). Also offered diagnostic test if ≥ 35 yrs. July 2001-December 2003, 103 pregnant women with TWIN pregnancies attending the department for prenatal care or referred for 1st T aneuploidy screening enrolled in the study. Outcomes The outcomes extracted for this evidence table were the DR and FPR for screening for DS by either NT or the combined test. Accuracy of screening methods DR Limitations Population may be high risk (referred for aneuploidy screening) therefore external validity may be threatened Inclusion: twin pregnancies with both alive at 11-14 weeks. Verification 25 women had an invasive procedure: 10 for positive screen (firstly NT then combined in 2nd period), 10 AMA, 3 parental anxiety, and 2 had other high risk factors. Barcelona, Spain Prospective cohort study Grade III-2 Serum taken at 8-12/40 and assayed fresh samples (Delfia, PerkinElmer). USS at 11-14/40. CRL and NT measured as per FMF guidelines. Chorionicity determined as per Sepulveda et al. 1996 (as referenced in this paper). Serum and NT converted to MoM for GA after correction for presence of twins. Risk determined using Delfia software (Wallac, PerkinElmer) by both the NT and combined test. In dichorionic (taken as dizygotic) considered as having an individual risk. Risk calculated using individual NT and serum marker LR (adjusted for twins as per Spencer 2000 as referenced in this paper) 2 pregnancies lost to follow-up. 1 pregnancy diagnostic test cancelled on the death of fetus at risk (observed to have exomphalos). Of the 100 in analysis: Mean MA 33.3 yrs (23-42,) 36% > 34 yrs. 56% were the result of ART and 12% were monochorionic. No other chromosomal abnormalities in the sample. Median GA at serum screen = 11/40 (7.3-13.5/40), and NT screen = 12.5/40 (10.3-14.2/40) Other outcomes from delivery room records or phoning those who did not delivery in their hospital. DR(95% CI), FPR (95% CI) NT (detection of pregnancies) 100%(16-100 ), 14.3% (7.421.2) NT (detection of fetuses) 100% (29-100), 8.6 %(4.7-12.6) Combined (detection of pregnancies) 100%(16-100), 5.1% (0.7-9.5) Combined (detection of fetuses) 100% (29-100), 8.6 %(1.0-6.1) (Difference between NT and combined was not significant) Difficulties implementing any of the screening strategies Deciding which result to use for NT in monochorionic twins (the largest smallest or average of 2 results). Should both have same risk. Spectrum of disease: 3 DS/ 200 1 monochorionic pregnancy both affected and 1 dichorionic with 1 DS fetus. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Sample was well described but not the source population or how recruited. Unclear if consecutive patients were enrolled. Very small sample size only 3 DS and 200 fetuses in 100 women. Very wide CIs. Author’s conclusions The 1st T combined test appears to be more accurate than screening with NT alone in twin pregnancies as it maintains the DR, reduces the FPR and allows identification of the fetus at risk. However when correction for twins made using distribution of own data this reduction was less apparent. The results should be confirmed in a larger study. Reviewers conclusions Small sample. It appears that the performance of the combined test in twins is better than NT but further large prospective cohort studies needed. 41 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Gonce et al. 2005) For monochorionic pregnancies, both fetuses considered to have same risk. Risk calculated using the largest NT and serum marker levels after twin correction. Prenatal unit at University Hospital Barcelona, Spain Prospective cohort study Grade III-2 Sample Outcomes and verification Results Twin serum levels should be about double singletons but in this study while PAPP-A MoM 1.96 the fβhCG 1.57 MoM. Risk also recalculated using the distribution parameters for unaffected pregnancies in this study. Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 42 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Gyselaers et al. 2005) Compared screening by NT, fβHCG + PAPP-A and by combined test. Population had a documented underestimation of NT measurement compared to the FMF. Between 1st January 2004-30th April 2004 13267 1st T maternal serum samples were analysed by AML (a lab in Antwerp). Outcomes The outcomes extracted for this evidence table were the DR and FPR for screening for DS by NT, fβHCG + PAPP-A, and by combined test. This was calculated for a 12/40 risk cut-off of 1:200 and 1:300. Accuracy of screening methods DR (95% CI), FPR (95%CI) Limitations 37 fetal losses with no karyotype - may have been some DS. Similar methods as many other studies so comparison with other 1st T studies may be ok. 264 obstetric practices (mostly private) in 35 centres. Flanders, Belgium Prospective cohort study NT was performed according to FMF by those accredited by FMF, but only 6/264 FMF accredited. They measured 11.4% of the NT values. 2700 local NT/CRL used to determine MoM relative to the FMF. FMF medians used to calculate MoM. Analysis of PAPP-A using ELISA (DRG International) and fβhCG using radioimmunoassay (Biosource). Each result adjusted for maternal weight, multiple gestation, smoking, and ethnicity-converted to MoM. Used parameters reported by de Graaf et al. 1999 as referenced in this paper. Risk calculation using the algorithm reported by Wald et al. 1988 and Reynolds et al. 1989 (as referenced in this paper). Samples were obtained from women attending 264 obstetricians in 35 centres in Flanders. In this region 95% of prenatal care is provided by private obstetricians. Information sent to AML included CRL and NT. Exclusion: those with chromosomal disorders other than DS (n= 23), fetal losses where karyotype unknown (n=37). 13207 in screening test analysis. Spectrum of disease: 26 DS /13230 = 0.20%. Verification At least once per year obstetrician reported outcome of all screened pregnancies. Non responding obstetricians were contacted by phone. The estimated birth prevalence of DS from the MA distribution and British DS register = 19.4. Considering 43% spontaneous fetal loss from 1st T to birth and no intervention for the 17 ToP, then 18.7 expected at term in study population (9 live births plus 17 x0.75) NT 1:200 38.5% (20.0-57), 3.2% (2.9-3.5) 1:300 42.3% (23.3-61.3), 4.8% (4.5-5.2) DR for 5% FPR 42.3% (23.361.3) fβHCG + PAPP-A 1:200 80.8% (66-96), 11.6% (11.1-12.1) 1:300 80.8(66-96), 16.8% (16.1-17.4) DR for 5% FPR 61.5% (42.880.2) Combined 1:200 76.9% (60.7-93.1), 5.5% (5.1-5.9) 1:300 80.8 (65.6-95.9), 8.6% (8.08DR for 5% FPR 73.6% (56.090.1) Difficulties implementing any of the screening strategies Recently had reported a systematic underestimation of the NT MoM in this database compared to the FMF. Majority of those performing the NT measurement did less than 50 over 3 yrs. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Results of age and NT poor. Did not use regional centre or operator specific NT values. MoM had poor correlation with FMF values. Those with other chromosomal disorders removed from analysis which may slightly decrease the estimate of FPR compared to other studies as most include these in the unaffected cohort. Not blinded. When bloods sent, NT values sent as well. Only 11.4% had access to NT measurements as per FMF criteria. Authors’ conclusions The introduction of 1st T combined screening with unspecified USS methodologies was very easy. The performance was less than in single centres using FMF scanning criteria but the easy access to screening and the contribution from serum markers were responsible for the majority of DS detected in this population. Reviewer’s conclusions The results of the NT screening may not be applicable as most staff were not FMF trained. 43 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Soergel et al. 2006) Comparison of 1st T screening by MA alone, fβHCG + PAPP-A, NT, and combined test. Those attending 1 of 3 centres (maternity unit of university hospital, or 2 private prenatal diagnosis centres) June 1998October 2002 at GA 11-14/40 were offered screening by NT and serum markers (PAPP-A and fβhCG). Outcomes Outcomes were DR and FPR for screening for DS using MA, NT, or the combined test. The paper gave FPR as FP/total screened-this was recalculated using FP/unaffected. Accuracy of screening methods Limitations 301 had NT then did not have MSS as either the NT was very low and decided not to have serum screen or NT was very high and decided to have invasive testing. This would bias the sample and lead to a decreased performance of the combined test (decreased DR). University hospital and 2 regional private prenatal diagnosis centres Hanover, Germany Prospective cohort study Grade III-2 NT measured using FMF criteria when CRL 45-84mm. All sonographers were FMF trained. Serum taken and sent to lab on same day as NT. PAPP-A and fβhCG levels analysed using Kryptor analyser (Brahms). Parameters in twin pregnancies adjusted using approach of Wald et al. 1991 as referenced in this paper. Risks calculated using FMF computer algorithm with the PIA Fetal Database programme (ViewPoint). Uses MA specific risk for GA as per Snijders et al. (as referenced in this paper). Given partial results based on NT results and a priori risk before serum screen. Risk cut-off of 1:300 used to recommend invasive screen. 4394 had screening and 409 refused to participate. Analysis included those with CRL 45-84 mm (11-14/40) and with known pregnancy outcome. Total = 2497 includes 2423 singleton and 37 twins. 1450 did not have outcome documented as: incomplete data, refused to give details, wrong DOB for fetus, or other reason. All of the 2497 had MSS, 301 had just NT, and 2196 had both NT and MSS. Median MA = 32.5yrs (16-44 yrs)slightly older than general German obstetric population (29.7yrs). 26.4% ≥ 35. Median GA at NT =12+4/40 and median CRL = 62mm (45-84mm). Spectrum of disease: 11 DS/2497 = 0.44% 13 other chromosomal abnormalities. Twins included so analysis is per fetus rather than per pregnancy. Other chromosomal abnormalities = “unaffected.” Verification Invasive test or questionnaires about birth outcomes given to patient or sent to gynaecologist. Delivery room records cross checked. Risk 1:300 MA 63.6% (30.8-89.1), 28.3% (2730) NT 81.8% (48.2-97.7) , 5.1% (4.36.0) fβHCG + PAPP-A 87.5% (47.3-99.7), 15.4% (13.8-16.9) Combined 87.5% (47.3-99.7), 4.0% (3.24.8) Difficulties implementing any of the screening strategies None noted Expected DS at birth = 6 (algorithm published by Hecht and Hook, 1994 as referenced in this paper). Fetal loss 12 weeks-birth = 30% leading to an adjusted prevalence of 9 at 12/40. (6 x 1/0.7 = 6 x 1.43 =8.6 at 12 weeks). Ascertainment is probably OK. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Small number of DS (8 in the group having both screening methods). Doesn’t state attempts to karyotyping any fetuses lost between screening and term, nor any adjustment. Author’s conclusions “This provides further evidence that 1st T screening using a combined test is effective for the detection of DS at 11-14/40 with DR of about 90% for FPR of about 5%”. Reviewers conclusions Some bias in the sample may mean the performance of the combined test is underestimated. 44 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Marsk et al. 2006) Comparisons were made between screening by MA alone, NT, and the combined test. The cases and controls were identified from the Swedish NT trial (n=39,572). This trial randomised women to either a 1st T or 2nd T USS. 47 pregnancies with DS were identified from the early scan group and 3 controls chosen for each case. Outcomes DRs and FPRs for screening for DS (risk 1:250) using MA alone, NT, and the combined test. Accuracy of screening methods DR (95%CI), FPR (95%CI) Limitations The aim of the study was not to determine an accurate DR and FPR rather to determine what extent adding MSS to NT in 1st T would change the DR and FPR in individuals and reduce the need for invasive testing. Stockholm, Sweden Nested case-control study Grade III-2 Stored (-70C) Serum which had been taken for Rubella screening was reanalysed. fβhCG and PAPP-A analysed using commercial kits (DelfiaPerkinElmer). Risk calculated based on MoM and MA using Lifecycle software (Wallac). The controls were randomised as close as possible to the case in time and geography and from within 5 yrs of MA. Only those living in the Stockholm area were eligible. Verification No details given MA 67%(51-84), 37%(28-46) NT 94%(79-99), 20%(13-28) Combined 94%(79-99), 7%(2-12) Difficulties implementing any of the screening strategies None noted Excluded those refusing to participate, with too little serum or serum from before 8/40 or after 14/40. 33 controls (24%) excluded and 16 DS cases (34%) excluded. MA of cases was 38.5 yrs ± 4 (SD) versus 35.5 yrs ± 4 (SD) which was statistically significant (p<0.01). Storage time did not differ between the two groups. The study design (case control) limits the ability to generalise the DR and FPR calculated in the study. The spectrum of disease may not reflect that in the general population. 29% of the pregnancies in the sample were pregnancies with DS, and the sample did not contain other chromosomal disorders. The sample may be biased if those refusing to participate had a different DS risk compared to those in the analysis (e.g. if NT measurements differed). Unclear how DS cases had been identified. Also, no details of verification were given. This is important if cases had been identified through large NT. Stored sera used for this retrospective analysis. This may affect the performance of the MSS. Sample size of 139 The small number in the sample is reflected by the wide CI’s. Spectrum of disease: 31DS/139= 22% Equipment provided by PerkinElmer. Author’s conclusions The study confirms that when 1st T MSS is added to NT and MA the number of women screen positive decreases without a decrease in the DR. . SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 45 Table 10. Source Country Setting Study design Evidence Grading Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Comparison screening strategies Sample Outcomes and verification Results (Marsk et al. 2006) Comments Reviewers conclusions The study did not aim to accurately determine the DR and FPR for screening for DS. The design and possible biases mean these results should be regarded with caution. Stockholm, Sweden Nested case-control study Grade III-2 Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 46 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (O'Leary et al. 2006) Compared 1st T screening for DS using NT, fβHCG + PAPP-A, and combined test in WA. Between August 2001 and October 2003, 26,641 women had 1st T combined screen in this period. Outcomes Outcomes were the DR and FPR for NT, fβHCG + PAPP-A and combined test. Accuracy of screening methods Limitations Retrospective nature of the study means there may have been fetal loss bias and overestimate of performance. Also sample may have been biased by the proportion not completing screening (removed from analysisno further details). These may have only had NT and declined further screening because either NT was very high (and had invasive screen) or very low (and had no further testing) 13 independent and government funded ultrasound clinics Western Australia (WA). Retrospective cohort study. Grade III-2 Screening NT data were obtained from all 13 ultrasound clinics undertaking screening across WA between August 2001 and October 2003. NT was measured as per FMF guidelines. Two labs provided MSS data, using either Kyrptor analyzer (Brahms) (92% of data) or Wallac (PerkinElmer). Serum markers converted to MoM (lab specific) and adjusted for maternal weight. Removed twins, and “incomplete” screening (n = 3,946) and 415 no outcome. Those with no outcome may have been ToPs, spontaneous fetal losses, or relocations out of state. Analysis was for 22,280 singleton pregnancies between 11-13/40. Median MA = 31 yrs (14-47). Compared with 29 yrs for all WA women giving birth in same period. GA for NT = 12+4/40, and for serum = 12+3/40. Spectrum of disease: 60DS/22,280 = 0.27% 78 other chromosomal disorders. The FPR given in the paper was FP/"all screened”. FP/unaffected was calculated for this table. Outcomes obtained by linking screening data to outcome data using probabilistic recordlinkage. Statewide data collections (registries for midwives notifications, birth defects, and hospital separations) combined with providers screening data. DR (95%CI),FPR (95%CI) (1:300) NT 73% (62-85), 8.6% (8.2-8.9) fβHCG + PAPP-A 85% (76-94), 11.5% (11.111.9) Combined test 83% (74-93), 3.7% (3.5-4.0) Difficulties implementing any of the screening strategies None noted. 78 other chromosomal disorders included in analysis as unaffected. Verification As above review of Midwives Notification System and the Birth Defects Registry between August 2001-October 2003 (children were aged 3/1230/12). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING No adjustment for fetal loss between screening and term. Few details available for how risks calculated (which parameters were used etc). Authors’ conclusions This study provides evidence that 1st T combined screening performs well compared to that predicted by clinical trials. Reviewer’s conclusions While the study appears well planned and the sample was large and representative of the general population, the design (retrospective) makes it difficult to avoid bias such as verification bias. 47 Table 10. Source Country Setting Study design Evidence Grading (O'Leary et al. 2006) 13 independent and government funded ultrasound clinics Western Australia (WA). Retrospective cohort study. Grade III-2 Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Comparison screening strategies Sample Outcomes and verification Results Birth Defects Registry includes ToP for fetal abnormalities and has been validated. All true positives for DS had invasive screen. Estimated DS for MA distribution = 43.5. With fetal loss from 12/40-birth adjustment of 25% = 58 (95% CI 57-63) (Multiply 43.5 by 0.75 = 58). Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 48 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions Sample Outcomes and verification Results Comments (Spencer et al. 2000a) The aim of the study was to clarify whether total hCG is a marker of DS in 1st T of pregnancy. Three groups of women comprised the study group. Outcomes DR for fixed 5% FPR for 1st trimester screening with AFP, total hCG, AFP + total hCG, PAPP-A + total hCG, and PAPP-A + fβhCG. Accuracy of screening methods (from text only % DR given) Limitations Biased sample. Of the 130 DS some were already screen positive by NT and 1st T MSS. DR for fixed FPR 5% 1st T AFP 31% 1st T total hCG 31% 1st T AFP, total hCG 40% 1st T PAPP-A, total hCG 52% 1st T PAPP-A, free βhCG 67% Some of the markers analysed on stored samples. Case samples stored considerably longer than control samples. Statistical Modelling Analysed AFP and total hCG in large series of DS comparing these with other markers in 1st T. Regression analysis to determine relationship between marker level and gestational age. Converted each level to MoM for the gestational age. Performance modelled using standard modelling techniques as per Royston and Thompson. Observed population parameters for AFP and total hCG. Observed parameters for total hCG and AFP and Spencer et al. 1999 for others. 15,000 MoM generated for each marker from the Gaussian distributions of the log MoM affected and unaffected pregnancies. Values then used to calculate likelihood ratios with MA specific risk of DS in1st T for a population with a MA distribution of England and Wales. The first was women with singleton pregnancies who were referred to the Harris Birthright Centre for fetal karyotyping because of MA and NT screening at 10-14 weeks high risk for DS. 2nd group women self referred for assessment of risk. Blood collected from women at time of NT and serum stored -20C (before blinded retrospective analysis). Gestational age from CRL. Outcome ascertained in all women. 90 cases of DS part of previous study. To supplement these, third group of 40 samples from cases of DS from a Glasgow centre part of the Combined Ultrasound and Biochemistry Study in Scotland. In this series total hCG AFP PAPP-A and fβhCG all done at same time. Results also given for PAPPA + total hCG using GA specific parameters. Verification Outcome obtained in all pregnancies. Controls resulted in the birth of unaffected babies DR using population parameters specific to the gestational age Fixed 5% FPR total hCG & PAPP-A 70-83 days 47% total hCG & PAPP-A 84-97 days 60% SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Observed parameters for total hCG and AFP and those from Spencer et al. 1999 (as referenced in this paper) were adopted for other variables Acknowledge the support of CIS in providing reagents and instrumentation. Author’s conclusion “This median shift has significant implications for interpreting previous studies and even more significant implications for DRs. This observation explains much of the confusion around total hCG in the first trimester and shows the importance of selecting analyte pairs and population parameters appropriate to the time in gestation when screening is performed.” 49 Table 10. Source Country Setting Study design Evidence Grading Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Comparison Interventions Sample (Spencer et al. 2000a) (total 130 DS cases) Statistical Modelling Median MA cases 37 (1745) Gestational age at storage time 86 days (43-97) CRL 61mm (42-84) Sample storage time 937 days (150-2317). Continued Outcomes and verification Results Comments Difficulties implementing any of the screening strategies Performance of PAPP-A + total hCG worse using parameter before 84 days. Reviewers’ conclusions When PAPP-A + total hCG analysed using appropriate parameters screening performance similar to PAPP-A, free βhCG at 84-97 days. Controls 10-14/40 obtained from samples taken as part of the routine 1st T screening in OSCAR clinic (Harold Wood). PAPP-A and free βhCG taken then samples frozen -20C. Further 90 specimens obtained from the Glasgow centre. Total of 959 controls used to establish medians and for reference data. Median MA controls 28.8 (15-45) Gestational age storage time 83 (42-97) CRL 56mm (36-84) Sample storage time 435 days (350-502) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 50 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Christiansen and Olesen Larsen 2002) The contingent screening was compared to 1st T MSS, NT, and 1st T combined test. Monte Carlo simulation The aim of the study was to assess the discriminatory efficiency and costeffectiveness of a 1st T contingent screening method compared to 1st T combined screening. The protocol involved all women having 1st T MSS (PAPP-A and β hCG) prior to 11/40 and only those with an intermediate risk (including MA associated risk) having NT and those with very high risk having an invasive screen (CVS). The reasoning was to explore the possibility of rationing NT which requires training, expensive equipment and extensive quality control. Sample Outcomes and verification Results Comments Outcomes DR and FPR of 1st T Contingent Screening, 1st T MSS, NT, and combined test for a fixed cut-off. Accuracy of screening methods Limitations Distribution of serological markers from three different sources with no description of two of these populations. DR, FPR (risk cut-off for double screen of 1:1000, and final risk 1:400) 1st T Contingent 78.9%, 4% PAPP-A + β hCG 74.2%, 9.6% NT 74.0%, 5.7% Combined test 85.5%, 4.4% (Only 19.4% of women offered NT testing.) Distribution of 1st T serological markers from literature (Cuckle) and 2nd T from Wald (1994) NT distribution from a large UK multicentre study (Nicolaides). MA distribution was standardised (Van de Veen). MA related risk of DS based on formula from Cuckle and refers to risk at term. Monte Carlo simulation used to evaluate performance of different screening methods. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Author’s conclusion Contingent testing can reduce costs with a small decrease in performance. Contingent testing is attractive in areas where there is restricted access to NT screening preventing the introduction of 1st T screening. We have shown here that NT screening need not be applied to the whole population in order to reach a satisfactory 1st T screening performance. Reviewers’ conclusions As per authors’ conclusions this strategy shows promise. Needs prospective cohort study. 51 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Christiansen and Olesen Larsen 2002) For estimates of the DR and FPR for contingent screening firstly determined the distribution of risk values after serological screen. Used the published NT distributions to determine the cut-off of the serological marker risk, where no NT measurement could reduce the final risk to <1:400. Monte Carlo simulation Continued Sample Outcomes and verification Results SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 52 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions Sample Outcomes and verification Results Comments (Spencer et al. 2002) The aim of the study was to examine the values of HhCG (hyperglycosylated) with a sialic acid-specific lectin immunoassay in 1st T of pregnancy. Maternal serum samples from unaffected and DS pregnancies from women attending Harold Wood hospital. Collected as part of research studies in to 1st T screening since 1998 as part of routine screening OSCAR incorporating PAPP-A, fβhCG, USS for CRL and NT by FMF trained ultrasonographers. Outcomes DR for fixed 5% FPR for HhCG alone and in combination with conventional markers in a population with the MA distribution of pregnancies in England and Wales 19971999. Accuracy of screening methods Limitations Means from small sample for HhCG DR at fixed 5% FPR ThCG 37.0% HhCG 48.6% ThCG & PAPP-A 49.0% fβhCG 49.5% PAPP-A 50.8% HhCG & PAPP-A 61.2% fβhCG & PAPP-A 69.5% NT 73.5% NT & ThCG 75.5% NT, ThCG & PAPP-A 78.7% NT & HhCG 79.9% NT and fβhCG 80.1% NT & PAPP-A 81.2% NT, HhCG and PAPP-A 83.0% NT, fβhCG and PAPP-A 88.9% Controls and cases had different storage time. DS stored longer and although not thawed since collection may have influenced the results. Statistical modelling Results of a case-control study used for model. Maternal serum fβhCG, PAPP-A and ThCG were measured using commercial kits (Brahms). Maternal serum HhCG measured at two dilutions in singleton pregnancies using the lectin immunoassay. Used the means of the two results corrected for dilution in analysis. Analysis of samples blinded to outcomes. All marker levels converted to MoM for unaffected pregnancies at the same gestational age using previously described relationships (Ong et al. 2000 and Spencer et al. 2000a as referenced in this paper) and for HhCG from this study. Correction for maternal weight as per Neveux et al. 1996 as referenced in this paper. Statistical modelling was performed (as described by Royston and Thompson, 1992 as referenced in this paper) in order to generate a series of 15000 random MoM values for each marker from the Gaussian distribution of the log10 MoM for DS and unaffected pregnancies. Maternal serum had been stored at -20C. 224 unaffected and 54 DS were retrieved from the archive. DS MA 36.1 years (20-44) Gestational age days 87 (73-97) CRL 62mm (38-85) Storage sample time 1187 days (147-2354) Controls MA 30.4 (16-41) Gestational age 84days (70-96) CRL 55mm(33-80) Sample storage time 127 (116-1460) Difficulties implementing any of the screening strategies None noted SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Authors’ conclusion “Maternal serum HhCG is unlikely to be of additional value when screening for DS in 1st T” Reviewers’ conclusions As per authors’ conclusions. 53 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Spencer et al. 2002) The parameters for HhCG were derived from this study, those for ThCG from Spencer et al. 2000a, those for fβhCG and PAPP-A and NT from Spencer et al. 1999 (as referenced in this paper). Statistical modelling Continued Sample Outcomes and verification Results These were used to calculate LR and combined with MA specific risk of DS (using Snijders et al. 1999) to calculate the DR in a population with the MA distribution of pregnancies in England and Wales 19971999. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 54 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions Sample Outcomes and verification Results Comments (Spencer and Cuckle 2002) The aim of the study was to assess the within person variability of 1st T MSS markers and to see whether repeat measures of PAPP-A and fβhCG during 1st T would increase DR. Harold Wood Hospital Essex. Outcomes Compared screening using one sample to using two samples at 10/40 and 12/40, with and without NT. Accuracy of screening methods Limitations Estimates only for the DS correlation (no DS cases in the sample). Statistical Modelling All women who book for maternity care are offered an appointment at the OSCAR clinic of 11/40. At this appointment MSS is done and then USS. Those having USS who are noted to have CRL less than 45mm are given a further appointment to attend for NT when GA is between 11/40 and 13+6/40 (as the FMF algorithm is based on data 11-14/40). A further MSS is therefore taken at this repeat appointment. fβhCG and PAPP-A analysed using Kryptor analyzer (Brahms). Samples analysed straight away. Converted to MoM using previously established GA and maternal weight adjustments. 261 pairs of data available (unaffected fetuses) 10-13+6/40. In routine practice about 8% initially attend when GA too early for NT and so need a repeat serum sample. 261 pairs of data were available for analysis over a three year period. All were unaffected pregnancies. Median MA = 28.97. 95% Caucasian. Median GA 1st sample: 10.7/40 Median 2nd sample: 12.6/40 DR For a fixed 5% FPR. PAPP-A + β hCG at 10/40 67% PAPP-A + β hCG at 10/40 & NT 87.2% PAPP-A + β hCG at 12/40 67% PAPP-A + β hCG at 12/40 & NT 87.3% PAPP-A + β hCG at 10/40 & at 12/40 70.5% PAPP-A + β hCG at 10/40 & 12/40 & NT 88.6% Difficulties implementing any of the screening strategies None noted. The standard deviation of fβhCG and PAPPA in each sample was estimated, and the correlation between the markers, within and between the two samples, was determined excluding outliers. Statistical modeling was used to predict DR for a fixed FPR of 5%. Modelling by numerical integration method (as per Royston and Thompson) used to model risks over a multivariate Gaussian distribution of log marker levels and the observed distribution of MA. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Small sample. Authors’ conclusion When women present twice in 1st T it might be OK to include both measures. However, no justification for taking two samples routinely. Reviewer’s conclusions As per authors conclusions. 55 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Spencer and Cuckle 2002) NT parameters from FMF study (Nicolaides et al. 1998 as reference in this paper). Parameters for unaffected MSS markers were from their own series and means taken to be 1 MoM. For DS the SD and correlation coefficients from a meta-analysis of the difference between the variancecovariance matrices of affected and unaffected pregnancies (as described by Cuckle and van Lith 1999 as referenced in this paper). Statistical Modelling Continued Sample Outcomes and verification Results The means for DS for 1st and 2nd sample taken as those for 10 and 12/40 respectively. Assumed NT independent of MSS markers in MoM. MA specific risk derived from published meta-analysis (Cuckle et al. 1987 as referenced in this paper) and the MA distribution was that of England and Wales 1994-1998. Truncation limits applied to the MoM outside a certain range. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 56 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions Sample Outcomes and verification Results Comments (Christiansen and Jaliashvili 2003) The aim of the study was to determine the performance of PAPP-A using a polymonoclonal assay to see whether it is as efficient as double monoclonal assay in discriminating between DS and unaffected pregnancies. Samples obtained through 1st T MSS screening programme for syphilis and other diseases from Statens Serum Institute. Outcomes For this evidence table the DR and FPR of PAPP-A in combination with age, βhCG and NT were extracted from the paper. Accuracy of screening methods (DR,FPR) Limitations Data susceptible to biases typical of case control studies: particularly selection bias. Normal controls-i.e. no other chromosomal disorders. Monte Carlo simulation An in house poly-monoclonal assay for PAPP-A was developed and performed as detailed in the paper. All PAPP-A measurements were converted to the MoM of unaffected pregnancies for the gestational age. Distribution data for the other 1st T serum markers (βhCG and NT) were obtained from a recent meta-analysis (Cuckle and van Lith 1999 as referenced in this paper). DR in a population were estimated with a Monte Carlo simulation using MA specific risk (Cuckle et al. 1987 as referenced in this paper) and a standardized age distribution of women giving birth (van der Veen et al. 1997). All simulations were done with 100,000 cases. Truncation limits for PAPP-A were applied as detailed in this paper. 39 DS and 167 controls with normal pregnancy outcome. Samples collected as part of another study. All samples stored under identical conditions. Gestational age from LMP and in most cases confirmed by USS. All DS diagnosed in 2nd T as a result of screening or at birth (i.e. would not be biased by being screen positive or negative cases already by another serum test). Verification All DS verified by karyotyping. No other information. PAPP-A Risk 1:100 50.1%, 2.3% Risk 1:250 66.7%,6.4% Risk 1:400 74.5%,10.2% DR at 5% FPR 62% PAPP-A, βhCG Risk 1:100 61.8%, 2.1% Risk 1:250 75.5%, 5.3% Risk 1:400 81.3%, 8.1% DR at 5% FPR 74% PAPP-A, βhCG, NT Risk 1:100 79.4%, 1.1% Risk 1:250 85.9%, 2.7% Risk 1:400 88.8%, 4.0% DR at 5% FPR 90% Difficulties implementing any of the screening strategies The simplicity of the detection technology of the assay is obtained at the expense of the possibility of automation. Automation reduces the risk of human error. However felt this could be suitable in smallmedian lab if results not required within less than 5 hours. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Small sample for parameters of PAPP-A especially for DS pregnancies. Unclear if there was blinding to outcome. Author’s conclusion In conclusion we have described a high performance PAPP-A assay. Reviewers’ conclusions Some limitations in design and information about the population. 57 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions Sample Outcomes and verification Results Comments (Laigaard et al. 2003) The aim of the study was to assess ADAM12 as a 1st T and 2nd T marker for DS. Unaffected samples from 1st T pregnant women 154 obtained as part of routine screening for DS in a university hospital Denmark. Screening at 8-13/40 and also have USS. Outcomes The outcomes extracted were the DR and SPR for ADAM12, PAPP-A, fβhCG and NT alone and combined at risk cut-offs of 1:200 and 1:400. Accuracy of screening methods Limitations Very small samples e.g. for DS 1st T =18 All with age DR, SPR In-house ELISA. ADAM12 1:200 77.7%,1.5% 1:400 81.5%, 3.2% No other details of length of time in storage or details of MA, ethnicity gestational age. Verification All DS by karyotyping. PAPP-A 1:200 52.3%, 5.1% 1:400 66.2%, 11.2% Monte Carlo simulation An ELISA (enzyme-linked immunosorbent assay) for the quantification of ADAM12 (A disintegrin and metalloprotease) was developed. Median ADAM12 concentrations were estimated for the gestational age. Marker levels were converted to the MoM of unaffected pregnancies. Compatibility with the normal distribution assessed using normal plots. Monte Carlo simulation done using a standardised age distribution (van der Veen et al. 1997 as referenced in this paper). The MA specific risk was taken from Cuckle et al. 1987. The distribution parameters for PAPP-A, βhCG and NT were taken from published meta-analysis (Cuckle and van Lith, 1999). Risks for cut-offs were 1:200 and 1:400 for term. ADAM12 in DS did not differ from unaffected in 2nd T so no further analysis was done. 2nd T unaffected screening sample obtained from 91 women in another screening programme (prenatal screen for severe malformations and DS at Statens Serum Institute). Women 14-20/40-no further analysis of 2nd T done. All these samples 1st T and 2nd T were kept cool (4C) before postage. DS samples 1st T samples n=18. Some were from the same University hospital programme (n=3) and were identified as part of the programme (i.e. already screen positive) and 15 samples from the quality control programme at Statens Serum Institut diagnosed (2nd T, n= 10) or at birth ( n=5). βhCG 1:200 42.4%, 5.1% 1:400 59.9%, 12.9% NT 1:200 67.4%, 2.8% 1:400 74.3%, 5.9% ADAM12 and βhCG 1:200 82.8%, 1.5% 1:400 86.3%, 3.1% Only the 3 from University hospital were taken at the same time as the controls and were recent whereas the other 1st T DS samples (n=15) were stored at -20C for many years. Only 3 DS (and 154 unaffected) used for relation between ADAM12 and PAPP-A and βhCG. Used 15 DS from the Staten Serum Institut where the same analysis had been done for βhCG and PAPP-A. No other details of length of time in storage or details of MA, ethnicity gestational age. Author’s conclusion ADAM12 and βhCG and PAPPA 1:200 85.4%, 1.6% 1:400 88.7%, 3.0% While further prospective studies are clearly needed the data here suggest that ADAM12 is a potentially valuable marker for use in prenatal screening ADAM12 and βhCG and PAPPA and NT 1:200 92.4%, 0.8% 1:400 94.1%, 1.5% Reviewers’ conclusions While results suggest the ADAM12 may be discriminatory the study was potentially biased and had a small sample (especially for DS). Needs further large prospective studies. Difficulties implementing any of the screening strategies None noted SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 58 Table 10. Source Country Setting Study design Evidence Grading (Laigaard et al. 2003) Monte Carlo simulation Continued Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Comparison Interventions Sample Outcomes and verification Results Only the 3 from University hospital were taken at the same time as the controls and were recent whereas the other 1st T DS samples (n=15) were stored at -20C for many years. 2nd T DS samples n=12 were all from quality control programme at Statens Serum Institut diagnosed in 2nd T (n=8) or at birth (n=4). (Biased as already positive by 2nd T screening). Gestational age from LMP with most confirmed by USS. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 59 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions Sample Outcomes and verification Results Comments (Palomaki et al. 2005) Compared screening using ITA (HhCG) in the 1st T to using PAPP-A +fβhCG or PAPP-A +hCG with and without NT. Also compares screening in 1st T using DIA. The maternal samples were collected between 1994 and 1996 as part of a previous trial. 16 centres in California and elsewhere in the USA recruited women already scheduled to have amniocentesis or CVS at 915/40. Outcomes The paper reports the performance of ITA and other 1st T serum markers at various fixed FPR and fixed DR and for various risk cutoffs. The performance is also stratified by week of gestation. For this evidence table the data for DR and FPR for 5% FPR, 85% DR and cut-off of 1:250 has been extracted. These are given for weeks 11-13/40 combined. Accuracy of screening methods Limitations Model used data with a small sample and casecontrol design but appeared to be nested in another study. Statistical modelling As part of an earlier observational study, serum samples from 54 DS and 276 matched unaffected controls were collected between 9 and 15/40 Samples had been aliquoted and stored at 20 degrees C for 8 years. ITA was measured and converted to weight-adjusted multiples of the median (MoM). The distributions of other 1st T markers are from a single published study. Serum collected before amniocentesis and assayed for AFP, uE3, hCG, fβhCG and PAPP-A then aliquoted and stored at -20C. Case control series was constructed: 5 controls matched for each case of DS. Matched for time in storage, MA, gestational age, “race”, site where sample obtained. Mostly was for AMA. Collected demographic info including how dated, weight, “race”, diabetic status, gestational age based on USS BPD or CRL, and LMP None had had serum screen or NT in this pregnancy. 54 DS and 276 matched controls 9-15/40 gestation For this study a “never thawed” aliquot was available for 54 DS and 276 controls. These 330 samples sent (blinded) to Quest for ITA measurement using automated immunochemiluminometric assay (minimal cross reactivity with hCG). No further details of this sample. Verification Amniocentesis DR For a fixed 5% FPR (all with MA and for 11-13/40 combined). PAPP-A & fβhCG 67% PAPP-A & hCG 64% PAPP-A & ITA 67% PAPP-A, fβhCG & NT 84% PAPP-A, hCG and NT 84% PAPP-A, ITA & NT 84% PAPP-A, fβhCG, DIA & NT 88% PAPP-A, hCG, DIA and NT 87% PAPP-A, ITA, DIA & NT 87% PAPP-A, fβhCG, & ITA 72% PAPP-A, fβhCG, ITA, and NT 86% PAPP-A, fβhCG, ITA, DIA and NT 88% SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Mostly AMA. Some support from screening industry for the study. Partially funded by Quest Diagnostics, and Quest Diagnostics also assisted in the assaying of the samples for various biomarkers. One author worked for Quest. Author’s conclusion Serum ITA appears to be a useful first-trimester DS marker that could replace fβhCG measurements while maintaining performance. Reviewers’ conclusions. The use of ITA as a 1st T marker shows promise but needs further investigation. 60 Table 10. Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Palomaki et al. 2005) All results converted to MoM and corrected for maternal weight (Neveux 1996). Adjustments to the other variables and population variables published elsewhere (Wald et al. 1988 as referenced in this paper). Statistical modelling Continued SD of ITA in DS adjusted to take into consideration the varying concentrations by gestational week. Correlation coefficients between ITA and serum markers in 2nd T DS and unaffected were derived after exclusion of outliers. Correlation between NT and ITA in DS and unaffected assumed to be zero. Screening performance modelled for combinations of serum ITA and other 1st T serum markers with and without NT. Modelling method based on overlapping Gaussian distributions as described elsewhere. Used MA age distribution of the USA for 2000. Sample Outcomes and verification Results FPR For a fixed 85% DR (all with MA and for 11-13/40 combined). PAPP-A & fβhCG 16% PAPP-A & hCG 20% PAPP-A & ITA 16% PAPP-A, fβhCG & NT 5.6% PAPP-A, hCG and NT 6.8% PAPP-A, ITA & NT 5.6% PAPP-A, fβhCG, DIA & NT 3.7% PAPP-A, hCG, DIA and NT 3.7% PAPP-A, ITA, DIA & NT 3.8% PAPP-A, fβhCG, & ITA 12% PAPP-A, fβhCG, ITA, and NT 4.3% PAPP-A, fβhCG, ITA, DIA and NT 3.2% SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 61 Table 10. Source Country Setting Study design Evidence Grading (Palomaki et al. 2005) Statistical modelling Continued Evidence table of primary research studies appraised investigating the accuracy of first trimester combined screening compared to components (continued) Comparison Interventions Sample Outcomes and verification Results Risk cut-off of 1:250 DR, FPR (all with MA and for 11-13/40 combined) PAPP-A & fβhCG 78%,11% PAPP-A & hCG 76%, 11% PAPP-A & ITA 81%, 12% PAPP-A, fβhCG & NT 85%, 5.8% PAPP-A, hCG and NT 84%, 5.8% PAPP-A, ITA & NT 86%, 6.3% PAPP-A, fβhCG, DIA & NT 87%, 4.9% PAPP-A, hCG, DIA and NT 87%, 5.1% PAPP-A, ITA, DIA & NT 88%, 5.4% PAPP-A, fβhCG, & ITA 82%, 9.9% PAPP-A, fβhCG, ITA, and NT 87%, 5.6% PAPP-A, fβhCG, ITA, DIA and NT 88%, 5.1% Difficulties implementing any of the screening strategies None noted SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 62 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 63 Chapter 4: Comparison of Second Trimester Screening Strategies PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY The search identified 13 eligible papers comparing the accuracy of second trimester screening strategies. Below is an overview of study designs and aspects of quality represented by these studies. Full details of the papers appraised, including methods, key results, limitations and conclusions, are provided in evidence Table 13 (pages 69-84). Studies with directly observed DRs and FPRs are presented first, followed by studies where the results are estimations of performance using statistical modelling. For each of these two groups of papers, studies are presented in chronological order of publication. Study design, and grading All 13 papers appraised in this chapter are based on primary research; there were no papers based on secondary research that fitted the inclusion and exclusion criteria of the review protocol. Of the 13 papers comparing second trimester screening tests, ten reported directly observed comparisons of performance, two of which also presented some results estimated by statistical modelling (Monte Carlo simulation. Three other papers only reported comparisons of DR and FPR estimated by modelling. Of the papers reporting directly observed comparisons, five studies used a cohort study design and sample sizes ranged from 2,833-854,902. Two of these studies were retrospective analyses of screening (Benn et al. 2003; Muller et al. 2002a). There were five case-control studies with sample sizes ranging from 100 to 1128. While all these studies were graded III-2 as per NHMRC, as discussed in chapter 2, the ideal design for determining the DR and FPR of a screening strategy is a large prospective cohort study (or nested casecontrol study) where all receive the screening tests/methods being compared (Deeks 2001). Study setting and samples In one of the cohort studies the population consisted of women with twin pregnancies (Muller et al. 2003b). In this paper the proportion of DS in the analysis was 0.25%. In another cohort study the population was high risk: all were having amniocentesis for advanced maternal age or because they were considered high risk for other reasons (Huderer-Duric et al. 2000). Seventy three percent of women in the population were aged 35 years and over, and the proportion of DS pregnancies in this analysis was 0.42%. For the other three cohort studies the population was a routinely screened population and the proportion of DS cases ranged from 0.11% to 0.19%. One of the cohort studies removed other chromosomal disorders from the analysis, while for the remaining four papers it appears they were included in the analysis as unaffected cases. The proportion of DS in the case-control studies ranged from 6% to 17%. The samples in the casecontrol studies consisted of cases (DS) and unaffected controls. The papers either expressly excluded other chromosomal disorders or it can be assumed there were no chromosomal disorders apart from DS in the analysis. Comparison screening methods The screening tests/methods compared in this chapter are: MA: either ≥ 38 years or ≥ 35 years. Double test: AFP and hCG (fβhCG or total hCG) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 64 Triple test: AFP, hCG (fβhCG or total hCG) and uE3 Quad test: AFP, hCG (fβhCG or total hCG), uE3 and inhibin Lens culinaris agglutinin-reactive AFP (expressed as a percentage of total AFP). Human Placental Growth Hormone (hPGH) Invasive trophoblast antigen (ITA) - formerly called hyperglycosylated hCG (HhCG). hCG glycoforms (GlyhCG) unless otherwise stated all screening strategies incorporate the individual’s maternal age. Three case-control studies analysed some of the markers using stored samples. This may bias results if there was a difference in storage times between cases and controls or if some markers were stored where others were analysed using fresh samples. In one study (Baviera et al. 2004) hPGH was analysed using stored specimens whereas TT markers were not (storage times were the same for cases and controls), and in another study inhibin and uE3 were analysed using stored specimens and AFP and hCG using fresh serum samples (case specimens were stored for a longer period) (Harrison and Goldie 2006). Two of the modelling papers used data from populations where stored samples had been used either for one of the markers (Palomaki et al. 2004), or for all marker levels, but where population parameters for one of the markers had been obtained from these samples, while the other parameters were obtained from the literature (Talbot et al. 2003). In both these papers either cases or controls were matched for storage time, or there was no difference in storage time between cases and controls. All papers in this chapter except one used conventional methods to combine maternal age specific risks and marker levels to determine an individuals risk and therefore the DR and FPR (i.e. using multivariate Gaussian distributions). The paper in question established a unique scoring system, based on the marker profile in relation to the median level of markers in unaffected pregnancies (Azuma et al. 2002). The scoring system did not incorporate the maternal age specific risk of DS. Where details were given in the paper all serum markers were analysed using commercial kits and individual risks were established using commercially available software. Outcomes Where a large number of results for different risk cut-offs, fixed FPRs or DRs have been reported (e.g. for modelling papers) a fixed FPR of 5% and/or a fixed DR of 85% have been selected for the evidence table. While studies report DR for fixed FPR in reality the cut-off chosen for screening programmes is an individual’s risk (Benn and Donnenfeld 2005). In the four of the five cohort studies it appears that any pregnancies affected by other chromosomal disorders (e.g. trisomy 18) have been included as “unaffected” pregnancies for analysis of screening accuracy, and so will contribute to false positive and true negative results. As discussed previously, these cases are not strictly false positives (as a positive screen is clinical relevant), but it is appropriate when calculating the performance of screening for DS to include these cases in the unaffected group. PRIMARY RESEARCH: STUDY RESULTS Accuracy of screening methods Maternal age alone as a screening test. Four studies had results for maternal age alone as a screen test. These results are presented in Table 11. In two studies, maternal age was clearly inferior to any of the other strategies used: specifically the DR was lower and the FPR was higher than for the other screening methods (Benn et al. 2003; Wald et al. 2003a). In one national study of the performance of maternal serum screening in the 2nd Trimester (using triple test or double test), maternal age (≥ 38 years) had a lower DR (9.5%) compared to 2nd trimester screening (73.5%) but a lower FPR ( 9.5% compared to 1.6%) (Muller et al. 2002a). However, while the results for maternal age as a screening test could be extracted from the data this was not the aim of the paper. As per the national policy those aged 38 years or older were offered amniocentesis, and so only 1.7% of the population included in the study were over 38 years old (Muller et al. 2002a). Another paper, a study of screening for DS in twin pregnancies, found that maternal age as a screen (adjusted for twins) had a lower DR but a superior (lower) FPR compared to various double SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 65 test strategies (Muller et al. 2003b). Maternal age with a correction for the risk of DS in those with dichorionic pregnancies had a DR equal to the double test but a higher FPR. Overall, the literature supported the low predictive performance of maternal age compared with other screening strategies in the second trimester. Table 11. Comparison of DRs and FPRs, maternal age versus other screening strategies Reference (Muller et al. 2002a) (Wald et al. 2003a) Test TT & double (1:250) MA ≥ 38 years Quad (1:300) MA ≥ 35 years DR for fixed FPR of 5% MA Double TT Quad (Benn et al. 2003) (Muller et al. 2003b) (twins) DR for fixed FPR of 5% & fetal loss adjustment MA Double TT Quad Modelled (1:270) MA ≥ 35 years TT Quad Risk cut-off 1:250 MA ≥ 37 years MA correct. dichorionic Double-observed MoM Double-twin median Double- mono- or dichorionic twin median DR (%), (95% CI) 73.5 (70.1-76.2) 9.5 (8-11) 81 (72-89) 51 (41-62) FPR (%), (95% CI) 6.85 (6.80-6.90) 1.6 (1.59-1.65) 7 (6.7-7.2) 14.3 (14.0-14.7) 26 61 66 75 5 5 5 5 (17-35) (51-72) (56-76) (66-84) 24 57 62 70 5 5 5 5 53.3 79.3 83.8 17.1 10.4 9.9 27.3 54.5 54.5 54.5 54.5 (6-61) (25-84) (25-84) (25-84) (25-84) 6.5 24.4 7.6 7.9 7.6 (5.6-7.4) (23-26) (6.6-8.5) (6.9-8.8) (6.6-8.5) Quad versus triple test and double test Five studies were identified that included results for the validity of the quad test and had comparisons with the triple test. Three papers also had results for the double test. The results for papers comparing the quad test with the triple test and the double test are summarised in Table 12. These comparisons took different forms. One study presented the results for a fixed cut-off of 1:250 (observed and modelled), for a fixed 3% FPR, and for a fixed 75% DR (Harrison and Goldie 2006). Two papers just gave results for a fixed 5% FPR (Palomaki et al. 2004; Wald et al. 2003a) and two papers provided results for a 2nd trimester cut-off of 1:270 (Benn et al. 2003; Benn et al. 2001). It should be noted that in some of these studies there were a wide range of other tests included. The quad test consistently performed better than TT which in turn performed better than the double test. The one exception to this was in a study where the quad test performed better than both TT and double test except for the observed results for a cut-off of 1:250 where the DR was higher but the FPR was also slightly higher (6.6 versus 6.7 ) for the quad test compared to the TT (Harrison and Goldie 2006). A further study only had results for the TT and the double test but not the quad test ( and so was not included in Table 12) (Huderer-Duric et al. 2000). In this study at a risk cut-off of 1:100 TT had a clearly better DR but higher FPR compared to the double test. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 66 Table 12. Comparison of performance of the quad test, triple test, and double test. Reference (Wald et al. 2003a) (Benn et al. 2003) (Harrison and Goldie 2006) (Benn et al. 2001) (Palomaki et al. 2004) Test DR for 5% FPR Double TT Quad Modelled (1:270 at 2nd T) TT Quad Risk 1:250 Double TT Quad DR (%), (95% CI) 61 (51-72) 66 (56-76) 75 (66-84) 79.3 83.8 63 (54-71) 70 (62-78) 72 (64-80) 3% FPR Double TT Quad 52 (44-61) 55 (47-64) 56 (48-65) 75% DR Double TT Quad 75 75 75 Modelled (1:250) Double TT Quad Modelled 2nd T risk of 1:270 TT TT LMP dating TT USS dating Quad Quad LMP dating Quad USS dating Modelled DR for 5% FPR Double-LMP dating Double- BPD dating TT-LMP dating TT-BPD dating Quad-LMP dating Quad-BPD dating FPR (%), (95% CI) 5 5 5 10.4 9.9 7.0 (5.4-8.6) 6.6 (5.1-8.2) 6.7 (5.2-8.2) 3 3 3 13.2 (11.1-15.2) 10.3 (8.4-12.1) 9.9 (8.0-11.7) 64 67 70 7.6 7.2 6.7 74.6 70.4 77.4 79.3 76.4 81.3 8.4 9.05 7.96 7.38 8.15 6.86 65 67 67 72 77 79 5 5 5 5 5 5 Evidence regarding second trimester screening included in Chapter 5 evidence tables Eight studies included in Chapter 5 also have evidence relevant to this chapter (Benn et al. 2005a; Cuckle et al. 2005; Cuckle 2003; Knight et al. 2005; Malone et al. 2005; Rode et al. 2003; Wald et al. 2006a; Wald et al. 2003b). All eight of these papers confirmed that the quad test has a higher DR and lower FPR than the TT and three papers (Benn et al. 2005a; Cuckle et al. 2005; Wald et al. 2003b)confirmed the conclusion that the TT performs better than the double test. Other comparisons In one paper, a case-control study, the performance of screening using TT with the addition of Lens culinaris agglutinin-reactive AFP (AFP-L3) was compared to the conventional TT (using total hCG). It also compared the performance of the conventional TT to screening with the TT using Lens culinaris agglutinin-reactive AFP instead of the standard AFP (Azuma et al. 2002). The TT plus AFP-L3 had a higher DR for all cut-offs than TT alone but a higher FPR. The TT with AFP-L3 instead of AFP had a higher DR and lower FPR than the conventional TT. These results show AFP-L3 is a possible additional 2nd trimester marker (Azuma et al. 2002). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 67 A case-control study compared the performance of the triple test using free βHCG compared to using intact hCG (Sancken and Bahner 2003). Neither showed superior performance with fβhCG having a slightly higher DR at a fixed 5% FPR (64% versus 55%), but having a lower DR and lower FPR (70%, 10%) than total hCG (79%, 12%) at a risk cut-off of 1:380. Another paper estimated the validity (with statistical modelling) of the double test using fβhCG compared to using total hCG (Talbot et al. 2003). For a fixed 5% FPR fβhCG plus AFP had a higher DR (66.4%) than total hCG plus AFP (59.4%). The same paper compared these variations of the double test to screening with GlyhCG plus AFP (DR was 53.1%) (Talbot et al. 2003). The case-control design and small sample size means there are some limitations to the study, but GlyhCG does not appear to be any better than other forms of hCG already in use. Another case-control study compared the predictive performance of hPGH to the individual components of the TT and compared the TT to screening with TT plus hPGH (Baviera et al. 2004), For a fixed 5% FPR hPGH performed better than AFP (DR of 34.4% versus 25.8%), about the same as uE3 (35.5%), but worse than hCG (41.9%). When the conventional TT was compared to TT with the addition of hPGF, the DR improved from 65.6% (95% CI 49-82) to 71.9% (95% CI 56-87) at a fixed 5% FPR (Baviera et al. 2004). A case-control study compared the performance of screening with ITA to the components of the conventional quad test. For a 5% FPR, ITA had a DR of 81% (95% CI 54-96), which was better than the DR of AFP of 29% (95% CI 8-58), uE3 of 36% (95% CI 13-65), DIA of 36% (95% CI 13-65) and hCG DR of 75% (95% CI 48-93). HCG performance (which correlates with ITA) was higher than expected. These results could have been subject to bias or chance, and in this study ITA performance may have been overestimated (Pandian et al. 2004). A further study estimated the performance of screening ITA using modelling (Palomaki et al. 2004). When ITA replaced hCG in the triple test the performance was similar: for a 5% FPR DR was 71% using ITA and 72% using hCG. When ITA replaced DIA in the quad test the DR was 76% compared to 79% with the conventional quad test. The quad test plus ITA had a DR of 83% (Palomaki et al. 2004). Difficulties implanting any of the screening strategies Five papers appraised in this chapter noted difficulties implementing any of the screening strategies. Mostly these were quality control issues or issues with the reliability of serum marker measurements. In one study (Harrison and Goldie 2006) inhibin-A was associated with considerable assay drift and marked with-in batch imprecision which led the authors to state they had concerns with its use in its present form. In another study serum level results for two individuals had to be removed as they were outliers: a DS pregnancy with a fβhCG of 29.7 MoM and a total hCG of 6.5 MoM, and an affected pregnancy with fβhCG of 0.06MoM and total hCG of 0.05 MoM (Sancken and Bahner 2003). The authors stated that dimeric hCG is thermally unstable during storage and transport and dissociated into fβhCG subunits (Sancken and Bahner 2003). Analysis of stored specimens in this study could have distorted marker levels compared to freshly analysed samples (Sancken and Bahner 2003). A paper based on a twin population noted an apparent problem where the results of serum screening in a twin with DS could be “normalised” by the results of an unaffected twin (Muller et al. 2003b). The paper did not provide any data to support this statement. Two papers noted problems determining an individual’s risk when dating was not accurate. One paper compared various 2nd trimester screening methods using LMP or BPD dating (Palomaki et al. 2004). Performance was worse if LMP dating was used compared to using BPD dating. Another found that FPR of the quad test improved (reduced to 8.2% from 9.0) after correction for dating errors of more than 10 days (Benn et al. 2003). Another paper noted problems with screening accuracy (higher FPRs) when using the medians from the software (as opposed to study population medians) (Huderer-Duric et al. 2000). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 68 Summary of results There was a high level of consistency supporting: poor performance of maternal age as a screening method compared with other screening tests/methods improved performance of the quad test compared with triple test and double test. It was not possible to determine whether screening methods using fβhCG or total hCG had better performance in 2nd trimester. fβhCG performed marginally better than hCG but there were only two papers appraised that compared screening using these markers. Markers other than standard 2nd trimester screening (hPGH, ITA, and AFP-L3) showed promising results. However, further research is required using prospective cohort studies and using fresh serum samples for analysis. The difficulties implementing screening strategies included issues with the reliability of serum marker measurements, problems with assay reliability, issues determining an individual’s risk when dating was not accurate, and for one study difficulties with accuracy when using software medians to calculate risk (as opposed to study population medians). Conclusion Some markers other than those used in conventional 2nd trimester screening showed promising screening performance. These were hPGH, ITA, and AFP-L3. However, further research is required using prospective cohort studies and using fresh serum samples for analysis The quad test has the highest performance of all conventional screening strategies confined to the 2nd trimester, followed by the triple test then the double test. Maternal age alone has a lower screening performance than other screening methods and is not recommended as a screening test. While some of the studies estimated DR and FPR using statistical modelling, there were also directly observed data supporting this evidence. Other limitations included using case-control design (especially for those considering new markers) and retrospective designs. However, the prospective cohort studies included in this chapter also contributed to this evidence. A population-based screening programme will need to include quality control measures to ensure serum measurements are accurate. It will also need to consider the use of USS to correct for uncertain dates, and to ensure software is able to accurately determine an individual’s risk of DS. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 69 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Huderer-Duric et al. 2000) The comparison interventions were the TT, compared to screening using AFP and hCG (double test) Between 1996-1998 a total of 2833 pregnant women (15-22/40) who were enrolled in an antenatal care programme had AC. Outcomes The outcomes extracted for this table were the DR and FPR for the TT and the “double test” (AFP and hCG) at risk cut-offs of 1:100, 1;200, 1:300. The paper gives the DR of DS and the specificity with a CI. It appears that other chromosomal disorders have been removed from this analysis (i.e. were not included in the unaffected group). Accuracy of screening methods DR% (95%CI), FPR% (95%CI) Limitations The population was an older high risk population as all were having AC. This may reduce the external validity of the results. DR may be overestimated. The paper says that introduction of their own population specific means reduced the overall FPR from 27% to 18.6% (1:100), from 38 to 26.7% (1:200) and from 45% to 33% (1:300). Difficulties implementing any of the screening strategies Higher FPR when used the used software medians in risk calculations. Setting unclear (although authors from University School of Medical, Zagreb, Croatia.) Zagreb, Croatia. Prospective cohort study Grade III-2 MSS using Ortho-Clinical Diagnostics assays. Markers corrected for weight. Risk determined using OrthoClinical Diagnostics software to analyse MoM. First 986 used software medians; next 1847 used the study population medians. Software corrects for MA, prior agebased risk for previous DS, and gives risk at term. In 2071 (73%) AC was indicated because of AMA (35 years and over). 762 were indicated for other reasons or requested. 2.8% of total had previous DS, 0.8% IDDM, 1.7% twins. 73% over the age of 34 years. For most (98.2%) GA determined from USS. Mean GA =17.7/40 Spectrum of disease: 12 DS/2833 = 0.42% 12 cases of other chromosomal disorders Triple test 1:100 75% (43-95), 20% (19-22) 1:200 83% (52-98), 30% (28.531.9) 1:300 92% (62-100) 37% (34.938.5) Double test 1:100 42% (14-70), 16% (14-18) 1:200 ?DR, 26% (24-27) 1:300 ?DR, 32% (31-34) Verifications All had amniocentesis. Unclear if any fetal loss between blood test and AC. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Small number of DS cases (n=12). Unclear how risks calculated for twins. Few details about sample handling and storage and methods for determining risk (parameters used etc). Unclear how some FPR figures calculated. DR not available for comparison for all risk cut-offs for double test. Author’s conclusions UE3 significantly contributes to DS detection with cost of slightly reduced specificity. The 1:300 risk cut-off caused an unfavorable compromise between sensitivity and specificity. Reviewers conclusions May not be generalisable as high risk population. Difficult to compare 2 screening methods as unclear how some figures calculated and not all DR are available. However in this study TT had a clearly better DR but higher FPR compared to the double test. 70 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Azuma et al. 2002) Compared screening using the TT (AFP, UE3, and hCG) with the addition of Lens culinaris agglutininreactive AFP (AFP-L3), to the TT alone. Also compared the performance of conventional TT to the TT with AFP-L3 instead of AFP. No strategies included MA related risk. Retrospectively found 530 woman without DS and 31 women with fetuses affected by DS. Seen at the hospital from 1989-1998. Outcomes The outcomes extracted for this evidence table were the DR and FPR using different cut-offs (scores of 5,4,3,2, respectively) for TT with the addition of AFP-L3 compared to TT. Accuracy of screening methods DR% (95%CI), FPR% (95%CI) Limitations Excluded those other chromosomal abnormalities. These are likely to have a positive screen for DS. This will have reduced the FPR compared to papers where these cases are included. University hospital Japan Case-control study Grade III-2 The authors established a scoring index for each marker and evaluated the screening performance of screening using this scoring method. Gestation (dates from LMP and corrected by USS). Exclusions: DM, hepatic disease, multiple pregnancies, major fetal anomalies. MA for cases 38 ± 3.9, MA for controls 35.9 ± 4.8 years (mean and SD)(p<0.05) Samples taken at 14-20/40. AFP-L3measurement measured by electrophoresis and expressed as a percentage of total AFP. GA cases 17.1 ± 1.8/40 GA controls 16.3 ± 1/40 (p<0.0001) AFP, hCG, and UE3 measured by commercial kits (Abbott Labs, Wallak oy, Diagnostic Products respectively). Marker levels then expressed as MoM median based on 4256 unaffected Japanese pregnancies. A score of 1 or 2 was assigned to an individual’s marker profile based on the relationship between the individual marker MoM and the median value of serum markers in unaffected pregnancies. Spectrum of disease: 31/561= 5.5% Total = 561 Also compared TT using hCG, uE3 and AFP-L3 to conventional TT (hCG, uE3 and AFP). Verification All fetal karyotypes were diagnosed by AC because of either AMA, abnormal screening results, or other reasons. TT & AFP-L3-score cut-off =5 35.5% (19-52), 1.3% (0.3-2.3) TT & AFP-L3-score cut-off =4 83.9% (71-97), 5.1% (3.2-7.0) TT & AFP-L3-score cut-off =3 90.3% (74-98),20.9%(17.5-24.4) TT & AFP-L3-score cut-off =2 100%(89-100), 57.5% (53.361.8) TT -score cut-off =5 16.1% (3-29), 0.2% (0.9-3.30 TT -score cut-off =4 35.5% (19-52), 1.3%(0.3-2.3) TT -score cut-off =3 61.3% (44-78), 7.0% (4.8-9.2) TT -score cut-off =2 90.3% (74-98), 35.3% (31.239.4) No other chromosomal disorders. HCG, uE3 and AFP-L3 80.6%, 6.6% (TT) hCG, uE3 and AFP 61.3, 7% Difficulties implementing any of the screening strategies No objective problems noted SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING As all had been diagnosed by AC for AMA and abnormal screening results and various other reasons, therefore sample would be biased and could increase DR estimate. Author’s conclusions We confirmed that AFP-L3 is a useful marker for DS. The method was not dependent on MA related risk factor. Small database therefore it needs to be confirmed by larger studies. Reviewers conclusions This scoring system needs validating in another population. However these results show AFP-L3 is a possible additional 2nd T marker. 71 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Muller et al. 2002a) All labs used approved tests (software and assay kit from the same supplier). The sample consisted of all pregnant women who had a 2nd T serum screen in France over 2 yrs (19971998). Outcomes The aim of paper was to compare screening performance for those under 38 to those 38 years and older using 2nd T MSS (various TT or double test methods). Accuracy of screening methods DR% (95%CI), FPR% (95%CI) Limitations The study did not analyse different screening methods separately. MSS (TT, double) 73.5% (70.1-76.2), 6.85% (6.806.90) Only 1.7% women in the analysis ≥ 38 yrs. Many of those ≥ 38yrs would not have had 2nd T screening as 38 years is the cut-off for offering AC in France. MA ≥ 38 yrs DR 9.5%(8-11), FPR 1.6% (1.591.65) Fetal loss possibly not fully ascertained. No national register of DS. Nation-wide review of medical records. France Retrospective cohort study. 15% of labs used triple test (AFP, UE3, hCG) (25% women). 85% of labs used the “double test” (AFP & total hCG, or AFP and fßhCG, or uE3 & hCG). Grade III-2 The software all combined maternal age risk with risks due to the MSS markers. Weight adjustment in 38 labs in 1997 and all labs in 1998. Risk of 1:250 used in all labs. Coincided with national decree in 1997 governing MSS including lab practices obligations of practitioner and patient, and reimbursement of the cost of screening and karyotyping. 1997 = 378,941 (52% of the total pregnant women that year), 1998 = 475,961 (65% of total that yr). Total for two yrs = 854,902. Less twins = 851,656 60 labs across France sent a standard questionnaire. GA in 98% = 14-17+ 6/40 in 98%. In 98% GA confirmed by USS. Spectrum of disease: = 977 DS/854,902 = 0.11% For this evidence table calculated performance of MSS (all different methods) compared to MA, by removing twins and adding together the two figures (under 38 to those 38 years and older). Difficulties implementing any of the screening strategies None noted Verification For live births a questionnaire was sent to maternity units as all newborns have pediatric examination at 1/12, 3/12 and 6/12 after birth. Questionnaire also sent to all 80 cytogenetics labs which perform pre and post natal karyotyping. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Would have expected 1,027 DS out of those aged < 38 yrs rather than 884 in this population. However, some centres offered had 1st T NT screening first -those with abnormal NT excluded from 2nd T screening. This will decrease the DR of the estimate of screening in 2nd T. Author’s conclusions Strict rule covering DS screening are of benefit to patients, practitioners, and labs and ensure good quality control and a high DR and low AC rate. Reviewers conclusions The sample is biased as many (especially over 38 yrs) will have had either 1st trimester NT screening or AC. The results may have limited utility as they are for serum screening as a whole rather than one method. However, screening using MA alone performs poorly compared to 2nd T serum screening. 72 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Wald et al. 2003a) Compared the performance of the quad test (using total HCG, or more usually fβhCG), to TT, double test, and MA alone. Between August 1996-September 2001 prospectively assessed the quad test offered as routine screening in 14 NHS hospitals. Accuracy of screening methods Limitations MA distribution of population not given but presumed to be that of routinely screened general population. Serum markers taken 14-22/40 and analysed at one centre. All tests done in all pregnancies. Outcomes Outcomes were the performance of the quad test (risk cut-off of 1:300) compared to MA alone ≥ 35 yrs. UK Prospective cohort study Grade III-2 Risk determined by аlpha software (multivariate Gaussian model). 46,193 pregnancies had quad screen done and MA recorded. 149 twins (no DS) Risk of 1:300 considered screen positive offered invasive testing. In 79% GA determined by USS. Spectrum of disease: 88 DS /46193 = 0.19% The DR for quad test compared to MA, the double test and the TT at a fixed 5% FPR. This was repeated with an adjustment for fetal loss (23%) after 16/40. Verification Hospital records and cytogenetic laboratories. Expected DS at 16/40 = expected at birth with adjustment for fetal loss 16/40-term. 72 less 20 alive at birth = 52. 52/1-0.23 = 68 plus 20= 88. DR% (95%CI), FPR% (95%CI) Quad test: 1:300 81% (72-89), 7% (6.7-7.2) MA alone (≥ 35 years) 51% (41-62), 14.3% (14.0-14.7) DR for fixed FPR of 5% MA alone 26% (17-35) Double test 61% (51-72) triple test 66% (56-76) Quad test 75% (66-84) DR for fixed FPR of 5% & fetal loss adjustment MA alone 24% Double test 57% triple test 62% Quad test 70% Difficulties implementing any of the screening strategies None noted SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING No details of other chromosomal disorders in sample but presumably left in analysis as “unaffected”. Unclear if all DS fetal losses ascertained however fetal loss adjustment made. Lead author is a director of logical Medical systems (software for antenatal screening for DS) and of Intema which holds rights to the Integrated test. Author’s conclusions “The results from a routine screening service confirm the value of an early 2nd T MSS over screening using MA alone …and lends support to the decision of the UK government to offer serum screening to all pregnant women. Also confirms that 2nd T quad is better than double or triple and should be regarded as test of choice in this period” Reviewers conclusions Small amount of detail in some sections but a large well designed study which shows quad test performs better than TT, double test, or MA alone. 73 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions Sample Outcomes and verification Results Comments (Benn et al. 2003) Compared the quad test (observed and modelled) to MA, double and TT. All MSS quad test performed at the health centre from November 1999 for 32 months were included. Accuracy of screening methods AFP and hCG measured using commercial kits (Bayer Corp), uE3 (Diagnostic Systems), and inhibin-A (DSL). Inclusion criteria: singletons, GA 14/40-21.9/40 at time of sample. Limitations Retrospective nature of the design means outcomes may not have been determined for all cases. However, adjustments made for fetal loss and for expected DS for the MA distribution. This highlights the problem of validating the DR in DS. The prevalence changes through the pregnancy (spontaneous fetal losses and ToPs) and difficult to decide on denominator. Adjustments for maternal weight, race, and ethnicity have been described elsewhere (see Benn et al. 1995 and Benn et al. 1997 as reference in this paper). No adjustments for smoking. Median values for all four analytes based on data from this study and lab data. 74% dated by USS, 25%- LMP, 1% clinical exam. Outcomes As well as the observed results for the quad test (1:270 risk 2nd T) DR was given for the quad test with fetal adjustment (see below) and with the expected DS as denominator (see verification section). A 2nd T risk of 1:270 used for screen positive. MA specific risk for DS at birth adjusted to 2nd T risk using a factor of 0.76 for the survival of affected and 0.985 for unaffected. Spectrum of disease: 45 DS /23749 = 0.19%. University of Connecticut Health Center, USA Retrospective cohort study Grade III-2 Some statistical modelling During the study period revised parameters used for screening programme. However, for this paper the same parameters (means, SD, correlation coefficients, and truncation limits) used for all screened and revised risks not used clinically. The modelled DR and FPR calculated by simulation using mathsoft Inc software. 23749 had quad screen. “Race” or ethnicity: 65% white, 19% Hispanic, 12% black, 5% other (mostly Asian). The median MA at EDD was 27.8 yrs and 17.1% ≥ 35 years. 12 DS live births. 12/0.76 (adjustment for fetal loss from 2nd T to term) = 16. Plus the 33 ToP= 49. Also compared modelled DR and FPR of MA alone (>35 years), TT, and quad test for a population with the same attributes as the study population. For the quad test they adjusted for fetal loss in both screen positives and screen negative women and using the denominator of 54 (expected DS in this population). Verification Cytogenetic lab reports, genetic consultation records, USS reports, and follow-up from referring physicians. (previously validated methods). Observed DR (95% CI) , FPR (95% CI) Risk of 1:270 at 2nd T. Quad test No adjustment fetal loss 86.7% (73.2-95), 9.0% (8.7-9.4) DR Quad test with adjustment 85.8% (72.8-94) DR Quad test with expected denominator 77.7% (64.4-88) Modelled DR, FPR MA ≥ 35 years 53.3%, 17.1% Triple 79.3%, 10.4% Quad test 83.8%, 9.9% Difficulties implementing any of the screening strategies After correction for major dating errors (> 10 days) FPR of quad = 8.2% (7.9-8.6). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING FPR different between observed and modelled results which may reflect differences in parameters for the analytes relative to published. This may be due to differences in accuracy of dating, population characteristics and assay conditions. Author’s conclusion The quad test meets or exceeds performance expectation and appears to represent an improvement over the triple test. Inhibin-A should be widely available for 2nd T screening. Reviewers’ conclusions The study is limited by the retrospective design of the study. However in this well conducted study, quad test performed well in observed screening and modelled results showed an improved performance over TT and MA. 74 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Benn et al. 2003) These were determined for those dated by LMP and those dated by USS then combined by calculating averages weighted by the number of women at each age and method used for dating. University of Connecticut Health Center, USA Retrospective cohort study Grade III-2 Sample Outcomes and verification Results Expected DS based on MA distribution = 54 cases. Difference not sig diff between 49 detected and 54 expected. The modelling was based on a population with MA distribution, method of dating, and “race”/ethnicity of those screened. Some statistical modelling Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 75 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Sancken and Bahner 2003) Compared the performance of TT using total hCG compared to fβhCG. 34 serum samples from pregnancies with DS and 189 controls matched for GA. Accuracy of screening strategies: fβhCG analysed by immunoradiometric assay (CIS). Others by a radioimmunoassay (Ortho Clinical Diagnostics). Removed two outliers. Left 33 DS and 188 controls= 221 in study. Outcomes DR and FPR for TT (without a priori MA risk) using AFP, uE3, and either total hCG, or fβhCG. Limitations Case control study design means the population may not be representative of a routinely screened population. Other commonly confused disorders removed (other chromosomal disorders). Germany Case-control study Grade III-2 MoM from regression equations of the controls with gestational days as the regressing variable. Multivariate discriminant analysis was performed for the combination of AFP, uE3, and hCG and for the combination of AFP, uE3 and fβhCG. Mean MA was 34 yrs (19-43yrs) for cases, and 29 yrs 17-44) for controls. For both groups mean GA=17/40 (1522). Spectrum of disease: 34DS /221 = 0.15% No other chromosomal disorders. Presumed singletons. DR and FPR for TT (with a priori MA risk) using AFP, uE3, and either total hCG, or fβhCG for a fixed 5% FPR and with risk cut-off of 1:380. Verification Reference standard not discussed nor whether the cases were already screen positive using total hCG or fβhCG. DR (95% CI), FPR (95% CI) without MA ( ?risk cut-off) TT -fβhCG 79% (65-93), 24% (18-30) TT- total hCG 76% (61-90), 22% (16-28) DR for fixed 5% FPR TT-fβhCG 64%(47-80) TT-total hCG 55% (38-72) Risk cut-off of 1 :380 DR (95% CI), FPR (95% CI) TT-fβhCG 1:380 70% (54-85), 10% (5-14) TT-total hCG 1:380 79% (65-93), 12% (8-17) Difficulties implementing any of the screening strategies There were 2 outliers: one DS (29.7 MoM fβhCG, 6.5 MoM hCG), and one unaffected (0.06MoM free βhCG, 0.05 MoM hCG). Dimeric hCG is thermally unstable during transport and storage and dissociates into fβhCG subunits. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING The removal of outliers. In clinical practice would still need to have a risk estimate for these individuals. Unclear if serum specimens were retrospectively analysed and if so if tests were performed blinded to outcomes. Author’s conclusions For the observed cases none of the markers, hCG or free βhCG, was superior in DS screening. Problems with free ßhCG being unstable and therefore an individuals risk misinterpreted means the small increase in sensitivity may not justify using it. Reviewers conclusions In this study fβhCG and total hCG performed equally well. A case control study is not the best design for diagnostic accuracy and a prospective cohort study is needed in order to determine if one marker is a better discriminator for DS. 76 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Muller et al. 2003b) Compared five strategies for screening for TWINS: MA, MA corrected for risk of at least one DS fetus in dichorionic pregnancies, MSS (fβhCG and AFP) with observed MoM ÷ 2, using median values in twin population, and using median values specific to mono- or dichorionic twins. Between Jan 1997 and June 2000, 3292 women with TWIN pregnancies had 2nd T MSS at 14-22/40. Outcomes Outcomes were DR and FPR for screening for DS using MA alone, MA with twin correction in dichorionic pregnancies, double test (fβhCG and AFP) with observed MoM ÷ 2, double test using median values in twin population, and double test using median values specific to mono- or dichorionic twins. Accuracy of screening methods DR (95% CI) , FPR (95% CI) Limitations The setting of the study or what interventions where based on in clinical practice was unclear. MA alone 27.3% (6-61), 6.5% (5.6-7.4) Small number of DS cases (n=15) All methods had risk cut-off of 1:250. For MA this meant using cut-off of 37 yrs (1:250 risk in singletons) and then where chorionicity known correction as per Meyers et al (as referenced in this paper). For dichorionic twins this leads to an offer of invasive screen for those with dichorionic twins for women ≥ 34 years. Double: median values-total twin population 54.5% (25-84), 7.9% (6.9-8.8) No details of setting. France Prospective cohort study Grade III-2 MSS analysed using PerkinElmer kit. Risk using PerkinElmer software. As the database increased, the factor used to normalise serum marker levels was corrected. In this way the median concentration observed in twins expressed in MoM derived from singletons could be used. Retrospective analysis of serum level data. 2nd T MSS marker (AFP and fβhCG) MoM calculated. This was then ÷ 2, or medians adjusted for twins, or medians adjusted for chorionicity. When risk of DS greater than 1:250 amniocentesis offered. Median GA =15/40. Mostly (95%) dates from 1st T USS. Outcomes known in 3043 = 92.4% Chorionicity in 1562 from 1st T USS and the T or Lambda sign, or from post natal placental examination. Routine screening Median MA = 30yrs (16-44). Spectrum of disease: 15 DS /6086 fetuses = 0.25% Four pregnancies with 2 fetuses affected with DS, and 7 with one fetus with DS. Verification No details. Based on MA would expect 15-16 cases of DS. MA corrected for dichorionic pregnancies 54.5% (25-84), 24.4% (23-26) Double: observed MoM ÷ 2 54.5% (25-84), 7.6% (6.6-8.5) Double: median values for mono- or dichorionic twins 54.5% (25-84), 7.6% (6.6-8.5) Difficulties implementing any of the screening strategies Authors noted that a problem with MSS for detection of DS in twins is that an abnormal result in an affected twin may be normalised by an unaffected twin’s marker levels. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING No details of verification methods. May not be full ascertainment of DS (fetal losses not accounted for). Author’s conclusions Trisomy 21 second-trimester MSS is feasible in twins, and is better than a policy based on maternal age alone. Reviewers conclusions 2nd T MSS using AFP and fβhCG performs better than MA alone with no correction but not a lot better than when MA corrected for risk of twins. This method cannot indicate which is the affected twin. 77 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Baviera et al. 2004) Compared the TT (AFP, UE3, and hCG) to the TT with the addition of Human placenta growth hormone (hPGH), and the performance of the separate components of the TT to hPGH. 32 stored samples (-70C) from 32 singletons with DS were retrieved. Outcomes The outcomes were the DR for 5% FPR for hPGH, AFP, uE3, hCG, TT, and TT plus hPGH. Accuracy of screening strategies: DR, FPR Limitations The case control design is not the ideal design to determine the performance of a screening test. The spectrum of disease will not be that of a routinely screened population. There will be no other chromosomal disorders which will decrease the FPR compared to studies which include these as unaffected in their analysis. Italy. Case-control study Grade III-2 All maternal samples had previously had routine triple test (Immulite) from 1999-2003 at 1520/40 hPGH Measured using a commercial assay (Biocode). hPGH Expressed as MoM. Gaussian distributions for cases and controls produced. Means and SD of the hPGH marker and correlation coefficients between markers determined. These were cases diagnosed with DS by AC after positive TT (20), or AMA (≥ 35 years) with negative triple test (12). Of the 20 with positive TT, 5 were over 35 yrs. Five matched singleton controls (160). MA, weight, specimen storage, GA, and ethnicity not different between groups. Verification All cases had AC. No details of how control outcomes verified. DR For fixed FPR of 5% hPGH 34.4% (18-51) AFP 25.8% (11-41) UE3 35.5% (19-52) hCG 41.9%. (25-59) Triple (AFP, UE3, and hCG) 65.6% (49-82) AFP, UE3, hCG, and hPGH 71.9% (56-87) Spectrum of disease: 32/192 = 17% 20/32 DS already positive by TT screening which may bias the sample. Unclear if controls had previously had positive or negative screen. HPGH was analysed on stored sera while other markers had already been analysed. Parameters for hPGH were obtained from this study – i.e. a small sample. Author’s conclusions Although the results are less significant than those obtained with inhibin this marker could be considered in screening for DS. hPGH concentrations only weakly correlated with other triple test markers and these correlations were taken into account when determining the LR. Reviewers conclusions The design and sample size limits the ability to accurately determine the performance of screening with hPGH. It shows potential and this should be confirmed in a prospective cohort study. Parameters for the other markers from published data (Cuckle et al. 1995 as referenced in this paper). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 78 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Pandian et al. 2004) Comparison of components of the quad test i.e. AFP, uE3, hCG, and DIA (dimeric inhibin-A) were compared to serum ITA (previously called hyperglycosylated hCG, and usually measured in urine) Blood collected from pregnant women between 14-22/40. All were singletons. Outcomes Compared the DRs for a fixed 5% FPR for screening using AFP, uE3, hCG, DIA, and ITA Accuracy of screening strategies Limitations Case control design is not ideal design for accuracy of screening. Small number of DS cases (n=16). Verification All DS cases had karyotyping after AC AFP 29% (8-58) UE3 36% (13-65) HCG 75% (48-93) DIA 36% (13-65) ITA 81% (54-96%) USA Case-control study Grade III-2 Samples analysed blind to clinical information. ITA results were determined by automated immunochemiluminometric assay. Intra and interassay imprecision determined (<3.5% and 7.4%). DIA (diagnostic system laboratories) and AFP, HCG, and uE3 (immunlite 2000) were analysed on the same day. Estimates also modelled using Gaussian distributions and maternal age distribution for the USA in 2000. Samples from the 16 with DS were collected at the time of counseling (Yale University). The diagnosis had been made by AC. All 16 DS had ITA and hCG done. For two samples AFP, uE3, and DIA results were not available. Samples from 84 controls collected from women undergoing 2nd T screening (University of Connecticut). Spectrum of disease: 16 DS/100 = 16% DR (95% CI) for a fixed FPR of 5% The pair wise correlation coefficients were calculated between ITA and other markers. The correlation between ITA and HCG was much higher than for others (0.794 in controls). Both markers may not be needed in screening. Difficulties implementing any of the screening strategies None noted. No details of how outcomes determined for controls. Cases and controls at different sites-maybe different handling of specimens. Cases determined after AC - may affect marker levels. Average GA for cases later than controls. HCG performance (which correlates with ITA) higher than expected. Could have been due to bias or chance. Screening history reviewed for referral bias- 8/16 DS from women ≥35 yrs. No evidence these were screen +ve by MSS prior to AC. Can’t exclude fact ITA performance may be overestimated. Quest diagnostics, and Nichols Institute Diagnostics assisted in assaying the samples. One of the authors works for Quest diagnostics. Author’s conclusions The current study found 2nd T serum ITA was a good marker for DS % when gestational dating was based on USS. Findings should be confirmed in a larger study. Reviewers conclusions Some limitations due to study design and numbers. Need to have a large prospective cohort study. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 79 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Harrison and Goldie 2006) Comparison of screening using either double test, TT, or quad test (all with total hCG). 128 case samples collected January 1993-December 2001. Outcomes The outcomes (DR and FPR) were screening using the double test, TT, and Quad test. These are presented for a risk cut-off of 1:250, DR at a fixed FPR of 3%, and FPR at a fixed DR of 75%. Accuracy of screening strategies Limitations Having only 55.3% confirmed by USS may have decreased the performance of the tests. Hard to compare to others where mostly or all GA confirmed by USS. Bristol, UK Case-control study Grade III-2 For DS samples, total hCG and AFP analysed then stored at -40C until completion of an annual audit, then stored at -70C long-term. For controls total hCG and AFP determined then frozen (-40C) until all outcomes known. HCG and AFP not reanalysed. Testing done with commercial kits (Bayer Plc). Inhibin and UE3 analysed using Diagnostic Systems Laboratories testing kits. The control population comprised 1000 samples received consecutively for DS screening (using total hCG and AFP) between 1st January and 20th February 2003. GA 15/40-17.6/40. GA from BPD in 55.3% and remainder from reliable dates. Median MA of cases at delivery = 33 yrs (19-42yrs). Median MA of controls at delivery = 29 yrs (16-42 yrs). Spectrum of disease: 128/1128 = 11% All serum markers corrected for maternal weight. Results expressed as MoM. MA specific risks as per Cuckle et al. 1987 (as referenced in this paper). Risk estimates calculated using NSC Prenatal Screening Decision Support QAtools software. (uses multivariate risk algorithm). Marker distribution for AFP and hCG from literature (Wald et al. 1988 - as referenced in this paper). Distribution parameters for uE3 and inhibin-A derived for this study. Assay drift was assessed by running 80 control samples across an assay plate in forward then reverse sequence. Presume no other chromosome disorders. Presume singleton. The 3% fixed FPR was chosen as the National screening committee in the UK has set a target performance for DS screening of DR 75% for less than 3% FPR by April 2007. The performance was also estimated using statistical modeling. Verifications 80 detected by the double test (1:250 at term) and 48 missed (either live births or miscarriages). None identified by AC alone (i.e. AMA) as these did not have serum taken. Risk 1:250 DR (95%CI), FPR (95%CI) Double 63 (54-71), 7.0(5.4-8.6) TT 70 (62-78), 6.6 (5.1-8.2) Quad 72 (64-80), 6.7 (5.2-8.2) DR (95%CI) fixed 3% FPR Double 52 (44-61) TT 55 (47-64) Quad 56 (48-65) FPR (95%CI) fixed 75% DR Double 13.2% (11.1-15.2) TT 10.3% (8.4-12.1) Quad 9.9 (8.0-11.7) Modelled (1:250) DR, FPR Double 64, 7.6 TT 67, 7.2 Quad 70, 6.7 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING For controls and cases different tests done at different times-AFP and HCG then frozen samples used for rest at later date. Case blood collected some years earlier so frozen for longer. Authors say this is not a problem as stored under optimum conditions and subject to only one thaw freeze cycle. Problems with inhibin-A assay. Authors state they have concerns about the use of this marker in its present form. (No other assay available in UK commercially). Author’s conclusions The NSC target of a DR of at least 60% with FPR < 5% may be achievable using TT and USS dating in 2nd T. The results suggest that the April 2007 target of DR 75% and FPR < 3% is unachievable using current 2nd T MSS. Reviewers conclusions Well designed and conducted study with clear description of methods. Needs further analysis in large prospective cohort study. 80 Table 13. Source Country Setting Study design Evidence Grading (Harrison and Goldie 2006) Bristol, UK Case-control study Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Comparison screening strategies Sample Outcomes and verification Results Difficulties implementing any of the screening strategies Inhibin-A was associated with considerable assay drift and marked within-batch imprecision (intra-batch % coefficient of variation = 17%). Grade III-2 Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 81 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Benn et al. 2001) The aim of the study was to determine the extent to which additional DS pregnancies may be identified through T18 screening, and the extent to which T18 may be screenpositive for DS (cross-identification). Statistical modelling Randomly generated 100,000 sets of analyte values corresponding to normal, DS and T18 pregnancies using the multivariate Gaussian distribution generator in the statistical computer package, S-Plus (mathsoft). This program has been previously validated. (Larsen et al. 1998-as referenced in this paper). For both the DS and unaffected populations the means, SD and correlation coefficients between the markers were determined by Wald et al. 1994, 1996 and1997 (as referenced in this paper). Prior to the computation of the LR from the randomly generated analyte values, truncation limits were applied to the analyte values as specified for DS calculations (Wald et al. 1996 as referenced in this paper). Sample Outcomes and verification Results Comments Outcomes For the purpose of this evidence table the data for the FPR and DR of triple test (AFP, uE3, and hCG) and quad test (addition of inhibin) was extracted from the paper. Accuracy of screening methods Limitations Performance of DS screening not the aim of the study. There were separate simulations for pregnancies dated by LMP and those dated by USS, as well as a simulation for a population in which 60% of pregnancies were dated by USS and 40% on LMP. DR, FPR (risk 1;270, 2nd T) Triple test 74.6%, 8.4% Triple test (LMP dated) 70.4%, 9.05% Triple test (USS dated) 77.4%, 7.96% Quad test 79.3%, 7.38% Quad test (LMP dated) 76.4%, 8.15% Quad test (USS dated) 81.3%, 6.86% The risk cut-off was a 2nd T risk of 1:270. The MA distribution for the model was that of the USA population in 1998. MA specific risk of DS was based on Bray et al. 1998 (as referenced in this paper). Adjustment to correct for prevalence of DS in 2nd T was performed using a 0.8554 factor obtained by comparing AC prevalence of DS to GA specific prevalence curve (Cuckle 1999 and Benn and Egan 2000 as referenced in this paper). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Author’s conclusion Quad test appears to have a somewhat higher FPR and DR than others have noted. This may be due to the difference in MA distribution of the population analysed and other minor differences in the variables used in the model. (Other conclusions were about cross identification-the aim of this paper) Reviewers’ conclusions This model predicts that quad test with USS dating of all pregnancies will perform better than TT. 82 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Talbot et al. 2003) The study compared screening with different forms of the double test (AFP and either hCG glycoforms(GlyhCG), fβhCG, or total hCG). Maternal serum samples collected in 2nd T from DS and unaffected women attending Harold Wood hospital (Spencer 1999-as referenced in this paper). Blood had been stored at 20C. From this archive retrieved 50 DS and 279 unaffected samples. Outcomes Outcomes extracted for this evidence table are the DR for a 5% fixed FPR using GlyhCG plus AFP, compared to total hCG plus AFP, and to fβhCG plus AFP. Accuracy of screening methods Limitations Analysis of all samples using stored specimens. Obtained population parameters for GlyhCG from these results (small number and stored samples). Parameters for other markers would have been obtained from freshly analysed samples. Statistical modelling HCG AFP and total hCG were analysed using Kryptor (Brahms) and automated immunofluorescent assays (Brahms). The serum GlyhCG measured at two dilutions in singletons using lectin immunoassay described by Abushoufa et al. 2000 (as referenced in this paper). The mean of the two (after correction for dilution) used in the analysis. Analysis for GlyhCG was blinded to outcomes. Each value converted to the MoM for the GA using published parameters or for GlyhCG from this study. Used observed parameters for GlyhCG and those for total hCG and fβhCG and AFP from Spencer et al. 2002 (as referenced in this paper). All dated by CRL or BPD (after 14/40). Median MA 35 yrs DS and 30.4 yrs for controls Model based on MA distribution of England and Wales 1997-1999 DR For a fixed 5% FPR. GlyhCG, AFP 53.1% Total hCG and AFP 59.4% fβhCG, and AFP 66.4% Difficulties implementing any of the screening strategies None noted GA cases= 109 days. GA controls =115 days. Sample storage cases = 840 days. Sample storage controls = 851. Used regression analysis to determine the relationship between marker levels and GA. Corrected for maternal weight as per Neveux et al. 1996 (as referenced in this paper). Performance of marker combinations determined using standard modelling techniques (Royston and Thompson, 1992 as referenced in this paper). Monte Carlo simulation used to generate 15,000 random MoM for DS and unaffected pregnancies for each marker based on the MA distribution of England and Wales 1997-1999. From these calculated the LR, and MA related risks used to determine risk of DS at term. MA related risks from Cuckle et al. 1987(as referenced in this paper). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Author’s conclusion Maternal serum GlyhCG, as measured by the sialic acid-binding lectin immunoassay is unlikely to be of additional value when screening for DS in the second trimester. Whether a more specific immunoassay could improve clinical discrimination needs further consideration. Reviewers’ conclusions The case-control design and small sample means there are some limitations, but GlyhCG does not appear to be any better than other forms already in use. 83 Table 13. Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Palomaki et al. 2004) The study compared the performance of using serum ITA (or hyperglycosylated hCG) in 2nd T multiple marker strategies with the performance of using hCG, or dimeric inhibin-A (DIA). ITA is a carbohydrate variant of hCG. Used stored serum from an “unbiased group” of 5,345 women. Samples collected 1990-1992 during a previous observational study (Haddow et al. 1994 as referenced in this paper). Outcomes The outcomes were observed and modelled DR at 5% FPR for ITA alone, either with LMP dating or BPD dating (without MA specific risk in the calculation of risk). Accuracy of screening methods Limitations Small sample, and data from a case control study used for the model. Statistical modelling Serum collected before AC and assayed for AFP, uE3, hCG then stored -20C. Never thawed aliquot used for this series. Samples sent (blinded) to Quest Diagnostics for ITA measurement using automated immunochemiluminometric assay (minimal cross reactivity with hCG). 14 prenatal centres in California and elsewhere in USA recruited women at 1421/40, about to undergo AC for reasons other than a positive screening test (mostly for AMA). Results are also presented for modelled DR for 5% FPR replacing hCG, or DIA with ITA in the TT or quad test, and comparing the quad test to quad plus ITA. All results converted to MoM and corrected for maternal weight (Neveux et al. 1996 as referenced in this paper). Adjustments to the other variables and population variables published elsewhere (as referenced in this paper). Correlation coefficients between ITA and serum markers in 2nd T DS and unaffected were derived after exclusion of outliers. None had had serum screen in this pregnancy. Used MA age distribution of the USA for 2000. Aliquots from same series used for two previous studies. Three cases and 7 controls had been used up. Left 45 cases and 238 controls. Verification Amniocentesis Modelling method based on over lapping Gaussian distributions of DS and unaffected pregnancy markers. Matched for time in storage, MA, GA, “race”, and site where sample obtained. Observed (modelled) DR For a fixed 5% FPR. No details of MA but states that most of the women having AC for AMA. ITA alone-LMP dating 38% (45%) ITA alone-BPD dating 40% (48%) Used stored samples for ITA (and possibly DIA-no details given) Modelled DR For a fixed 5% FPR for different methods of dating. Some support form screening industry for the study. Partially funded by Quest Diagnostics, and Quest Diagnostics also assisted in the assaying of the samples for various markers. One author works for Quest. ITA -LMP dating 56% ITA -BPD dating 58% ITA, and AFP-LMP dating 62% ITA, and AFP- BPD dating 65% Double test (AFP & hCG) -LMP dating 65% Double test (AFP & hCG) - BPD dating 67% TT (AFP, uE3, & hCG) -LMP dating 67% No further details of the sample. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Author’s conclusion This study indicates that serum ITA is an effective marker for DS. It is highly correlated with both hCG and free βhCG and could replace either of them in a multiple marker method. A lab may find it is more economical or easier to use ITA than hCG and this should not change performance. However, need to first look at ITA screening for twins and T18 etc. 84 Table 13. Source Country Setting Study design Evidence Grading (Palomaki et al. 2004) Statistical modelling Continued Evidence table of primary research studies appraised investigating the accuracy of second trimester screening compared to components (continued) Comparison screening strategies Sample Outcomes and verification Results Comments TT (AFP, uE3, & hCG) BPD dating 72% ITA, AFP, & uE3-LMP dating 66% ITA, AFP, & uE3- BPD dating 71% ITA, AFP, uE3, & hCG -LMP dating 70% ITA, AFP, uE3, & hCG- BPD dating 76% Quad test -LMP dating 77% Quad test -BPD dating 79% ITA, AFP, uE3, & DIA -LMP dating 77% ITA, AFP, uE3, & DIA -BPD dating 79% ITA, AFP, uE3, DIA, & hCG -LMP dating 80% ITA, AFP, uE3, DIA, & hCG -BPD dating 83% Reviewers’ conclusions Using data for the model from a case-control study with a small sample limits the ability to calculate an accurate DR and FPR. ITA does appear to perform as well as hCG or DIA, but this should be confirmed with a large prospective cohort study with an unbiased sample. Difficulties implementing any of the screening strategies Performance of screening worse if used LMP dating rather than BPD dating SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 85 Chapter 5: Comparison of 1 st trimester strategies, 2 nd trimester strategies, integrated and sequential methods. PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY The search identified 26 eligible papers comparing the accuracy of first trimester with second trimester screening and/or comparison with screening where the results of screening in two trimesters are combined (integrated or sequential screening). Below is an overview of study designs and aspects of quality represented by these studies. Full details of the papers appraised, including methods, key results, limitations and conclusions, are provided in evidence Table 16 (pages 97-134). Studies with directly observed (or age standardised) DRs and FPRs are presented first, followed by studies where the results only include estimations of performance using statistical modelling. For each of these two groups of papers, studies are presented in chronological order of publication. Study design, and grading All 26 papers appraised in this chapter are based on primary research; there were no papers based on secondary research that fitted the eligibility of the review. Of the 26 papers appraised, twelve papers reported directly observed (or age standardised) comparisons of performance, and fourteen papers only reported comparisons of DR and FPR estimated by modelling. For further explanation of modelling see Chapter Three Study design and grading. Of the papers reporting directly observed comparisons (or age standardised), eleven were cohort studies. Sample sizes for these studies ranged from 359-47,053. Four of these studies were retrospective cohort studies, (Gyselaers et al. 2004a; Gyselaers et al. 2004b; Michailidis et al. 2001; Schuchter et al. 2002) and one (Wald et al. 2003a) included analyses using data from a nested casecontrol study (sample size of 588). There was one case-control study with a sample size of 531 (Herman et al. 2002). While all these studies were graded III-2 as per NHMRC, as discussed in chapter 2, the ideal design for determining the DR and FPR of a screening strategy is a large prospective cohort (or nested case-control) study where all women have all screening strategies being compared (Deeks 2001). Study setting and samples Of twelve studies reporting directly observed (or age standardised) data six were set in single centres and six were multicentre studies. All studies except one were in hospital settings, (Knight et al. 2005) and four single centre studies were set in university hospitals (Audibert et al. 2001; Dommergues et al. 2001; Herman et al. 2002; Michailidis et al. 2001). The case-control study compared DS cases to “normal” controls, i.e. the study did not include cases of other chromosomal disorders which will also be detected by DS screening (Herman et al. 2002). The proportion of DS in this study was 4.3% (Herman et al. 2002). In five of the cohort studies the population studied was a routinely screened population with the proportion of DS in the sample ranging from 0.18-0.31%. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 86 Two of the papers screened women of AMA, one by design as the sample consisted of women 38 years and over (proportion of DS = 2%), (Dommergues et al. 2001) and the other through referral bias-the mean maternal age was 37 years and the proportion of DS in the study population was 0.78% (Babbur et al. 2005). In four studies the population contained fewer older women, two by design as they excluded either women over 38 years (Audibert et al. 2001) or those over 37 years, (Rozenberg et al. 2002) or because many women over 35 years old had invasive diagnostic screening without DS screening (Gyselaers et al. 2004a; Gyselaers et al. 2004b). The cohort studies all included other chromosomal disorders as unaffected (for some studies this is presumed from the details in the paper). Comparison screening methods The screening tests/methods compared in this chapter were (see glossary for further details): 1. Screening restricted to one trimester: MA ≥ 35 years first trimester MSS (PAPP-A and either fβhCG or hCG) first trimester combined test second trimester double test second trimester triple test (TT) second trimester quad test proform of eosinophil major basic protein (ProMBP) used in the TT (second trimester) instead of uE3, and in the quad test instead of inhibin. 2. Screening using repeat measures of the same serum analytes in first and second trimesters. 3. Screening combining information from markers across two trimesters (sequential or integrated screening). The conventional integrated and sequential screening methods were: serum integrated screening (1st trimester PAPP-A and second trimester quad test) (fully) integrated screening (1st trimester PAPP-A and NT and second trimester quad test) stepwise screening (1st trimester combined test and second trimester quad test) contingent screening (1st trimester combined test and second trimester quad test). Integrated screening (serum integrated or fully integrated screening) involves all women having first and second trimester screening tests. Results are held until completion of second trimester screening when all results are combined into one result. For stepwise screening all women have first trimester screening. Those with a high risk result are offered diagnostic testing. All others have second trimester screening. For these women the results of first and second trimester testing is combined to produce one integrated result. For contingent screening all women have first trimester screening. High risk women have a diagnostic test; low risk women are reassured and have no further screening. All others have second trimester screening. For these women the results of first and second trimester testing is combined to produce one integrated result. In additional, sometimes a screening method employs the same strategy as a conventional integrated or sequential screening method but different tests are used. For instance, measuring NT in the first trimester and offering those with a high risk result a diagnostic test, and those with a low risk a 2nd trimester double test. This method uses the same strategy as the stepwise screening method and in this review this is described as combining tests in a “stepwise manner”. In some appraised studies women were offered a first trimester screening test then women who were screen negative were offered a second trimester screen without consideration of the first trimester results. This practice of combining tests in an “independent manner” leads to erroneous risk estimates (Wald 2006c) and is not recommended. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 87 Unless otherwise stated all screening strategies incorporate the maternal age for each individual. Three studies used stored serum specimens for the analysis. This may bias results if there was a difference in storage times between cases and controls or if some markers were stored where others were analysed using fresh samples. For one paper all markers were analysed using stored specimens and cases and controls were matched for duration of storage (Wald et al. 2003b). Another paper determined the performance of the serum integrated test using 1st trimester PAPP-A results from stored serum (Knight et al. 2005). A further paper analysed ProMBP and Inhibin-A using stored specimens while the analyses of other markers (fβhCG, hCG, uE3, AFP) were determined on fresh specimens (Rode et al. 2003). Where details were given in the papers most studies analysed serum markers using commercial/automated kits and risk was established using commercially available software. One study analysed PAPP-A using a manual ELISA, and ProMBP using an in-house (Statens Serum Institut) ELISA (Rode et al. 2003). All papers in this chapter used conventional methods to combine maternal age specific risks and marker levels to determine an individual’s risk and therefore the DR and FPR (i.e. using multivariate Gaussian distributions). Outcomes Where a large number of results for different risk cut-offs, fixed FPRs or DRs have been reported (e.g. for modelling papers) a fixed FPR of 5% and/or a fixed DR of 85% have been selected for the evidence table as this is the convention for reporting DS screening test performance (reference). While studies report DR for fixed FPR in reality the cut-off chosen for screening programmes is an individual risk (Benn and Donnenfeld 2005). For practical reasons, in some cases where large numbers of results have been reported for serum or NT measurements at different GAs, the results for timings which produced the best performance are presented in the evidence tables. PRIMARY RESEARCH: STUDY RESULTS Accuracy of screening methods Maternal age alone as a screening test Two papers compared maternal age alone as a screen to other screening methods (Michailidis et al. 2001; Schuchter et al. 2002). In one study maternal age as a screen was clearly inferior to NT, triple test and a screening method where NT and TT were combined in an independent manner (a positive screen in either NT or TT constituted a positive screen) (Schuchter et al. 2002). In the other study maternal age was inferior to NT and screening combining NT and double test in an independent manner, but maternal age had a slightly higher DR than the double test (56.5% versus 50%) with a higher FPR (21% versus 8.7%), (Michailidis et al. 2001). It should be noted however, that the double test performance was determined after screening with NT in the first trimester that was interventional and left only a few cases (n=4) to be detected by the double test (Michailidis et al. 2001). Overall, the literature supported the low predictive performance of maternal age alone compared with other screening strategies in the second trimester. First trimester screening versus second trimester serum screening One paper compared 1st trimester maternal serum screening alone (i.e. with no NT screening) to 2nd trimester maternal serum screening (Gyselaers et al. 2004a). Results indicated that TT had a better performance than 1st trimester MSS (fβhCG and PAPP-A). The paper had some limitations which may reduce the accuracy of these findings. The design was a retrospective cohort study that compared screening with two screening methods in two different populations at different times. There may have SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 88 been concurrent screening with other methods (NT during the period of 1st trimester MSS) which could bias the samples. Sixteen papers compared 1st trimester NT (or the combined test) to 2nd trimester serum screening (double test, TT, or quad test). Five of these papers (Audibert et al. 2001; Babbur et al. 2005; Michailidis et al. 2001; Rode et al. 2003; Schuchter et al. 2001) were interventional-i.e. the NT results in the first trimester were acted upon before the 2nd trimester which biases the results (Wald et al. 2003b). Women with high NT results were offered diagnostic tests and did not have 2nd trimester serum screening. This makes a comparison between NT and the 2nd trimester MSS invalid. There were eleven non-interventional studies (or modelling based on non-interventional studies) that compared screening with NT (or the combined test) to 2nd trimester MSS. The results of the 1st trimester screening test were not acted upon and women proceeded to 2nd trimester screening. The results of these studies are presented in Table 14. The results consistently showed that 1st trimester screening strategies had similar performance to 2nd trimester screening, apart from the combined test which in general performed better than all 2nd trimester screening. Table 14. Non-interventional studies comparing DRs and FPRs of 1st trimester NT (or combined test) versus 2nd trimester strategies (and integrated, sequential, or independent strategies). Reference (Dommergues et al. 2001) (Rozenberg et al. 2002) ?some intervention (Wald et al. 2003b) (SURUSS) Test NT ≥ 3mm Double (1:250) NT and double test (independently) Modelled (1:250) NT Double NT & double (stepwise manner) Modelled 5% FPR (1:250) NT Double NT & double (stepwise manner) Fixed 85% DR NT Combined Double TT Quad Serum integrated Integrated Fixed 5% FPR NT Combined Double TT Quad Serum integrated Integrated DR (%), (95% CI) 100 59-100 86 42-100 100 59-100 FPR (%), (95% CI) 3.3 (1.5-5.1) 33 (28-38) 35 (30-40) 53.4 68.8 80.6 4.6 7.8 5.3 54.6 59.7 79.8 5 5 5 85 85 85 85 85 85 85 20.0 6.1 13.1 9.3 6.2 2.7 1.2 60 83 71 77 83 90 93 5 5 5 5 5 5 5 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING (18.6-21.4) (5.6-6.5) (12.5-13.7) (8.8-9.8) (5.8-6.6) (2.4-3.0) (1.0-1.4) 89 Table 14. Reference (Malone et al. 2005) (Faster trial) Non-interventional studies comparing DRs and FPRs of 1st trimester NT (or combined test) versus 2nd T MSS strategies (and integrated, sequential, or independent strategies) (continued) Test Observed Combined (1:150) Combined (1:300) Quad (1:300) Combined and quad (independent manner) (1st T 1:150, 2nd T 1:300) Age standardised FPR for 85% DR NT 1st MSS Combined TT Quad Serum integrated Integrated (Lam et al. 2002) (Cuckle 2003) (Wright and Bradbury 2005) Age standardised DR for 5% FPR NT 1st MSS Combined TT Quad Serum integrated Integrated Stepwise (fixed 2.5% FPR for each screening component) Modelled 5% FPR (1:320) NT Double NT and double (Integrated manner) Modelled (1:250) NT (11-13/40) Combined TT Quad PAPP-A combined with quad (?integrated or stepwise) PAPP-A and NT combined with quad (?integrated or stepwise) Modelled, 85% DR Combined Quad Serum Integrated Fully integrated DR (%), (95% CI) 77 82 85 94 FPR (%), (95% CI) (69-86) (74-89) (76-93) (89-99) 85 85 85 85 85 85 85 70 70 87 69 81 88 96 95 (65-79) (64-78) (82-92) (63-74) (70-86) (81-92) (92-97) (91-97) 69.3 (56-76.1) 73.2 (63.4-82.9) 85.7 (76.2-92.1) 3.2 5.6 8.5 11 (3.0-3.4) (5.4-5.9) (8.2-8.8) (10.7-11.3) 20 16 3.8 14 7.3 3.6 0.6 (10-26) (9.8-22) (1.8-7) (10-21) (4.6-16) (2.0-7.7) (0.4-1.6) 5 5 5 5 5 5 5 4.9 5 5 5 70.6 81.6 65.1 69.2 76.1 2.4 2.0 4.7 4.1 3.2 88.1 1.5 85 85 85 85 6.1 6.2 2.7 1.2 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 90 Table 14. Non-interventional studies comparing DRs and FPRs of 1st trimester NT (or combined test) versus 2nd T MSS strategies (and integrated, sequential, or independent strategies) (continued) Reference (Benn et al. 2005a) (Cuckle et al. 2005) (Benn and Donnenfeld 2005) (Wald et al. 2006a) Test Modelled-UK policies Combined (1:250) Double (1:250) TT (1:250) Quad (1:250) Contingent screening (overall risk 1:250) Modelled-USA policies Combined (1:270) TT (1:270) Quad (1:270) Contingent screening (overall risk 1:270) Modelled for 5% FPR (using fβhCG) NT Combined Double TT Quad Serum Integrated Fully integrated Modelled Combined Quad Stepwise Modelled 5% FPR (adjusted for previous pregnancy) Combined TT Quad Serum integrated integrated DR (%), (95% CI) FPR (%), (95% CI) 85.8 77.8 79.8 83.8 91.4 3.8 7.8 7.2 5.4 2.1 83.6 81.7 84.6 89.1 5.3 8.3 6.7 3.1 78 87 61 65 71 78 93 5 5 5 5 5 5 5 83.7 84.4 90.8 5.1 6.6 3.1 87 80 85 89 95 5 5 5 5 5 1st trimester screening and/or 2nd trimester screening compared with screening strategies using first and second trimester tests. All eleven studies in Table 14 (non-interventional studies comparing screening with NT/ combined test with 2nd trimester MSS) also had results for a screening strategy using first and second trimesters tests (integrated, sequential or independent screening). One of the studies was in a population where women were of advanced maternal age and the proportion of DS was 2%, the sample size was only 359, and there were only 7 DS cases (Dommergues et al. 2001). In this study, NT had a lower FPR and a DR (100%) equivalent to screening in an independent manner. In another study screening combining results in an independent manner had a higher DR and a higher FPR than screening confined to one trimester (Malone et al. 2005). Most studies showed that screening in a stepwise, contingent or integrated manner has an increased performance compared to screening in either trimester alone, and that serum integrated screening has a similar performance to the 1st trimester combined test. The five interventional papers not included in Table 14 confirmed these results. Two of these papers included results for an independent approach (Audibert et al. 2001; Schuchter et al. 2001). In both these studies independent screening had higher DRs but the FPR was higher than screening in either trimester. Two papers gave results for screening in a stepwise manner (Audibert et al. 2001; Babbur et al. 2005). In both studies this strategy had higher DRs than 1st or 2nd trimester screening but in one paper the FPR was higher than for NT screening (4.1% versus 1.8%), (Babbur et al. 2005). In this study TT had a particularly poor performance which may have been due to the samples being taken at 14/40 rather than 16/40 (Babbur et al. 2005). Also, comparisons were difficult in this paper as results were SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 91 presented for different cut-offs, fixed FPRs, or DRs. In another study the integrated test was compared to the combined test and the quad test and the performance was better (higher DR for fixed 5% FPR) than either method alone (Rode et al. 2003). In the last of these papers it was not clear how the tests were combined for the results provided (Michailidis et al. 2001). However, the screening method combining results for NT and the double test had a better DR and lower FPR than either screening method alone (Michailidis et al. 2001). Two other papers not included in Table 14 (as they did not also include results for 1st trimester screening) compared screening in the 2nd trimester with a screening strategy using first and second trimester tests. One paper compared screening with 2nd trimester serum screening to a method combining NT and 2nd trimester MSS (Gyselaers et al. 2004b). It was not clear how the strategy combined the results (Gyselaers et al. 2004b). Screening performance decreased after the introduction of NT (Gyselaers et al. 2004b). However, there were major limitations to this study. The study was based on two different populations having two different screening strategies over different time periods (2nd trimester MSS from 1992, NT from 1999), (Gyselaers et al. 2004b). The group having NT and 2nd trimester MSS included a very small number of DS cases (n=9). The proportion of older women in the sample was small as many of these women would have had an invasive test rather than screening (Gyselaers et al. 2004b). Women who had abnormal NT were probably removed as they would have had invasive screen and not MSS. These limitations could explain the finding (which contradicts other studies) that NT has a better performance than a sequential strategy combining NT and 2nd trimester MSS. A prospective cohort study compared serum integrated screening to TT and quad test (Knight et al. 2005). When cut-offs were chosen so the DR was close to 70%, serum integrated screening had a lower FPR (2%) than both quad test (3%) and TT (5%). In general studies found that stepwise, contingent and fully integrated screening performed better (higher DR and lower FPR) than screening in a single trimester and that screening combining tests in an independent manner screening had an improved DR compared to screening in either trimester but the FPR was higher. Serum integrated screening had a similar performance to 1st trimester combined test and may be useful when staff trained in NT are not available (Wald et al. 2003b). Comparison of performance of integrated and sequential screening strategies. Ten papers were identified that compared integrated or sequential screening strategies. Seven of these papers compared conventional integrated or sequential screening strategies: fully integrated (or serum integrated), stepwise, or contingent screening. Some also presented results for screening where tests were combined in an independent manner. The results of these papers are presented in Table 15. Five papers compared stepwise screening to fully integrated screening (Cuckle et al. 2005; Hackshaw and Wald 2001; Malone et al. 2005; Palomaki et al. 2006; Wald et al. 2003b). In most studies integrated performed marginally better than stepwise screening. Where DR was fixed, integrated had a decreased FPR. In one paper where the FPR was fixed at 5% integrated screening had a decreased DR (93% versus 95% for stepwise), (Cuckle et al. 2005). Three papers compared the fully integrated test to a conventional contingent strategy (Benn et al. 2005a; Cuckle et al. 2005; Palomaki et al. 2006). In one paper the results (FPR) varied depending on the policy but in general integrated had a slightly better DR for the same risk cut-off (Benn et al. 2005a). In another paper at a fixed 5% FPR, integrated screening had a DR of 93% and contingent had a DR of 94% (Cuckle et al. 2005). In the third paper for a fixed DR integrated screening had a lower FPR (Palomaki et al. 2006). Two papers directly compared contingent screening to stepwise screening. In one, at a fixed 5% FPR contingent screening had a DR of 94% and stepwise had a DR of 95% (Cuckle et al. 2005). In the other at a fixed risk cut-off stepwise had a DR of 90% and FPR of 1.7% whereas contingent screening had a DR of 88% and a FPR of 1.4% (Maymon et al. 2005). In the one paper that had a comparison between a screening method combining tests (combined test and quad test) in an independent manner with sequential and integrated strategies, independent SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 92 screening had a lower DR (86%) compared to stepwise (95% with 1st trimester fβhCG), contingent (94% with 1st trimester fβhCG) and fully integrated (93%) for a fixed FPR of 5% (Cuckle et al. 2005). Three of the appraised papers only had results comparing non-conventional integrated or sequential screening strategies (and so were not included in Table 15). One of these was a case-control study that compared screening combining NT and TT in an independent manner (disclosure) versus an integrated manner (non-disclosure), (Herman et al. 2002). The result showed the non-disclosure strategy had an improved FPR but a decreased DR compared to the disclosure strategy (Herman et al. 2002). These results contradict other results where DR is improved and FPR decreased by integrated screening compared to independent screening. As discussed earlier, a case-control design is not ideal for determining the validity of a screening method. The study also had limited details on how risk was determined and the sample was biased by the fact that cases and controls were from different hospitals and most of the cases had already been detected through NT screening (Herman et al. 2002). Another paper aimed to compare the performance of screening (either with fβhCG, and AFP or with fβhCG , AFP, and uE3) when different gestational dating policies were used (Rahim et al. 2002). It is not clear when different samples were taken, but the GA for sampling was given as 13-21/40. Screening with fβhCG, AFP, and uE3 had a slightly better DR (67.9% with a policy of scanning all women for dating) compared to screening with fβhCG and AFP (63.2%), in agreement with the evidence that combining first and second trimester tests improves performance of DS screening. The third paper with a comparison of unconventional integrated or sequential methods compared a three-stage contingent screening strategy with screening combining the same markers in an integrated manner (Wright et al. 2006). The three stage screening strategy involves all women having 1st trimester fβhCG and PAPP-A, and those with a very low risk would have no further screening (Wright et al. 2006). All other women would have NT (Wright et al. 2006). Those with a high risk when combined with 1st MSS results would have an invasive diagnostic test, those with a low risk would have no further testing, and those with an intermediate risk would have 2nd trimester quad test (Wright et al. 2006). This strategy achieved an 89.5% DR for a 1.9% FPR at the best risk cut-off. While the integrated method had a higher DR and lower FPR (92.2%, 2.1%) most women having the contingent strategy would only require 1st trimester screening, while for the integrated approach all women would need testing in both trimesters (Wright et al. 2006). Overall these papers suggest that fully integrated screening is marginally better than both stepwise and contingent screening (with a lower FPR for a fixed DR) and that the validity of stepwise and contingent screening is similar. Table 15. Comparison of the validity of fully (or serum) integrated screening, stepwise screening, contingent screening, and screening combining tests (combined test and quad test) in an independent manner Reference (Wald et al. 2003b) (SURUSS) (Malone et al. 2005) (FASTER) (Hackshaw and Wald 2001) Test Stepwise screening (1:250 2nd T) Integrated (risk cutoff set so DR = stepwise) Age standardised5% FPR Serum integrated Fully Integrated Stepwise (2.5% FPR for each screening component) Modelling-Stepwise (5% FPR 2nd step), integrated (same DR) Stepwise integrated DR (%), (95% CI) 93 FPR (%), (95% CI) 9.8 93 4.5 88 96 (81-92) (92-97) 5 5 95 (91-97) 4.9 95 95 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 5 9.2 5 4.4 93 Table 15. Reference (Benn et al. 2005a) (Cuckle et al. 2005) (Maymon et al. 2005) (Palomaki et al. 2006) Comparison of the validity of fully (or serum) integrated screening, stepwise screening, contingent screening, and screening combining tests (combined test and quad test) in an independent manner (continued) Test Modelled-UK policies Contingent screening (overall risk 1:250) Contingent screening (overall risk 1:100) Integrated testcontingent manner (overall risk 1:250) Integrated testcontingent manner (overall risk 1:100) Integrated test-non disclosure (overall risk 1:250) Integrated test-non disclosure (overall risk 1:100) Modelled-USA policies Contingent screening (overall risk 1:270) Contingent screening (overall risk 1:130) Integrated testcontingent manner (overall risk 1:270) Integrated test combined in contingent manner (overall risk 1:130) Integrated test-non disclosure (overall risk 1:270) Integrated test-non disclosure (overall risk 1:130) Modelled 5%FPR Serum integrated Integrated Stepwise-fβhCG Independent manner-fβhCG Contingent -fβhCG Modelled Stepwise contingent Modelled DR for 5% FPR and integrated set so same DR Stepwise (1st T1:168, 2nd T-1:165) Versus Integrated (1:275) Contingent (lower risk 1:3250, 1st T 1:168, 2nd T 1:170) Verus Integrated DR (%), (95% CI) FPR (%), (95% CI) 91.4 2.1 88.2 1.1 91.9 2.2 88 1.0 92.1 2.2 88.3 1.0 89.1 3.1 86.2 1.9 90.5 3.4 86.8 1.8 90.7 3.4 87.0 1.8 78 93 95 86 94 5 5 5 5 5 90 88 1.7 1.4 89.9 5 89.9 3.1 89.8 5 89.8 3 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 94 Other comparisons Two papers looked at the performance of screening when repeat measures of the same analytes were taken in both trimesters. The first paper found that repeat measures of some markers (e.g. PAPP-A and uE3 or total hCG) with and without NT performed better than combined screening, serum integrated and fully integrated screening (lower FPR for a fixed 85% DR), (Wright and Bradbury 2005). The authors of the second paper felt that the methods of the above study yielded “implausible riskestimates” when using repeat measures in several highly correlated markers (Wald et al. 2006b). To overcome this issue they used ratios of the marker levels in 1st trimester and 2nd trimester which they called the cross trimester (CT) ratios (Wald et al. 2006b). The modelled estimates of the DR for a 5% FPR were higher for the integrated test when using CT ratios compared to the conventional integrated test (using hCG-97.4% versus. 94.1%), (Wald et al. 2006b). The authors commented that while it seems inconsistent that discrimination is improved if sampling of highly correlated analytes is repeated across 2 trimesters, this is achieved because what is being measured is not two markers, but one marker and the change in its concentration throughout a pregnancy (Wald et al. 2006b). This paper also had results for the conventional integrated test using fβhCG compared to total hCG. For a fixed 5% FPR the DR of the integrated test using fβhCG (94.2%) was similar to using hCG (94.1%), and for a fixed DR of 85% the FPR of integrated using fβhCG (0.88%) was slightly lower than for hCG (0.95%), (Wald et al. 2006b). There were no CI provided and these differences are too small to make a conclusion on which is the most valid method. One paper aimed to compare different combinations of 1st and 2nd trimester markers including a new marker, the proform of eosinophil major basic protein (proMBP) (Rode et al. 2003). A combination of AFP, hCG and ProMBP had a higher DR (79%) for a fixed 5% FPR than the conventional TT (AFP, hCG and uE3-61%), and a combination of AFP, hCG, uE3 and ProMBP had a higher DR (83%) for a fixed 5% FPR than the conventional Quad test (69%). The tests using ProMBP all had higher DR than the combined test (76%), (Rode et al. 2003). As previously discussed the data were from a study where -the NT was used clinically in the 1st trimester and nearly all women included in the analysis had a negative NT screen which will bias the results in favour of 2nd trimester screening. The conventional integrated test had a DR of 86% compared to a method combining NT, 1st trimester fβhCG, 2nd trimester AFP and ProMBP which had a DR of 90%. While these results suggest that ProMBP may be an important new marker for DS, there needs to be a larger prospective study using fresh samples to confirm these results. Difficulties implementing any of the screening strategies Six papers noted problems with NT measurement. In one paper screening with NT was not achieved in some women because either CRL corresponded to a gestational age outside the screening range, or because NT was not able to be successfully measured (Audibert et al. 2001). In another paper transvaginal USS (rather than transabdominal USS) had to be performed for 14% of NT measurements (Michailidis et al. 2001). A number of papers recorded the proportion of women in whom NT screening was unsuccessful: NT screening failed or the measurements were suboptimal in 7% (Malone et al. 2005), for 0.22% NT could not be measured successfully (Lam et al. 2002), 98.6% of women had successful NT measurement with a mean duration of scan 12mins (range-5-26 mins), (Rozenberg et al. 2002), and for one study 9% of pregnancies had no NT obtained within 20 mins (Wald et al. 2003b). This was worst before 10/40 weeks and after 14 weeks and best at 12/40) (Wald et al. 2003b). This last study also noted that the failure rate decreased significantly over the study period, and also that the make and model of the USS machine influenced the ability to get an image as did experience of the ultrasonographer (Wald et al. 2003b). In this study using only NT images that were judged satisfactory improved screening performance (Wald et al. 2003b). Four papers noted problems with serum screening-most commonly these were problems with defaulting from 2nd trimester maternal serum screening (Audibert et al. 2001; Babbur et al. 2005; Knight et al. 2005; Lam et al. 2002). In one study of 11,159 who agreed to participate, only about 80% submitted both samples in the right gestational range (some had 1st trimester serum screening too early or late, and others had no 2nd trimester serum screening due to fetal loss, declining to have screening, SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 95 changing residence/provider, having amniocentesis or ToP between trimesters or having 2nd trimester serum screening too late), (Knight et al. 2005). This paper also had problems as they used a lower cutoff for 1st trimester screening of 8/40. As two of three didn’t have dates confirmed at this time and PAPP-A had not been validated for before 8/40, these women had to have 2nd trimester serum screening (Knight et al. 2005). Two papers noted problems with combining results of screening in two trimesters. One noted issues with matching samples from the two trimesters. This required extra work for lab staff (Knight et al. 2005). Another paper demonstrated through modelling how assuming tests were independent when they were not could decrease the performance of stepwise screening: risks will be underestimated in 27% of DS cases and overestimated in 20% of unaffected (Benn and Donnenfeld 2005). In 10% of DS the risk will be underestimated more than two fold and for 10% of unaffected cases the risk will be overestimated more than 2 fold (Benn and Donnenfeld 2005). Two papers commented on the importance of adjusting for certain factors in order to correctly determine risk. One study showed that not adjusting for maternal weight meant a small increase in FPR, and that not using USS for dating also increased FPR (Wald et al. 2003b). Another study demonstrated that women who had false positive results in previous pregnancies had high FP rates in subsequent pregnancies, and that these rates could be greatly reduced by adjustment for markers in previous pregnancies (this adjustment slightly reduced DRs), (Wald et al. 2006a). Summary of results There was a high level of consistency supporting: 1. Low performance of maternal age alone as a screening test compared to other screening strategies. 2. NT and 1st trimester MSS having similar performance (FPR and DR) to 2nd trimester serum screening. 3. Combined test being a more valid screening test than all screening strategies in the second trimester (and the first trimester-see Chapter 3). 4. Serum integrated screening having a similar performance to combined screening. 5. Screening combining tests in an independent manner increasing DR compared to screening in either trimester alone, but having a higher FPR. This strategy is not recommended (Cuckle et al. 2005). 6. Screening performance being improved if integrated or sequential methods (fully integrated, stepwise, and contingent) are used rather than screening confined to one trimester. 7. Fully integrated screening having a better performance than both stepwise and contingent screening. Fully integrated screening has a lower FPR for a fixed DR. 8. The validity of stepwise and contingent screening being similar. There appears to be some support for improved performance using strategies which repeat measures of the same analyte in the 1st and 2nd trimester. Another marker, ProMBP showed some promise but further research is needed. There were some difficulties noted with the screening tests. Mostly these were issues with NT: either NT measurement was not successful in all women or it took more time than expected. There were also a number of papers that noted the problem of women defaulting from 2nd trimester maternal serum screening. Adjusting for maternal weight, dating using USS, and adjusting for false positive results in previous pregnancies was noted to increase screening performance. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 96 Conclusion Quality control of NT screening will be a major consideration if it is used in a population based screening programme. The problem of women defaulting from 2nd trimester maternal serum screening has implications for screening methods which depend on women having screening in both trimesters (especially for integrated screening which only involves PAPP-A and NT, rather than the combined test used by contingent and stepwise screening). Combined screening is better than any other 1st trimester or 2nd trimester screening strategy. Screening which combines tests in an independent manner may increase DR compared to screening confined to either 1st or 2nd trimester but it will increase FPRs. Fully integrated screening is marginally better than both stepwise and contingent screening with a lower FPR for a fixed DR and that the validity of stepwise and contingent screening is similar. However, most of this evidence is based on papers using statistical modelling as it is not possible to both disclose and withhold results for the same individuals. These models make assumptions such as the number of women who will return for 2nd trimester screening, which may not be accurate in the real-world. Some modelling papers use the same primary data, similar modelling techniques, and similar assumptions which may mean the same or similar results are repeated. Confirmation of the performance of these strategies is required in large prospective cohort studies. Overall, the evidence indicates that integrated or sequential screening has superior screening performance for DS compared with screening confined to either 1st or 2nd trimester and any choice of screening should be selected from these strategies (fully integrated, stepwise, or contingent screening). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 97 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Audibert et al. 2001) Compared 1st T NT to screening with 2nd T double test (total hCG), or a strategy which combined the 2 screening methods (1. having either test positive, or 2. combined risk > 1:250). May 1994 -December 1997 DS screening (1st T NT USS and 2nd T double test) routinely offered to women who booked at the hospital. Outcomes The outcomes extracted for this table are the DR and FPR for DS screening for NT and double test at different cut-offs (NT ≥ 3mm, NT risk ≥ 1:250, MSS risk ≥ 1:250) and for combinations of the two screening strategies: either by a combined risk greater than ≥ 1:250, or where NT ≥ 3mm or MSS risk ≥ 1:250. Accuracy of screening methods DR (95%CI), FPR (95%CI) Limitations Women > 38 yrs were excluded-this is therefore a low risk population. NT ≥ 3mm 58% (30-86), 1.8% (1.4-2.3) NT (1:250) 67% (35-90), 4.3% (3.7-4.9) Double (1:250) 60% (26-88), 3.3% (2.7-3.8) NT ≥ 3mm or MSS risk ≥ 1:250 (Independent manner) 90%(55-100 , 4.8% (4.1-5.5) Combined risk ≥ 1:250 (stepwise manner) 90%(55-100, 2.6% (2.1-3.1) This was an interventional study. Those positive by 1st T NT sometimes had IT and ToP before 2nd T. This will decrease the DR of 2nd T MSS compared to 1st T NT. Difficult to compared results with a nondisclosure strategy. University Hospital (Hôpital Antoine Béclère) France Prospective cohort study Grade III-2 NT measured using standardised methods by specially trained staff. Cystic hygroma considered increased NT. TA USS where possible or TV USS. NT at 12-13/40. 5245 consecutive women scanned. Exclusions (1080): twins (161), CRL outside the range 38-84mm (303), NT not measured or recorded (219), and MA over 38 yrs (397) as these women were routinely offered AC. 35 lost to follow-up. AFP and total hCG were measured (Ortho Clinical Diagnostics) between 14-17+6/40. Cut-off risk of 1:250 risk (Prenata software, Ortho Clinical diagnostics). NT derived LR for each patient was calculated using parameters from Nicolaides et al. 1998 (as referenced in this paper) and NT distribution in this population. To combine NT and MSS risks, MA related risk and MSS LR multiplied by NT LR. Karyotyping suggested when NT ≥ 3mm, MSS positive or abnormalities on 2nd T USS. 4130 analysed. Other chromosomal disorders included as unaffected. Mean age 30.1 yrs (16-37). 3,790 had MSS (of whom 65 had positive NT) verification 7.6% had prenatal karyotyping. Most women who had abnormal NT also had 2nd T MSS (before AC). Not possible when women had CVS and requested ToP before 14/40. Fetal postmortem following every ToP, spontaneous abortion, or intrauterine demise. 340 did not have MSS as declined test or had fetal loss or ToP. All newborns examined by paeds and karyotyping in those suspected of chromosome abnormalities. Spectrum of disease: 12DS/4130 = 0.29% Difficulties implementing any of the screening strategies Successful screening was not achieved because CRL outside range (n= 303) or because NT was not measured (n=219). 340 did not have MSS as they declined the test or had fetal loss or ToP before 2nd trimester. 20 cases of chromosomal disorders other than DS. All those with an ongoing pregnancy also had USS at 20-24 weeks. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING A better way to combine 2 tests is with an algorithm based on MA, NT (related to CRL), and MSS-none was available at time of study. The method of combining risk as was done in this study (MA and MSS-derived risk X NT-derived LR) assumes NT and serum independent. This study found no correlation between tests but some with large NT not included. Needs further studies. Authors’ conclusions The results suggest that NT screening compares favorably with MSS (double test) although there was over-lap in CI’s which does not allow a firm conclusion. NT also has the advantage of confirming viability, dating, detecting multiple pregnancies and structural abnormalities. Reviewer’s conclusions The study had excellent follow-up of pregnancy outcomes. The intervention after 1st T NT meant the sample is biased. The results of this study suggest that combining the results of 1st T NT and 2nd T double test is better than either method on its own. 98 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Schuchter et al. 2002) Compared screening with 1st T NT to TT (AFP, uE3, and hCG) and a combination of the two (where positive results if screen positive by either method). NT done in 9,789 pregnancies and 9,642 also had TT over 5 year period (January 1994-December 1998. Outcomes The performance of screening using MA (≥ 35yrs), NT, TT or combined screening (either NT and/or TT positive) is presented. Accuracy of screening methods DR (95%CI), FPR (95%CI) Limitations Retrospective design-may mean less ascertainment of DS cases. MA 47.4%(24.9-70), 10.6%(10.011.2) NT 57.9%(35.7-80.1), 2.3%(2.0-2.6) TT 91.7% (62-100), 5.2%(4.8-5.7) Combined NT and TTindependent manner 94.7% (74-100), 7.0%(6.5-7.5) Intervention study-some who were screen positive in 1st T had IT and ToP and therefore did not have 2nd T MSS. Sample will be biased. Danube Hospital Vienna, Austria Retrospective cohort study Grade III-2 The policy at the hospital was to offer women NT (& CRL) at 1113/40. NT by TA USS or TV if not feasible, using standardised methods and trained staff. Rarely needed to reschedule a visit for repeat measure. GA using USS to correct discrepancies between LMP and CRL. NT measured and for those with NT ≥ 3.5mm CVS performed. All others asked to return for TT at 16/40. TT using Ortho Clinical Diagnostics kits. Risk factors using alpha software (logical Medical Systems) which takes MA into account. Only the 9,342 women delivered at this hospital used in the analysis as outcome not followed-up if delivered elsewhere. 9315 of these also had 2nd T MSS. Median age at delivery was 28 yrs (range 15-46 years), and 10.7% ≥ 35 years. Median CRL 48mm (range-3565mm). Spectrum of disease: 19DS /9342 = 0.20% 12 with other chromosomal abnormalities included as unaffected. Verification Newborns all examined by paeds. If suspected chromosomal abnormality karyotyping was done. Difficulties implementing any of the screening strategies No objective problems noted. Rarely needed to reschedule a visit for repeat measure (doesn’t say how many). AC offered to those in whom NT 2.53.4 mm (had been told at the time of NT), TT risk ≥ 1:250 and all women ≥ 35 years. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Small number of DS (n=19). Author’s conclusion The data suggest that the combination of NT measurement in 1st T and the TT in the 2nd T is associated with a very high DR of DS at a relatively low screen positive rate (SPR). Reviewers’ conclusions The results show screening combining NT and 2nd T TT has a better DR but higher FPR than TT or NT alone. However this is an interventional studysome with high NT in 1st T did not have 2nd T TT. 99 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Michailidis et al. 2001) Compared screening using NT to screening with double test and to screening combining NT and double test. Some DS also detected by USS soft markers. Women routinely screened in a single hospital setting. From data base identified 9548 sequential women who had 1st T USS after January 1995 with EDD before 1st January 2000. Outcomes Outcomes were the DR and FPR for MA ≥ 35yrs, NT, double test, or screening combining NT and MSS. Accuracy of screening methods DR (95%CI), FPR (95%CI) Limitations Retrospective design. Maternity unit of a university hospital. London, UK Retrospective cohort study Grade III-2 1st T USS for NT, viability, CRL, and number of fetuses and chorionicity. GA corrected using CRL. NT measured using standardised methods. When NT could not be measured by TA USS, TV scan performed. If NT above 99th centile (or structural abnormalities) offered IT. Women were offered the option of 2nd T double test (AFP, free β hCG). AFP and free β hCG were measured by radio-immunoassay and immunoradiometric assays. The risk for DS determined by screening software (Screenlab, QC technology) using LR method. When risk at term >1:250 offered invasive screen. Retrospectively applied a formula (Cuckle and Sehmi, 1999 as referenced in this paper) and LR of Nicolaides et al. 1998 (as referenced in this paper) in order to calculate combined risk based on 1st T and 2nd T screening and MA. Excluded 309 with fetal demise, and those presenting outside GA range of 10-14/40 and those without NT measurement. Left 8536 eligible. 7447 had outcome data. Mean MA = 30.1 year (range-13-50 years). 21.3% ≥ 35 years. The ethnicity was 75.6% Caucasian, 11% African and West Indian, 10.7% Asian, and 2.7% Oriental. Mean GA at NT was 12+5/40. Spectrum of disease: 23 DS/7447 = 0.31% 23 other chromosomal disorders. 4864 also had 2nd T MSS (double test). MA distribution not sig diff between two groups. (21.97% > 35 yrs.) DS 4/4864 =0.08%. Unclear how calculation done for combining NT and double test. Unclear the denominator for this calculation-CI therefore not calculated for FPR. Also used Cuckle and Sehmi mathematical formula to interpret double test results when already had 1st T screen. Verification Invasive testing offered to those with positive NT screen, USS showing structural abnormality, positive double test, MA >37 yrs and requesting invasive testing, Fmhx of chromosomal abnormalities, or maternal anxiety. Overall 8.5% had invasive screen. MA ≥ 35 years 56.5%, (34.5-76.8), 21%(20.121.2) NT fixed 5% FPR 82.6% (61.2-95.1) Double test (1:250) 50% (6.8-93.2), 8.7%(7.9-9.5) Combining NT and double (?stepwise) 90.5% (69.6-98.8), 4.2% Using correction for having 1st T screening, double DR increased and FPR decreased (75%, 5%). Difficulties implementing any of the screening strategies TV USS had to be performed for 14% of NT measurements. Pregnancy outcome from hospital maternity database, labour ward records, or from patients (outcome request letter). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Interventional study-results of 1st T NT were acted on leaving only 4 cases of DS in those who had double test. Difficult to compare the two methods. Raw data was not documented and confusing text- difficult to check how figures calculated. According to Hackshaw and Wald (Hackshaw and Wald 2001) should only use the Cuckle and Sehmi formula if all who had 1st T then have 2nd T screening regardless of the result. Author’s conclusion 1st T NT measurement is an effective screening test for the prenatal detection of fetuses with DS. Although the MSS in the 2nd T can detect additional DS cases this may be outweighed by the delay in diagnosis, the extra visits and cost so that the right time for MSS is most likely to be in the 1st T. Reviewers’ conclusions Some biases may affect the accuracy of the DR and FPR (especially for 2nd T MSS) and the ability to compare with non-interventional studies. Difficult to understand how the figures were calculated for double test and combining NT and double test. Results indicate that 1st T NT performs better than double test but that combining the two increases performance. 100 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Dommergues et al. 2001) Compared screening by NT, 2nd T double test, and combining results of these two screening methods (screen positive where either test was positive). Also compared 2nd T sonography, but these data were not extracted for this review. AMA-38 years and over. Outcomes The outcomes were the DR and FPR for NT ≥ 3mm, double test (1:250) and a screen where either NT or double test positive constituted a positive screen. Accuracy of screening methods DR (95% CI), FPR (95% CI) Limitations Small number of DS cases (n=7). University Hospital (Hôpital Antoine Béclère) France Prospective cohort study The study was non-interventional. Women had MSS even where NT was ≥ 3mm indicating need for AC and in 3 women who had CVS. Grade III-2 NT obtained at 10-14/40 using standardised methods (as per Nicolaides et al.) by staff trained in NT measurement. AFP and hCG analysed at 15-17/40 and risk calculated using Prenata software, Ortho Clinical Diagnostics. If NT ≥ 3mm or MSS >1:250 or if structural anomalies on 2nd USS recommended karyotyping. The sample consisted of 359 consecutive women aged 38-47 seeking antenatal care before 14/40 in a maternity hospital between 1994 and 1997 who consented to have NT and 2nd T MSS. Follow-up in all 359 patients. Disease spectrum = 7/359 = 2% 2 cases of other chromosomal disorders-included in analysis as unaffected Verification 227 had karyotyping. Follow-up included clinical examination of all neonates, cytogenetic records when available, and pathological examination of all cases of fetal loss or ToP. NT ≥ 3mm 100% (59-100), 3.3% (1.5-5.1) Double test (1:250) 86% (42-100), 33% (28-38) Combining NT and double test (independent manner) 100% (59-100), 35% (30-40) Difficulties implementing any of the screening strategies No objective difficulties noted. 2 cases of fetal loss ≤ 22 weeks considered normal on the basis of having no significant structural anomaly at post-mortem. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING AMA of the sample limits the generalisability of the estimates of performance. Few details of how risk calculated for double test (where parameters obtained from) Author’s conclusions AC may be offered on a selective rather than routine basis in women over 38yrs, based upon the results of noninvasive screening tests” Reviewers conclusions The study has a small sample size and is limited to women over 38 yrs. However the results indicate that NT has a higher DR and lower FPR compared to double test. A screening strategy where a screening was positive if either NT or double tests were positive had same DR as NT and higher DR than double test, but higher FPR than NT or double test alone. 101 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Rozenberg et al. 2002) Comparing NT to double test (fβhCG, AFP) and to screening where either NT or double test positive = a positive screen. Also used modelling to compare NT and double test to screening combining NT and double test results (risk 1:250). Six centers participated in the study which was conducted between March 1994 and December 1997. All women having prenatal care at the hospitals and requesting DS screening were invited to participate. Outcomes The observed DR and FPR for NT ≥ 3mm, double test (1:250), and screening combining the two (considered a positive test if either NT or double test positive). Accuracy of screening methods DR (95%CI), FPR (95%CI) Limitations While it was intended that all would have 2nd T MSS some had only NT (possibly because of a positive NT screen). Also some only had MSS. Multicentre study: 2 tertiary referral centres and 4 primary referral centres. Prospective cohort study Grade III-2 NT carried out 12-14/40. According to the paper TV measured unless position meant had to measure by TA USS. Confirmed GA with CRL. NT measured as per Nicolaides. MSS at 14+1/40-17/40. Uniform handling and storage of specimens. fβhCG and AFP analysed using ELISAs and commercial kits (CIS Bio). Marker levels expressed as the gestational specific MoM and risks estimated from MA and MoM using commercial software (CIS Bio) that uses normal medians from a French population at 14-17/40 AC advised when either NT ≥ 3mm (procedure delayed until after MSS), and when risk based on MSM was ≥ 1:250 at term. Retrospective analysis was done on data collected in the study. NT and MSS marker expressed as MoM using own data set and risks calculated based on published parameters. Eligible if 18-37 years, no Fmhx DS, singleton, and intention to deliver at participating hospital. 9444 screened over period. 326 outcome not known and excluded from analysis (including 113 who miscarried before 24/40 without karyotyping) 9118 in analysis-of whom: 5506 had both USS and 2nd T MSS, 821 had only USS 2791 had only MSS. The results were also modelled for a population with a mean maternal age of 28 yrs and a coefficient of variation of 16%. Compared NT to double test and to combined risk of NT and MSS ≥ 1:250. Verification Amniocentesis when NT more than 3mm or risk of MSS more than 1:250. About 8.6% invasive screen. 6327 had USS but only 6234 had successful measurement of NT. 8297 had MSS. Median MA 30.5 yrs. Spectrum of disease: 21 DS /9118 = 0.23% No details of how pregnancy outcomes verified or if fetal losses were karyotyped. NT ≥ 3mm 62%(41-83), 2.8% (2.4-3.2) Double (1:250) 55%(33-77), 5.7% (5.2-6.2) Combined screening (independent manner) 81% (58-95), 8.4% (7.9-9.0) modelled DR, FPR(1:250) NT 53.4%, 4.6% Double 68.8%, 7.8% NT & double (1:250)-stepwise manner 80.6%, 5.3% DR for 5% FPR (1:250) NT (1:250) 54.6% Double (1:250) 59.7% NT & double (1:250)-stepwise manner 79.8% Difficulties implementing any of the screening strategies 98.6% of women had successful NT measurement. Mean duration of scan 12mins (5-26 mins) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Those over 37 yrs old excluded-low risk population selected. Authors felt that because the intervention was NT ≥ 3mm this may have meant that when NT < 3mm did try as hard to be accurate-therefore results not so good when converted to MoM. Author’s conclusion The study results suggest a 25% increase in the DR of DS using a combination of NT measurement at 12-14/40 and MSS at 14-17/40 for a 5% falsepositive rate. However this delays risk assessment and invasive testing by some weeks. Reviewers’ conclusions Some intervention. However, the results show that double test was better than NT and that a combined screening where risk from both NT and double test combined was ≥ 250 performed the best. 102 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Rozenberg et al. 2002) Adjusted for maternal weight. Log Gaussian modelling was carried out using standard methods (Royston and Thompson, 1992 as referenced in this paper) to estimate the DR and FPR using our parameters applied to a population with a mean maternal age of 28 years and a coefficient of variation of 16%. Multicentre study: 2 tertiary referral centres and 4 primary referral centres. Prospective cohort study Sample Outcomes and verification Results Grade III-2 Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 103 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Herman et al. 2002) Compared disclosure and nondisclosure approaches of combining 1st T NT and TT. 508 consecutive “normal” pregnancies (previously described method of data collection-Herman et al. 2000 as referenced in this paper). Data collected prospectively. Outcomes The DR and FPR of a strategy combining NT and TT results in an independent manner and DR and FPR of a strategy combining results of these two tests in an integrated or non-disclosure manner. Applied an age “adjustment risk technique” Accuracy of screening methods DR (95%CI), FPR (95%CI) Limitations Case control study limits the spectrum of disease and is not the ideal design to determine the accuracy of a screening test. No cases of other chromosomal disorders. Proportion of DS high4.3%. No details of MA. Medical centre, University Hospital. Israel NT done at 10-14/40 according to Nicolaides et al. Risk based on FMF software. Case-control study Grade III-2 2nd MSS at 16-19 weeks. LR for each test used to calculate MA adjusted FPR for non disclosure and disclosure methods. MA a priori risk from literature. Estimated the FPR and DR by a methodology of population adjusted calculations previously described in the literature (Pandya et al. 1995 as referenced in this paper), and using the MA distribution of Israel in 2000. Disclosure method either USS or triple test risk ≥ 1:250. Tests performed sequentially. 23 DS cases collected from various sources: 8 from local 1st T NT screening, 11 had NT elsewhere and were referred to centre for ToP, 2 from neighbouring hospitals, 2 identified as DS after birth. Blood taken before amniocentesis. Only those with good NT images included. Verification No details other than 21/23 DS detected before birthmainly increased NT. NT and TT-Disclosure method (independent manner) 88.7%, 9.5% NT and TT-Non disclosure(integrated manner) 75.3%, 2.4% Difficulties implementing any of the screening strategies No objective difficulties noted. All had 1st T NT and 2nd T MSS. Method of identifying control pregnancy outcomes was unclear. Total 531 Spectrum of disease: 23/531= 4.3% Possible bias as cases from different hospitals-tests may not be done identically as controls, and tests done at different times. Most DS (91%) had been identified before birth mostly from NT screening. No details of the testing kits. Exact method of determining DR and FPR unclear (used ‘a methodology of population adjusted calculations’). Author’s conclusion The disclosure approach resulted in considerably higher DR. The non-disclosure approach however had a lower FPR. No other details of population. Reviewers’ conclusions Biased sample. The result showed the nondisclosure strategy had a decreased DR but improved FPR compared to the disclosure strategy. These results are contradictory to other comparisons where DR is usually improved by integrated screening. Non-disclosure if integrated figure ≥ 1:250. Risk derived from LR NT X LR TT. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 104 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Wald et al. 2003b) The aim of the study was to determine the most effective, safe and cost-effective method of screening for DS using NT, MSS, and urine markers in the 1st and 2nd T. Urine screening is outside the scope of this review and therefore methods and results are not reported here. 25 maternity units in the UK and one in Austria who were screening for DS in 2nd T and who agreed to collect observational date in 1st T. Most started recruiting in September 1996 and ended April 2000. Outcomes The outcomes extracted for this evidence table are the FPR for a fixed 85% DR for fully integrated, serum integrated, combined test, quad test, TT, double test, and NT. The results are for 1st T NT and serum at 10/40 except for NT screening results (NT at 12-13/40). Also extracted results for fixed FPR of 5%. Accuracy of screening methods Limitations High median inhibin compared to other studies so had to use published data. Serum urine ultrasound scan (SURUSS) study UK. Prospective cohort study and nested case-control study. Grade III-2 Had USS at booking visit (CRL, viability, and if possible at least 3 NT measurements). Quality control of images, and all ultrasonographers trained in NT measurement. If NT ≥ 3mm flagged to ensure 2nd T serum screen done. Allowed up to 20 mins to get image. NT measurement standardised across centres and ultrasonographers. NT MoM based on ultrasonographer specific medians and upper truncation limit of 2.5 MoM used. Two sets of serum samples (from booking and from 2nd T) were stored at -40C. DS samples and those of matched controls retrieved and analysed. If serum collected before 14/40 = 1st T. AFP, fβhCG, total hCG, uE3 and PAPP-A measured using fluoroimmunoassay (Perkin Elmer) and Inhibin-A using ELISA (Oxford Bioinnovation and Diagnostic Systems). Analysis was blinded to outcomes. 47053 women with singletons presenting at 8-14/40 were recruited. Data set used for performance of screening was 43,712 singletons recruited at 10-13/40. 39983 had NT, 40387 had 1st T serum, 37362 had 2nd T serum. MA was median 29 years % ≥ 35 was 16%. Spectrum of disease: 101/ DS 47053 = 0.21%. (or 0.22% if only include 96% where outcomes documented). For biochemical analysis 1090 controls and 101 DS were retrieved. 3 DS had no serum or urine screen available-serum analysis based on 98 DS samples. Each DS was matched with 5 singleton controls (n= 490) according to centre, CRL (or BPD), MA, and duration of storage or sample. When estimated using MA, the MA specific risk of DS was for a standard population of England and Wales 1996-1998. The confidence intervals were calculated using Monte Carlo simulation. Results for a number of other marker combinations, different GA, and for different fixed FPR, and fixed DR are presented in the report. FPR (95% CI) for 85% DR Integrated 1.2 (1.0-1.4) Serum integrated 2.7 (2.4-3.0) Combined 6.1 (5.6-6.5) Quad 6.2 (5.8-6.6) TT 9.3 (8.8-9.8) Double 13.1 (12.5-13.7) NT 20.0 (18.6-21.4) DR for 5% FPR Integrated 93% Serum integrated 90% Combined 83% Quad 83% TT 77% Double 71% NT 60% SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Not everyone had all tests done. Analysis on stored samples. One of the authors belongs to the institution which holds the patent for the use of uE3 as DS marker. Same author is a director of a company producing a commercial software package for DS screening interpretation (USS and serum markers). Also director of company holding a patent application for integrated test. Author’s conclusion Overall the integrated tests have the best performance, if NT is not available should use serum integrated, for women attending only in 2nd T quad is best, for those only wanting 1st T screening combined is best. NT, double and TT without combination with other screening methods are not worthwhile. 105 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Wald et al. 2003b) After observation that urine did not add to the performance an additional 600 controls that had 1st T and 2nd T samples taken were selected and sample analysed. This was in order to estimate screening performance where GA based on dates compared to screening based on USS (CRL). Together with 490 matched controls this meant over 1,000 samples could be used for distribution parameters for both methods of dating. Serum urine ultrasound scan (SURUSS) study UK. Prospective cohort study and nested case-control study. Grade III-2 Continued Despite 1st T screening being observational, 5 DS pregnancies had ToP before 14/40 weeks. About 3 of these would have continued to 2nd T without miscarriage. Therefore two samples were selected at random to remove from analysis to avoid overestimation of median marker levels in DS pregnancies (intervention bias). There were 26 live births of which 10 were screen –ves. This means that 13 would have been expected in 2nd T (1/0.77), and there would have been about another 3 screen –ves who would have been miscarried. Need to include these as there would have also been screen +ves who had ToP, but would have miscarried if left, or will overestimate the DR. Therefore 3 of these pregnancies sampled and levels added to those already included. Sample Outcomes and verification Results Comments Verification Verification of outcomes in 6 ways: Staff at local hospitals filled out a SURUSS outcomes form at delivery. Cytogenetic lab info linked to SURUSS records. National DS register linked to SURUSS records. Info from local obstetric outcome records. Form sent to women with request to return details of outcome. Individual searches of those outcomes not otherwise obtained. DR, FPR Stepwise screening (1:250) 93%, 9.8 Integrated (risk cut-off set so DR = stepwise) 93%, 4.5% Reviewers’ conclusions Large well designed study with unbiased sample (adjustments for intervention). Excellent verification of outcomes. Results show similar results for 1st T and 2nd T but integrating markers were better. Integrated had better performance than stepwise screening. Follow-up continued until 31st May 2001. Completeness of ascertainment. Would expect 81 live births if no intervention (55 of 71 ToP would have gone to term=77% plus 26 live births = 81). From MA distribution would expect 87. Difficulties implementing any of the screening strategies Overall 9% of pregnancies had no NT obtained within 20 mins (worst before 10/40 weeks and after 14 weeks). Best at 12/40. The failure rate decreased significantly over the study period. DS with unacceptable images were close to the MoM for the unaffected pregnancies. The make and model of the USS machine influenced the ability to get an image as did experience of the ultrasonographer. Depended on the machine used and experience of the operator. Using only NT images that were judged satisfactory improved screening performance. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 106 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Wald et al. 2003b) The DS used in analysis was 101 – 2 ToP plus 3 as above = 102 (adjustment for intervention bias). Serum urine ultrasound scan (SURUSS) study UK. Prospective cohort study and nested case-control study. For each marker the MoM was calculated for each GA (using CRL or BPD to estimated CRL). The medians were calculated using unaffected pregnancy serum markers adjusted for maternal weight. There was no correction for ethnicity. Sample Outcomes and verification Results Not adjusting for maternal weight meant a small increase in FPR. Not using USS for dating increased FPR. Grade III-2 Continued To estimate risk the parameters (SD, means and correlation coefficients) were derived from the study population. Used a published MoM for inhibin-A as it appeared unexpectedly high in the study population. The risk of having a DS pregnancy at 17/40 was calculated using the maternal age specific risk of a live birth (corrected by multiplication of 1/0.77 to allow for DS fetal loss from mid term to term) and LR for the markers obtained by the overlapping of the Gaussian distributions of DS and unaffected pregnancies. No planned intervention in 1st T. Antenatal diagnostic tests were based on routine 2nd T results (double, TT, or quad). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 107 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Gyselaers et al. 2004b) Compares screening with 2nd T MSS to screening combining the results of NT and 2nd T MSS. Since 1992 2nd T MSS samples from all geographic areas in Flanders have been analysed by General Medical Laboratory in Antwerp (AML). Outcomes Performance (DR for fixed FPR) of screening with 2nd T MSS (?quad test) compared to a strategy combining MSS and NT. Compared screening accuracy for those who had 2nd T MSS but did not have NT (36,382) to those who had NT and 2nd T MSS (3274).(i.e. totally separate populations). Unclear how the two tests combined (presume stepwise manner). Accuracy of screening methods Limitations Comparing two populations having two different screening strategies over different time periods (2nd T MSS from 1992, NT from 1999). The group having NT and 2nd T MSS had very small number of DS (n=9). Lab samples from all geographic areas. Flanders, Belgium Retrospective cohort study. Grade III-2 2nd T serum: AFP (Diagnostic Products Corporation) fβhCG (BioSource), uE3 (Diagnostic Systems Laboratories) PAPP-A (BioSource). The cut-off for a positive screen was 1:300. A total of 2700 NT measurements sent to FMF for audit and used to calculated MoM. The total number of MSS samples (1st and 2nd T) analysed by AML over the 10 years of the study (1992-2002) was 78,365. 51.7% had compete follow-up = 40,490. Since 1999 AML have also registered details of NT and CRL. Those who had 2nd T MSS but did not have NT =36, 382. Those who had NT and 2nd T MSS = 3274. Those who had 1st T MSS and NT = 834-no further analysis. Spectrum of disease: 108 DS /39656 = 0.27% Chromosomal disorders other than DS considered unaffected. 5.5% study population ≥ 35 years compared with 8.9% of the general population (p<0.001). Verification Outcomes of pregnancies and newborns from obstetricians after birth. DR (95%CI), FPR (95%CI) 2nd T MSS 69.7% (61-79), 5.5% (5.3-5.7) NT and 2nd T MSS ( ?stepwise manner) 55.6%(21-86), 7.6% (6.7-8.5) DR for 5% FPR 2nd T MSS 66.7% (57-76) NT and 2nd T MSS 44.4% (14-79) Difficulties implementing any of the screening strategies None noted. Clinicians mailed every year for collection of missing data and contacted in person if did not respond. Compared prevalence in screened population to prevalence in Belgium. No other details of who got AC-presumably all who were screen positive. No details if karyotyping used in case of fetal loss. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Across the two groups the proportion of women over 35 yrs was less than in the general population. Some over 35 yrs would have had IT without screening. No details of why some women had NT from 1999. Possibly some women had only NT and not MSS over this period as high NT so referred for IT. Only those with normal NT would have 2nd T MSS. This will decrease the DR of screening in the group having NT and 2nd T MSS. Unclear how pregnancies dated or whether staff trained in NT. Unclear what the 2nd MSS was. Author’s conclusion In Flanders the uptake of 2nd T MSS in women over 35 yrs is low. The performance of screening decreased after the introduction of NT. Reviewers’ conclusions Poor design (comparing two separate nonrandomised populations) and biased samples. These limitations could explain the finding (which contradicts other studies) that NT is better than a strategy combining NT and quad test. 108 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Gyselaers et al. 2004a) Compared screening with 1st MSS (fβhCG) to screening with the TT (total hCG). Since 1992 2nd T MSS samples from all geographic areas in Flanders have been analysed by General Medical Laboratory in Antwerp (AML). Outcomes The outcomes extracted for this evidence table are the DR and FPR for 1st T MSS (1:85) compared to the DR and FPR of 2nd T MSS, the TT (total hCG) with a risk cutoff of 1:300. Accuracy of screening methods Limitations The study populations are completely separate i.e. different tests done on different populations. Also screened at different time periods and details of other concurrent screening (e.g. NT) or invasive testing (women over 35 yrs) in these time periods which could also bias sample. Lab samples from all geographic areas. Flanders, Belgium Retrospective cohort study Grade III-2 Immunoradiometric assay used for AFP (Diagnostic Products Corporation) and fβhCG (BioSource), and uE3 assayed with radioimmunoassay (Diagnostic Systems Laboratories). PAPP-A measured by ELISA (BioSource). The cut-off for a positive screen was 1:300 for 2nd T MSS and 1:85 for 1st T MSS. 2nd T MSS was obtained between1992-1998 which was before the introduction of 1st T screening for DS, and 1st T MS were taken 1999-2003. A total of 40,419 2nd T MSS samples performed over study period. Spectrum of disease in this group = 60DS /40419 = 0.15% A total of 7079 1st T MSS samples performed over study period. Spectrum of disease in this group = 13 DS /7079 = 0.18% 5.1% of those having 2nd T MSS were ≥ 35 years, and 8.6% of those having 1st T MSS were ≥ 35 years. Chromosomal disorders other than DS presumably considered unaffected. Verifications Outcomes of pregnancies and newborns from obstetricians after birth. DR (95%CI), FPR (95%CI) 1st T MSS (1:85) 61.5%(52-86), 5%(4.5-5.5) 2nd T MSS (1:300) 73.3%(62.1-84.5), 5.6%(5.4-5.8) Difficulties implementing any of the screening strategies No objective difficulties noted. Clinicians mailed every year for collection of missing data and contacted in person if did not respond. Compared prevalence in screened population to prevalence in Belgium. Those with risks as described offered invasive screening. In the two time periods studied, the prevalence of DS in Belgium was 0.14% and 0.17% which correlated well. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Few details about the study population (GA at screening) and how risks calculated (parameters used in Gaussian distributions). No details of lost to follow-up (presume kept in the analysis as unaffected). No adjustment for viability bias-1st T screening detects cases that may not be viable in 2nd T. Small number of DS cases in 1st T group (n= 13). Author’s conclusion The performance of both 1st MSS and 2nd MSS at MA ≥ 35 yrs (these results not extracted) in Flanders is excellent, even without the combination with NT or integration of 1st and 2nd T screening. The simplicity of methods makes them good options for aneuploidy screening at AMA, until high quality combined or integrated screening is accessible to all women in Belgium. Reviewers’ conclusions Poor design for screening performance and biased samples. Results indicate TT had better DR and similar FPR than 1st T MSS (fβhCG and PAPPA). 109 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Babbur et al. 2005) Compared screening with NT to a screening method combining LR of NT and the LR of TT. Sample consisted of 3,188 women with singleton pregnancies requesting screening for DS (August 2001-March 2004) who had a combined NT and triple test. Outcomes For this evidence table the performance of NT was compared to the performance of combining TT with NT. This was calculated firstly with only those who had both tests (2725) then for the overall performance-including all those who were screened with either NT or MSS (in a stepwise manner). Accuracy of screening methods DR (95%CI), FPR (95%CI) NT ≥ 3-mm 64%; (38.8-78.9), 1.8 (1.3-2.3) Combining NT plus TT stepwise manner (1:250) 67% (27.9-92.5), 2.7% (2.1-3.3) Overall screening performance stepwise manner (1:250) 88% (68.8, 97.5), 4.1% (3.4-4.8) TT FPR for fixed 88%DR 20% TT DR for fixed 4.8%FPR 60% Limitations High-risk group-median MA = 37 yrs so DR likely to be higher than in a routinely screened population. Women’s hospital Cambridge, UK. Prospective cohort study Grade III-2 NT measured according to FMF criteria when CRL 45-84 mm. 3 images obtained and largest NT used in analysis. Offered invasive testing if NT > 3mm (which was the current policy in the department). If less than 3mm or declined CVS offered the combination of the NT results and TT at 14 weeks (Had screened at 14/40 not 16/40 since 2000). For those having both 1st T and 2nd T screen the risk was calculated multiplying the MSS risk by the likelihood ratio for the reported NT (as per Cuckle and Sehmi, 1999 as referenced in this paper). Offered invasive testing if combined risk DS at term of >1:250. All TT assayed markers at same lab with commercial kits (PerkinElmer, Wallac Oy). Three markers used in combination with MA to determine the risk of DS. All results expressed as MoM for medians of unaffected pregnancies of same GA based on local population. Adjusted for maternal weight. Median MA was 37 years (19-46 yrs) Attending for NT offered as a self – paying service at a maternity Hospital and an outreach clinic, and those attending for NHS funded NT for previous fetal abnormality. Excluded those who had non-viable pregnancy at USS. 3188 total. 2725 had MSS after NT (85% of 3188). 463 did not have MSS. Most of those with normal NT scan proceeded to 2nd T triple. Mid trimester (12-16 /40) prevalence for DS was 7.8/1000 The FPRs given in the paper are the SPRs (as per raw data in the paper). For this reason CI not calculated. Verification Cytogenetic data (invasive testing) and neonatal tests for all women who delivered within East Anglia. Authors were not notified of any missed diagnoses in women delivering outside the area (did not have access to these records). Difficulties implementing any of the screening strategies 12.4% did not attend for triple test. Unclear if fetal losses were karyotyped. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 14/40 for TT instead of usual 16/40. Interventional so hard to compare screening with NT (or the combination) to studies where NT (or combination) was not interventional. No details of how TT risk determined or risk cut-offs. Author’s conclusion In a high-risk group the combination of NT with TT offers DR at least as good as either test while allowing disclosure of an abnormal NT at scan and reducing the FPR. Importantly FPR is less than 5%, which is lower than TT alone. Reviewers’ conclusions High risk group but for this population screening combining NT and TT performed better than NT alone. 110 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Knight et al. 2005) Compared serum integrated screening to quad and TT. Women in Maine receiving prenatal care from one of 229 primary care prenatal practitioners were recruited for the study between August 2001 and August 2003. Recruited women presenting 8-13/40. Told about delay in interpretation until 2nd T. Outcomes Outcomes given are the DR and FPR at different cut-offs for TT, quad and serum integrated screening. Also included the serum integrated FPR adjusted after USS dating. Accuracy of screening methods DR (95%CI), FPR (95%CI) Limitations Not all received all tests and these were removed from analysis- may have biased sample if they were more or less likely to have DS. Those removed included those with fetal loss. However did use the more conservative figure of the expected DS for denominator (DR). Primary care Maine, USA Prospective cohort study Grade III-2 First serum sample taken then stored at -20C. Second sample taken 15/40. Designed a lab form to facilitate sample matching. Modified the labs database to accommodate the additional sample and added a matching algorithm for linking info from both samples. 2nd T MSS evaluated by the algorithm to see if 1st T MSS had been taken. Staff usually verified all matches. Occasionally had to contact primary care to obtain further information. Sent reminder faxes that 2nd T MSS was due. In 2nd T combined the results of all five tests with MA associated risk to produce a single DS risk using a published algorithm and parameters (Haddow et al. 1998; Knight et al. 1998; Wald et al. 1999 as referenced in this paper). 11159 agreed to participate. Less 950 who had 1st T too early or too late, less 1436 where no 2nd T MSS as (575 fetal loss, 459 declined, 236 elected AC, 133 changed residence/provider, 29 ToP, 4 second sample too late) Left 8733 Spectrum of disease: 16/8773 = 0.18% Mean MA 27.8 (SD 5.5) and MA ≥ 35 years 11.3%. GA at 1st T screening =10/40, and 2nd T screening 16.9/40. Maternal ethnicity: non Hispanic Caucasian=98%. The DR has been calculated using the expected DS cases (17.8) rather than the observed 16 cases. Paper also presents “expected” results where selected 2nd T risk cut-offs to yield DR of about 70%unclear how these calculated (?statistical modelling). Verification Outcome information from Bureau of Vital Records and 2 diagnostic labs responsible for most karyotyping in the region (AC, 1st yr of life and products of conception). Triple test (1:270) 67%(43-84), 6.4 (5.9-6.9) Triple (1:190) 62%(39-81), 4.6(4.2-5.1) Quad (1:150) 56%(33-76), 3.3%(2.9-3.7) Serum integrated (1:100) 79%(55-92), 3.2%(2.8-3.6) When FPR was revised using only unaffected pregnancies that remained screen positive after confirming dates with USS the FPR decreased from 3.2% to 3.0%. Expected Triple (1:190) 69%, 5 Quad (1:150) 69%, 3 Serum integrated (1:100) 73%, 2 Difficulties implementing any of the screening strategies Those with 2nd T risk ≥ 1:100 considered screen positive and managed accordingly. Calculated DS risk for each woman using either quad or TT correcting the total DS cases for “trimester of ascertainment bias” due to some affected pregnancies with negative screen having fetal loss and not diagnosed. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING PAPP-A analysed on stored serum. No details of commercial kits or software used for analysis of data. There was some support from the screening industry as PAPP-A reagents were provided by Diagnostic Systems Laboratories. Author’s conclusion “Integrated serum screening for DS was successfully implemented in primary care settings; screening performance was consistent with predictions”. It is an accessible and acceptable alternative to screening methods that require NT measurement. 111 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Knight et al. 2005) To do this calculated the expected number of cases on the basis of MA distribution in the screened population and published age specific risks of DS (Hecht and Hook 1996, as published in this paper) and used this as the denominator for the DR (17.8 cases rather than the 16 ascertained in the study). Primary care Maine, USA Prospective cohort study Grade III-2 Continued Sample Outcomes and verification Results Comments Also contacted genetic counselors to confirm known cases and identify missed cases. Primary care also contacted occasionally. Samples from 2 trimesters were correctly matched in 99.9% of women (98% by the computer and 139 with input of lab staff, requiring 15-30 mins per day). In 6 cases no match could be made (multiple providers, name changes, and sometimes multiple 1st T samples). The matching was the most demanding aspect of implementation. The process was successful but required extra work. Reviewers’ conclusions Some issues with the implementation of the integrated approach-having to match specimens taken in both trimesters. However serum integrated performed better than quad and TT having a higher DR and slightly lower FPR. Those with positive screen advised to have invasive screen. Of 11,159 who agreed to participate 78.6% submitted both samples in the right gestational range. 950 had 1st T MSS too early or too late, 1436 where no 2nd T MSS as (575 fetal loss, 459 declined, 236 elected AC (mostly as over 35 years), 133 changed residence/provider, 29 ToP, 4 second sample too late) Also using 8 week as the lower cut-off for having a 1st T MSS was a problem as most (2/3) didn’t have dates confirmed at this time. PAPP-A had not been validated before 8/40 so had to just use 2 T screen. Later more emphasis was placed on dating but still had 9% of samples taken too early. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 112 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Malone et al. 2005) Comparing stepwise screening to serum integrated and fully integrated screening. Also evaluated the performance of the following screening methods: NT, 1st T MSS (PAPP-A and fβhCG), combined screening, quad test, and a screening method combining tests (combined and quad) in an independent manner. MA taken into account in all the risk calculations. Recruited from 15 centres in the USA. Outcomes The outcomes extracted for this evidence table are the directly observed DR and FPR for combined screening (1:150, and 1:300), quad test (1:300) and screening combining tests in an independent manner screening (1st T 1:50 and 2nd T 1:300). These are given with and without the inclusion of cystic hygroma. Accuracy of screening methods Limitations Women with cystic hygroma were removed from the analyses so the results apply to populations without cystic hygroma. (Including cystic hygroma increased the DR.) 1st T risks determined by NT, PAPP-A, free β hCG, and MA at 10+3/4013+6/40. Returned at 15-18/40 for 2nd T screening. 2nd T risk calculated from AFP, uE3, total hCG, and inhibin A and MA. The 1st T results were not released until 2nd T to allow an unbiased comparison of the two approaches. Patients with cystic hygroma followed separately (included for some comparisons). 134 had cystic hygromas (with 25 DS). Age standardised performance is also presented based on the MA specific risk DS at term corrected to early mid trimester to allow for fetal loss from this time until birth, and applied to the 1999 USS distribution of MA. These results are given for NT, 1st T MSS, combined test, serum integrated, fully integrated, TT, and quad with all 1st T screening at 11/40. Age standardised FPR (95% CI) for 85% DR NT 20 (10-26) 1st T MSS 16 (9.8-22) Combined 3.8% (1.8-7) Triple 14% (10-21) Quad 7.3%(4.6-16) Serum integrated 3.6% (2.0-7.7) Fully integrated 0.6%(0.4-1.6) First- and secondtrimester evaluation of risk (FASTER) trial. USA Prospective cohort study Grade III-2 NT measurement performed according to standard methods by trained staff. A minimum of 20 minutes reserved for NT measurement, and could return for further attempt. TV USS measurement used if required. Quality control conducted. MSS markers converted into MoM for GA (CRL). Adjusted for maternal weight and ethnicity. NT MoM centre specific and distribution of NT measurements based on all NT measurement in the population including those with cystic hygroma. Inclusion criteria: MA ≥ 16 years, singleton live fetus, CRL 36-79 (10+3/40- 13+6/40) at recruitment. 42,367 approached. 4,178 ineligible or refused. Exclusions: Prior measurement of NT, or anencephaly (n=22). 38,033 enrolled. Maternal characteristics: mean (± SD) MA at EDD was 30.1 ± 5.8 yrs, ethnicity 66.9% “white” 22.6% Hispanic, 5.3% Black, 4.1% Asian and 1% other. GA at 1st T screening: 44.8% were 12+0/40-12+6/40, 22.6% were 11+0/40-11+6/40, 29.1 were 13+0/4013+6/40, and 3.5% were 10+3/4010+6/40. Outcomes were available for 36,873 Spectrum of disease: 92/36,873 = 0.25% Removed 7% of NT measurements as failed or inadequate. Complete 1st T data available in 36,120. Complete data for 1st T and 2nd T available in 33,546 (i.e. all tests done). Directly observed DR (95%CI), FPR (95%CI) Combined (1:150) 77%(69-86), 3.2% (3-3.4) Combined (1:300) 82%(74-89), 5.6%(5.4-5.9) Quad (1:300) 85% (76-93), 8.5%(8.2-8.8) independent screening 94% (89-99),11% (10.7-11.3) The differences between screening were less apparent when FPR fixed at 5% rather than 1% because the DR of all screening is relatively high. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Although the study was not directly supported by the screening industry, some of the authors hold patents for screening tests and directorships/ownership of companies involved in screening. Author’s conclusion Where there is appropriate quality control for NT, combined screening is a powerful tool for detection of DS. Stepwise screening and fully integrated screening are both associated with high DR and acceptable FPR; the advantage of earlier diagnosis with stepwise screening must be weighed against the lower FPR with integrated screening. Consideration of the costs associated with different strategies and patient preferences will help guide the choice between these approaches. Reviewers’ conclusions Well designed and high quality paper. Direct comparison with 1st and 2nd trimesters is possible. The results showed 1st T screening for DS is highly effective and that combined screening at 11/40 better than quad. 113 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Malone et al. 2005) Risk estimated by multiplying the MA specific risk (Morris et al. 2002-as referenced in the paper) by LR obtained by the Gaussian distributions of affected and unaffected pregnancies (as per Wald and Hackshaw. 2,000-as referenced in this paper). Distribution using published parameters (Wald et al. 1999 and Wald et al. 2000 as referenced in this paper). First- and secondtrimester evaluation of risk (FASTER) trial. USA Prospective cohort study Grade III-2 Continued After all screening complete women given two risks; with cut-off of 1:150 risk at term for 1st T and 1:300 risk at term for 2nd T screening. Offered counselling and invasive test if either was positive. 2nd T screening (standard of care) cut-off chosen so that the estimated rate of positive screens would be similar to the current screening practice (5%), and the 1st T cut-off so the over-all rate of screen positives would not be excessive. 95% CI for screening estimates were determined by “bootstrapping” with 1000 DS dataset replications. Sample Outcomes and verification Results Comments Verification Medical records were reviewed by one paediatric geneticist in situations where DS suspected, or in those with positive screen but no karyotyping, and in a random sample of all others. Age standardised DR (95% CI) for 5%FPR NT 70%(65-79) 1st T MSS 70%(64-78) Combined 87%(82-92) Triple 69% (63-74) Quad 81%(70-86) Serum integrated 88%(81-92) Fully integrated 96% (92-97) Combinations of measurements of markers from both 1st T and 2nd T (fully integrated and stepwise) yield higher DR and lower FPR than markers from a single trimester. Serum integrated screening was similar to 1st T combined and may be useful when staff trained in NT are not available. DS status ascertained by AC, fetal cord sample in those with positive screen but declined AC, and or tissue sampling in spontaneous fetal loss ToP or stillbirth. Completeness of ascertainment estimated by the expected number of DS at 2nd T from MA distribution and recent age specific birth prevalence. 112 would be expected 117 found (25 in the cystic hygroma group plus 92). The differences between combined screening and either NT or 1st T MSS are significant. Stepwise setting a 2.5% FPR for each screening component 95% (91-97), FPR 4.9% Difficulties implementing any of the screening strategies 7% failed or suboptimal NT measurements. To compare different screening methods the differences between pairs of tests were determined for each data set replication and CI determined for this measure. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Stepwise had a similar DR to integrated but with higher FPR. Independent screening had a high FPR and should not be used. 114 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Hackshaw and Wald 2001) Compared integrated screening and stepwise screening to illustrate the effect of reporting a risk in women who have already had 1st T screening without taking into account that the distributions are based on populations where everyone had both 1st T and 2nd T screen. Statistical modelling Sample The authors were concerned about the screening practice where women who have a positive screening in 1st T are offered a CVS and those with a negative screen are then offered a screen in 2nd T. Two problems occur: Firstly if the1st T results are not available or are ignored the standard calculation of risk for 2nd T screening will not be accurate. Second if the results are combined with 1st T result to produce a risk estimate, this still does not allow for the fact that these women were screen negative based on another test and this will also give inaccurate risk estimates. The paper examines this issue. The parameters for the Gaussian distribution of markers in DS and unaffected pregnancies were obtained from published prospective data. Those relating to NT were from a large cohort (Nicolaides et al. 1998 as referenced in this paper). 1st T PAPP-A and fβhCG were from SURUSS (77 DS and 383 unaffected). 2nd T AFP, uE3, total hCG, fβhCG and dimeric inhibin-A based on 77 DS and 980 unaffected pregnancies (Wald et al. 1994, 1996b, 2000 as referenced in this paper). Correlation coefficients between NT and 1st T markers taken to be zero (Spencer et al. 1999). Correlation coefficients between 1st T markers and 2nd T AFP, uE3, and hCG also taken to be zero (Lam et al. 1998, de biasio et al. 2000). No published data for the correlation between 1st T markers and 2nd T inhibin-A. As the correlations between the others are very small it is likely to also be the case for inhibin, and even if there were a small correlation unlikely to have a big effect on the modelling. Outcomes and verification Results Comments Outcomes The overall DR and FPR were estimated using the 1st T and 2nd T combined results with revised MA specific risk curves and revised marker distributions. Results of screening across trimesters combined in stepwise manner-anyone screen negative has 2nd trimester screen and risk at 2nd T takes 1st trimester into account. Accuracy of screening methods Limitations Like all models makes assumptions that may not be applicable in the real-world e.g. that all who are screen positive are offered and accept IT. These were compared to integrated test with a cut-off chosen to yield the same DR as the stepwise approach. The modelled integrated test uses hCG in both trimesters so it is the same as in the stepwise method. The maternal age specific risk curves showed that the age specific risks in those who have had a previous negative 1st T screen were lower than those who have not been screened, as most DS will have been detected in 1st T. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Stepwise DR, FPR (cut-off set at 5%FPR 2nd step) and integrated FPR for same DR. NT (risk 1:350), double Stepwise manner 88%, 9.9% Integrated manner 88%, 6.4% NT (risk 1:350), TT Stepwise 91%, 9.9% integrated 91%, 6.7% NT (risk 1:350), Quad Stepwise manner 93%, 9.9% Integrated 93%, 6.0% Combined (risk 1:400), Double Stepwise manner 92%, 9.2% integrated 92%, 6.1% Combined (risk 1:400), TT Stepwise manner 94%, 9.2% Integrated 94%, 5.8% Author’s conclusion If women who are screen positive in the 1st T are offered IT while those who are negative have 2nd T screening, the MA specific risk and marker distributions used to estimate risk in those who are screen negative would need to be revised. If the 1st T test result is not available or not combined with the 2nd T result (independent screening) there is a greater chance of FPR. If the markers are combined but the revised MA specific risks and the revised distributions are not taken into account the risk will be too low and may reduce the DR. This is because the distribution of markers in screen negative women who have DS fetuses will be closer to that of affected pregnancies. 115 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Hackshaw and Wald 2001) Estimated the correlation between two fβhCG samples (unaffected and affected) from the correlation when samples taken a few weeks apart in unaffected women, and distribution of fβhCG across 2 trimesters. Same done for correlation between 1st T fβhCG and 2nd T total βhCG. Statistical modelling Sample Continued Using the age distribution of England and Wales 1996-1998 and applying the MA related risk of DS at term (Cuckle et al. 1987-as referenced in this paper), a hypothetical group of 100,000 affected and 100,000 unaffected pregnancies was produced using a Monte Carlo simulation. From this was generated MoM for marker levels in unaffected and affected pregnancies. Outcomes and verification Results Comments The distribution of some of the markers changed for women who are screen negative compared to women who are yet to be screened. 1st T combined (risk 1:400), Quad Conventional Stepwise screening 95%, 9.2% integrated 88%, 4.4% Similar problems when those having a positive 1st T risk are then offered a 2nd T screen to “double check” if they need an invasive test. This means that the risk estimate will be wrong if these are not taken into account. To determine the effect of 2nd T screening in women who were previously screen negative had to firstly estimate: The MA specific risk of having a DS birth in those who were screen negative after 1st T screening. The distribution of the markers (median, SD and correlation coefficients of 1st T and 2nd T markers in affected and unaffected women who were screen negative after 1st T, and compared to those who had not been screened before. The risk of having a DS pregnancy in those who had previously been screen negative in 1st T was illustrated using women having the 2nd T triple test. Risks were estimated based on: the assumption that women had not been screened before in the same pregnancy; by combining the 1st T and 2nd T markers with MA using the inappropriate age specific risk curve; and also combining the 1st T and 2nd T and MA and the revised (appropriate) MA specific risk curve and revised marker distributions. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers’ conclusions Even when use the correct MA distribution and correct marker distribution stepwise has a higher FPR than integrated. 116 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Lam et al. 2002) Compared screening with NT to 2nd T double test and integrated screening. Modelling was based on data from obstetric clinics of 5 hospitals in Hong-Kong, China. Outcomes Compared estimates of DR and FPR obtained from modeling for NT alone (without MA), NT and MA, double test, and a screening method which integrated the results of NT and double test and only disclosed them after all results were available. Accuracy of screening methods DR (95%CI), FPR (95%CI) Limitations Removed those with other chromosomal disorders-changes the spectrum of disease and may reduce FPR compared to studies where these are considered “unaffected”. Verification Pregnancy outcome from hospital records or direct contact with women after delivery. Difficulties implementing any of the screening strategies For 0.22% (39/17590) could not measure NT successfully. Statistical modelling Non-intervention trial. All in the analysis had both 1st T and 2nd T screening. NT measured at 10-14/40 with TA USS (98.6%) or TV USS. Measured using methods as per FMF by staff who had undergone NT training. Regular audit carried out. The NT measurements from the 2 best images were averaged. GA by CRL at or before 13/40 or BPD at 13-14/40. NT converted to MoM for the gestational day. NT not acted upon unless USS showed gross features of hydrops fetalis. Neither women nor obstetricians told of the results. All had 2nd T MSS (hCG and AFP =double test) between 1520/40. DS risk from MSS disclosed and women with risk 1:250 or more offered invasive diagnostic test. Women >35 yrs old or with other risk factors given the option of CVS at 10-12/40 or AC at 15-20/40 but MSS still taken few weeks after CVS or just before AC. All markers expressed as MoM at given GA. AFP and hCG adjusted for maternal weight using commercial software. The DR and FPR of the screening methods were obtained by the model based LR approach described by Royston and Thompson, 1992 (as referenced in this paper). Specifically all markers were logarithmically transformed and multivariate Gaussian distribution fitted for affected and unaffected pregnancies. Used MA distribution of Hong Kong in 1994 and the age specific risks of DS to estimate DR and FPR particular risk cut-offs. 95% CI obtained by parametric bootstrapping (n= 300). Those attending clinics at or before 14/40 who agreed to DS screening were recruited. NT was measured for research only and results were not disclosed to women or clinicians. 17,590 recruited January 1997-August 2000. Removed 208 with other chromosomal disorders or major abnormalities. Removed 39 where NT measurement was unsuccessful. Also removed 1015 who defaulted from 2nd T MSS, and 91 who miscarried between NT and 2nd T MSS. 16237 left for analysis: mean GA at USS 87 days, and at 2nd T screening 16/40. Mean MA unaffected =30.5yrs (19% were ≥ 35 years). Pregnancy outcome ascertained in 15253. Review of cytogenetic labs performing karyotyping for all 5 hospitals. Of the 16237 in the analysis 117 had CVS (none had DS), and 1913 had AC. Looked at estimated number of DS at 2nd T based on MA distribution-about 38 similar to that found of 35. Spectrum of disease: 35DS/15253 = 0.23% SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING DR (95%CI) for fixed FPR 5% NT alone 60.8 (41.7-69.4) NT 69.3% (56-76.1) Double test 73.2% (63.4-82.9) Integration of NT and double 85.7% (76.2-92.1) 5.8% defaulted from 2nd T screen (1015/17590). Removed those not having 2nd T screening. However, NT results weren’t disclosed so should not have removed women with large NT (and therefore cases of DS) except hydrops fetalis, and possibly any women who had CVS without MSS. DS were ascertained at or beyond 2nd T so hard to compare NT performance to other studies that ascertained DS with NT screening in 1st T. (Some of these would have miscarried anyway). NT non-interventional may not have cared about precision as opposed to relying on it clinically. Unclear which population parameters used for statistical modeling-presumably from the prospective cohort study Author’s conclusion Despite the use of the double test 2nd T MSS had a higher DR than NT. The integration of NT and 2nd T MSS yielded the best screening efficacy DS. Reviewers’ conclusions Non-interventional study so reasonably valid comparison between 1st T and 2nd T screening. Double test performed better than NT but the integration of the two had the best performance. 117 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Rahim et al. 2002) The study aimed to determine the performance of a compromise policy where women who were most likely to have errors received a dating scan rather than all women having a dating scan. Screening with two markers (fβhCG, AFP) compared to 3 markers (uE3) for policies where all had USS for dating, or restricted to those with uncertain dates or dates unrecorded. Study based on 14, 274 women with unaffected singleton pregnancies screened by LASS January 1997-July 2001. Outcomes The outcomes were for 2 markers compared to 3 markers for different policies for dating; either using dates only, using scan unless dates reliable, or scanning all women. Accuracy of screening methods Limitations Unclear when serum markers taken so difficult to reproduce results or to classify the screening methods as 2nd T or 1st T screening. The aim of the study was not to determine accurate DR and FPR for different screening combinations but to determine the best policy for dating. Statistical modelling All had MSS at 13-21 weeks: fβhCG, AFP, and uE3 (PerkinElmer). Some also had inhibin-A, NT and PAPP-A but these markers were not used in the model. For each woman marker MoM were calculated based on USS and LMP. Serum markers adjusted for maternal weight where available. Normal medians were those used routinely by Leeds Antenatal Screening Service (LASS). Standard statistical modeling used to predict DR for a fixed 5%FPR for different screening methods. Parameters for unaffected from study data (correlation coefficients after removal of outliers). DS parameters from unaffected parameters with addition of meta-analysis values (Cuckle, 1995 as referenced in this paper). Both LMP and USS dating available for 12711, no LMP on form for 1404, and no USS information in 162. Of those with LMP available, 1693 listed as uncertain, 547 pill withdrawal periods, 1565 irregular cycles, and 296 regular cycles but abnormal length (these were all classified as unreliable). Presumably markers were across trimesters. MA distribution was that for England and Wales 1994-1998. Verification No details SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Dates only, DR for 5% FPR fβhCG and AFP 57.6% fβhCG, AFP, uE3 59.3% Scan unless dates reliable, DR for 5% FPR fβhCG and AFP 61.1% fβhCG, AFP, uE3 64.7% Scan all women, DR for 5% FPR fβhCG and AFP 63.2% fβhCG, AFP, uE3 67.9% Author’s conclusion The study confirms that a policy of universal scanning for dating increases the DR and reduces the FPR compared with using LMP. In this study over half of these benefits could be achieved by restricting USS dating to the 1/3 of women with unreliable dates. Reviewers’ conclusions A combination of 3 markers (presumably across trimesters) was better than using only 2 markers. Unclear which markers taken when or if some markers taken across two trimesters (e.g. fβhCG). 118 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Cuckle 2003) The aim of the paper was to update the parameters for DS screening markers, specifically inhibin in 2nd T, combining 1st T PAPP-A with 2nd T markers, and 1st T NT and 2nd T nuchal skinfold (NF) thickness (outside the scope of this review). Statistical modelling. Sample All markers were expressed as MoM (GA specific) of unaffected pregnancies and log transformed. Used meta-analysis of published studies to derive the DS mean and the differences in variance and covariance between affected and unaffected pregnancies. Except for NT only data from non-interventional studies were used. Used 11 studies for NT, including 5 interventional studies. Adjusted for this bias. Outcomes and verification Results Comments Outcomes The outcomes extracted are the DR for 5% FPR and DR and FPR (1:250) for NT, combined, combined plus 1st T uE3 and AFP, TT, quad, and what appears to be serum integrated and fully integrated methods. Accuracy of screening methods Limitations Little information on correlation between 1st T PAPP-A and 2nd T markers. More data needed to calculate covariances for inhibin and PAPP-A. Paper presented results for various fixed FPR, fixed DR, and cut-offs for a number of marker combinations. Other parameters from existing meta-analyses except for mean for fβhCG across the 1st T when GA specific values were regressed. The unaffected variance and covariance in unaffected pregnancies from 29,516 women screened in Leeds. The SD estimated from the Gaussian distribution and covariance estimated after exclusion of outliers. Results for β hCG better than with total HCG for both quad and triple testthe fβhCG results have been extracted. The existing and updated parameters were then used to predict performance of screening policies using the numerical integration method as described by Royston and Thompson, 1992-as referenced in this paper. The MA specific risk was from Cuckle et al. 1987-as referenced in this paper, and the MA distribution was taken to be Gaussian with mean MA 27yrs, and SD 5.5 yrs. For all the combinations, the results were better when the 1st T markers were measured at 10/40 compared to 13/40-the 10/40 results have been extracted. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING DR, FPR (1:250 cut-off) NT (11-13/40) 70.6%, 2.4% Combined 81.6%, 2.0% Combined plus uE3 & AFP 84.0%, 1.8% TT 65.1%, 4.7% Quad 69.2%, 4.1% 1st T PAPP-A combined with quad (?integrated or stepwise) 76.1%, 3.2% 1st T PAPP-A and NT combined with quad (?integrated or stepwise) 88.1%, 1.5% DR for 5%FPR NT (11-13/40) 76.1% Combined 87.8% Combined plus uE3 & AFP 90.5% TT 66.1% Quad 71.7% 1st T PAPP-A combined with quad (?integrated or stepwise) 80.6% 1st T PAPP-A and NT combined with quad (?integrated or stepwise) 93.5% Unclear whether results combining 1st and 2nd T screening relate to the disclosure or non-disclosure method. Presume they were combined in an integrated manner. Author’s conclusion The existing and updated meta-analysis parameters are used to predict screening performance for the related policies yielding a DR for a 5% false-positive as high as 9093%. Since multi marker serum screening for DS was first introduced there has been a steady increase in DR in relatively small increments as new markers have been added. The incorporation of USS markers has continued and accelerated the process. Reviewer’s conclusions The results indicated that adding inhibin to 2nd T TT increased DR, that NT at 11-13/40 had better DR than any 2nd T serum screening, and that combined test better than NT. Adding 1st T uE3 and AFP to combined test did not improve screening by much. Combining results of screening across trimesters was better than any method alone-presume this was the serum integrated and fully integrated screening method. 119 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Rode et al. 2003) Compared screening using different combinations of 1st T and 2nd T markers, including a new marker, the proform of eosinophil major basic protein (proMBP). Used 6,741 women from the Copenhagen First Trimester Study (Wojdemann et al. 2001 as referenced in this paper) who gave birth prior to 1st April 2001 who all had NT 10-13/40. Outcomes The DR for a fixed 5% SPR was extracted for 1st T MSS, combined test, conventional TT, TT with ProMPB instead of uE3, conventional quad, quad using ProMBP instead of inhibin, integrated method, and a method using 1st βhCG, & NT and 2nd T AFP & ProMBP. Accuracy of screening methods DR (95%CI), FPR (95%CI) Limitations Data from interventional study. Sample of 195 women biased. Only 1/195 had positive NT screen i.e. these were women who had negative NT screen then 2nd T serum screen-those with large NT would have not continued screening. Statistical modelling Monte Carlo simulation The aim of the study was to investigate the distribution of and correlation between1st and 2nd T serum markers and NT in normal pregnancies and use these findings in combination with published information on the distribution of markers in DS pregnancies to assess the performance of different screening strategies. NT measured according to FMF. Those with risk > 1:250 based on NT and MA were offered invasive screen. One of the 195 had positive screen. PAPP-A determined using a manual ELISA. fβhCG determined using commercial kit (AutoDelfia ™, EG&G, Life Sciences). 2nd serum (AFP, hCG and uE3) analysed using commercial kit (AutoDelfia ™, EG&G, Life Sciences). These all done prospectively. Of the 195 blood samples only 166 could be retrieved for proMBP and inhibin analysis. Total proMBP using ELISA developed by Statens Serum Institut Copenhagen. Inhibin using an Inhibin-A Dimer Assay Kit (Oxford Bio-Innovation). 195 had both 1st T (10+5/40-13+6/40) and 2nd T (14/40-20+2/40) samples taken in same pregnancy. Singleton, live birth outcomes, with no malformations. Only SPR was available. This will be a good proxy for the FPR in this modelled population. Median MA 32.2 yrs (range-20.3-40). Markers converted to GA (CRL) MoM. The correlation between markers and regression analysis performed using MoM. The distributions of MoM values determined for GA intervals. Monte Carlo simulation using S-Plus based software (validated by Larsen et al. 1998-as referenced in this paper) a standard distribution of MA (Van der Veen et al. 1997 as referenced in this paper) and published information on marker distribution in DS pregnancies (Cuckle and van Lith, 1999, and Wald et al. 1994;1996, as referenced in this paper). ProMBP distributions established in 16 DS samples obtained from routine serum samples from Statens Serum Institute. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING DR for a fixed 5% screen positive 1st T PAPP-A and fβhCG 56% Combined test (β hCG) 76% TT (hCG) 61% 2nd T AFP, hCG, and ProMBP (instead of uE3) 79% Quad (hCG) 69% 2nd T AFP, hCG, uE3, and ProMBP (instead of inhibin) 83% Integrated 86% 1st T βhCG, NT & 2nd T AFP & ProMBP 90% PAPP-A using a manual ELISA, and ProMBP using an in-house ELISA. ProMBP and inhibin-A on retrieved samples which had been stored at -20C. Very small sample used for the parameters in unaffected pregnancies195 and only 166 had all tests. ProMBP distributions established in 16 DS obtained from routine serum samples from Statens Serum Institute. Used the correlation coefficients from normal pregnancies in the DS matrices in calculations. This was done as an increase in correlation coefficients (up to 4 fold trialled) did not change the estimated performance of a given marker combination , fix common position and because data on correlation coefficient between 1st T and 2nd T markers in DS pregnancies are not yet available. Authors’ conclusion These results suggest that proMBP may be an important new marker in DS screening and, in particular, a good substitute for inhibin A. Reviewers’ conclusions Needs further analysis with larger samples for the parameters especially of proMBP. 120 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Wright and Bradbury 2005) Compared combined test, quad test, serum integrated and fully integrated screening to screening using highly correlated repeated measures of serum markers taken in the 1st T and 2nd T of pregnancy. Statistical Modelling Sample Estimates of means, SD, correlations of log MoM from SURUSS (47,053 pregnancies and 101 cases DS). MoM are estimated from USS dating and are corrected for maternal weight. MA related risk of DS- from published data (Wright and Bray, 2000-as referenced in this paper) DR and FPR estimated for the maternal age distribution of England and Wales 1996-1998 using Monte Carlo simulation to sample 500,000 observations from the distribution of MoM in DS and unaffected. LR calculated for each observation and used to estimate the DR and FPR for each MA. Overall DR obtained by combining MA specific rates with MA distribution. Apart from the model for MA related risk of DS the assumptions are the same as the MA risk model used in SURUSS. The DR and FPR are identical to those in SURUSS (to the number of decimal places presented). Outcomes and verification Results Comments Outcomes Outcomes were the FPR for 85% DR for combined test, quad test, serum integrated and fully integrated screening, and screening using highly correlated repeated measures of serum markers taken in the 1st T and 2nd T. Accuracy of screening methods Limitations. The modeling makes assumptions based on small sample of DS cases. The notation ^2 means a measure was taken in 1st T then again in 2nd T. The results presented are for repeat measures of PAPP-A, repeat measures of PAPP-A and uE3, repeat measures of PAPP-A and total hCG, repeat measures of PAPP-A, uE3, and inhibin. Results are also given with the inclusion of 1st T NT for some of these screening methods. Repeat measures of other analytes are available from the paper, but those presented had the best screening performance. In the repeat measures it is assumed that screening takes place at 10/40 for 1st T samples. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING FPR for 85% DR Combined (βhCG) 6.1% Quad (hCG) 6.2% Serum integrated test 2.7% Integrated test 1.2% PAPP-A^2 2.3 PAPP-A^2, uE3^2 0.5% PAPP-A^2, uE3^2 & NT 0.3% PAPP-A^2, & total hCG^2 1.1% PAPP-A^2, total hCG^2 & NT 0.6% PAPP-A^2, uE3^2, & inhibin^2 0.3% (The notation ^2 means a measure was taken in 1st T then again in 2nd T) Authors also state that SURUSS parameters affected by viability bias because of intervention in the T2 (will optimistically bias performance). However no studies are observational as this would be unethicalneed to compare performance at 2nd T. The modelled generated “implausible riskestimates” when using repeat measures in several highly correlated markers (Wald 2006). To overcome this issue Wald et al. used ratio of the marker levels in 1st T and 2nd T which they called the cross trimester (CT) ratio. Author’s conclusion Certain combinations of highly correlated markers (some of which have poor discriminatory power individually) have benefits over the established integrated test. Would be good in situations where NT not available. Drawback (as for the integrated) is that all need repeat testing in 2nd T. The performance of repeated measures screening tests, and acceptability should be assessed in further prospective studies. Reviewers’ conclusions While there are some issues with the methods used, repeat measures could be an additional marker which could potentially improve screening performance. 121 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Benn et al. 2005a) Compared contingent and integrated methods of screening with different timings of markers, using fβhCG or total hCG, and different risk cut-offs according to the practice in the UK and USA. Statistical Modelling Sample The aim of the study was to assess performance of different protocols for contingent screening for DS. Modelled to predict performance for programmes most likely to be adopted in the UK and USA. Statistical modelling applied to the parameters in the SURUSS study. Multivariate Gaussian model used for distributions of marker profiles for DS and unaffected pregnancies. All markers expressed as the GA specific MoM of unaffected pregnancies. The marker’s means, SD, and correlation coefficients were obtained from SURUSS (based on pregnancies expected to be viable 2nd T). MA related chance of DS at term from the literature (Wright and Bray, 2000 as referenced in this paper). Risk at 2nd T calculated by allowing for 23% fetal loss between 2nd T and term (Cuckle 1999 as referenced in this paper). MA distribution from national statistics of England and Wales 2001 or USA 2000. DR and FPR estimated using Monte Carlo simulation to sample 500,000 observations from the modelled distributions. LR ratios were calculated for each observation to derive MA specific rates. Also calculated the early DR and FPR for those with high risk in T1. Outcomes Results Comments Outcomes Outcomes were the DR and FPR for UK policies for risk cut-offs at term for UK and at 2nd T for USA. Accuracy of screening methods DR, FPR Limitations Modelling as opposed to directly observed DR and FPR although the method used has been validated. For the UK screening 1 T MSS (fβhCG) taken at 10/40, NT 11/40, and 2nd T serum 14-20/40. For the USA contingent screening 1st T serum (total hCG) and NT 12/40, and 2nd T MSS 1420/40. For integrated test, PAPP-A 10/40 & NT at 11/40. (These timings produced the best results.) st For contingent method the cut-offs for high and low risk groups were for the UK very high cut-off 1:20, very low cut-off 1:2000, (early DR, FPR = 62.5%, 0.3%); and for the USA very high cut-off 1:30, very low cut-off 1:1300, (early DR, FPR = 61.7%, 0.6%). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING UK policies (DR, FPR) Combined (1:250) 85.8%, 3.8% Double (1:250) 77.8%, 7.8% TT (1:250) 79.8%, 7.2% Quad (1:250) 83.8%, 5.4% Contingent screening (overall risk 1:250) 91.4%, 2.1% Contingent screening (overall risk 1:100) 88.2%, 1.1% Integrated test combined in contingent manner (overall risk 1:250) 91.9%, 2.2% Integrated test combined in contingent manner (overall risk 1:100) 88%, 1.0% Integrated test-non disclosure (overall risk 1:250) 92.1%, 2.2% Integrated test-non disclosure (overall risk 1:100) 88.3%, 1.0% Author’s conclusion With appropriate patient counselling it should be possible to provide highly effective DS screening using contingent protocols. Reviewers’ conclusions Contingent screening can provide most patients with early reassurance or diagnosis while providing additional testing for those who will most benefit. Our model based on the proposed protocols indicates this strategy can be highly effective. For both UK and USA policies over 60% of DS detected in 1st T and less than 20% require 2nd T screening. 122 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Benn et al. 2005a) Compared contingent and integrated methods of screening The contingent screening used 2 serum markers as opposed to just PAPP-A for integrate. This was to ensure that women who decided to just have 1st T screening had best available marker combinations. A four marker combination was chosen for 2nd T screening for those who attended too late for 1st T screening. Statistical Modelling Continued Sample Outcomes Results Results for other timings and combinations of markers are presented in paper. USA policies Combined (1:270) 83.6%, 5.3% TT (1:270) 81.7%, 8.3% Quad (1:270) 84.6%, 6.7% Contingent screening (overall risk 1:270) 89.1%, 3.1% Contingent screening (overall risk 1:130) 86.2%, 1.9% Integrated test combined in contingent manner (overall risk 1:270) 90.5%, 3.4% Integrated test combined in contingent manner (overall risk 1:130) 86.8%, 1.8% Integrated test-non disclosure (overall risk 1:270) 90.7%, 3.4% Integrated test-non disclosure (overall risk 1:130) 87.0%, 1.8% SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 123 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Cuckle et al. 2005) The study compared the estimated performance of 17 DS screenings strategies: 3 in 1st T, 6 in 2nd T, and 8 integrated or sequential policies- 6 are not in use but are proposed or trials underway and 2 used by default (combining results in an independent manner). Statistical modelling Sample 1st T: NT at 11-13/40, combined test with PAPP-A and fβhCG at 10-13/40, or hCG at 12-13/40. 2nd T: double test (with either βhCG or total hCG) TT (with βhCG or total hCG), quad test (with βhCG or with hCG). The eight policies combining 1st T and 2nd T markers were: Serum integrated screening Fully integrated step-wise (either fβhCG or total hCG in 1st T and either fβhCG or total hCG in 2nd T) screening combining tests in an independent manner (as for step wise) Contingent (using either fβhCG or hCG in 1st T and 2nd T) All marker levels converted to MoM for GA. The risk was estimated by combining the MA specific risk with LR for marker profiles derived from the Gaussian distribution of DS and unaffected pregnancies. The distribution of risk was estimated by statistical modelling as per Royston and Thompson, 1992 (as referenced in this paper). The maternal age specific risk was from Cuckle et al. 1987 (as referenced in this paper) and the maternal age distribution was considered Gaussian with a mean of 27yrs and a SD of 5.5 years. The risks for extreme marker values were calculated after truncation as per Wald et al. (Wald et al. 2003b) Parameters for 2nd T obtained from two meta-analyses of non–interventional studies. For AFP, uE3, hCG and βhCG this was Cuckle, 1995 (as referenced in this paper) and for inhibin this was Cuckle (Cuckle 2003) Outcomes and verification Results Comments Outcomes For this table the comparisons were NT, combined screening with fβhCG or hCG, double test with fβhCG or hCG, TT with fβhCG or hCG, quad test with fβhCG or hCG, serum integrated, fully integrated, stepwise with fβhCG or hCG for 1st T, independent with fβhCG or hCG for 1st T, and contingent screening with fβhCG or hCG for 1st T. The results were the DR for a fixed FPR. Accuracy of screening methods DR for 5% FPR Limitations Not a lot of information on between trimester correlations in DS pregnanciesthese parameters will need to be updated. The performance is reported with 1st T serum and NT at 10/40 and 11/40 unless hCG used when serum and NT at 12 and 13/40. For the stepwise policies (where applicable) the proportion of tests considered positive after 1st T screen set at 70%, and for contingent screen the proportion having 2nd T screening set at 15%. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING NT 11/40 78% Combined (fβhCG)-MSS 10/40, NT 11/40 87% Combined (hCG)-MSS & NT 12/40 83% Double (fβhCG) 61% Double(hCG) 56% TT (fβhCG) 65% TT (hCG) 60% Quad (fβhCG) 71% Quad (hCG) 67% Serum Integrated 78% Fully integrated 93% Stepwise (1st T fβhCG) 95% Stepwise (1st T hCG) 90% Independent (1st T fβhCG) 86% Independent (1st T hCG) 86% Contingent (1st T fβhCG) 94% Author’s conclusion Modelling with meta-analysis derived parameters provides a reliable guide for policy and favours contingent screening policy. The widespread use of calculate in 1st T and 2 T risks separately should be abandoned. Reviewers’ conclusions High quality validated modeling using parameters from large studies. Metaanalysis of non intervention or adjustment for viability bias. Lots of detail about the methods used. The results were better for strategies using fβhCG compared to total hCG in both trimesters. Combined screening was better than NT, and both were better than all 2nd T screening methods. All integrated and sequential screening was clearly better than screening in 1 trimester, except serum integrated which has a similar performance to NT and was inferior to combined, and independent screening. 124 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Cuckle et al. 2005) GA mostly based on USS. Statistical modelling 1st T PAPP-A and fβhCG parameters from a published meta-analysis of non-interventional studies (Cuckle and van Lith, 1999 as referenced in this paper), extended to include the gestational specific DS means from a more recent meta-analysis (Spencer et al. 2002) as referenced in this paper and in a large single study (Wald et al. 2003b) Continued Sample Outcomes and verification The means at 10, 11, 12, and 13 /40 derived from the weighted average from these 3 sources subjected to regression. About half of the data in the recent meta-analysis from interventional studies using NT, PAPP-A, and fβhCG so viability bias. Possibly this would mean a 1.5% reduction in the mean PAPP-A level and a 1% increase in the mean fβhCG and so to allow for bias authors adjusted the means by these proportions. Mean serum hCG at 12 and 13/40 from the weighted average of data in the two recent studies (Spencer et al. 2002 as referenced in this paper and SURUSS) as were other hCG parameters. NT means for DS at 11, 12 and 13 weeks from a new metaanalysis of nine studies as referenced in this paper. Four were intervention studies so subject to intervention bias and one biased referral of women with increased NT. FASTER biased as removed cystic hygroma. The results of the 3 interventional studies adjusted for viability bias. NT SD from 4 large prospective studies combined (Spencer et al. 2003 as referenced in this paper). Between trimester correlation coefficients from a new meta-analysis from 5 studies as referenced in this paper. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Results Comments Contingent (1st T hCG) 88% Stepwise screening using fβhCG in combined test has best performance (95%) based on DR and FPR followed by contingent (94%) using fβhCG in combined test and then fully integrated (93%). Contingent has high DR with only 15% having 2nd T screening. 125 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies Sample Outcomes and verification Results Comments (Maymon et al. 2005) The aim of the study was to describe a method for deciding whether an individual's 1st T DS screening test result justifies further testing in the second trimester. The illustrative sample: Outcome The modeled DRs and FPRs for stepwise and contingent screening are presented. Accuracy of screening methods Limitations The illustrative sample size was small and biased but this was only used to confirm the probability of a positive 2nd T test given the 1st T test results. Statistical Modelling For each 1st T test result the model estimated the probability that the final results would be positive, and it also generated 2 cut-off probabilities, above one where women would be offered invasive screen, and below the other women would be counselled that no further screening would be needed. The markers compared were 1st T fβhCG and PAPP-A and 2nd T quad (intact hCG). Statistical modelling was used to estimate the distribution of second-trimester marker profiles for a given first-trimester profile and hence the probability of a final positive result, using a 1:250 risk at term cut-off. The Gaussian distributions were constructed for unaffected and DS pregnancies for 1st T and 2nd T markers. For a given profile of one marker the associated profile of 2nd T markers was determined. Using numerical integration (Royston and thompson,1992). The probability of a positive final result was determined given the MA and the first trimester profile. 24 DS samples from women referred to a Medical Centre, in Israel for late ToP. All had been screened with NT at 11-13/40 and 2nd T TT and all had AC. Some also had 1st T MSS (PAPP-A and fβhCG) and others were tested retrospectively from stored serum. 367 unaffected singletons. Samples collected from those having sequential screening at same centre. All had been screened with NT at 1113/40 1st T MSS (PAPP-A and fβhCG) and TT and all had AC. For stepwise screening all those with a ≥ 50% chance final positive have diagnostic test. For contingent screening all those with ≥ 50% chance final positive have diagnostic test, and below 3% no further screen. Verification From phoning parents and delivery medical records. The parameters used for DS and unaffected were from SURUSS. GA from USS (CRL) and markers were weight corrected. 1st T serum taken at 10/40, 2nd T serum at 1422/40 and NT measured at 11/40. MA risk based on Cuckle et al. 1987 (as referenced in this paper) and MA distribution taken as Gaussian with mean 27 yrs and SD of 5.5 yrs. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING DR, FPR Step-wise 90%, 1.7% Contingent 88% 1.4% Difficulties implementing any of the screening strategies None noted Author’s conclusion Predicting the probability of a positive final result from the first-trimester marker profile has potential utility, either as a decision aide for individual women or as a formal part of screening policy in selecting a subset of women for second-trimester testing. Reviewers’ conclusions While this was not the aim of the study the paper did have results for stepwise screening versus contingent screeningstepwise screening had a marginally higher DR and a slightly higher FPR compared to contingent screening. 126 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Maymon et al. 2005) A multi-variate log Gaussian model was used with published parameters. Sample Outcomes and verification Statistical Modelling Continued To illustrate the method, the model was applied to a published series of 24 DS and 367 unaffected pregnancies to confirm the probability of a positive screen given first trimester markers. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Results Comments 127 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Benn and Donnenfeld 2005) The study compared combined test, quad test, and stepwise screening using a strategy that takes into consideration correlation between tests and one that doesn’t. Statistical modelling Sample If both trimester results considered independent then the a priori risk at 2nd T is the LR from the 1st T, and the LRs are multiplied together. Not all tests are fully independent so it is important to take the correlation between markers into account when calculating the LR. Few labs have flexible software which have the capacity to combine 1st T and 2nd T screening results into a multivariable screening algorithm so it has been suggested (Malone 2005 as referenced in this paper) that risks could be provided based on the approximation that tests are independent of each other. Outcomes and verification Results Comments Outcomes The DR and FPR have been modelled for 1st T combined screening, 2nd T quad test (with total βhCG), and stepwise strategy with and without this multivariate algorithm (LRM). Accuracy of screening methods DR, FPR Limitations The paper states that more robust estimates of DR and FPR based on the simulation of larger numbers of cases with additional test combinations are available elsewhere.(Benn and Donnenfeld 2005). 1st T combined 83.7%, 5.1% 2nd T quad (with total βhCG) 84.4%, 6.6% Stepwise (LRM) 90.8%, 3.1% Risk cut-off was 1:270. Model used the MA distribution of the USA for 2000. The aim of this study is to compare this approach (LR1st T X LR2nd T) with the approach that uses an algorithm that takes the correlations into consideration (LRM). Statistical parameters from SURUSS. Screening at 12/40 and then 14-22/40. Dating was by USS and maternal weight correction. Monte Carlo simulation (software programme S-plus) to generate 50,000 DS results and 50,000 unaffected result. For each set of results the LR for 1st T and 2nd T were calculated, as well as for all seven markers combined (LRMA). Combined the LR with MA specific risk of DS and the MA distribution of the USA for 2000. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Stepwise (LR1st T X LR2nd T) 90.2%, 3.9% Difficulties implementing any of the screening strategies Errors in determining risk if assuming that 1st T and 2nd T tests are independent when they are not. Risk will be underestimated in 27% of DS cases and overestimated in 20% of unaffected. In 10% of DS the risk will be underestimated ≥ 2 fold, and for 10% of unaffected the risk will be overestimated ≥ 2 fold. Author’s conclusion We conclude that the correlations that exist between first and second trimester screening tests preclude the use of second trimester risks derived from the direct product of separate first and second trimester screening. Should not be offered screening across 2 trimesters unless can be combined using this multivariate algorithm. Reviewers’ conclusions The study has highlighted importance of taking the correlation between markers into account when calculating the LR. 128 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Palomaki et al. 2006) The aim of the study was to describe the choices and tradeoffs inherent in 3 strategies that combine 1st T and 2nd T markers. Monte Carlo simulation Sample Compares three policies stepwise (1st T combined then 2nd T quad), contingent, and integrated screening. Parameters (means SD, and correlation coefficients) and truncation limits for modelling taken from SURUSS. Data from pregnancies viable at 2nd T. Assumes that if a case is detected in 1st T, there is no screen in 2nd T. The MA distribution is that of the USA in 2000 (median 27 years and 13% ≥ 35 years). The MA specific risk is based on a published equation (Hecht and Hook 1996 as referenced in this paper). Adjusted for 43% loss in DS between late 1st T and term, and 23% loss between early 2nd T and term. Monte Carlo simulation to generate MA and the associated 7 markers for a million hypothetical cases of DS and million unaffected. For each of the “pregnancies” 1st T risk then 2nd T risk is assigned. Modelling programme has been used elsewhere and validated by comparing the results to other independent models in the literature. Assume complete adherence i.e. all positive cases get diag. test and no one who has positive 1st T screen then requests 2nd T. No one drops out. Compared their result to that of SURUSS at same cut-offs. Minor differences may be due to hCG rather than fβhCG and slight differences in the MA distribution and a priori risks. Still they are nearly identical to those of SURUSS. Use same type of hCG in 1st as in 2nd T as feel these had to be kept the same when comparing strategies. Outcomes and verification Results Comments Outcomes Compares firstly stepwise screening to integrated screening. DRs are given for stepwise screening for a fixed FPR (2% and 5%) and different cut-offs. The FPR and cut-offs set so that the DR for fully integrated screen is same as for sequential screening. Cut-offs provided in brackets for each test. Accuracy of screening methods Limitations Makes assumptions that may not occur in “real-world screening” e.g. that there is complete adherence to the policy i.e. all positive cases get IT and no one who has positive 1st T screen then requests 2nd T. The cut-offs are the level at or above which women would be offered IT. Secondly compares contingent screening to integrated screening in the same manner. The DR for fixed FPR (2% and 5%) are given for contingent screening, then compared to the FPR for integrated screening when the DR are the same as for the contingent screening. Stepwise versus integrated Stepwise DR for 2% FPR (1st T1:63, 2nd T-1:65) 84.3% Integrated FPR (1:100) 1.2% Stepwise DR for 5% FPR (1st T1:168, 2nd T-1:165) 89.9% Integrated FPR (1:275) 3.1% Stepwise FPR 5% (1st T-1:41, 2nd T-1:450) 92.3% Integrated FPR (1:470) 4.8% Contingent (those with risk 1:1500 or lower no further screen) versus integrated Contingent DR for 2% FPR (1st T-1:63, 2nd T-1:70) 84.1% Integrated FPR (1:95) 1.2% Contingent DR for 5% FPR (1st T-1:168, 2nd T-1:195) 89.5% Integrated FPR (1:250) 2.8% For contingent screening the cut-off under which women have no further screen is either set at 1:1500 or 1:3250. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Author’s conclusion “Integrated screening is the most efficient of the 3 screening strategies but possible to select risk cut-offs for both stepwise and contingent that minimize losses in efficiency while maintaining early detection and early completion. For all of these strategies well designed intervention trials are needed to determine acceptability to women and providers in primary care settings and to assess real-world performance.” Reviewers’ conclusions Slightly different DR and fixed FPR so difficult to compare all three at same time. However results do show that integrated performs better than either stepwise screening or contingent screening. 129 Table 16. Source Country Setting Study design Evidence Grading (Palomaki et al. 2006) Monte Carlo simulation Continued Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Comparison screening strategies Sample Outcomes and verification Results Contingent (those with risk 1:3250 or lower have no further testing) versus integrated. Contingent DR for 2% FPR (1st T-1:63, 2nd T-1:65) 84.2% Integrated FPR (1:95) 1.2% Contingent DR for 5% FPR (1st T-1:168, 2nd T-1:170) 89.8% Integrated FPR (1:270) 3% SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 130 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Wald et al. 2006b) Compared screening with Integrated to serum Integrated screening methods with and without the use of the Cross trimester (CT) ratios. Monte Carlo simulation Sample Used data from SURUSS in which the following analytes were measured in both 1st T and 2nd T: AFP, uE3, fβhCG and total hCG, PAPP-A and inhibin-A. Some are markers in one trimester but not other (e.g. PAPP-A). Results are for 74 DS pregnancies with pairs of 1st T and 2nd T measures and 492 unaffected pregnancies (3 were removed as they had imprecise GA). NT performed in 1st T. Analytes measured in 1st T between 10-13/40 and those in 2nd T 14-22/40. Median interval between two measures was 31 days when first measure done at 11/40, 27 when first taken at 12/40 and 18.5 when first taken at 13/40. Analytes expressed as MoM for unaffected pregnancies at same GA. The CT ratio expressed as MoM in 2nd T divided by MoM 1st T. Medians of CT ratios in DS pregnancies estimated for 10/40, 11/40, 12/40, and 13/40. SD and correlation coefficients estimated from SURUSS (based on GA using CRL and corrected for maternal weight). Parameter estimates for individual markers are from SURUSS. Truncation limits given in appendix for the serum markers, and for NT in the text. NT lower truncation limit was 0.65 which is higher than for SURUSS (as per Morris and Wald 2005, as referenced in this paper). Outcomes and verification Results Comments Outcomes The performance of screening using the CT ratios was better when the 1st T measurement of NT and MSS was at 11/40 compared to 12 or 13/40. The results are therefore given for 1st trimester screening at 11/40. Accuracy of screening methods Limitations Small sample used to determine the parameters (74 DS cases). The outcomes extracted for this evidence table were those for each Cross trimester (CT) ratio when first of each pair taken at 11/40. All were good discriminators except AFP. The results for integrated screening (with fβhCG and total hCG) are presented with and without the use of the CT ratios (for all markers except AFP). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING DR for 5% FPR CT ratio-PAPP-A 60% CT ratio-total hCG 42% CT ratio-uE3 31% CT ratio-inhibin-A 30% CT ratio-fβhCG 19% CT ratio-AFP 10% Without CT ratios, DR for 5% FPR Integrated test (fβhCG) 94.2% Integrated test (total hCG) 94.1%, Without CT ratios, FPR for 85% DR Integrated test (fβhCG) 0.88% Integrated test (total hCG 0.95% With CT ratios DR for 5% FPR Integrated test (fβhCG) 96.9% Integrated test (total hCG) 97.4% The first author has a patent interest in the integrated test and with others holds a patent for the use of uE3 as a 2nd T screening marker for DS. He is also the director of logical Medical Systems (software for interpretation of DS screening test results). Author’s conclusion The addition of CT ratios to an Integrated test substantially improves the efficacy and safety of prenatal screening for DS. It is cost effective and could be usefully introduced into screening programmes. Reviewers’ conclusions More data may be needed to refine parameters. However it appears that CT ratios have the potential to improve integrated screening. 131 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Wald et al. 2006b) Multivariate Gaussian distributions were specified for different combinations of screening markers. Monte Carlo simulation used these distributions to generate random samples (500,000 DS and 500,000 unaffected). The risk of DS pregnancy at mid-trimester determined by MA specific rates of DS at term adjusted by multiplying by 1/0.77 (fetal loss between mid-trimester and term). This MA specific rate was multiplied by the LR for marker values obtained by the over lapping Gaussian distribution of affected and unaffected pregnancies. Continued Sample Outcomes and verification DR and FPR calculated for a specific risk cut-off. The MA distribution based on England & Wales from 1996 to 1998. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Results With CT ratios, FPR for 85% DR Integrated test (fβhCG) 0.32% Integrated test (total hCG) 0.28% Comments 132 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison screening strategies (Wright et al. 2006) Compared three stage contingent screening, to integrated screening, and a strategy of using TT rather than quad in the 2nd T stage of contingent screening. Statistical modelling Sample For the 3 stage contingent strategy PAPP-A and fβhCG would be measured at 10/40. The risk would be combined with MA associated risk and those with very low risk would be considered screen negative and would have no further screen. All remaining women would then get NT screen at 11/40 and risk would be determined using 1st T screen and MA. Those with very low risk would have no further screening and those with very high risk would have invasive testing. Those with intermediate risk would continue to 2nd T screening. The quad test (AFP, uE3, fβhCG, and inhibin) would be measured at 2nd T. MA risk, and results from 1st T and 2nd T screens would be combined to assess risk. This would be used to determine whether screen positive or screen negative at this stage. The integrated strategy would involve all women receiving all tests and being told the result after all screening completed. Outcomes and verification Results Comments Outcomes Results for 3 stage contingent screening, integrated screening, and a strategy of using TT rather than quad for the contingent strategy were extracted for this evidence table. Accuracy of screening methods DR , FPR Limitations Assumes all with high risk will receive the additional testing. However there is a relatively narrow GA window and some may not present until 2nd T. The paper used modelling to determine the best cut-off. This was stage one 1:2000, stage 2 low risk 1:2000 high risk 1:20, stage 3 1:250. These results are presented in this table compared to integrated screening (final risk cut-off of 1:250). Means, SD, and correlation coefficients obtained from SURUSS (Wald et al. 2004). These parameters based on pregnancies expected to be viable in 2nd T and assume no correlation between marker values and viability. SURUSS truncation limits used. Modelling based on data where GA confirmed by USS and weight corrected. Various rates estimated using a Monte Carlo simulation. 500,000 observations were drawn. LR calculated for each set of markers at each stage and combined with MA distribution of DS or unaffected pregnancies to derive DR and FPR. The MA associated risk of DS taken form published data (Wright and Bray, 2000 as referenced in this paper). MA distribution that of England and Wales 2002. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 3 stage contingent with quad 89.5%, 1.9% 3 stage contingent with TT 89.2%, 2.2% PAPP-A and βhCG combined with quad -integrated manner(non-disclosure) 92.2%, 2.1 Author’s conclusion 3 stage contingent screening achieved similar results to integrated screening while only a fraction needed 2nd T results. About 2/3 of pregnancies are screened with 1st T MSS alone, 5/6 women complete screening in 1st T, and 1st T DR is over 60%. This is a logical approach for allocation of NT where this resource is limited. The acceptability of this protocol and its performance in practice should be tested in prospective studies. Reviewers’ conclusions This strategy appears promising but there needs to be a prospective study in order to have data based on observed data. 133 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Wald et al. 2006a) The aim of the paper was to determine the effect of adjusting serum markers for values from a previous pregnancy. Women with a FP MSS in one pregnancy have an increased chance of a FP result in a subsequent pregnancy. Monte Carlo simulation Sample The study compared performance of screening with TT, compared to quad, combined, serum integrated and integrated screening, adjusting and not adjusting for DS in previous pregnancy. For each screening marker the values of the MoM were adjusted to take account of the values of a previous pregnancy using the regression coefficient of the weight adjusted MoM values in the current pregnancy regressed on the value of the previous pregnancy. Outcomes and verification Results Comments Outcomes For the purpose of this evidence table the DR and FPR for Triple, quad, combined, serum integrated and integrated screen for a 5% FPR and an 85% DR were extracted. These are presented as unadjusted figures, and figures adjusted for previous pregnancy results. Accuracy of screening methods Limitations MoM adjustment does not take into account that the SD of adjusted MoM values tend to be smaller and correlation coefficients slightly different. However virtually no gain using adjusted distribution parameters. All tests used fβhCG. The regression coefficient for 1st T markers was estimated using data from 401 women who had MSS in two pregnancies screened at the Wolfson Institute of Preventive Medicine. For 2nd T markers used previously reported estimates of the regression coefficients (Wald et al. 2004a, as referenced in this paper). These were based on 6,448 women screened at the Wolfson Institute of Preventive Medicine. Adjusted all screening markers. Did not adjust NT as the regression coefficient was so small. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Women with previous pregnancy. DR (adjusted DR) For a fixed 5% FPR. Triple 75% (adj 80%) Quad 81% (adj 85%) Combined 85% (adj 87%) Serum integrated 86% (adj 89%) Integrated 94% (adj 95%) FPR (adjusted FPR) For a fixed 85% DR. Triple 10% (adj 7.9%) Quad 7.1% (adj 4.9%) Combined 4.9% (adj 3.7%) Serum integrated 4.7% (adj 2.9%) Integrated 1.1% (adj 0.7%) Assumed that the relationship between markers across two pregnancies is the same whether the second has DS where first does not, or both were unaffected (no data but authors presume this is a safe assumption). The first author has a patent interest in the integrated test and with others holds a patent for the use of uE3 as a 2nd T screening marker for DS. He is also the director of logical Medical Systems (software for interpretation of DS screening test results). Author’s conclusion The results show that adjusting the MoM for those in a previous pregnancy improves screening performance and avoids recurrent FP. Should be used on all women with previous pregnancy screen (as long as not DS in first). Both false positives and false negatives will be reduced. 134 Table 16. Evidence table of primary research studies appraised investigating the accuracy of screening carried out in first and second trimesters (continued) Source Country Setting Study design Evidence Grading Comparison Interventions (Wald et al. 2006a) DR and FPR estimated by Monte Carlo simulation. Weight adjusted MoM and MA were simulated for a previous pregnancy and current pregnancy for a sample of 100,000 affected (for DR) and 100,000 unaffected (for FPR) pregnancies. Monte Carlo simulation Sample Outcomes and verification Continued Distribution of ages among women who had two pregnancies from Wald et al. 2004a (as referenced in this paper). MA specific risk of DS from Morris et al. 2002 (as referenced in this paper). Means DS and correlation coefficients for markers in the same pregnancy were from SURUSS. Based on GA by USS, NT at 11/40 and weight adjusted. The correlation coefficients for markers in different pregnancies were calculated from 401 women as above. Truncated limits for all markers in appendix (from Wald et al. 2003 as referenced in this paper and Morris (2005)). MoMs were adjusted for serum marker results in previous pregnancy. These results together with MA were used to calculate risk of DS. Also examined affect of adjusting MoM for those who had previous FPR. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Results Comments Should also not adjust where previous pregnancy was a twin or where previous pregnancy had markers adjusted for smoking and not smoker now. Also probably no need to adjust where women screened in last 10 months as likely to have ended in miscarriage with risk of extreme levels. Simplify by using most recent pregnancy. Should also truncate to avoid adjusting using very extreme levels which may have been associated with T18. Reviewers’ conclusions While this was not the aim of the study the results show that the integrated tests is superior to the serum integrated test which is superior to the combined test which in turn is superior to the quad test and that the TT has the poorest performance of these 5 screening methods. 135 Chapter 6: Changes in the Rate of Invasive Testing Following the Introduction of Screening PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY The search identified 11 eligible primary research studies. Below is an overview of study designs and aspects of quality represented by these studies. Full details of the 11 papers appraised, including methods, key results, limitations and conclusions, are provided in evidence Table 17 (pages 142-155). Studies are presented in reverse chronological order of publication within each table. Study designs and quality assessments Of the 11 eligible studies investigating changes in the rate of invasive testing following the introduction of population-based MSS programmes, all 11 were graded evidence level III-2. All 11 of these studies were retrospective cohort studies with eight studies comparing rates of testing before and after the introduction of screening programmes (Shohat et al. 2003; Zoppi et al. 2001; Benn et al. 2005b; Jou et al. 2005; Benn et al. 2004; Chasen et al. 2004; Muggli and Halliday 2004; Cheffins et al. 2000), and three comparing the invasive testing rates in hospitals or areas with differing screening policies (SmithBindman et al. 2003; Dixon et al. 2004; Wellesley et al. 2002). The eight studies comparing the rate of invasive testing before and after the introduction of populationbased MSS screening programmes were of similar design but varied considerably in terms of the type of screening programme introduced, the quality of the data sources and the duration of follow-up. Jou et al. (2005) reported the rates of amniocentesis in mothers aged 35 years and over in Taiwan between 1993 and 2001. The authors utilised large national databases of birth defects, amniocentesis records, and demographic descriptions of the population in Taiwan to report the numbers of births, the proportion of infants born to older mothers, the live birth prevalence of DS and rates of amniocentesis. The quality of these databases is mixed and there is no information reported regarding the uptake of the screening programme. The registration of live births was not compulsory in Taiwan until 1997 (Chen et al. 2005), which brings into question the quality of the birth data and, thereby, the accuracy of calculating amniocentesis rates as a proportion of total births. Benn et al. (2004) aimed to examine the overall effectiveness of a MSS programme in clinical practice by investigating changes in the numbers of samples processed by the authors’ cytogenetics laboratory over an 11-year period (1991-2002). A triple-analyte MSS programme was introduced in Connecticut in 1991, expanded to include a fourth analyte in 1999 and utilised USS findings to modify risk from 1996 onwards. While the records used were complete, the authors could not be certain that the proportion of the population screened by their laboratory had remained constant over the period of the study and that any changes in invasive testing were not the result of people being referred to other laboratories. Indications for invasive testing were reclassified by the authors into one category when multiple indications were present. There is no information as to whether the reclassification was conducted by someone blind to the screening policy in place when the test was requested. A second study by Benn et al. (2005b) utilised a similar sample but over a 12-year period (1991-2003) and focussed on the uptake of invasive testing in true-positive versus false-positive cases. While the quality of data were generally good in both of these studies, the samples consisted of a proportion of the pregnant population of Connecticut and lacks generalisability. Chasen et al. (2004) compared the rates of invasive testing after the introduction of NT measurement in a medical centre between 2000 and 2002. The quality of records was high, however, the sample was small and based in one medical centre, and therefore may not be generalisable to other populations. In addition, the authors chose to exclude terminated pregnancies and stillbirths from the sample and so the study reports only on the invasive testing rate in women aged 35 years or older who delivered viable SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 136 infants. As a number of invasive tests will result in terminated pregnancies and miscarriages, this design does not give an accurate indication of changes in the overall rate of invasive testing. Two studies retrospectively examined the rate of invasive testing in large state-wide MSS programmes in Australia. Muggli and Halliday (2004) examined changes in the rate of prenatal diagnostic tests performed in Victoria between 1992 and 2002. Prior to 1997, risk assessment for DS was made based primarily on maternal age ≥ 37 years of age, previous family history or increased risk of aneuploidy. In 1997 second trimester MSS based on 4 analytes and maternal age was introduced, and a combined serum and NT risk assessment was available from 2000. Similarly, Cheffins et al. (2000) reported rates of invasive testing over an 11-year period (1986-1996) following the introduction of a 4-analyte MSS programme in South Australia in 1991. Both of these studies utilised high quality databases, reported rates of invasive testing for older and younger mothers, and included all invasive tests performed over the period of study, rather than just those of mothers who had delivered viable infants. The inclusion of all pregnancies in the state and all complete records of all invasive tests make these results highly valid and more generalisable to other settings. Shohat et al. (2003) investigated the rate of invasive testing relative to screening policies in the Israeli Jewish population between 1990 and 2000. National birth and cytogenetic databases were utilised. In Israel the MSS triple screen was introduced to the public between 1990 and 1992 (free of cost), USS screen at 14-16 weeks were available privately at a cost from 1992, and NT measurements privately from 1996. Unfortunately, the availability of some tests privately and some publicly makes it difficult to determine the effect of each change in policy on invasive testing rates. Furthermore, the authors used a maternal survey of all women who gave birth on a single day in 2000 to estimate the uptake of various screening options over the whole period of the study. Lastly, cultural differences in the acceptability and uptake of screening and invasive testing in the Israeli population make the findings of this study less generalisable to other settings. Zoppi et al. (2001) compared the rates of prenatal diagnosis in two cohorts of women, one group referred prior to nuchal translucency measurement (1995) and one after (1999). The study was set in Italy and the two cohorts of women were both ≥ 35 years of age. In the 1995 cohort, 982 women were given non directive counselling and offered invasive testing on the basis of AMA. In the 1999 cohort, 1386 women were given non directive counselling and offered IT on the basis of AMA and NT results. The authors focused on the proportion of women who declined invasive testing after receiving nondirective counselling where NT results were or were not discussed. Women who presented too late for 1st trimester NT measurement were included in analyses of rates of invasive testing in the NT group. Removing these women from analyses altered the findings. Potential cultural differences in the acceptability and uptake of screening and invasive testing in Italy make the findings of this study less generalisable to other settings. Three studies compared the rates of invasive tests performed in areas or hospital districts with differing screening policies. Dixon et al. (2004) compared two district hospitals, one with a maternal serum screening programme (2 analytes) in place and one with MSS available by request. In the second (Wellesley et al. 2002), eight district maternity units were divided into three groups based on their screening policies. Group 1 (two hospitals) assessed risk for DS based on two maternal serum analytes and an anomaly scan. Group 2 (three hospitals) assessed risk based on maternal age and a second trimester anomaly scan. Group 3 (three hospitals) included nuchal translucency measurement in their risk calculations, although the application of NT was not consistent across the three districts, being offered either to all mothers, on the basis of age, or information was supplied to mothers regarding private nuchal translucency measurement. The age distribution of the cohorts varied significantly in both studies and, given that maternal age was used in some serum screening protocols to calculate risk, this may have confounded the results. In Wellesley et al. (2002), the rates of invasive testing are difficult to compare because there were variations between hospitals in the same group, regarding the age at which testing was offered, the availability of services through the public health system, and the age of the sample. Smith-Bindman et al. (2003) examined the proportion of invasive testing in England and Wales over an 11 year period (1989-1999) and in relation to the dominant screening policies in different areas. Data describing almost 6,000,000 births and 335,000 prenatal diagnostic referrals were included in the study. Data sources for births and cytogenetic results were of high quality. The screening method which lead to a prenatal diagnosis in DS cases was used as a proxy measure for screening methods employed in SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 137 different areas for different years (area-years). Using this inferred method, each area-year was classified on the basis of the dominant screening method at the time and rates of invasive testing compared. Study settings and samples Jou et al. (2005) examined the rate of invasive testing using a Taiwanese national database of all deliveries to mothers aged 35 years and over between 1993 and 2001. (Shohat et al. 2003) utilised large national datasets in Israel to examine rates of invasive testing, with a focus on the Jewish-Israeli population. Two studies, both based in Australia, examined rates of invasive testing following the implementation of a state-wide maternal screening programme. Muggli and Halliday (2004) utilised two units which collect information on indications for every invasive procedure and every birth at or after 20 gestational weeks in the state of Victoria. Rates of invasive procedures between 1992 and 2002 were compared. Cheffins (2000) evaluated the South Australian Maternal Serum Antenatal Screening Programme (SAMSAS) between 1986 and 1996. Both of these studies included all invasive tests performed in the sample over the period of study and reported their results separately for older and younger mothers. Four studies, three based in the United States and one in Italy, were based on records collected in specific hospitals or cytogenetic laboratories. Benn et al. (2004; 2005b) reported the numbers and indications of all AC and CVS samples processed by one cytogenetics laboratory in the state of Connecticut between 1991 and 2002, and so included both older and younger mothers in the sample. Chasen et al. (2004) reviewed the records of all women aged 35 years and over who delivered viable infants at one hospital in New York State. Women who underwent a termination were excluded from the study. (Zoppi et al. 2001) compared the proportion of invasive testing in two groups of women, both aged ≥ 35 years, who were screened for DS risk based on maternal age alone or nuchal translucency and maternal age. Two studies (Dixon et al. 2004; Wellesley et al. 2002) utilised the records of maternity hospitals based in neighbouring districts and both of these were based in England. Dixon et al. included all registered pregnancies in two district hospitals in Gloucestershire, England between 1993 and 1999, and reported rates of invasive testing separately for older and younger mothers. Wellesley included all women who completed their pregnancies in one of eight hospitals in Wessex, England between 1994 and 1999. While mothers of all ages were included in the sample, the results were reported as an average rate of invasive testing across the period of study and not reported separately for each age group. (SmithBindman et al. 2003) reviewed the dominant screening policies and rate of invasive testing in England and Wales from 1989-1999. Study interventions, comparators and outcomes All of the studies examined the rates of invasive testing in district, state, or national community screening programmes. In eight studies the rates of invasive testing were compared before and after the implementation of a screening programme. In two cases a maternal screening programme utilising two analytes had been introduced, in two cases nuchal translucency measurement had been introduced, and in the remaining four studies a triple screen or quadruple screen had been introduced. The policy prior to the implementation of these programmes was not specified in all cases, but in the studies which did describe the previous policy it was generally based on maternal age (either ≥ 35 or ≥ 37 years), and previous family history or risk of aneuploidy. Uptake of screening varied between samples and was not reported in some cases. Five of the studies reported the rates of testing for individual years of study. One study reported the rate of testing for specific years during the period of the study because they corresponded to particular changes in policy or uptake of screening. One study reported the average rate of testing across the 6year period of the study instead of including data for individual years. Two of the studies compared the rates of invasive testing between maternity units with different screening policies. In one study two district hospitals were compared, one with a maternal serum SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 138 screening programme (2 analytes) in place and one without. In the second, 8 district maternity units were divided into three groups of districts with similar screening policies in place. Group 1 assessed risk for DS based on two maternal serum analytes and an anomaly scan. Group 2 assessed risk based on maternal age and a second trimester anomaly scan. Group 3 included nuchal translucency measurement in their risk calculations, although the application of NT was not consistent across the three districts, being offered either to all mothers, on the basis of age, or information was supplied to mothers regarding private nuchal translucency measurement. One final study compared the rates of invasive testing for different areas in each year of the study (area-years) (Smith-Bindman et al. 2003). The dominant screening method in each area for each year of study was approximated using a proxy measure of the method of screening in identified DS cases. There were variations in the way data were reported. In five studies the rates of CVS and AC were reported separately, in five studies a single rate of invasive testing (AC + CVS) was reported, and in one study the numbers of invasive tests were reported rather than the rate. Four studies reported rates of testing separately for older and younger mothers, three studies only included mothers aged 35 or older, and four studies compared the overall proportion of invasive testing across age groups. The rate of invasive testing was calculated as the number of tests performed or serum samples analysed/ the total number of pregnancies or total pregnancies for that age group. PRIMARY RESEARCH: STUDY RESULTS Studies comparing the rate of invasive testing before and after implementing a MSS programme Seven studies compared the rate of invasive testing before and after the implementation of populationbased MSS programmes. One of the studies (Jou et al. 2005) examined the introduction of a two analyte screen, one study (Benn et al. 2005b) compared rates before and after the introduction of a three and four analyte screen, and two (Muggli and Halliday; 2005, Cheffins, 2000) the effectiveness of a four analyte screen. One further study (Chasen, 2002) examined the effectiveness of using nuchal translucency measurements to detect DS. Two studies utilised high quality sources of data, included representative samples and examined the rates of invasive testing over an acceptable period of time. Cheffins et al. (2000) evaluation of SAMSAS included information about the rate of invasive testing, births to women aged 35 years and older, and the uptake of a four analyte maternal serum screen in South Australia between 1986 and 1996. Overall, births to women of AMA (≥ 35 years) increased from 5.2% of total births in 1982 to 13.5% in 1996. There was an overall increase in the proportion of pregnant women undergoing invasive testing from 4.9% in 1986 to 11.4% in 1996. When these rates were examined relative to maternal age, the authors’ reported that the rate of invasive testing in older women had not changed significantly since the introduction of maternal screening (51.0% in 1986 to 53.8% in 1996) whereas in younger women it had increased from 1.7% to 4.8% over the same period. A comparison of indications for referrals across time showed that indications based on maternal age alone had decreased from 60.7% to 51.0% as a proportion of all tests. The proportion of tests referred on the basis of MSS results increased from 9.5% to 35.7%. Muggli and Halliday (2005), in their evaluation of a four analyte serum screening programme introduced in Victoria in 1996, also noted a steady increase in the proportion of births to women of AMA (≥ 37 years) giving birth over time, from 6.1% in 1992 to 11.2% in 2002. Prior to the introduction of MSS in 1996, invasive tests were recommended based primarily on maternal age ≥ 37 years. During this period the rates of invasive testing increased from 5.9% in 1992 to 8.2% in 1996. MSS was introduced for all women in the public health system in 1996. Overall rates of invasive testing were 8.8% in 1998 and 7.9% in 2002. Among older women, the rate of invasive testing decreased from 63.0% in 1992 to 42.7% in 2002, while the proportion of younger women undergoing an invasive procedure increased from 2.2% to 3.6% over the same period. A comparison of indications for referrals between 1992 and 2002 showed that indications based on maternal age alone decreased from 62.2% to 48.3% as a proportion of all tests. The proportion of tests recommended on SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 139 the basis of abnormal screening results (MSS or USS or both) increased from 6.9% to 35.4% of all tests. Benn et al. (2004) reported an overall decrease in the number of amniotic or chorionic villus samples processed at a cytogenetics laboratory between 1991 and 2002 when, based on the age demographics of the population, the expectation would have been that the number would have increased. In 1991 a three analyte (AFP, uE3, hCG + MA) MSS programme was introduced to screen for DS and this was expanded to include inhibin-A in 1999. There were however, a number of limitations in this study which make generalisation of the findings to other populations difficult. The authors acknowledged that they assumed the proportion of the population serviced by their laboratory had remained constant over the period of study, and that it was possible that changes in the number of samples processed by the lab were the result of changes in referrals to their lab, rather than due to screening policy change. They noted that the numbers of samples processed remained within 19-22% of all Connecticut births during this time. AMA as an indication for invasive testing decreased from 66% of all tests in 1991 to 45% in 2002, while tests referred on the basis of MSS results increased from 23% to 35% of all tests over the same period of time. Given that indications for referral were reclassified if there were multiple indications, it is difficult to determine the effect of individual changes in screening policy, and the authors acknowledge this limitation. Furthermore, there is no information as to whether the person who reclassified the indications was blind to the screening policy in place for each case. A second study by Benn et al. (2005b) utilised a similar sample but over a slightly different timeframe (1991-2003) and examined both the proportion of invasive tests performed in DS screen-positive pregnancies and the proportion performed in false-positive pregnancies, over the period of the study. There was a reduction in the utilisation of amniocentesis by screen-positive women from 70% in 1991 to 27.7% in 2003. When DS affected pregnancies and false-positive pregnancies were examined separately, the authors found there was no significant change over time in the rate of AC in DS affected pregnancies (average =73%), but in false-positives, the decline from 70% to 27% was significant. There was no information about the change in rate of AC over time either overall or for older and younger women. Chasen et al. (2004) examined the rates of invasive testing in women aged 35 years or older who gave birth at a New York based medical centre between 2000 and 2002. NT screening was introduced in 2000 and the authors compared the rate of AC and CVS between mothers who were screened and not screened. Overall rates of IT decreased from 70.0% to 64.7% over the three year period of the study (p<.001). A significantly greater proportion of women who underwent NT screening opted for CVS (7.1%) than in those who did not take up NT screening (1.9%, p<.001). There was no difference in rates of amniocentesis between the screened (64.1%) and unscreened groups (62.1%), or in overall rates of invasive testing between screened (66.0%) and unscreened (69.3%) groups. However, the follow-up was not long enough to provide a real indication of the change in invasive testing following the introduction of screening. Shohat et al. (2003) investigated the rate of invasive testing following the introduction of second trimester maternal serum screening in 1990-1992, targeted ultrasound screening in 1992, and nuchal translucency measurements in 1996. In older women, the rate of invasive testing decreased from 66.5 to 48.3% between 1990 and 2000. In women <35 years of age, the rate of invasive testing increased from 5.7% to 13.8% over the same period. Overall there appears to have been an initial increase in invasive testing rates between 1990 and 1993, followed by a leveling out and slight decrease between 1994 and 2000. Zoppi et al. (2001) compared the rates of prenatal diagnosis in two cohorts of women, both ≥ 35 years of age. In the 1995 cohort, 982 women were given non directive counselling and offered invasive testing on the basis of AMA. In the 1999 cohort, 1386 women were given non directive counselling and offered IT on the basis of AMA and NT results. A higher proportion of women accepted invasive testing after screening based on maternal age alone compared to screening which included nuchal translucency measurements (70.2% vs. 66.1%). However, this difference was not significant and 297 women who presented too late for NT screening were included in the calculation of invasive testing uptake. It was possible to analyse the data excluding these women and when this was done, the rate of invasive testing in the NT group was 77.9%, significantly higher than the AMA alone group. The design of this study gives an indication of changes in the rate of testing following the implementation SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 140 of a new screening policy, but a much longer follow-up period is needed to get a real sense of the impact of any change. Jou et al. (2005) reported a substantial increase in AC rates between 1991 and 2002. During this period there were, however, two important policy changes in Taiwan that potentially may have biased their results. The registration of live births was not compulsory in Taiwan until 1997 (Jou et al. 2005), which brings into question the quality of the birth data and, thereby, the accuracy of calculating amniocentesis rates as a proportion of total births. Secondly, Chen et al. (2005) commented on the implementation of a programme to promote prenatal diagnosis in older women in 1990 and a change in family values to one of quality rather than quantity which may have made the use of more definitive methods of prenatal diagnosis, such as AC or CVS, more popular. Studies comparing the rate of invasive testing in health districts with differing DS screening policies: Three studies compared the rate of invasive testing in areas or health districts with differing screening policies for DS. Smith-Bindman et al. (2003) examined the proportion of invasive testing in relation to the dominant screening policies in different areas in England and Wales over an 11 year period (1989-1999). High quality data describing almost 6,000,000 births and 335,000 prenatal diagnostic referrals were included in the study. The screening method which lead to a prenatal diagnosis in DS cases was used as a proxy measure for screening methods employed in different areas for different years (area-years). Using this inferred method, each area-year was classified on the basis of the dominant screening method at the time and the rates of invasive testing compared. The number of invasive tests per DS case detected were reported but the proportion of women undergoing invasive testing for each area-year was not. The overall rate of invasive testing increased gradually from 4.4% in 1989 to 5.8% in 1999. Dixon et al. (2004) compared two district hospitals, one with a maternal serum screening programme (2 analytes) in place and one with MSS available by request (1.6% of women requested MSS). The overall rate of invasive testing in the district with a MSS programme was significantly higher (7.6%) than in the district with MSS available only by request (4.4%, p < 0.001). When these rates were examined as a function of age they showed that, among older mothers, the rate of invasive testing was lower in the district with routine MSS (29%) than in the MSS by request district (38%). Among mothers aged 25-34 years of age, the rate of invasive testing was higher in the district with MSS (4.8%) than the district with MSS by request (0.61%). Unfortunately, the age distribution of the two cohorts was different, with women in the district with MSS being significantly older than those in the district with no MSS. Because maternal age was used in conjunction with MSS, gestational age and maternal weight to calculate DS risk, it is possible that this population had a higher uptake of invasive testing as a result of the age of the sample, rather than because of the introduction of the screening programme. Rates of testing were presented as an average over the six-year period of the study so it is difficult to determine how rates of testing changed over time as a result of the introduction of MSS. Wellesley et al. (2002) utilised a similar design where eight district maternity units were divided into three groups based on their screening policies. Group 1 (two hospitals) assessed risk for DS based on 2 maternal serum analytes and an anomaly scan. Group 2 (three hospitals) assessed risk based on maternal age and a second trimester anomaly scan. Group 3 (three hospitals) included nuchal translucency measurement in their risk calculations, although the application of NT was not consistent across the 3 districts, being offered either to all mothers, on the basis of age, or information was supplied to mothers regarding private nuchal translucency measurement. In the two districts offering serum screening (Group 1) the average rate of invasive testing was 6.3%. In districts offering invasive testing on the basis of MA (35+ or 37+) and an USS anomaly scan at 20 weeks (Group 2), the average rate was 5.2%. In districts offering invasive testing on the basis of MA and NT (Group 3) the average rate was 5.3%. However there was some variation between hospitals in the same group, for instance, rates of testing in group 3 ranged from 2.8% to 7.7%. Interestingly, the district with the highest rate of invasive testing had the oldest maternal population and the district with the lowest rate of invasive testing had the youngest maternal population. The authors suggested that the proportion of women who accept an invasive diagnostic test increases with age and risk even in populations with serum screening in place. The rates of invasive testing are difficult to compare because there were variations SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 141 between hospitals in the same group in the age at which testing was offered, the availability of services through the public health system, and the age distribution of the samples. Conclusion In general, the quality of appraised studies in this section of the report was not high. There were limitations regarding sample selection and the setting of the study which may have biased the findings in several of the studies or at least made them less applicable to other settings. The uptake of screening varied and many older women still opted for direct invasive testing. Factors such as the perceived accuracy of the test may affect women’s decisions regarding screening and subsequent invasive testing. Screening uptake seemed to increase the longer a test had been used, perhaps reflecting women’s confidence in the results. Unfortunately, none of the studies included in this chapter evaluated the change in rates of invasive testing following the introduction of integrated or sequential screening methods. While the performance of these methods in detecting Down syndrome cases has been examined in short-term audits of community screening programmes, they have not yet been implemented long enough to provide an indication of the long-term effects on invasive testing rates. Of the 11 studies included in this chapter, two were designed well enough to provide a reliable indication of the impact of a screening programme on rates of invasive testing. In one study a four analyte screen (AFP, uE3, α and β subunits of hCG) was introduced (Cheffins et al. 2000) and in the other the quadruple test (Muggli and Halliday, 2005). Both of these studies were based in Australia and used reliable, complete sources of data, followed the cohort for an appropriate length of time, and reported their findings separately for older and younger mothers. It appears that overall rates of invasive testing increased slightly after the introduction of screening against the backdrop of a steeper rise in maternal age. If referrals for prenatal diagnosis had been made on the basis of maternal age alone, the rates of invasive tests would have been expected to increase more than they did after screening was introduced. When the uptake of invasive testing was examined as a function of age, the pattern was one of a decrease in older mothers and increase in younger mothers after the introduction of screening. In areas with screening in place there was generally a decrease in rates of testing in older mothers and fewer tests were recommended on the basis of maternal age alone, thereby decreasing the number of unnecessary invasive procedures in that age group. If the proportion of older mothers accepting invasive testing declines with the introduction of a screening programme, changing patterns in maternal age are likely to be an important determinant of invasive testing rates if such a programme was to be introduced. A limitation of this chapter was that none of the studies evaluated the effect of introducing integrated or sequential screening strategies on the rate of invasive testing. The results from long-term evaluations of population-based screening programmes utilising these strategies are not yet available. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 142 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme Source Sample Screening Strategies Outcomes Results Comments Benn et al. (2005b) N= 109, 469 maternal serum tests performed on singleton pregnancies that were referred to the authors’ cytogenetics laboratory between September 1991 and December 2003. September 1991: Triple test introduced – maternal serum AFP, hCG, uE3 + MA Utilisation of AC Cytogenetic results MSS provided to 109,469 women with singleton pregnancies between 1991 and 2003 The presence of DS or other aneuploidy identified at birth Age distribution: ≥ 35 years at delivery = 12.9% of screened population The proportion of births to mothers ≥ 35 years Changes in utilization of AC: Screen positive women (all ages) 1991 = 70% AC 2003 = 27.7% AC Authors Conclusions Between 1991 and 2003 there was a marked reduction in the utilization of AC by screen positive women. This was true of women aged 35 or older and those aged < 35 years. The overall reduction in AC rates was confined to false-positive cases. In affected pregnancies, there was no significant change in the rate of AC (average 73%), but in false-positives, the decline from 70% to 27% was significant. University of Connecticut Health Center Human Genetics Laboratories Retrospective cohort Level of evidence III-2 Data were collected from laboratory records regarding utilization of AC, cytogenetic results, presence of DS or other aneuploidy. Proportion of births to AMA women collected from birth certificate data April 1999: Inhibin-A added to serum screening 2nd T cut-off of 1:270 used to identify high risk pregnancies January 1996: 2nd T USS used to modify age-specific or post serum screening risk False positives (all ages) 1991 = 70% 2003 = 27% Significant difference (p<.001) Screen positive affected pregnancies (all ages) No significant difference in AC rate between 1991 and 2003 (average 73%) Proportion of AMA women who received screening: 1991 = 58% of all AMA women 2003 = 83% of all AMA women Reviewers Conclusions Rates of AC uptake for screen positive affected pregnancies fluctuated between 1991 and 2003. The authors do not present the actual uptake for this group but report that it was not a significant difference. They include a figure comparing the rate of uptake for affected and false positive pregnancies and it appears that the AC rate for unaffected pregnancies initially increased from about 50% in 1991 to about 90% in 1996, followed by an overall decline to about 70% uptake in 2003. The authors present rates of AC averaged over the period of the study and focus instead on the difference between AC rates in screen positive affected pregnancies and screened false positive pregnancies. The study does not provide information about the change in overall rate of AC over time either overall or for older and younger women. The sample demographics were not reported in enough detail to ascertain whether the participants were representative of the Connecticut pregnant population or whether the results are generalisable to other populations. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 143 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments (Jou et al. 2005) N= 1 331 616 deliveries between 1993 and 2001. 2nd T MSS for DS introduced in 1994 using 2 analytes: AFP and hCG AFP and free beta-hCG + MA AC rates per year in women aged 35 years and older The proportion of pregnant women aged 35 and older was 4.8% in 1993 and rose steadily to 8.3% in 2001. Live birth prevalence of DS in the Taiwanese population AC rate in pregnant women ≥35 years of age: 1993=25.3% 1994=35.0% 1995=41.3% 1996=46.1% 1997=53.3% 1998=56.7% 1999=69.2% 2000=75.8% 2001=70.7% Authors Conclusions The live birth rate of DS fell markedly from 0.63 per 1000 live births in 1994 to 0.21 per 1000 live births in 2001. This sudden fall in live birth rates of infants with DS is mainly due to the implementation and liberal use of the twomarker MSS test for DS during the 2nd T. We do not know how many women underwent the test per year in Taiwan but it is estimated that probably about 65-85% of pregnancies were screened. AC rates may have increased because the present policy in Taiwan still indicates routine AC in women aged 35 years or older and serum testing in other women. However the efficacy of AC has been re-evaluated and the rate of AC decreased in 2001. Taiwan Retrospective cohort Level of evidence III-2 Taiwan has an average of approximately 300,000 live births per year. Data sources: Birth defects registration (1993-2001) Amniocentesis database (1987-2001) Demographic databases (1991-2001) Live birth rate of isolated cleft palate (ICP) used as an internal control Reviewers Conclusions There is not enough information provided in the study to ascertain whether the screening programme was responsible for the decrease in the live birth rate of DS, or how the introduction of the screening programme affected rates of AC. There are no reliable data reported about the uptake of screening. It is possible that screening was not as widely accepted as the authors postulated and that the introduction of screening merely raised awareness of DS. Mothers may have been more likely to opt for direct AC to obtain a definitive diagnosis about their child as a result of this increased awareness. We have no way of knowing whether mothers who opted for AC had previously undergone MSS. Policy changes in Taiwan during the period of the study may have affected the rates of AC. Compulsory birth registration was introduced in 1997 suggesting that the birth data may not have been accurate prior to this time, and if AC rates were calculated as a proportion of all births, this may have affected their accuracy as well. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 144 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments (Benn et al. 2004) N = 18057 records regarding all amniotic or chorionic villus samples processed by the laboratory between 1991 and 2002 1991 – high risk women identified mostly on the basis of MA (≥35 years), low maternal serum maternal AFP, or family hx The numbers of invasive tests ordered, indications for testing, and results of tests were compared for each year of study. Births: 1991 Live births = 48566 AMA births = 6 082 (12.5%) September 1991: Triple test introduced – maternal serum AFP, hCG, uE3 + MA Trends in the use of prenatal testing, referral indications, and numbers of abnormalities were identified. Authors Conclusions: Against the background of an increase in the number of women aged 35 years or older at delivery, the number of invasive tests performed decreased by 50% between 1991 and 2002. The reduction of invasive tests is largely attributable to the introduction of serum screening and 2nd T ultrasonography to determine risk, instead of MA alone. However, because indications for testing were reduced to one indication per test, it is difficult to determine the relative contributions of each change in screening policy. No new cytogenetics laboratories opened in the area and the proportion of women receiving MSS DS tests by our laboratory remained within 19.121.3% of the total Connecticut pregnant population, suggesting that any changes in referral numbers were not because patients were referred to competing laboratories. University of Connecticut Health Centre cytogenetics laboratory USA Retrospective cohort Level of Evidence: III-2 Numbers of referrals were reported separately for DS cases and other abnormal karyotypes. Indications for prenatal dx: 1) fetal demise 2) abnormal USS 3) abnormal serum screening results 4) family history of aneuploidy 5) maternal age of 35 years or older 6) other Multiple indications were reduced to a single indication in the above order of priority. January 1996: 2nd T USS used to modify age-specific or post serum screening risk April 1999: Inhibin-A added to serum screening 2002 Live births = 41690 AMA births = 9 040 (21.7%) Numbers of prenatal diagnoses performed each year: 1991 AC fluid samples = 1870 CVS samples = 118 Total = 1988/48566 = 4.1% (95% CI 3.9 – 4.3%) of total births 1996 AC fluid samples = 1581 CVS samples = 60 2002 AC fluid samples = 878 CVS samples = 55 Total = 933/41690 = 2.2% (95% CI 2.1 – 2.4%) of total births Indications for referral: AMA number (%age of referrals) 1991=1314 (66%, 95% CI 64 – 68%) 2002=423 (45%, 95% CI 42 – 49%) MSS number (%age of referrals) 1991=455 (23%, 95% CI 21 – 25%) 2002=350 (38%, 95% CI 34 – 41%) Abnormal USS number (%age referrals) 1991=72 (4%, 95% CI 3 - 4%) 2002=127 (14%, 95% CI 11 – 16%) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers Conclusions: There is evidence of a decrease in the overall number of invasive tests performed between 1991 and 2002 when, based on the age demographics of the population, the expectation would have been that the number of ITs would have increased had AMA alone been used to identify high risk mothers. During this period, MSS to screen prenatally for DS risk was introduced but this study cannot be used to demonstrate a causal link between the introduction of screening and the decrease in invasive tests performed. Given that indications for referral were reclassified if there were multiple indications, it is difficult to determine the effect of individual changes in screening policy, and the authors acknowledge this limitation. Furthermore, there is no information as to whether the person who reclassified the indications was blind to the screening policy in place for each case. It is also impossible to determine the individual screening methods used by individual practitioners in the state, which may have varied widely and may or may not have reflected the state policy on DS screening. 145 Table 17. Source Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Sample Screening Strategies Outcomes Results (Benn et al. 2004) Comments The sample demographics were not reported in enough detail to ascertain whether the participants were representative of the Connecticut pregnant population or whether the results are generalisable to other populations. University of Connecticut Health Centre cytogenetics laboratory This study gives an indication of the effects on the rate of invasive testing of introducing MSS and 2nd T USS screening for DS in a population. However the results cannot reliably be used to predict change in invasive testing rates with the introduction of similar policies in other populations. USA Retrospective cohort Level of Evidence: III-2 Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 146 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments (Chasen et al. 2004) N= 4029 women aged 35 years or older who delivered viable infants between January 2000 to December 31 2002 Main objective: assessing the impact of NT screening on the rate of CVS and AC in women ≥35 years old Rates of screening and IT reported for six 6-monthly periods between January 2000 and December 2002. Median age at delivery = 37 years NT screening introduced April, 2000: NT 11-14 weeks GA (as per FMF protocol) High risk cut off > 1:300 Measures: NT screening rates 2nd T multiple marker screening rates CVS and AC rates Rates of testing with and without NT screening: NT group: CVS=1.9% AC=64.1% Overall IT=66.0% Authors Conclusions There was a decline in invasive testing that coincided with the increasing use of NT screening. Women who underwent NT screening were less likely to have CVS than women who did not undergo screening. Rates of AC stayed the same. Different trends were noted in 35-39 year olds and ≥40 year olds. In 35-39 year olds, the declining rates of both CVS and AC contributed to an overall decline in invasive testing. In women aged 40 years and older, the rate of CVS declined although the rate of AC increased, with a rate of IT that did not change significantly over time. USA New York Weill Cornell Medical Center Hospital Retrospective review of hospital records – USS database and antepartum records Level of evidence III-2 Exclusion criteria: - women who underwent abortion - women who were cared for by physicians who did not deliver their patients at the authors hospital 2nd T USS 18-20 weeks GA No NT group: CVS=7.1% AC=62.1% Overall IT=69.3% Rates of testing by age group: <40 years old NT : AC=60.6%, CVS=1.8% No NT : AC=61.8%, CVS=4.2% ≥40 years old NT: AC=75.7%, CVS=2.4% no NT: AC=63.7%, CVS=19.3% Uptake of screening: Rates of NT screening increased from 0% to 41.2% (p<.001) over the two year period. Overall rates of CVS decreased from 7.9% to 4.4% (p<.001) over the two year period. Rates of CVS among those who did not have NT screening also decreased over the period of the study from 7.9% to 6.5%(p=.02) suggesting there was an overall trend of a decrease in CVS regardless of screening. There was no change in AC rates but there was an overall decrease in IT from 70.0% to 64.7% (p<.001). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers Conclusions Only women who delivered viable infants were included in the sample so the results of this study do not report the rates of IT in all women who underwent screening. The authors acknowledge this and explain that, because one of the concerns about IT rates is exposure of unaffected fetuses to IT, they felt that excluding aborted fetuses was appropriate. Only age-related demographic information was reported for the sample so it is difficult to determine how representative the sample was of the wider population. Because the sample was restricted to women ≥ 35 years of age, the findings of the study have limited generalisability. The authors acknowledged that other factors may have contributed to changes in IT rates, including better education of physicians and patients about the use of 2nd T screening, and pointed out that they could not be certain that there was a causal relationship between NT screening and changes in IT rates. In addition, the authors reported a decrease in CVS rates over the period of the study for those who did not undergo NT screening, suggesting an external factor may have been at least partially responsible for the decrease. 147 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments (Dixon et al. 2004) N= 38,475 registered pregnancies in two district hospitals between 1993 and 1999 Comparison of the impact of different DS antenatal screening policies on detection and AC rates. IT rates for each district and maternal age group Women who registered their pregnancies in West Gloucestershire were significantly younger than women in East Gloucestershire. Authors Conclusions Our findings show that in Gloucestershire between 1993 and 1999, the double test identified a greater proportion of pregnancies affected by DS than a programme based on age-based AC and 20 week USS. The acceptance rate of the double test suffered a steady decline from 67% in 1993-4 to 47% in 1998-1999. The highest acceptance rate was 66% in the 25-34 year age group. Gloucestershire, England Retrospective cohort – review of records (1993-1999) Two district hospitals in Gloucestershire with different prenatal DS screening policies. Level of evidence III-2 District 1 – East Gloucestershire N=14,863 < 25 years = 25% 25-34 years = 59% >34 years = 16% District 2 – West Gloucestershire N=23,612 <25 years = 29% 25-34 years = 61% >34 years = 10% Data sources: Regional cytogenetic laboratory records, Gloucestershire community pediatric service register, medical records East Gloucestershire: May 1993-1999 offered an opt-in MSS programme to women ≥ 25 years. No NT available but small minority able to request NT privately. MSS between 15-19/40 based on two analytes (double test): AFP fßhCG MSS + MA + GA + maternal weight used to assign risk. MSS risk cut-off May 1993 to Jan 1994 was 1:250. Jan 1994 increased to 1:200 Women with a +ve MSS result, history of aneuploidy or >34 years MA were offered AC or CVS. West Gloucestershire: No MSS unless requested by woman or her doctor. Of the pregnant population in West Gloucestershire, 1.6% had a double test. Both districts offered AC if DS-related anomalies were found by a mid-gestation scan. DS cases for each district and maternal age group Fetal losses after IT Uptake of IT following a positive MSS result Rate of Invasive testing: The overall rate of IT in West Gloucestershire (no MSS) was significantly lower than East Gloucestershire (MSS) (p<.001). East Gloucestershire = 7.6% (95% CI 6.6-8.6%) West Gloucestershire = 4.4% (95% CI 3.0-5.9%) Rates of IT for each age group: East Gloucestershire <25 = 32/3806 = 0.8% (95% CI 0.6 – 1.1%) 25-34 = 425/8713 = 4.9% (95% CI 4.4 – 5.3%) >34 = 671/2344 = 28.6% (95% CI 26.8 – 30.5%) West Gloucestershire <25 = 24/6875 = 0.3% (95% CI 0.2 – 0.5%) 25-34 = 87/14305 = 0.61% (95% CI 0.5 – 0.7%) >34 = 931/2432 = 38.3% (95% CI 36.3 – 40.2%) Rate of acceptance of screening: East Gloucestershire (opt in MSS) overall = 51% <25 years = 16% 25 – 34 = 66% >34 = 53% West Gloucestershire (by request) overall = 1.6% Invasive testing acceptance rate following a positive MSS result: <25 = 87% 25-34 = 93% >34 = 86% SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Maternal age was the principal indication for IT in West G. whereas a positive MSS result in women aged 25-34 was the main indication and cause of the significantly higher IT rate in East G. The rate of IT in East G. was significantly higher than that in West G. Reviewers Conclusions The higher rate of IT in East G. (MSS) seems to be driven by the 25-34 year age group, who had the highest uptake of MSS and among whom positive MSS results were the main indication for IT. In women aged > 34, positive MSS results and MA were equally likely indications for IT. It appeared that AMA women may be using the screen programme to avoid AC. Retrospective so relying on accurate records. 148 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments (Muggli and Halliday 2004) Victoria, Australia All women who received prenatal dx before 25 weeks gestation in Victoria, Australia between 1992 and 2002. Retrospective analysis of statewide datasets Data were collected from the following sources: Aim: To describe patterns of uptake of prenatal diagnostic testing and prenatal DRs for DS in Victoria with regard to maternal age and prenatal screening practices. Total number of prenatal diagnostic tests: Steady increase from 1992 – 1998 followed by an 8% decline from 2000 to 2002. The proportion of CVS was about 40% over this period. Total confinements were about 60000 per year from 1992 – 2002. Level of evidence III-2 Public Health Genetics Unit which collects data on indications for every CVS and AC procedure in Victoria. 2nd T MSS screening became available for all women in 1997 using a four analyte screen: AFP hCG inhibin-A uE3 + MA Authors Conclusions Although the proportion of older mothers (≥ 37 years) has doubled in the past 10 years and in 2002 represented 11% of all women giving birth, there has been a steady decrease in the utilization of IT by older women. At the same time an increasing number of younger women are being referred for invasive testing following an increased risk result of a screening test. Perinatal Data Collection Unit collects data on every birth on or after 20 weeks gestation from all maternity hospitals and home births. 1st T combined serum and NT was introduced in the private sector only in 2000. Comparisons focused on the following years: 1992 – prenatal dx primarily on basis of MA 1996 – introduction of 2nd T MSS for all women in the public sector 1998 – highest uptake of prenatal dx 2002 – most current data Two age groups compared: 37 years and over at expected date of delivery 36 years and younger at expected date of delivery Proportion of older women (≥ 37 years) giving birth: 1992=6.1% 1996=8.0% 1998=9.1% 2002=11.2% Overall rate of IT (CVS or AC)(%, 95%CI) 1992=5.9% (5.8 – 6.1%) 1996=8.2% (7.9 – 8.4%) 1998=8.8% (8.6 – 9.1%) 2002=7.9% (7.7 – 8.1%) Rate of IT for women 37 years or older: 1992=63.0% (61.5 – 64.5%) 1996=65.1% (63.7 – 66.4%) 1998=58.5% (57.2 – 59.8%) 2002=42.7% (41.5 – 43.8%) Rate of IT for women 36 years or younger: 1992=2.2% (2.1 – 2.3%) 1996=3.3% (3.1 – 3.4%) 1998=3.8% (3.7 – 4.0%) 2002=3.6% (3.4 – 3.7%) Indications for invasive testing: AMA alone decreased from 95.7% to 80.4% in older women, and from 62.2% to 48.3% as a proportion of all tests. The proportion of diagnostic tests carried out in younger women with no increased screening risk decreased from 62.2% to 25.6%. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers Conclusions High percentage of recruitment to the study. Information about all births, IT rates in the state. Applicable to a NZ setting. Exact uptake of MSS was not reported but was mentioned in the discussion as being less than 50% over the period of the study, and approximately 46% in 2002. 149 Table 17. Source (Muggli and Halliday 2004) Victoria, Australia Retrospective analysis of statewide datasets Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Sample Screening Strategies Outcomes Results As a proportion of all diagnostic tests it decreased from 21.6% to 10.2%. Overall increase from 6.9% to 35.4% of all tests being prompted by an abnormal screening result (either 2nd T MSS or USS or both) between 1992 and 2002. Level of evidence III-2 Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 150 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments (Wellesley et al. 2002) N=155 501 women who completed their pregnancies between January 1994 and December 1999. Seven different screening policies were used across the eight districts. These could be divided into 3 principle groups: Wessex antenatally detected anomalies register Invasive procedure rates for each district ranged from 2.8% to 7.7% from 1994 to 1999. Group 1: Serum screening offered to all mothers: AFP fßhCG 1:250/1:300 cut-off = IT offered Routine scan at 16 or 12 weeks Anomaly scan at 20 weeks in one district For 1998 only, the postcodes of women who delivered in each district were checked to make sure they were allocated to the correct district. Authors Conclusions The authors found no evidence that serum and NT screening improves the DRs or reduces rates of invasive procedures. Across the region 15% of pregnant women were aged ≥ 35 years compared with the 5-7% assumed by modelling studies of serum screening. Despite this high proportion the rate of IT was only 5-7%, even in districts that relied on MA screening. The authors speculate that the proportion of women who accept an offer of IT rises with increasing age and risk. In the districts offering routine MSS the uptake of screening among older women was 40%, and this has decreased progressively since it was introduced in 1993. In these districts 40% of women opted for direct AC and 20% declined any test. Wessex, England Retrospective six year survey (1994-1999) Comparative audit of screening in adjacent health districts to determine whether serum screening is justified by an increase in DR or a reduction in the rate of invasive procedures Maternity units in eight districts 7 different screening policies in the 8 districts. Group 2: MA main indication for IT 19-20 week anomaly scan 2 districts – 35 years and over cut-off 1 district – 37 years and over cut-off Group 3: IT offered to all women 35 years and over with additional screening offered. NT was offered in one district to all women and in another to women aged ≥ 34 years or by request in younger women. A third district provided a leaflet with information about private clinics offering NT measurement and serum screening. An anomaly scan at 20 weeks was performed in all 3 districts. One cytogenetics laboratory for the region. The district with an average of 2.8% IT had the youngest maternal population and the district with a 7.7% rate of IT had the oldest maternal population. IT rate per group: Group 1: (MSS, anomaly scan in 1 district) Average = 6.3% (range 5.9-6.7%) Group 2: (MA + anomaly scan) Average = 5.3% (range 4.2-6.8%) Group 3: (MA + NT either to all women, those ≥ 34 years or offered privately + anomaly scan) Average = 5.2% (range 2.8-7.7%) Rates of IT are reported as an average rate over the period of the study (1994-1999), not separately for each year. Uptake of screening: Group 1: 1993 = 85% 1999 = 55% in hospital A and 65% in hospital B SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers Conclusions The rates of IT for each district are difficult to compare because even where districts had similar screening policies, there were differences in the age at which different services were offered, the availability of services through the public system, the age of the population, and the uptake of screening options. For instance, in the two districts offering MSS, the uptake of screening was about 85% in 1993 but had dropped to 55% in district A and 65% in district B by 1999. Because the rates of IT are averaged over the 6 year period of the study, it is difficult to determine how the rates of IT changed with decreases in the uptake of screening. Age distribution of the population in each district varied, which would have affected the offer of IT (MA) and possibly uptake as well. The district with the highest proportion of AMA women, had the highest reported rate of IT and the district with the lowest proportion of AMA women had the lowest rate of IT. Authors conclude that there was no difference in rate of invasive testing between districts with serum screening and districts without. 151 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments (Cheffins et al. 2000) Women who had MSS between 1991 and 1996 AC and CVS offered to women age 35 and older since 1970’s. Number of births Overall proportion of pregnant women undergoing IT (%age, 95% CI): Total (AC + CVS) 1986 = 4.9% (4.6 – 5.2%) 1991 = 7.3% (6.9 – 7.6%) 1996 = 11.4% (10.9 – 11.8%) Authors Conclusions Against a background of increasing births to women aged 35 years or older there was a significant decrease in the birth rate of DS following the introduction of MSS. There was an increase from 7% to 84% of all women using a form of prenatal testing (MSS/direct AC/CVS), with approximately 7% using direct AC or CVS and 76% using MSS. MSS availability has resulted in a large proportion of women (approximately 83% of younger women and 93% of older women in 1996) having a prenatal test for DS, where previously testing was rare. South Australia (popn 1.48 million people, 20000 births per year) Retrospective cohort South Australian Maternal Serum Antenatal Screening Programme (SAMSAS) Level of evidence III-2 Women who had AC or CVS between 1986 and 1996 Women who had births or terminations of pregnancies with DS between 1982 and 1996 Births to women ≥ 35 years: 1982 = 5.2% 1996 = 13.5% SA prenatal screening policy introduced in 1991: AFP uE3 alpha and beta subunits of hCG + MA Offered on a state-wide basis in September 1992. MSS well-established in private sector by 1994 using Amerlex-M, a three analyte screen. Risk cut-off for both tests was 1:405 (equal to the risk of a woman aged 35 years at delivery in South Australia). At-risk women counseled and offered AC for fetal karyotyping. Uptake of screening: 1991 = 17% (pilot programme introduced) 1994 = 71% (Medicare rebate introduced) 1996 = 76% Higher in younger women (< 35) for all years. Maternal sociodemographic and obstetric characteristics Terminations performed for DS Numbers of screening tests performed and results of screening tests AC rates by age (provided by Department of Chemical Pathology, Women’s and Children’s Hospital who perform 80% of tests). Total AC and CVS rates and indications. Proportion of older women undergoing IT: 1986 = 51.0% (48.3 – 53.8%) 1991 = 52.8% (50.5 – 55.0%) 1996 = 53.8% (51.8 – 55.7%) Proportion of younger women undergoing IT: 1986 = 1.7% (1.5 – 1.9%) 1991 = 2.6% (2.3 – 2.8%) 1996 = 4.8% (4.5 – 5.1%) Indications for AC 1991 vs. 1996: MSS increased from 9.5% to 35.7% USS increased from 4.0% to 6.5% MA decreased from 60.7% to 51.0% Anxiety (< 35 yrs) decreased from 8.9% to 2.3% Family history or past birth of birth defect decreased from 7.1% to 2.3% Proportion of women using direct AC (all reasons except high risk MSS result): 1990: 5.3% 1996: 6.5% Older women: 1990 = 42.1% 1996 = 40.1% Younger women: 1990 = 1.8% 1996 = 1.2% AC rates in screen +ve (high risk): 1491/1966=75.8% <35 years = 80.3% ≥35 years = 65.0% SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING With the introduction of MSS the proportion of women utilizing AC increased, particularly among younger women, paralleling experience at other centres. For younger women MSS offered a chance for screening where none had been offered before. For older women it provided an opportunity to use a non-invasive test which might indicate that AC is unnecessary. Counseling very important prior to MSS, especially the informed consent process. Counselling also required when results of the screening are reported. Reviewers Conclusions High percentage of recruitment to the study. Information about all births, IT rates in the state. Applicable to a NZ setting. Women who screened positive were told they had a 1 in 50 chance of carrying a fetus with DS. One clinic performed 80% of testing so high control over way testing carried out but more information about how the results were reported to mothers needed. Still a high proportion of older mothers opting for direct AC (all other indications except MSS), but it is providing an option for older mothers who would like to avoid AC. Prior to the introduction of testing 53% of older mothers had some form of prenatal screening (AC or CVS) compared to 93% after (IT or MSS). 152 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments (Zoppi et al. 2001) Two groups of women referred for prenatal diagnosis on the basis of maternal age. Group 1: AMA Acceptance of screening for the NT group Group 2: NT at 10+3 – 13+6 weeks Maternal choice of invasive testing among the two groups Group 1 (AMA alone) 690/982 = 70.2% (67.4 – 73.1) underwent IT CVS = 31% (27.6 – 34.5) AC = 69% (65.5 – 72.4) Authors conclusions Knowledge of NT could lead to a decrease in the demand for invasive diagnosis. Because abnormalities are detected in the 1st trimester, this might lead to a more frequent diagnosis by 1st T transabdominal CVS for those undergoing IT. One group presented to the clinic before NT screening had been introduced and so were referred for testing based on AMA alone. The second group presented when NT screening was being offered between 10.3 and 13.6 weeks gestation. Rate of diagnosis by CVS and AC among the two groups Prenatal diagnosis clinic, Italy Group 1 (AMA alone) N = 982 Retrospective cohort Level of evidence III-2 Group 2 (AMA and NT) N = 1386 1089 at appropriate gestational age for NT screening 1088 accepted NT screen NT offered at 10+3 – 13 +6 weeks The age range of each group was not reported. However the age range of women accepting IT in each group was: Group 1 Median age = 38 (35 -46) Group 2 Median age = 37 (35 – 48) The two groups were both given non-directive genetic counseling, which included information about maternal and fetal risks associated with AC and CVS (no PRL ratios provided in the paper but Tabor cited); techniques and results; local experiences and resources by the same group of geneticists. 221/982 = 22.5% (19.9 – 25.1) refused IT 68 miscarried before diagnosis and 3 terminated the pregnancies Group 2 (AMA + NT) 1386 in total 1089 eligible for NT screen 1088/1089 accepted NT screen Rates of uptake were reported in the paper inclusive of the 297 women who did not have NT measured because they were outside the gestational age range. 916/1386 = 66.1% (63.6 – 68.6) underwent IT CVS = 29% (26.1 – 32.0) AC = 71% (68.0 – 73.9) 421/1386 = 30.4% (28.0 – 32.8) refused IT Rates of IT in women where NT was measured: 848/1088 = 77.9% (75.5 – 80.4%) Not possible to extract CVS and AC data separately for this group. 47 women had a spontaneous miscarriage and two terminated the pregnancies prior to diagnosis SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers conclusions Not all of group 2 had NT measured as 297 women presented too late for the screen. Data were available to calculate the rate of IT excluding those women who presented after 13+6 weeks, and this showed a higher rate of IT (77.9%) in the NT group compared to the AMA group (70.2%). The maternal and gestational ages of the two groups were not reported, although the age range of women in each group who opted for IT was provided. However, the two original groups of women may have presented at different stages in pregnancy and, as uptake of IT can vary as a function of gestational age, this may have confounded the results (as may any difference in age). The lack of information about the age of the participants was a serious limitation. Potential cultural differences in the acceptability of prenatal screening and diagnosis make the results of this study less generalisable to other settings. The design of this study gives an indication of changes in the rate of testing following the implementation of a new screening policy, but a much longer follow-up period is needed to get a real sense of the impact of any change. 153 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments Shohat et al. (2003) Prenatal screening policy between 1990 and 2000: Israel N= 831,505 live births and 169,927 invasive tests performed between 1990 and 2000 in Israel. Retrospective cohort review of nationally collated databases Births, numbers of invasive tests and indications for IT in the Jewish population in Israel from 1990-2000. Data collected for whole Israeli population but findings reported separately for Jewish and non-Jewish women. Level of evidence III-2 Data sources: Demographic variables (maternal age, ethnic origin, number of births in Israel) obtained from the Central Population Registry. %age (95% CI) Total IT rate in Israeli Jewish population: 1990 = 11.3% (11.1 – 11.5) 1992 = 19.4% (19.1 – 19.6) 1994 = 23.8% (23.5 – 24.1) 1996 = 20.7% (20.4 – 20.9) 1998 = 20.3% (20.0 – 20.6) 2000 = 19.8% (19.6 – 20.1) Authors conclusions Between 1990 and 1995 there was an increase in the number of invasive tests carried out and a significant improvement in the percentage of DS cases detected. This change in the use of invasive testing seems to be the result of the introduction of calculated DS risk based on MA and MSS. Between 1995 and 2000, despite the introduction of new methods for calculating DS risk (USS and NT), there was no significant change in the rate of invasive testing, which remains high compared to other countries. Of interest is the difference between the two age groups. While the uptake of IT among women age less than 35 years was almost the same between 1995 and 2000, the uptake in women age ≥ 35 years declined from 61.6 – 48.3%. It may be that the introduction of newer screening tests has increased anxiety in younger women and helped older women decide whether they require IT or not based on their calculated risk of DS. Live births data obtained from National Registry for DS livebirths Invasive testing data collected from all 14 cytogenetic laboratories in Israel and divided into 3 groups based on indications: • AMA (≥ 35 years) • < 35 years with ≥ 1:386 risk of DS as calculated by free 1st or 2nd T MSS • low risk of DS who elected to have IT. 1990-1992 2nd T triple screen introduced (free of cost): AFP hCG uE3 1992 14-16 weeks targeted USS introduced (private, cost) 1996 NT with or without 1st T MSS (PAPP-A + fßhCG (private, cost) Genetic counseling available free of cost for women with detected abnormality or high risk DS estimate Invasive testing available free of charge with following indications: ≥ 35 years ≥1:386 risk of DS (at delivery) based on age and MSS Abnormal NT result (normally >3mm) IT available at cost for women who request it Total number of prenatal tests Utilization of screening methods in 2000 DS cases detected DS live births Birth rate IT rate per age group: ≥ 35 years 1990 = 66.5% (65.4 – 67.6) 1992 = 67.6% (66.5 – 68.6) 1994 = 62.2% (61.4 – 63.0) 1996 = 57.7% (56.9 – 58.5) 1998 = 51.9% (51.1 – 52.7) 2000 = 48.3% (47.6 – 49.1) <35 years 1990 = 5.7% (5.5 – 5.8) 1992 = 14.0% (13.8 – 14.3) 1994 = 15.8% (15.5 – 16.1) 1996 = 13.4% (13.1 – 13.7) 1998 = 13.5% (13.3 – 13.8) 2000 = 13.8% (13.6 – 14.0) Utilization of screening methods (2000): 2nd T MSS = 60.9% NT = 20.3% USS (14-16 weeks) = 34.9% Uptake of screening methods in 2000 measured using a maternal survey of women in all maternity departments in Israel on 1 day. Women also questioned about IT, 2nd T marker test, 1st T MSS, USS (14-16 weeks), and NT (10-13 weeks). 540 women interviewed 70% Jewish Proportion of women aged ≥ 35 years old did not change significantly during the study period (16.2 – 17.4%) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers conclusions Estimations of the uptake of prenatal screens were calculated by interviewing women who gave birth on a particular day in 2000. This doesn’t include women who underwent screening but miscarried or terminated pregnancies. It is likely that the uptake of different screens would have changed over the years. Without reliable data regarding the use and uptake of screening methods, it is difficult to determine how they might have affected IT rates. Without indications for younger and older women, it is difficult to determine why rates of IT have changed in each group. There are a very high number of nonrecommended invasive tests performed in Israel (<1:386 risk) “most likely due to the social emphasis placed on general health, and especially on the health of children”. This suggests that cultural differences in the acceptability of invasive testing and disability may have influenced the rate of invasive testing, making the results less generalisable to other settings. 154 Table 17. Source Shohat et al. (2003) Israel Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Sample Screening Strategies Outcomes Results Utilization of IT in the total Israeli population was lower than that of the Jewish population but showed the same overall pattern. Retrospective cohort review of nationally collated databases Level of evidence III-2 Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Comments 155 Table 17. Evidence table of primary research studies appraised investigating the rate of invasive testing with the introduction of a screening programme (continued) Source Sample Screening Strategies Outcomes Results Comments Smith-Bindman (2003) N= 5,980,519 pregnancies in England and Wales between 1989 and 1999 England and Wales divided into 26 geographical areas and the combination of geographic area and year (area-years) was the unit of analysis for all comparisons Total number of invasive tests Percentage (95% CI) of pregnancies undergoing IT: 1989 = 4.4% (4.3 – 4.4) 1991 = 5.0% (4.9 – 5.0) 1993 = 5.7% (5.6 – 5.7) 1995 = 6.1% (6.1 – 6.2) 1997 = 6.4% (6.4 – 6.5) 1999 = 5.8% (5.8 – 5.9) Authors conclusions The current analysis has several limitations. NEQAS estimates of the number of IT included some duplicates and some tests for other conditions but these are likely to be a very small percentage of the total. Our proxy measure assumes that the patterns of screening for diagnosed cases are a reliable reflection of referral patterns; the analysis in 2 geographic areas showed a close correlation between the 2 measures (0.7). However the categorization of area-years by their dominant screening method will tend to underestimate the differences between the screening methods because our characterization of screening patterns based only on DS cases almost certainly overestimated the use of serum and US screening. Additionally each area-year includes a mixture of screening methods which accounts for some of the variability in outcomes within each screening category and might have limited our ability to differentiate between the performance of USS and serum screening. NT and 2nd T anomaly screens combined. Lower number of ITs per case detected when serum or USS used versus AMA. England, and Wales Retrospective cohort review of nationally collated databases Level of evidence III-2 Data sources: National Office of Statistics annual births, terminations and stillbirths annual maternal age-specific population births for each health authority National DS cytogenetic register (NDSCR) UK national external quality assessment scheme (UKNEQAS) - annual audit of all cytogenetics laboratories Information on the dominant method of screening for each area-year and invasive testing information was linked. Each area-year was classified on the basis of the dominant screening method used for prenatal diagnosis of DS cases. 1) Serum 2) ultrasound (1st or 2nd T) 3) AMA 4) mixed – no dominant method Rate of invasive testing per year Utilization of screening methods per area-year Total = 5.6% (5.6 – 5.6) IT per DS case detected: Serum = 60.7 (56.6-64.8) USS = 52.0 (43.8 – 64.8) AMA = 88.0 (80.1 – 95.9) A general policy of offering invasive testing on AMA alone is not justified. Reviewers conclusions Design of study measures number of ITs per case detected for area-years. Authors acknowledge that within each area-year there is variation in the offering and uptake of each method of screening. The design is therefore unable to indicate how invasive testing rates change with the introduction of screening because the screening policy for each year was not consistent across England and Wales and the percentage of invasive tests for each area-year are not reported, just the number of ITs per case detected. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 156 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 157 Chapter 7: Uptake of testing following screening results The search identified 18 eligible papers examining the uptake of invasive testing following positive or negative screening results. Below is an overview of study designs and aspects of quality represented by these studies. Full details of the 18 papers appraised, including methods, key results, limitations and conclusions, are provided in evidence Table 21 (pages 166-184). Study settings and samples Of the 18 papers appraised, seven were reviews of multi-centre community-based screening programmes and 11 were single centre studies set in hospitals or clinics. Of the 11 single centre studies, eight were prospective cohorts and three were retrospective reviews of medical records (Marini et al. 2002; de la Vega et al. 2002; Michailidis et al. 2001). Of the seven reviews of community-based screening programmes, three were retrospective analyses of data collected prospectively as part of the screening programme (Hadlow et al. 2005; Mueller et al. 2005; Chen et al. 2000), one was a retrospective analysis of a large perinatal survey (Khoshnood et al. 2003) and three were prospective studies which collected information from contributing hospitals or diagnostic centres (Platt et al. 2004; Wald et al. 2003a; Saltvedt et al. 2005). All seven of the multi-centre reviews of screening programmes included women of all maternal ages, and sample sizes ranged from 8216- 308,000 screened women, of whom 403 - 16,792 had screen positive results. One of these studies (Chen et al. 2000) did not report the number of women who were screened as the study reviewed the records of screen positive women only. Of the 11 single centre studies, four included women of all maternal ages, six examined the rates of invasive testing among women of AMA only (≥ 35 years - Lam et al. 2000; Ilgin-Ruhi et al. 2005; Vergani et al. 2002; ≥ 38 years – Dommergues et al. 2001; Marini et al. 2002) or younger women (< 38 years - Audibert et al. 2001) and one did not report the age distribution of the sample. Study design and quality Reviews of community-based screening programmes Khoshnood et al. (2003) used data from the National Perinatal Survey which collected information on all births in a one week period in France (National Perinatal Survey, 1988). The sample was large (n=13,478) however a number of women had missing data on either amniocentesis or serum screening (n=1152) and were excluded from analyses. Women were divided into four age groups (<30, 30-34, 35-37, ≥ 38 years) and approximately 10% were aged ≥ 35 years. Because data were collected via a retrospective interview, it could be subject to recall bias and this coupled with the proportion of missing data makes it difficult to interpret the findings. Chen et al. (2000) utilised the databases of a cytogenetics laboratory in Connecticut, USA and retrospectively reviewed the records of almost 50,000 women who were referred to the laboratory for second trimester MSS screening (AFP, hCG, uE3). Screening was offered to all women regardless of age, the median age of the screened women was 27.5 years and 10.3% of the population were 35 years or older at delivery. Women with a high risk result (1:270 or more) were offered invasive testing (n=2879) and the uptake rates were reported as a function of risk estimate, maternal age, gestational age, ethnicity and year of testing. Unfortunately amniocentesis information was not available for 10% of the screened sample but these women were included in analyses and assumed not to have had invasive testing. While this study was carried out retrospectively, the data sources were robust and missing data were dealt with appropriately. Results of screening were reported back to women via their practitioners and it is likely there was variation in the information they were provided regarding the implications of their risk estimate and the risks involved with invasive testing. However, this provides a more accurate reflection of the realities of community testing programmes. What is of more concern, and was acknowledged by the authors, is that they cannot be certain that all screen positive women SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 158 were offered invasive testing. The population source, age distribution and size make the study results applicable to other settings. Platt et al. (2004) utilised a multi-centre prospective cohort of women undergoing sequential first and second trimester screening for DS. First trimester serum (PAPP-A, fßhCG), NT and MA were collected with all screen positive (≥1:270 cut-off) women and women aged 35 years or older offered invasive testing. All women were offered second trimester serum screening (AFP, total hCG, uE3 + MA) and high risk women (≥ 1:270 cut-off) were offered invasive testing. Rates of invasive testing were reported for three groups of women: screen positive after first trimester screening; screen positive after second trimester screening; low risk for both screens. The age distribution of the entire sample was not reported in this paper, however approximately 39% of women who had both first and second trimester screening were aged ≥ 35 years. Saltvedt et al. (2005) randomly assigned women participating in the NUPP trial to receive either a scan at 12-14 weeks gestation (12 week group) which included a measure of NT, or a scan at 15-20 weeks (18 week group). Women in the 12 week group were offered prenatal diagnosis on the basis of NT and women in the 18 week group were offered prenatal diagnosis primarily on the basis of maternal age (≥ 35 years). The trial also evaluated the performance of the screening test in the detection of DS cases. Almost 40,000 women from 8 maternity hospitals were included in the trial which was conducted over a 3 year period in Sweden. The authors’ main concern was that the screening policies in each group had not been adhered to strictly. Approximately 9% of women in the 12 week scan group did not have NT measurements recorded and a small proportion of women in both groups requested and received second trimester maternal serum screening. However, the authors also noted that this reflected the realities of community-based screening. Hadlow et al. (2005) conducted a retrospective audit of a first trimester community screening programme in Western Australia, including analyses of the uptake of invasive testing in screen positive women. Screening and cytogenetic records for 10,480 women who completed first trimester screening based on the combined test were examined. High risk women (≥ 1:300 risk) were offered prenatal diagnosis in consultation with their doctor. The proportion of acceptance of invasive testing was reported for screen positive and screen negative women and some results were available for different levels of estimated risk. The mean age of the sample was 30.7 years with 21% of women aged 35 years or older. There was a minimal loss to follow-up because Western Australia has a stable population base and well-established state-wide data collection methods. All FMF accredited sonologists and cytogenetic laboratories in WA agreed to provide information, suggesting the data were complete. The authors acknowledged however, that they did not have information about the number of women who were not screened because of technical difficulty, multiple pregnancy, or gestational age. Mueller et al. (2005) reported the rates of amniocentesis in women who were screened as part of the Ontario MSS programme between 1993 and 1998. The Ontario MSS programme was a large community-based programme which screened more than 300,000 women for DS risk using the triple test (AFP, uE3, hCG) and maternal age. About half the pregnant population in Ontario opted for screening and over 21,000 women screened positive (≥ 1:385 risk at term). (Mueller et al. 2005) reported that amniocentesis information was not available for 23% of women who were excluded from analyses, potentially producing biased estimates. AC uptake in relation to age, risk estimate and the interaction between age and risk was reported for the 77% of screen positive women who did have follow-up information about amniocentesis. The study provides an indication of the uptake of amniocentesis following screening in a community-based programme, but the high proportion of women lost to follow-up is of concern and decreases the validity of the findings. Wald et al. (2003a) prospectively collected information from 46,000 women screened for DS risk using the quadruple test (AFP, uE3, hCG, inhibin-A) and maternal age at term. This study was based in 14 hospitals in the UK between 1996 and 2001. There were 3271 women who screened positive (≥ 1:300) and uptake of amniocentesis relative to risk estimate was reported. Single centre cohort studies Nicolaides et al. (2005) collected information from women who attended a private prenatal assessment clinic between 1999 and 2003. More than 30,000 women were screened in the first trimester (11-13+6 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 159 weeks) for DS risk using the combined test (NT, free ßhCG, PAPP-A + maternal age). The median age was 34 years with a high proportion of women (48.5%) aged 35 years or greater. Uptake of invasive testing was reported for screen-positive (risk of ≥ 1:300) and screen-negative women, as well as for each risk estimate group. While the design and description of the study are of good quality, the sample may not be representative in that it consisted of women who presented early in their pregnancy to a private clinic and their socioeconomic status may have been higher than the general population. High quality information was available regarding risk estimates, maternal age, and the uptake of invasive testing. Michailidis et al. (2001) examined the uptake of invasive testing in 8536 women who screened positive by either first trimester NT screening or second trimester maternal serum screening based on two analytes. Complete outcome data were not available for 13% of the original sample so 7447 women were included in the final analyses. NT measurements were completed at 10-14 weeks gestation and women were informed of their results and offered prenatal diagnosis if they were above the 99th centile or the screen showed structural abnormalities. There was not enough information presented in the paper to ascertain whether the women who did not have second trimester screening refused screening, were lost to follow-up, or did not continue their pregnancy into the second trimester because of spontaneous miscarriage or termination. Ilgin-Ruhi et al. (2005) conducted a prospective study of the uptake of screening and invasive testing following screening results in women who presented at the authors’ clinic for prenatal diagnosis on the basis of maternal age (35 years or older). Women were initially offered second trimester MSS (AFP, hCG, uE3) with a high risk cut-off of 1:250. Uptake of invasive testing was reported for those who declined screening, screen positive and screen negative women. The sample was relatively small and the study was based in Turkey, making it more difficult to generalise these results to other settings. A study by de la Vega et al. (2002) presents the same difficulties in sample size (n=555 screen positive women), and a lack of generalisability because of potential cultural differences in the acceptability of both screening and invasive testing. In this study, the medical records of Hispanic women who presented to a high risk prenatal clinic in Puerto Rico were retrospectively reviewed. Women who had tested screen positive for DS (> 1:250) were included in the study and were grouped based on their indication for prenatal diagnosis. Women with more than one risk factor were excluded from the study and the uptake of amniocentesis was reported for each group. The description of the methodology was not detailed enough to ascertain what options women were offered following screening, what invasive testing risks they were informed of, and what they were told about their individual risk for DS. The age distribution of the sample and how many women were excluded because of multiple indications for prenatal diagnosis was not reported. It was also not indicated whether grouping of women based on their indications was blind to their invasive testing decision. Spencer et al. (2000b) reported on the uptake of invasive testing in a prospective cohort of women who presented for maternity care between 10 and 13+6 weeks GA at a maternity unit in Essex, England. Women (n=4190) were given the option of first trimester screening using a combination of MSS (free ß hCG + PAPP-A) and NT. The rate of invasive testing in screen positive women (n=253) was reported overall and relative to risk estimate. Exact figures for uptake relative to risk estimate were not reported and confidence intervals were unable to be calculated. The methods section was detailed and thorough with the median age of the sample being 29 years and 6.1% of the sample aged ≥ 35 years. Vergani et al. (2002) reported the rate of amniocentesis in 1486 women who underwent genetic counselling on the basis of maternal age (≥ 35 years) at a maternity clinic in Italy. Data were collected prospectively between 1990 and 1998. Women were initially seen at 10 - 14 weeks GA and were counselled regarding their risk for DS and screening options. Women were offered an USS including NT measurement and their attitudes to amniocentesis (in favour/against) were measured. Rates of uptake were reported relative to ultrasound findings and women’s a priori preferences for/against invasive testing. Potential cultural differences with regard to the acceptance of both screening and invasive testing make the results of this study less generalisable to other settings. O’Connell et al. (2000) reviewed the clinic and cytogenetic records of almost 19,000 women who booked for first trimester assessment between May 1992 and April 1997 at Hull Maternity Hospital, England. Women were screened for DS risk using the triple test (AFP, uE3, total hCG or free ßhCG). Amniocentesis was available on request prior to serum testing but was not routinely offered on the SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 160 basis of maternal age. Women with a high risk result (1:250 or more) were informed of their options, including invasive testing, by a consultant. Rates of invasive testing in screen positive, screen negative and unscreened women were reported. While the rest of the study was adequately reported, no information was provided about the age distribution of the sample, making it difficult to determine how representative the sample was, and how generalisable the findings might be. Standardised information was provided to women prior to testing but there was not enough detail in the study to know how women were informed of their results after testing. Free ß hCG replaced total hCG in the screen from May 1994 - April 1997. There was subsequently an increase in the proportion of screen positive women and there may have been a change in the uptake of invasive testing, however rates of acceptance were not reported separately for these two variations of the serum test. Lam et al. (2000) conducted a prospective cohort study to examine the uptake of screening and the rate of invasive testing in women of AMA (≥ 35 years). This study was based in Hong Kong between 1997 and 1999 and followed 3,419 women who reported for prenatal diagnosis on the basis of maternal age. The authors standardised the information presented to women regarding screening and invasive testing by including a video presentation and written information in the consultation. At the consultation women were offered serum screening (AFP + total hCG) as an alternative to direct IT and screen positive women (1:250 or greater risk) were offered amniocentesis. Amniocentesis was offered to screen negative women if requested. Women who refused any screening or prenatal diagnosis were excluded from the study. The sample was predominantly of Chinese ethnic origin (91%) and the authors found that uptake varied as a function of ethnicity, making it difficult to generalise the findings to other settings. The standardised and well-controlled nature of the screening and invasive testing information presented to women may not reflect what would normally happen in clinical practice, although it does represent an ideal situation. The number of women who refused any screening or prenatal diagnosis was not reported, making it difficult to determine what proportion of the population initially approached agreed to take part in the study. Dommergues et al. (2001) conducted a prospective cohort study of the uptake of invasive testing in women of AMA (38-47 years) following first and second trimester screening using a combination of NT and MSS (AFP + hCG). A NT screen was performed at 10-14 weeks, MSS at 15-17 weeks and an USS at 21-23 weeks. Women who were designated to be high risk on the basis of either their NT (≥ 3mm) or MSS (≥ 1:250) or an abnormal USS were recommended to have an amniocentesis while low risk women were given the option of no amniocentesis. MSS was performed in all women regardless of their NT result. The rates of uptake of invasive testing were reported for screen positive and screen negative women, as well as for each combination of screening results (both NT and MSS +ve; NT ve/MSS +ve; NT +ve/MSS -ve; both NT and MSS -ve). The sample was small (n=359) and was restricted to women of AMA in France, where the DS screening policy offers direct AC to women aged 38 years and over, a little older than the age 35 years cut-off in many other countries. However, Audibert et al. (2001) reported the results for women less than 38 years old in the same population. The mean age of this sample was 30 years with 14% aged 35 - 37 years of age. The uptake of invasive testing was reported for screen positive and screen negative women and for each combination of risk group. Marini et al. (2002) reviewed the records of a cytogenetics laboratory to examine the screening results and subsequent decision to accept or decline invasive testing in women aged 35 years or older. Women were included in the study if they had been screened for DS risk using the 2nd trimester triple test (AFP, uE3, βhCG). A calculated risk of 1:270 or greater was considered screen positive. The rate of invasive testing in screen positive and screen negative women was reported. As this study was a retrospective review of laboratory records it relied on an assumption that the laboratory records were accurate. Study results Of the eighteen studies included in this part of the review, 11 were prospective cohort studies and seven were retrospective reviews of previously collected records. The screening strategies employed by the studies included reviews of the uptake of invasive testing following screening by first trimester methods (MSS + NT/NT alone, n = 4), second trimester methods (triple = 5; quad=1; NT = 2; double = 1), or a combination of first and second trimester methods (n=4). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 161 One study reviewed the records of women who had been screened for DS risk using a variety of methods (de la Vega et al. 2002). Uptake of invasive testing following 1st trimester screening Four studies examined the rates of invasive testing following results of 1st trimester screening (see Table 18). Three of these studies utilised NT measurements in combination with maternal serum markers, one utilized NT alone and all four included women of all maternal ages. One of these studies was a review of a community-based screening programme. Hadlow et al. (2005) reported a 62% uptake of invasive testing in screen positive women and a 2.7% uptake in screen negative women. The rate of invasive testing increased with an increase in estimated risk of DS from 75.8% with a risk of ≥1:100 to 88.3% with a risk of ≥ 1:50. Three further studies reported on the rates of invasive testing in hospital or clinic settings. Two studies followed singleton pregnancies and reported 78 - 82% rate of uptake in screen positive women (Nicolaides et al. 2005; Spencer et al. 2000b). Nicolaides et al. also reported that 4.6% of screen negative women opted for invasive testing and that uptake rates varied as a function of age, with a higher proportion of older women (7.4%) than younger women (2.2%) opting for invasive testing following negative screening results. Saltvedt et al. (2005) compared the uptake of invasive testing in women screened as high risk using either a 12 week scan including NT or an 18 week anomaly scan. The results indicated that overall fewer women in the 12 week scan group (8%) underwent invasive testing compared to women in the 18 week scan group (11%, p < 0.001). When the uptake of invasive testing was examined in screen positive women, the findings indicated that 73% of women with high risk NT measurements opted for invasive testing. In the 18 week scan group 52% of women ≥ 35 years opted for invasive testing. Table 18. Uptake of invasive testing following 1st trimester screening Reference Maternal Age of Sample Uptake following positive screening result (%, 95% CI’s) Hadlow et al. (2005) Mean = 30.7 years overall = 62 <25 - ≥ 35 years Spencer et al. (2000b) Mean = 29 years Saltvedt et al. (2005) Nicolaides et al. (2005) (57-67) Uptake following a negative screening result <1:300 = 2.7 (2.4-3.0) Overall = 4.6% (4.3-4.8) ≥ 1:100 = 75.8 ≥ 1:50 = 88.3 ≥ 1:300= 81.8 (77.1 – 86.6) Median = 30.1 years 12 week group = 73.4 (69.9 – 77.0) 18 week group = 51.7 (50.0 – 53.4) Median = 34 years 1:1 - 1:150=~85 Range = 15-49 years 1:300-1:500~20 ≥ 35 = 12.7% 1:151 to 1:300=~65 ≥ 35 = 76.2 (74.4 – 78.1) ≥ 35 = 7.4 (6.9 – 7.8) < 35 = 82.5 (79.4 – 85.6) < 35 = 2.2 (2.0 – 2.5) Uptake of invasive testing following 2nd trimester screening Nine studies (Table 19) reported the rates of uptake of invasive testing in women who were screened using 2nd trimester screening methods (triple=5; quad=1; NT = 2; double = 1). In five studies women were screened with the triple test (AFP, hCG, uE3) and rates of uptake ranged from 36 - 85% following a screen positive result, and 0.05 - 42% following screen negative results. Two of these were large reviews of screening programmes (Mueller et al. 2005; Chen et al. 2000) and found an increase in uptake of invasive testing with an increase in estimated DS risk. Both of these studies also found a higher uptake of invasive testing in women < 35 years of age compared to women of AMA. Wald et al. (2003a) found a 60% overall uptake of invasive testing in women screened with the quadruple test (AFP, hCG, uE3, inhibin-A) and that uptake increased with increasing risk estimate. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 162 Lam et al. (2000) reported the rates of invasive testing in an AMA population screened with the double test (AFP, total hCG) finding 60.9% of screen positive women, 5.3% of screen negative women, and 47.1% of women who opted for no testing, chose invasive prenatal diagnosis. Marini et al. (2002) examined the records of AMA women (≥ 35 years) who underwent second trimester maternal serum screening at the authors’ cytogenetics laboratory over a 6 year period. Of screen positive women, 48% accepted the offer of invasive testing, while in the screen-negative group, 13% opted for invasive testing. Khoshnood et al. (2003) used data from the National Perinatal Survey which collected information on all births in a one week period in France in 1998 and found an increase in uptake of invasive testing with maternal age and education. Vergani et al. (2002) found that 36% of screen positive and 31% of screen negative women over the age of 35 years opted for invasive testing following the results of NT screening. Of the studies which examined 2nd trimester screening methods, three (Ilgin-Ruhi et al. 2005; Lam et al. 2000; Vergani et al. 2002) were subject to possible bias in the cultural acceptability of both screening and invasive testing. These studies were set in Turkey, Hong Kong and Italy and the public acceptability of screening and prenatal diagnosis in these societies might have affected the uptake of screening and/or invasive testing following screening results, making the findings less generalisable to other settings. One study (Khoshnood et al. 2003) was subject to recall bias because the data were collected via a retrospective maternal interview and screening or invasive testing data were missing for 8% of the sample. All three of the studies reviewing screening programmes (Mueller et al. 2005; Wald et al. 2003a; Chen et al. 2000) noted an increase in uptake of invasive testing with an increase in estimated DS risk. Two of these also examined uptake relative to maternal age and found that among screen positive women, a higher proportion of younger women than older women opted for invasive testing, although the difference between the two groups was less than 10%. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 163 Table 19. Uptake of invasive testing following 2nd trimester screening Reference Maternal Age of Sample Uptake following positive screening result (%, 95% CI’s) Uptake following a negative screening result (%, 95% CI’s) O’Connell et al. 2000 Mueller et al. 2005 Not included 85 0.05 (0.01 – 0.09) Mean = 34 years Range = 15-50 years Overall = 65.7 Chen et al. 2000 Wald et al. 2003a Mean = 27.5 years Range = 13-48 years All maternal ages Distribution not reported. >1:100 = 73.3 1:100-1:149 = 69.1 1:150-1:199 = 67.7 1:200-1:249 = 63.4 1:250-1:299 = 62.8 1:300-1:385 = 58.6 Overall = 52 1:1 to 1:10 = 72.2 1:11 to 1:30 = 57.6 1:31 to 1:60 = 61.4 1:61 to 1:134 = 55.1 1:135 to 1:270 = 47.7 Overall = 60 (82-88) (71.9 – 74.7) (67.1 – 71.0) (65.8 – 69.6) (61.4 – 65.4) (60.8 – 64.7) (57.1 – 60.1) (60.3 – 84.2) (49.3 – 65.8) (55.5 – 67.2) (51.7 – 58.4) (45.3 – 50.2) Relative to DS risk: >1:50 = 74.5 1:250-1:300 = 43.1 (67.3 – 81.7) (38.8 – 47.5) 60.9 (55.4 – 66.4) 5.3 (4.2– .4) 48% (44.7 – 51.4) 12.9 (11.2-14.5) Khoshnood et al. 2003 Range = 35-47 years Maternal age range: Less than 30 – 38 and older Education level: >12 years = 13.6 </=12 years = 9.5 (12.6-14.5) (8.9-10.2) Vergani et al. 2002 ≥ 35 years only Mean = 38.9 36 Ilgin-Ruhi et al. 2005 ≥ 35 years only Mean=38.2 75 Lam et al. 2000 Marini et al. 2002 Uptake following no screening (%, 95% CI’s) 1.8 (1.4-2.3) ≥ 35 years only 47.1 (45.5 – 48.8) 35 – 37 year olds with >12 years education = 51% ≥ 38 years = 81.7% 31 32.9 – 50.8 42 (68.9 – 81.6) 73.3 (60.4 – 86.3) Uptake of invasive testing following 1st and 2nd trimester screening Four studies reported the rates of uptake of invasive testing in women who were screened using a combination of 1st and 2nd trimester screening methods (Table 20). Dommergues et al. (2001) and Audibert et al. (2001) reviewed the results of women from the same screening cohort but reported on different age groups. Women were screened using NT in the 1st trimester followed by MSS in the 2nd trimester, with results available to women after MSS. MSS was performed in all women regardless of their NT result. The uptake of invasive testing was 59% following a positive NT or MSS result in women <38 and 81% in women ≥ 38 years of age. In both age groups the uptake of invasive testing varied depending on the source of the positive result. Younger women were more likely to opt for screening after NT (71%) than MSS (48%) while older women showed a 79% uptake with a positive MSS result and 71% with a positive NT result. In both age groups there was 100% uptake of invasive testing when both NT and MSS results were above the cut-off. Among younger screen negative women, 4.7% opted for invasive testing, whereas 53% of older screen negative women opted for invasive testing. Michailidis et al. (2001) examined the uptake of invasive testing in 8536 women who screened positive by either first trimester NT screening or second trimester maternal serum screening based on two analytes. NT measurements were completed at 10-14 weeks gestation and women were informed of SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 164 their results and offered prenatal diagnosis if they were above the 99th centile or the screen showed structural abnormalities. Of the 122 women who screened positive in the first trimester, 105 (86.1%) opted for invasive testing. Of the remaining 7325 women, 4864 (66%) were screened in the 2nd trimester for AFP and free ßhCG levels. Of the women who did undergo 2nd trimester screening, 44.5% of screen positive women opted for invasive testing, with an overall invasive testing rate in screen positive women of 53.7%. The mean maternal age in this population was 30.1 years with 21% of the sample aged 35 years and over, suggesting that it may not reflect the age distribution in the general population. The prevalence of DS in the 2nd trimester screening sample was 0.08%, suggesting it may have been comprised of women who were mostly at low risk of DS and possibly explaining the lower uptake of invasive testing. Platt et al. (2004) investigated the uptake of invasive screening in women who were screened in the 1st trimester using NT and maternal serum levels and in the second trimester using the triple serum test. First trimester results were disclosed to women prior to offering a choice of invasive testing, second trimester testing or no further screening or diagnosis. A small number of women were screen positive in the first and second trimesters, of whom 40.5% accepted invasive testing. 57.1% of women who were screen positive in the first trimester and 44.4% of women who screened positive in the second trimester accepted invasive testing, and 5.1% of women who were negative for both screens opted for invasive testing. A higher proportion of older women (8.3%) than younger women (3.5%) in this group opted for invasive testing. Table 20. Uptake of invasive testing following 1st and 2nd trimester screening Reference Audibert et al. (2001) Dommergues et al. (2001) Platt et al. (2004) Michailidis et al. (2001) Maternal Age of Sample Mean = 30.1 years < 38 years Range = 38-47 years All ages included Distribution not reported Mean maternal age = 30.1 years (range 13 – 50) Uptake following positive screening result (%, 95% CI) 59 (52.1 – 65.3) Uptake following a negative screening result (%, 95% CI) NT –ve, MSS –ve= 4.7 (4.0-5.3) 81 (74 - 88) NT –ve, MSS –ve = 53 (46.8– 59.7) 1st T screen: +ve: = 57.1 -ve = 15.3 (53.7 – 60.5) (14.5 – 16.2) Negative both screens = 5.1 (4.4 – 5.8) ≥ 35 years = 8.3 < 35 years = 3.5 (6.8 – 9.8) (2.8 – 4.3) 2nd T screen: +ve both screens = 40.5 1st T –ve/2nd T +ve= 44.4 1st T screen= 86.1 (29.4 – 51.7) 2nd T screen= 44.5 (39.7 – 49.2) (39.3 – 49.4) (79.9 – 92.2) Summary of results The variability in screening methods, age of the participants, setting of the studies, method of analyses, and reporting of risk estimates utilised by the 18 appraised studies makes a comparison of the rates of uptake difficult. While it is not possible to report an accurate overall proportion of uptake for screen positive and screen negative women, there were two key trends in levels of uptake. Maternal age appears to be an influential factor in the uptake of invasive testing. Four studies included maternal age as a factor in their analyses, however, two studies reported a higher uptake of invasive testing in older women (≥ 35 years) and two a higher uptake in younger women. Some of the variability in these findings may be due to their setting as one of these studies was set in a private prenatal testing clinic (Nicolaides et al. 2005), two were large reviews of community screening programmes (Mueller et al. 2005; Chen et al. 2000) and one was based on findings from a maternal survey (Khoshnood et al. 2003). In the smaller single-centre studies women were provided with intensive and standardised counselling regarding the implications of their individual risk estimates and the risks involved with prenatal testing, however, this is not reflective of the realities of communitybased screening. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 165 The individual level of risk estimate appears to influence women’s decisions regarding the uptake of invasive testing. Five studies included exact risk estimates as a factor in their analyses with all five reporting that as the calculated risk estimate and, therefore, the likelihood of carrying a DS fetus increased, the proportion of women opting for invasive testing increased. Four of these studies were a combination of large multi-centre evaluations of community screening programmes with a fifth set in a single centre. In three of the studies women were screened using 2nd trimester methods and in two studies 1st trimester methods. The effect of risk estimate appears to be fairly consistent and robust. Three of these studies were able to examine the effect of risk estimate for each age group. All three of these found a higher uptake of invasive testing among younger versus older screen positive women. Conclusions The decision whether to opt for a potentially risky invasive diagnosis following screening results is obviously a very complex one. It appears that both maternal age and the individual calculated risk estimate are factors in women’s decision-making but it is probable that many other factors also play a role. The perceived accuracy of the test may be influential as uptake did vary, although not predictably, depending on the method of screening used. This is probably in part related to the gestational age of the fetus. Younger women and older women may be using screening for different purposes. Younger women may not perceive themselves to be at risk so a negative screening result is enough to reassure them and deter them from opting for an invasive test. In older mothers, who are aware of the age-related risk of Down syndrome, the screening may be used as a way of avoiding invasive testing. A higher proportion of these women decline invasive testing after receiving a positive screening result, particularly one close to the cut-off. Social, cultural, religious and ethnic factors may also influence the acceptability of both screening and invasive testing but this again was not consistent across studies. In some cultures, dominant religious beliefs may make termination of the pregnancy unacceptable, and lessen the demand for invasive testing. In other cultures where population growth is high and families are limited in the number of children they produce, the uptake of invasive testing and prenatal diagnosis may be higher. Because of this cultural variability it would be wise to conduct a local study of the acceptability of screening and invasive testing before implementing a large-scale screening programme. Both older and younger women and women from different ethnic backgrounds should be included in the study, and consistency in the quality of counselling regarding individual risk estimates and the implications of a positive or negative diagnosis monitored. It would be very difficult to maintain consistency across all practitioners; however, standardised information packs for both practitioners and patients might increase the quality of pre-screen and post-screen counselling. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 166 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening Source Sample Screening Strategies Outcomes Results Comments (Hadlow et al. 2005) N= 10480 women who completed FTS in Western Australia during the two year period of the audit 1st T screening: NT 11-14 weeks GA (FMF protocol) plus MSS based on free ßhCG and PAPP-A No risk reported on MA + MSS alone Numbers of screen +ve women (≥ 1:300 risk) Numbers of screen -ve women (<1:300 risk) %age (95% CI) Proportion of screen +ve women: 403/10436 = 3.9% (3.5-4.3%) Authors conclusions Authors comment on the stable population base so minimal loss to follow-up, small number of laboratories who all cooperated, and well-established state-wide data collection methods. Western Australia (25,000 annual births) Community-based first trimester screening programme Retrospective cohort (audit of a screening programme) Level of evidence III-2 Age distribution: Mean age = 30.7 years <25 years, 9.5% 25-29 years, 29.4% 30-34 years, 39.9% 35+ years, 21.2% Different from general population demographics Risk reported to woman and referring doctor. Prenatal dx offered in liaison with doctor if risk of 1:300 or more. Percentage in each group accepting or declining diagnostic testing Uptake of IT: Screen positive (≥ 1:300 risk) 250/403 = 62% (57-67%) Screen negative (< 1:300 risk) 271/10033 = 2.7% (2.4 – 3.0%) By risk group ≥ 1:100 = 75.8% ≥ 1:50 = 88.3% Median GA=12 weeks 3 days Full pathology records available for 10 436 women (99.6%) and full birth or pregnancy termination records were available for 10 274 women (98.4%). USS performed in multiple community sites because 1st T screening coordinated over a very wide geographical area. However the USS examinations were performed or supervised by 4 experienced and accredited sonologists. The rate of diagnostic testing (CVS/AC) in the high risk group (risk 1:300 or higher) was 62%. The rate of diagnostic testing in the low risk group was 2.7%. The likelihood of diagnostic testing increased with level of risk, with 75.8% of women with a risk of ≥ 1:100 and 88.3% of women with a risk ≥ 1:50 proceeding to testing. Reviewers conclusions Risks were reported to women and their doctors and a decision regarding diagnostic testing made in liaison with the doctor. It is possible that there was variation between doctors regarding the information women were provided about the risks associated with invasive testing and what her calculated DS risk meant, and that this may have influenced the uptake of IT. The quality of the birth and invasive testing records and the community-based setting make this study a good source of information about uptake rates in clinical practice. Study included data from all 9 FMF accredited USS practices and both the laboratories offering 1st T screening analysis in the state. Exclusion criteria: Gestation outside 11 weeks 1 day to 13 weeks 6 days Cases with incomplete screens or missing demographic information were excluded. The proportion of uptake relative to risk was reported but not n for each group, so confidence intervals could not be calculated for these groups. The authors acknowledged that, because this was a retrospective audit, they could not determine the proportion of women not screened because of technical difficulty, multiple pregnancy, or wrong gestational dates. Consequently, the rates of screening uptake are not reported, although the authors did report that Western Australia has approximately 25,000 births each year. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 167 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Ilgin-Ruhi et al. 2005) N=340 women aged 35 and older, referred from other centres for IT based on AMA 2nd T MSS (3 analytes) AFP hCG UE3 1:250 risk cut-off Uptake of screening Uptake of screening: 295/340 86.8% accepted screening 45/340 (13.2%) declined MSS (these women were considered high risk for some analyses) Authors conclusions Following screening high risk women had a higher rate of AC than low risk women. In a population of women who all could have had AC based on maternal age, only 63.5% of them opted to do so. The rate of AC in women who did not have any serum screening was higher than the low risk screened women. This may mean they decided to accept AC before genetic counseling. The mean age of women who had Ac was not significantly different from those who declined AC, but the women who declined AC were further along in their pregnancies. Medical Genetics Department, Ankara University Medical School, Turkey Prospective cohort Level of evidence III-2 Mean age = 38.2 years +/-2.4 (median = 38) Women were screened at 16-20 weeks GA Mean GA = 19.2+/- 1.7 (median=19) USS – not sure which markers used in this study Sometimes only hCG and uE3. 16-20 gestational weeks Genetic counseling offered to referred women which included information on: Advice regarding AMA and DS risks How MSS and USS are used to evaluate DS risk DRs, false-positive and false-negative rates Risk cut-off levels for MSS AC risks In some cases counseling required 1-2 additional sessions for couples to come to a decision Couples asked to choose initially whether to take up USS and MSS screening. Couples then asked to decide whether to take up AC. Uptake of AC within the no MSS and MSS groups. Further analyses of AC uptake based on results of MSS and USS screen (high risk or low risk). For some analyses, no MSS and MSS groups were reported separately. Some couples declined MSS and these cases (n=45) were considered High Risk based on MA for analyses comparing invasive testing in high risk versus low risk groups. Overall rate of AC (%age, 95% CI): 63.5% (58.4 – 68.6%) Uptake of AC with screen or no screen: No screen Accept = 73.3% (60.4 – 86.3%) Decline = 26.7% (13.7 – 39.6%) Screen Accept = 62.0% (56.5 – 67.6%) Decline = 38.0% (32.4 – 43.5%) Uptake of AC by MSS risk group: Low risk: N=123 44.7% (35.9 – 53.5) accept AC High risk: N=172 74.4% (67.9 – 80.9) accept AC Screen negative either MSS or USS N=117 42% (32.9 – 50.8) accept AC Screen positive either MSS or USS N=178 75% (68.9 – 81.6) accept AC SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING The data suggested that screening by non-invasive methods would decrease the requirement of AC over time. Reviewers conclusions The method section was a little unclear about the content of the MSS and USS markers. The sample was relatively small and as a result the confidence intervals are relatively wide for analyses with smaller group sizes. The study was based in Turkey and focused on women age 35 and above, restricting the generalization of these results to other settings. Women referred from many doctors but receiving counseling from one source so the information they are receiving is consistent but possibly not a realistic reflection of what might happen in a large scale community-based screening programme. 168 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Mueller et al. 2005) N= 16792 women who screened +ve in the Ontario MSS programme and for whom AC uptake data were available. Total screened = 308,298 2nd T MSS based on 3 biochemical markers: AFP uE3 hCG + MA AC uptake relative to risk, ethnicity and MA 6 risk groups: >1:100 1:100-1:149 1:150-1:199 1:200-1:249 1:250-1:299 1:300-1:385 Median risk = 1:202 Risk range 1:8 to 1:385 %age, (95% CI) Overall AC uptake in screen +ve: 11033/16792 = 65.7% (65.0 – 66.4%) Caucasian = 67% Asian = 66.6% Black = 48.6% Authors conclusions The overall uptake of AC was 66%. Uptake in women <35 was higher than that for older women. The risk estimate also seemed to influence women’s decisions regarding AC. As expected, the rate of uptake declined with a decrease in risk estimate. There was a greater change in uptake relative to risk among women ≥ 35 years of age. The uptake of AC in ≥ 35 year old women was less than that of women < 35 years old at every risk level except > 1:100. Older women and younger women have different motivations for using screening. Older women seem to use screening to avoid AC if possible. Younger women do not perceive themselves at risk so use screening for reassurance. When not reassured they take the step of AC to get a definitive answer. Canada Ontario MSS programme between October 1993 and September 1998 Retrospective cohort (review of MSS programme database) Level of evidence III-2 <35 years, n=9779 ≥ 35 years, n=7013 Median MA = 34 years (range 15-50 years) Total screen +ve = 21823 AC data not available for 5031 (23%) MSS database set up to audit the MSS programme. Collected demographic information, screening results, utilization of genetic services and pregnancy outcomes. Ontario prenatal care policy: All pregnant women eligible for MSS Women who are screen +ve for DS are offered AC Women who are aged 35 years or older can opt for AC without screening Screen +ve women receive genetic counseling including information about the risks of IT Screening between 15 and 20 weeks GA Risk cut off: 1:385 at term All women received a specific risk figure regardless of their age and screening result 2 age groups: <35 ≥35 3 ethnic groups: Caucasian, 71.3% Asian, 14.5% Black, 6.8% AC uptake relative to magnitude of change between age-specific risk and MSS risk calculated as: (MSS risk-age risk)/age risk AC uptake relative to risk estimate: >1:100 = 73.3% (71.9 – 74.7%) 1:100-1:149 = 69.1% (67.1 – 71.0) 1:150-1:199 = 67.7% (65.8 – 69.6%) 1:200-1:249 = 63.4% (61.4 – 65.4%) 1:250-1:299 = 62.8% (60.8 – 64.7%) 1:300-1:385 = 58.6% (57.1 – 60.1%) Screen +ve AC uptake relative to age: <35 = 69.8% (68.9 – 70.7%) ≥35 = 60.1% (58.9 – 61.2%) Similar patterns of uptake for all ethnic groups with women <35 having higher uptake rates than ≥35 year olds. AC uptake relative to age and risk: < 35 year olds 1: 100 = 73.6% (71.6 – 75.7%) 1:101–1:200 = 73.0% (71.3 – 74.7%) 1:201–1:300 = 69.6% (67.9 – 71.3%) 1:300–1:385 = 64.1% (62.3 – 66.0%) ≥ 35 year olds 1: 100 = 73.0% (71.0 – 74.9%) 1:101–1:200 = 62.3% (60.1 – 64.4%) 1:201–1:300 = 52.4% (50.1 – 54.7%) 1:300–1:385 = 47.8% (45.1 – 50.5%) Separating highest risk into smaller groups: >1:20, 1:20-1:49, 1:50-1:99 showed similar uptake across all risk groups and age groups. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING About half the pregnant women in Ontario opted for screening during the period of the study. There were ethnic group differences in uptake of screening and AC. This screening programme is state-funded so these differences should not be related to SES. Reviewers conclusions Overall, women were more likely to opt for AC with a higher risk estimate. In younger women, the difference was not significant between all risk levels. In older women there was a significant difference in uptake between all risk levels and the change in uptake was greater than that of younger women. The proportion of >35year olds (16%) who were screened is similar to that of the general pregnant population, so the authors feel many women of AMA are opting for screening. 169 Table 21. Source (Mueller et al. 2005) Canada Ontario MSS programme between October 1993 and September 1998 Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Sample Screening Strategies Outcomes Results Comments Difference between age-risk and MSS risk: These comparisons were only included for women ≥ 35 years of age. The uptake of AC increased gradually as the MSS risk increased over the age-specific risk. When MSS risk was 20 times higher than age-specific risk, there was a slight decrease in uptake. AC data were missing for 23% of screen +ve women who were lost to follow-up. Follow-up forms not received for these women and they may have miscarried, terminated their pregnancy, moved or used another genetic centre. The authors excluded these women from analyses but had they been included and assumed not to have had AC, the overall uptake of AC in screen +ve women would have been 51% rather than the 66% that was reported. Without age, ethnicity or risk estimate information for the women lost to follow-up, it is difficult to determine how their exclusion may have affected the findings. Retrospective cohort (review of MSS programme database) Level of evidence III-2 Continued SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 170 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Interventions Outcomes Results Comments Saltvedt et al. (2005) N=39572 pregnancies randomized to either 1st T NT or 2nd T USS by maternal age Women who agreed to participate were randomized to either: Pregnancy outcome Comparison between groups: The two groups did not differ in demographic characteristics Recruited women from an unselected population of pregnant women between March 1999 and October 2002. Eight Swedish maternity hospitals included. 12-14 weeks scan including NT. Fetal karyotyping was offered according to NT screening; an anomaly detected during any scan or increased risk based on history. Authors conclusions The 12 week scan policy was associated with many fewer invasive tests per antenatally detected DS case than the 18 weeks scan policy. The two policies were not strictly adhered to but this reflects the situation in real life. Neither pregnant women nor health personnel adhere blindly to recommended policies. There is no way of preventing people from seeking a second opinion. Almost 60% of the amniocenteses in the 12 week scan group were performed because of parental worry in the absence of medical risk factors. This probably reflects the fact that women did not trust the new NT method. Some women requested and got 2nd T MSS but it is impossible to determine how many retrospectively. Multicentre RCT NUPP trial, Sweden Level of evidence II Eligibility criteria: Ability to understand the information about the trial Gestational age at booking ≤13+2 weeks 12 week scan group: N= 19796 Mean = 30.1 years 19.3% ≥ 35 years 18 week scan group: N=19776 Mean = 30.2 years 19.4% ≥ 35 years Declined to participate: N= 10061 Mean = 29.7 19.0% ≥ 35 years Sample size calculated based on assumed prevalence of DS and number of cases needed to detect a difference between groups. Cut-off ≥ !:250 Peformance of each of the screening policies Uptake of IT following a positive result for each of the screening groups Uptake of IT following a negative result for each of the screening groups Indication for karyotyping 15-20 weeks anomaly scan. Fetal karyotyping offered to women ≥ 35 years old, women with an anomaly detected during any scan, or increased risk based on history. 2nd T MSS was not a routine offer but was infrequently performed by request Randomization was performed at the USS units using internet-based software. Randomization was done block-wise and stratified for maternal age (<35 or ≥ 35 years). %age (95% CI) Uptake of IT: 12 week scan group (NT) Overall = 8.0% (7.7 – 8.4) Screen positive = 73.4% (69.9 – 77.0) Number of invasive tests per detected DS case = 38 18 week scan group (AMA) Overall = 10.7% (10.3 – 11.1) Screen positive = 51.7% (50.0 – 53.4) Number of invasive tests per detected DS case = 85 Indication for fetal karyotyping: 12 week scan group (NT) NT high risk = 26.9% Age ≥ 35 years = 7.3% MSS = 1.1% Anxiety = 57.8% 18 week scan group (AMA) NT high risk = 0% Age ≥ 35 years = 79.7% MSS = 3.1% Anxiety = 7.7% Scans performed by 46 midwives with certification by the FMF. Confirmation of anomalies and counseling provided by 26 obstetricians. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers conclusions The authors acknowledged that there may have been variability in the adherence to screening policies in the two groups. This is a potential source of bias, but they are also correct in saying it reflects the realities of community-based screening where people can and do seek second opinions. It would seem from the indications for fetal karyotyping that only a small proportion (1-3%) of women had MSS but there is no way of knowing exact numbers. The two groups did not differ in any demographic characteristics and were representative of the general population. 171 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Interventions Outcomes Results Comments (Nicolaides et al. 2005) N=30564 singleton pregnancies. All women attended the Fetal Medicine Centre between July 1999 and December 2003. 1st T combined screen NT Free ßhCG PAPP-A MA Rate of invasive testing with varying levels of risk Results of screening: 30564 women screened 2565/30564 = 8.4% screen positive Composite risk calculated based on NT + MSS results Uptake of invasive testing within older and younger women with varying degrees of risk Authors conclusions There was an exponential relationship between estimated risk for DS and uptake of IT with the rate of IT increasing as risk estimate increased. There was a marked increase in the rate of invasive testing once the risk was 1:300 or more, and this may be because women are made aware during the consultation that this is the risk cut-off point for offering IT. There was also an effect of maternal age whereby more older women than younger women opted for invasive testing when their risk level was less than 1:300. However, more than 90% of older women chose to avoid IT when their risk level was less than 1:300. Pregnant women are able to use risk information to make informed decisions about IT for DS and should be given the opportunity to do so. Private prenatal care clinic, London, England Prospective cohort Level of evidence III-2 MA range 15-49 years Median MA = 34 years ≥ 35 years = 48.5% Screened at 11 – 13+6 weeks Median GA = 12 weeks 1. Information leaflet informing women of the PRL rate of invasive testing (1%) and that a DS risk of 1:300 or more was considered high. 2. Screening carried out via USS – NT and CRL and screening of serum for levels of fßhCG and PAPPA. 3. Women were counseled with regards to their individual risk for DS and asked to decide whether to accept or decline IT. Proportion of women in each risk category %age (95% CI) Uptake of invasive testing: Overall 3277/30564 = 10.7% (10.4 – 11.1%) Uptake relative to age: < 35 = 5.1% (4.8 - 5.5%) ≥ 35 = 16.7% (16.1 – 17.3) Screen positive 77.6% (76.0 – 79.2%) Screen negative 4.6% (4.3-4.8%) Uptake relative to risk: 1:50 or more = 95% 1:51 to 1:100=~83% 1:101-1:150=~75% 1:151 to 1:200=~70% 1:201 to 1:250=~63% 1:251 to 1:300=~60% 1:300-1:500~20% Uptake relative to risk and age: ≥35 years 1:300 or more = 76.2% (74.4-78.1%) Less than 1:300 = 7.4% (6.9-7.8%) <35 years 1:300 or more = 82.5% (79.4-85.6%) Less than 1:300 = 2.2% (2.0-2.5%) Reviewers conclusions There was a relationship between risk estimate and AC uptake in that, as the likelihood of carrying a fetus with DS increased, the proportion of women opting for invasive testing increased. Unfortunately these proportions are estimates because the uptake and number in each risk group were not presented in tables and the figure is difficult to read. Consequently, confidence intervals could not be calculated for the uptake relative to risk estimate. All the women in this study attended one fetal medicine clinic in London. Age distribution was reported but no other demographic information, making it difficult to determine how representative these women are of the general population and how generalisable the results would be to other settings. In addition, the sample was older than that included in other similar studies with a median age of 34 years, and a high proportion of the women (48.5%) age 35 and older. It is also plausible that they might be of a higher SES than the general population given that they were visiting a private clinic for their prenatal care. In this study all women received standardized information regarding the PRL risk for invasive testing and the interpretation of their estimated DS risk. The results therefore are possibly not reflective of the reality of a community-based screening programme. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 172 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Platt et al. 2004) N= 8216 women undergoing sequential first and second trimester screening for DS and trisomy 18 1st T screening: MA PAPP-A Free ßhCG NT ≥1:270 screen +ve Numbers of screen positive women after 1st T screening 1st T screening: N= 8205 813 screen +ve 7392 screen –ve All screen +ve and all women 35 years or older offered IT Proportion of women screened in 2nd T using multiple marker MSS. All women who continued their pregnancy into 2nd T offered 2nd T MSS: AFP Total hCG uE3 Numbers of 2nd T screen positive women Authors conclusions With the disclosure of their 1st T screen results, many women still request 2nd T MSS. Sequential screening following 1st T screening provides patients the maximal number of options to make informed decisions. Test results were disclosed with decisions regarding 2nd T screening and prenatal diagnosis left to the patient and physician. Invasive testing at <15 weeks was performed in 7.5% of screened patients, with 28% of screen +ve and 5% of screen –ve opting for IT at <15 weeks. Women aged 35 years and older seem to have undertaken screening for various reasons (information to help them decide between CVS and AC, thorough evaluation of their risk before IT). To make screening most effective it is equally important to understand patients’ preferences and to consider their perceptions of risks based on g.a. and willingness to use various methods. USA 12 participating prenatal diagnostic centres Prospective cohort multicentre screening programme Level of evidence III-2 11 cases of T18 removed leaving n=8205 women in sample Age distribution of women who underwent both 1st and 2nd T screening: N=4325 38.9% aged ≥ 35 years Uptake of IT in 1st T screen +ve women Maternal age-related 2nd T risk used in MSS calculation ≥1:270 = screen +ve 1st T screening results not incorporated into 2nd T calculation Uptake of IT in women who screened +ve in 2nd T 1st T and 2nd T screening: N=4325 2098 1st T screen –ve women declined or results unavailable for 2nd T screening +ve both screens = 74 -ve 1st +ve 2nd = 374 %age (95% CI) Uptake of IT 1st T +ve and no 2nd T screen +ve: 464/813 = 57.1% (53.7 – 60.5) -ve: 1134/7392 = 15.3% (14.5 – 16.2) Overall IT at ≤ 15 weeks GA 612/8205 = 7.5% (6.9 – 8.0) +ve both screens 30/74 = 40.5% (29.4 – 51.7) 1st T –ve/ 2nd T +ve 166/374 = 44.4% (39.3 – 49.4) Screen –ve both screens 194/3771 = 5.1% (4.4 – 5.8) ≥ 35 years = 8.3% (6.8 – 9.8) < 35 years = 3.5% (2.8 – 4.3) Reviewers conclusions The uptake rates of IT after 1st T screen actually reflect all women except those who had 2nd T screening. Some of the women had missing data or did not have information available for all 2nd T analytes so were included in this group. May have had AC for other reasons – family hx, anxiety, other abnormal scan. The authors acknowledged that they could not control how results were presented to women by their practitioner or what advice they were given regarding their individual risk estimate, uptake of 2nd trimester screening or potential risks of IT. Age distribution of the original sample not reported in this paper. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 173 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Khoshnood et al. 2003) N= 12326 women who gave birth in France in a 1-week period in 1998 (National Perinatal Survey, 1998) French prenatal testing policy: 2nd T USS with NT 2nd T serum screening normal care High risk women and all women aged 38 years and older offered reimbursed AC Reason for AC: Provided retrospectively, each woman could provide 2 reasons from serum screening, USS finding, MA, other, unknown. Uptake of screening 8569/12326 = 69.5% screened Authors conclusions Altogether 11.1% of women had an AC. The proportion of women who had an AC as greater for women with higher levels of education. Irrespective of screening or level of education, the proportion of women who had an AC increased with advancing maternal age. For both maternal education groups the effect of screening on AC was approximately 2 to 3 fold higher for women younger than 35 years compared with those 35 years and older. In general the odds of an AC were substantially higher for women who had screening compared with those who refused screening. Birth records from a sample of all births in France in 1998 Retrospective cohort Level of evidence III-2 Excluded: N=1152 (8.6%) women with missing data regarding either AC or serum screening Data collected from: 1. face-to-face interview with women after childbirth 2. medical records Missing data 5% interview, 1% medical records Tested the hypotheses that: 1. as MA increases, women will be more likely to opt for IT without screening 2. as level of education increases, women will be more likely to opt for IT without screening first MA groups (at 14 weeks g.a) Less than 30 30-34 35-37 38 and older Level of education: <12 years ≥12 years ORs comparing proportion of AC with screening vs. no screening for each age group and level of education group. Logistic regression used to adjust ORs for MA, marital status, parity, ethnicity. Logistic regression used to examine the effects of accepting versus declining serum screening on the odds of AC. %age (95% CI) Overall uptake of AC 1452/12326 = 11.8% (11.2 – 12.3%) Age distribution: Education level: >12 years = 13.6% (12.6-14.5) ≤12 years = 9.5% (8.9-10.2) Proportion of AC increased with MA irrespective of screening or level of education. The proportions of women who opted for AC with no screening were 51% for 35-37 year old women with >12 years education, and 81.7% for ≥38 year old women. OR of AC by maternal age and education: Maternal education ≤ 12 years <30 years = 2.7 (1.9 – 3.9) 30-34 years = 3.4 (2.0 – 5.9) 35 – 37 years = 1.2 (0.8 – 2.0) ≥38 years = 1.4 (0.7 – 2.4) Maternal education>12 years <30 years = 1.5 (1.0 – 2.4) 30-34 years = 1.5 (1.0 – 2.2) 35 – 37 years = 0.7 (0.4 – 1.1) ≥38 years = 0.7 (0.3 – 1.6) Reason for AC: 35-38+ year olds, biggest indication was MA followed by serum screening <30-35 year olds, biggest indication was serum screening followed by other or ultrasound results. Reviewers conclusions Data were collected via a retrospective interview and could be subject to recall bias. This, coupled with the proportion of missing data (8.6% of women were missing information about their serum screening or amniocentesis), makes it difficult to interpret the findings. The paper does not specify what type of information was missing about serum screening but in some it was whether serum screening was performed or not (4.1%). These women were excluded from analyses. The authors did not measure the use of CVS but this is available under same conditions as AC in France. The study assumed that all women were offered the same screening as per French prenatal testing policy. There was a slightly lower proportion of <30 year olds in >12 years education group (57.8% vs. 65.4%) and a higher proportion of 30-34 year olds in >12 years education group (30.4% vs. 24.4%). It is possible that maternal age may have been confounded with level of education and more educated women had their children later. The difference in AC uptake between the < 12 years and >12 years education groups may have been due to differences in maternal age. The confidence intervals for the odds of having an AC relative to accepting or refusing serum screening were wide suggesting small size for some of the groups. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 174 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Interventions Outcomes Results Comments (Wald et al. 2003a) Screening results 3271/46193 = 7.1% screen positive using MSS AC or CVS uptake relative to risk of DS calculated by quadruple test 6659/46193 = 14.4% screen positive based on MA alone Data collected from 14 UK hospitals as part of the routine care offered by the UK National Health Service 2nd T MSS (4 analytes): AFP uE3 hCG inhibin-A + MA at term DR False-positive rate UK 46193 pregnancies screened for DS risk at 14 UK hospitals between August 1996 and September 2001. Compared performance of quadruple test (AFP, uE3, hCG, inhibin-A), triple test (inhibin-A omitted) and double test (inhibin-A and uE3 omitted) in the same sample. Screening between 14 and 22 weeks. Performance of each screening method was compared for each woman, so each woman acted as her own control. Authors Conclusions Uptake of amniocentesis increased with increasing risk Only 43% of women with risks of 1 in 250-300 had an amniocentesis, whereas 74% did so if they had risks higher than 1 in 50. Among women who tested positive and had affected pregnancies, 87% had an amniocentesis. The influence of an anomaly scan in women with screen-positive results might have led to a lower overall uptake of amniocentesis than anticipated (60%). The uptake of serum screening could not be established because the number of women who were offered such testing was not recorded. Prospective cohort Included 149 twin pregnancies Risk cut-off: 1:300 or greater Uptake of AC in screen +ve women: Overall = 60% Relative to DS risk: >1:50 = 74.5% (95% CI 67.3 – 81.7%) 1:250-1:300 = 43.1% (95% CI 38.8 – 47.5%) Increase in IT with increasing risk (trend test p< 0.0001). Level of evidence III-2 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers Conclusions The overall study design and quality of the data sources was high. The findings regarding an increased rate of AC with higher estimated risk seem robust. However, the report did not contain enough information to calculate confidence intervals for the overall rate of AC in screen positive women. There was also no information about the uptake of AC in screen negative women or women who declined screening. The age distribution of the sample was not reported in this paper. 175 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Interventions Outcomes Results Comments (de la Vega et al. 2002) N=555 women referred for estimation of DS risk between January 1999 and December 2000. Sample divided into groups based on the indications for prenatal diagnosis: Uptake of AC was calculated for each group. Uptake of AC for the total sample = 336/555=60.5% Authors conclusions The results suggest women are influenced by the method of risk assessment more than their calculated risk estimate. Patients in the sonographic marker group chose AC more frequently than patients who were offered AC based on maternal age or serum screening results. Sonographic markers may be a better method in Hispanic populations. University Hospital high risk prenatal clinic, San Juan, Puerto Rico Hispanic population Women who had been screened and identified as having a calculated DS risk of ≥ 1:250 were included in the sample. Retrospective cohort review of records The sample was divided into four groups based on the risk factor detected. Level of evidence III-2 Women with more than one risk factor were excluded from analyses. MA (35 years or older) Abnormal serum screening (either low AFP or triple marker screening) 2nd T USS marker Previous child with chromosomal anomaly. Uptake of AC per risk factor: % (95% CI) MA = 61.4% (55.8 – 67.0%) Abnormal serum = 54.0%( 47.1 – 61.0%) USS markers = 72.9%( 60.3 – 85.5%) Family hx = 84.2%( 67.8 – 100%) records of referred women reviewed risk calculated. If 1:250 or more, included in study risk factor identified SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers conclusions This study presents the same difficulties in sample size (n=555 screen positive women), and a lack of generalisability because of potential cultural differences in the acceptability of both screening and invasive testing. The confidence intervals for uptake of invasive testing for each risk factor were wide and overlapping, indicating that there was no significant difference in uptake of IT between any of the groups. In this study, the medical records of Hispanic women who presented to a high risk prenatal clinic in Puerto Rico were retrospectively reviewed. Women who had tested screen positive for DS (> 1:250) were included in the study and were grouped based on their indication for prenatal diagnosis. The description of the methodology was not detailed enough to ascertain what options women were offered following screening, what invasive testing risks they were informed of, and what they were told about their individual risk for DS. The age distribution of the sample and how many women were excluded because of multiple indications for prenatal diagnosis was not reported. It is also not indicated whether grouping of women based on their indications was blind to their invasive testing decision. 176 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Vergani et al. 2002) N= 1486 women who underwent genetic counselling on the basis of MA (≥ 35 years) between January 1990 and December 1998. Women’s views regarding IT measured before USS screening was undertaken. Women’s preference for AC a priori Women’s preferences regarding AC following genetic counselling and prior to ultrasound: 103/1471 (7%) undecided 501/1368 in favour of AC 867/1368 against AC Authors conclusions Most studies looking at women’s decision-making regarding IT, involve populations who have already agreed to screening so are only relevant to these populations. Secondly, many are retrospective questionnaires or surveys and are more likely to exclude those women who are not interested in prenatal screening or diagnosis. They are also subject to recall bias. Hospital San Gerardo, Monza, Italy National Health Service funded prenatal evaluation and diagnosis Prospective cohort Level of evidence III-2 Mean MA=38.9+/-2.1 years 1471/1486 (99%) underwent genetic ultrasound 4645 deliveries for women aged 35 years and older Genetic counselling at 11.4+/5.6 weeks Ultrasound at 17.2+/-1.5 weeks 98% Caucasian Are you inclined to have an amniocentesis? Yes/No/Undecided Initial session occurred at 10-14 weeks. Age-related risks of DS discussed MA, family hx, medical hx, SES recorded. Accuracy and risks of available diagnostic tests discussed. Tests offered: 2nd T USS for structural anomaly and markers of aneuploidy (including NT) 2nd T genetic sonogram for DS AC Women’s attitudes to AC measured at end of session. Results of genetic sonogram SES Anamnestic Decision to have AC or not following ultrasonographic findings % (95% CI) Uptake in IT following USS screening: Normal findings: N=966 31% (28 – 34) accept 69% (66 – 72) decline Abnormal Findings: N=402 36% (32 – 41) accept 64% (59 – 68) decline AC uptake relative to a priori preference: In favour: N=501 83% (80 – 86%) accept 17% (14 – 20) decline Against: N=867 3% (2 – 5) accept 97% (95 - 98) decline Women’s preference for/against AC prior to USS was the strongest influence on their subsequent decision whether to take up AC. Generally women did not change their mind after receiving their USS results. 7% of women who were against AC prior to their USS, opted to have an AC after receiving results. Women who had an a priori preference for AC tended to be older than women who were against AC. They conclude that ‘intensely personal values’ determine uptake of AC Reviewer’s conclusions There were low uptake rates of AC in general. This could be partially due to cultural preferences against invasive testing as this was an Italian sample. It could also be due to a lack of trust in the screen which is less reliable than other methods. The authors acknowledged that the sample included only women who were referred to one genetic counselling service, about 1/3 of women who delivered at the hospital. The others would have received genetic counselling at other clinics. Representativeness of the sample is therefore an issue. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 177 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Dommergues et al. 2001) N=359 consecutive pregnancies in women presenting for antenatal care before 14 weeks NT 10-14 weeks Numbers of women screening positive for each screening method Screening results Either positive MSS or positive NT = 130 Authors conclusions A complete obstetrical and/or neonatal follow-up was obtained in all 359 women. Results indicate that a significant number of women of AMA chose not to have an AC when noninvasive tests were reassuring. However, nearly half of the women considered low risk by NT, MSS and 2nd T sonography chose to have an AC. Conversely, not all women considered high risk by MSS elected to have AC. These discrepancies may be in part related to the way different practitioners gave the information to patients over the study period, or to differences in the perception of the patients. Although standardized written information was given, it is impossible to control the way the information sheet was commented on and to determine the influence of the counselor on the decision. AC may be offered on a selective rather than routine basis in women over 38 based upon the results of screening. Prospective cohort Maternity Hospital, France Level of evidence III-2 Age distribution 38 – 47 years Excluded women who did not consent to NT or MSS 2nd T MSS (2 analytes) 15 – 17 weeks: AFP hCG 2nd T USS 21-23 weeks Low risk = NT < 3mm + MSS derived risk < 1:250 + normal USS High risk = either positive NT or MSS result or abnormal USS MSS was performed in all women regardless of their NT result Negative for both MSS and NT = 229 Uptake of IT following a positive result Uptake of IT following a negative result Uptake of IT Overall uptake 227/359 = 63% (95% CI 58 – 68%) Screen positive 105/130 = 81% (95%CI 74-88%) Screen negative 122/229 = 53% (95% CI 47-60%) Uptake of IT for each screening measure NT +ve, MSS +ve = 12/12 = 100% NT+ve, MSS –ve = 5/7 = 71% NT –ve, MSS +ve = 88/111 = 79% NT –ve, MSS –ve = 122/229 = 53% Low risk women were given the option of not having an AC. High risk women were recommended to have AC A standardized information sheet was provided to all women and included information regarding the age-related risk of DS, descriptions of MSS, NT and USS methods of detecting DS, and risks associated with IT. Reviewers conclusions This study was a comparison of AC uptake and pregnancy outcomes in women who screened high or low risk for DS using MSS, 1st T NT and 2nd T USS. Standardised written information was provided to women about age-related risk of DS and risks associated with IT. The authors acknowledge that there may have been practitioner differences in the way that information was communicated and it is not clear whether women were informed of their calculated MSS risk or just whether a positive or negative result was obtained. While this means there may have been variation in the information provided to women, this is probably an accurate reflection of what would occur in a community screening programme and should not be seen as a limitation of the study. Information regarding the uptake of IT was obtained for all women from cytogenetic records. The sample is small and restricted to women over 38 years making the results less generalisable to other settings. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 178 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Michailidis et al. 2001) N= 8536 pregnant women who presented at 10-14 gestational weeks or having had a NT measurement NT screening at 10-14 weeks (n=7447) Proportion of women who opted for IT following an abnormal NT result Total rate of invasive testing: 632/7447 = 8.5% Authors conclusions The use of 1st T sonography had two effects on biochemical screening: 1. a lower than usual uptake of the test(66%), 2. taking into account the NT measurements in our counseling and providing women with a combined risk resulted in low uptake of invasive tests, especially at the risk bands 1:250 – 1:100 (30 - 42%) A limitation of the study is that we evaluated a combination of 1st and 2nd trimester screening tests, although we acted upon the abnormal results of the first test. London, England Maternity unit of a university hospital Full outcome data available for 7447 (87%) Retrospective cohort review of records Mean maternal age = 30.1 years (range 13 – 50) ≥ 35 years = 21.1% Level of evidence III-2 Mean gestational age at NT screen = 12 + 5 weeks Screening and invasive testing data collected from hospital database Pregnancy outcome data from maternity hospital records or patients themselves Positive screen results disclosed to women (above 99th centile or structural abnormalities) All women offered 2nd T screening by midwife – counseled, standardized written information 2nd T MSS (2 analytes) AFP F ßhCG N= 4864 women who went on to 2nd T screening Proportion of women who opted for IT following abnormal 2nd T screen result NT screening N=7447 Screen +ve = 122 Rate of invasive testing 105/122 = 86.1% (95 % CI 79.9 – 92.2%) MSS screening N= 4864 Screen positive = 425 Rate of invasive testing 189/425 = 44.5% (95% CI 39.7 – 49.2) No significant difference in age distribution of this population Prevalence of DS in this group was 0.08% Risk cut-off >1:250 Ultrasound scan performed if over risk cut-off Counseling regarding results included a combined risk estimate based on 1st T NT, MA, 2nd T serum, and sonographic markers. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers conclusions There was incomplete data for 13% of the original sample. Women who screened positive with MSS were given an USS to detect soft sonographic markers and their counseling included their combined risk (NT, MSS, USS). So the uptake in screen positive women actually reflects women who were given 2nd T MSS and USS rather than those who screened over 1:250 for MSS risk. The uptake of 2nd T screening was low, although the rates of uptake were calculated by assuming all women who screened negative using NT continued their pregnancy into the 2nd T. It would be useful to report the number of terminations, spontaneous miscarriages, or women lost to follow-up in the NT screen negative group to get a better sense of the uptake of 2nd T screening and the rates of invasive testing in all groups (1st T negative/ 2nd T positive and negative both screens). 179 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Interventions Outcomes Results Comments (Marini et al. 2002) N= 2456 maternal serum samples from women of AMA referred for analysis at the authors laboratory. 2nd T triple test: AFP uE3 ßhCG Result of screening: Positive screen result (indicating increased risk for either DS, ONTD or trisomy 18) Result of screening 841/2456 (34%) = positive 1615/2456 (66%) = negative Authors conclusions The majority of patients receiving negative screening results did not request AC. More than half of all patients receiving positive results declined AC. Inclusion criteria AMA cases selected from all samples analysed 1994-1999 (n= 22 532) Women screened for DS, ONTD, T18 at the same time Cytogenetics laboratory, Texas, USA Retrospective review of laboratory records over a 6 year period Level of evidence III-2 MA range 35-47.3 years GA range 15-22.9 weeks Positive screen = 1:270 or greater. Rate of invasive testing in screen positive women Rate of invasive testing in screen negative women Uptake of invasive testing % (95% CI) Screen positive AC = 404/841 = 48.0% (44.7 – 51.4%) No AC 437/841 = 52.0% (48.6 – 55.3) Screen negative AC 208/1615 = 12.9% (11.2 – 14.5) No AC = 87.1% (85.5 – 88.8) Exclusion criteria: AMA women undergoing screening for Open Neural Tube Defects using AFP only Decisions made by AMA patients regarding AC may not always correlate clinically with maternal serum screening results. Understanding the reasons for these decisions may improve service delivery to all pregnant patients. An understanding of the decisionmaking process is also necessary in order to provide women with full informed consent. Reviewers conclusions This study was a retrospective review of laboratory records and thus was subject to potential bias in the quality of data regarding screening and AC results. As the authors acknowledge, clinicians vary in the way they explain the results of screening and the potential risks of IT to patients. As this was a review of records, no information was available about the contents of consultations or whether the laboratory or individual obstetricians or midwives were responsible for reporting results to women. There is also no information provided regarding whether women were told their exact risk estimate or just that they were above the cut-off. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 180 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Audibert et al. 2001) N = 4165 women undergoing screening for DS by NT measurement (10-14 weeks GA) and MSS (14-18 weeks) NT 10-14 weeks Numbers of women screening positive for each screening method Screening results Either positive MSS or positive NT = 213 Authors conclusions 7.6% of screened women had antenatal karyotyping. Surprisingly, in women at low risk for maternal age, NT measurement and MSS, 4.7% still opted for AC. Almost half of these tests were indicated by an abnormal 2nd T scan. Prospective cohort Maternity Hospital, France Level of evidence III-2 35 (0.8%) lost to follow-up Final NT group = 4130 Final NT + MSS group = 3790 Inclusion criteria: MA less than 38 years (the results for MA ≥ 38 years are reported in Dommergues et al. 2001) Exclusion criteria: MA ≥ 38 years; twin pregnancy; CRL < 38mm or >84 mm; NT not measured or not recorded Mean MA = 30.1 years (range 16-37 years) 14% 35 – 37 years 2nd T MSS (2 analytes) 15 – 17 weeks: AFP hCG 2nd T USS 21-23 weeks Low risk = NT < 3mm + MSS derived risk < 1:250 + normal USS High risk = either positive NT or MSS result or abnormal USS MSS was performed in all women regardless of their NT result Negative for both MSS and NT = 3599 Uptake of IT following a positive result Uptake of IT following a negative result Uptake of IT Overall uptake 315/4130 = 7.6% (95% CI 6.8 – 8.4%) Screen positive 125/213 = 58.6% (95%CI 52.1– 65.3%) Screen negative 169/3599 = 4.7% (95% CI 4.0 – 5.4%) Uptake of IT for each screening measure NT +ve, MSS +ve = 4/4= 100% NT+ve, MSS –ve = 59/83 = 71% NT –ve, MSS +ve = 62/130 = 48% NT –ve, MSS –ve = 169/3599 = 4.7% Low risk women were given the option of not having an AC. High risk women were recommended to have AC A standardized information sheet was provided to all women and included information regarding the age-related risk of DS, descriptions of MSS, NT and USS methods of detecting DS, and risks associated with IT. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers conclusions This study was a comparison of AC uptake and pregnancy outcomes in women who screened high or low risk for DS using MSS, 1st T NT and 2nd T USS. This study included only women aged <38 years of age, however the proportion of women over 35 years (14%) was not unusually small. Standardised written information was provided to women about age-related risk of DS and risks associated with IT. The authors acknowledge that there may have been practitioner differences in the way that information was communicated and it is not clear whether women were informed of their calculated MSS risk or just whether a positive or negative result was obtained. While this means there may have been variation in the information provided to women, this is probably an accurate reflection of what would occur in a community screening programme and should not be seen as a limitation of the study. 181 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Interventions Outcomes Results Comments (Chen et al. 2000) n = 2879 women with screen positive results who were screened between April 1993 and December 1998 2nd T triple test offered to all women regardless of MA: AFP hCG uE3 High risk cut-off: 2nd T DS risk of 1:270 more AC uptake (lab records) 2879/49772 (5.78%) women were screen +ve Pregnancy follow-up (from referring physician) Uptake of AC in screen +ve women: Overall = 52.0% Authors conclusions The greater the reported patient-specific risk, the greater the utilization of AC. Women selected AC when screening was performed earlier. Logistic regression model applied to identify the influence of various factors on the uptake of AC: Maternal age at delivery Risk identified by screening Gestational age Maternal weight Method of determining GA Ethnicity %age (95% CI) By MA: 14-19 = 41.8% (30.9 - 52.6) 20-24= 48.0% (40.5 – 55.4) 25-29 = 52.6% (47.6 – 57.7) 30-34 = 58.9% (55.8 – 61.9) 35-39 = 48.7% (45.5 – 51.9) 40-48 = 42.5% (36.6 – 48.4) University of Connecticut Health Center maternal serum screening laboratory Retrospective cohort review of records Level of evidence III-2 Mean MA = 33.5 years Gestational age at screening = 14.0-21.9 weeks Mean GA = 16.6 weeks Mean DS risk = 1:73 Screened population: n = 49772 Mean age = 27.5 years (range 13-48 years) 10.3% age 35 or more 70.7% White 10.3% Black 16.2% Hispanic 2.8% Other Excluded: AFP serum testing only Multiple pregnancies Women with insulin-dependent diabetes 10.3% of screen +ve women were tested through other laboratories or had no follow-up information and were assumed not to have had AC. Data collected from cytogenetic laboratory records and hospital records Women ages 35 or older were also able to opt for direct AC Laboratory findings reported to women via referring physicians, who received an indication of whether the test was positive or negative and also a patient-specific risk. Assumed that all screen +ve women were offered AC < 35 = 55.5% (53.1 – 57.9) ≥ 35 = 47.3% (44.5 – 50.1) By risk: 1:1 to 1:10 = 72.2% (60.3 – 84.2) 1:11 to 1:30 = 57.6% (49.3 – 65.8) 1:31 to 1:60 = 61.4% (55.5 – 67.2) 1:61 to 1:134 = 55.1% (51.7 – 58.4) 1:135 to 1:270 = 47.7% (45.3 – 50.2) By GA: 15/40 = 53.8% 17/40 = 50.3% 19/40 = 38.5% 21/40 = 6.5% By ethnicity: White = 54.6% Black = 41.6% Hispanic = 41.0% By year: 1993 = 70.3% 1994 = 53.4% 1995 = 57.6% 1996 = 53.0% 1997 = 44.0% 1998 = 36.4% Regression results: Odds ratios presented for each factor in the model. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Different ethnic groups showed different rates of uptake of AC. Year of testing also influenced uptake with a trend towards declining AC between 1993 and 1998. Maternal age influence on AC was complex. Screening test may be used by older women to avoid AC. Those who are going to have AC anyway, may proceed directly to it. Reviewers conclusions Assumed that every screen positive woman was offered AC but as the consultation was done by referring physicians, the authors couldn’t control this (acknowledged by them). Physicians were not asked whether they offered AC. Also no control over what information the patients received regarding their risk or AC risk (acknowledged by authors). Authors also suggest that decrease in AC over time may be because of an increase in the use of genetic sonogram to check for markers following MSS. This may have resulted in a reduction in the need for AC. 182 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (O'Connell et al. 2000) N= 18890 women who booked for first trimester assessment between May 1992 – April 1997 (total = 18890) and were subsequently screened in the 2nd T 2nd T Triple test: AFP uE3 total hCG (May 1992 – March 1994) fßhCG (May 1994 – April 1997) Rates of IT uptake in screen positive and screen negative women. Uptake of screening Screened = 14827/18890 = 78% No screen = 4063/18890 = 22% Rates of IT uptake in women who were not screened. Results of screening Screen +ve = 586/14827 = 4% Screen –ve = 14241/14827 = 96% Authors conclusions The information women are given may influence how they respond to prenatal diagnostic testing. 15% of women who tested positive did not want IT and a very small proportion of women with a negative screening result opted for AC. England Hull Maternity Hospital Retrospective review of a screening cohort Level of evidence III-2 Data collected from antenatal clinic records. IT data collected from the regional cytogenetics laboratory records 1:250 risk cut-off Individual counseling with a midwife prior to testing. A standardised information sheet, which included a description of the triple test and options following testing, was discussed with patients and then provided for further consideration. Uptake of IT Uptake of IT for screen +ve group: IT = 498/586 = 85% (95% CI 82 – 88%) No IT = 88/586 = 15% (95% CI 12 – 18%) Uptake of IT in screen –ve group: IT = 7/14241 = 0.05% (95% CI 0.01 – 0.09%) Uptake of IT with no screening: IT = 75/4063 = 1.8% (95% CI 1.4 – 2.3%) AC not routinely offered on the basis of MA. AC performed prior to triple test if requested Women who tested positive via MSS were informed in person by a community midwife and then met with a consultant to discuss options. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers conclusions There is no information provided about the age distribution of the sample, which makes it difficult to determine how well they represent the general population. There was also no information about whether any screen negative women were lost to follow-up. It is possible that some of these screen negative women were lost to follow-up and sought prenatal diagnosis at another cytogenetics lab. Standardised information was presented to women prior to MSS but there is no information about the information presented to women who screened positive and negative. There may have been variation in the information they were provided regarding IT risks and the implications of their calculated DS risk, which may have influenced their decision whether to opt for IT. This information in combination with the maternal age distribution might help to explain the very low IT rate in the screen negative group. There was an increase in screen positives from 2.9% (May 1992 – April 1994) to 5.7% (May 1994 – April 1997) when free ßhCG replaced total hCG in the serum screen. The reliability of the screening test may effect women’s decisions to opt for AC or not following screening, and so it would have been useful to present the rate of IT as a function of the two variations of the serum screen or as a function of year. 183 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Lam et al. 2000) N= 3419 women ≥ 35 years of age who reported for prenatal diagnosis of DS on the basis of AMA between Jan 1997 – October 1999. Women referred for prenatal diagnosis on the basis of AMA. Uptake of screening by year Uptake of screening 1807/3419 = 52.9% (95% CI 51.2 – 54.5%) opted for screening Authors conclusions Overall half of women aged ≥ 35 years chose serum screening and this proportion increased with time. The data show that the screening would become more popular with time when people became more confident and familiar with the procedure. Among the screen positive cases, AC uptake was about 60%. Another study (Lam et al. 1998a) by the same authors reported on uptake in <35 year olds in the same population and found 90% of screen positive women opted for AC. There was a difference in screening uptake depending on ethnic origin. This probably represents the difference in values assigned to each pregnancy outcome by people of difference ethnic origin, cultural background and religious belief. Hong Kong Prospective cohort Level of evidence III-2 Women were recruited to the study and so women who declined any form of screening and diagnosis were excluded, as well as women with other high-risk factors (e.g. previous chromosomally abnormal pregnancies). IT data were collected from cytogenetic laboratory records in the participating hospitals Age distribution 35-36 years = 51% 37-38 years = 27% 39-40 years = 14% >40 years = 7% Detailed counseling, leaflets and video presentation including information about agerelated DS risk, the use and effectiveness of serum screening, the use of and risks related to IT. Women offered serum screening (total hCG and AFP) as an alternative to direct IT. Risk cut-off ≥ 1:250 Screen positive women offered AC Uptake of IT in screen positive women Uptake of IT in screen negative women 1516 (AC) + 96 (CVS) = 1612/3419 = 47.1% (95% CI 45.5 – 48.8%) opted for direct IT Uptake of screening by year 1997 = 38.8% 1998 = 54.6% 1999 = 63.4% Uptake of IT Screen positives 184/302 = 60.9% (95% CI 55.4 – 66.4%) Screen negatives 80/1505 = 5.3% (95% CI 4.2 – 6.4%) Screen negative women given detailed scan at 1820 weeks and AC performed if requested Reviewers conclusions This study reports the uptake rates of IT following MSS in pregnant women aged ≥ 35 years. The sample was predominantly Chinese (91%) with a small proportion of Filipino (5%) and Caucasian (1.3%) women. Uptake rates of MSS varied according to ethnic origin with a lower rate of uptake in women of Chinese origin. This makes it difficult to generalise the findings to other populations. The information provided to women regarding the implications of serum screening and invasive testing was standardized and thorough. However, this may not reflect the realities of clinical practice and may overestimate the amount of information women would actually be provided within a community screening programme. It is difficult to determine from the methods section whether all women who booked for prenatal diagnosis in this period were recruited to the study and offered serum screening. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 184 Table 21. Evidence table of primary research studies examining the uptake of invasive testing following DS screening (continued) Source Sample Screening Strategies Outcomes Results Comments (Spencer et al. 2000b) N= 4088 women accepted screening after booking for routine antenatal care over a 1 year period (1998-1999). 1st T combined NT + MSS screen. Uptake of 1st T screening Uptake of screening: 4088/4190 (97.6%) opted for 1st T screening. 8% (n=348) had 2nd T screening (AFP + f ßhCG) Authors conclusions The rate of acceptance of CVS was related to the DS risk reported. Women with risks of ≥ 1:100 were 3 times more likely to accept CVS than those in the risk group 1:200 to 1:300. District General Hospital maternity unit, England Prospective cohort Level of evidence III-2 Included singleton pregnancies only All women were screened between 10+3 and 13+6 weeks GA Median age = 29 years ≥ 35 = 12.7% ≥ 37 = 6.1% Screening was predominantly 1st T but some women presented too late for NT measurement and so only had 2nd T MSS (8%). 1. Consultation with midwife and standardised written information about available tests and the clinic service. 2. If prenatal screening is opted for: MSS (fßhCG + PAPP-A if prior to 14 weeks GA. If ≥ 14 weeks GA, AFP measured instead of PAPP-A.) NT Uptake of IT following screening. After exclusions for GA or fetal death, n= 3762 women who had 1st T screening. Screening results: ≥ 1:300 = 253/3762 = 6.7% Uptake of invasive testing: 207/253 = 81.8% (95% CI 77.1 – 86.6) CVS = 200 AC = 7 3. MSS results entered into database and composite risk score produced. 4. Counselor/midwife discusses risk report with patient. Women with increased risk for DS referred for CVS or AC. High risk cut-off: 1:300 or more SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers conclusions The methods section was detailed and thorough with the median age of the sample being 29 years and 6.1% of the sample aged ≥ 35 years. In a subsequent paper focusing on twin pregnancies from the same sample (Spencer and Nicolaides, 2003) the screening population was described as being predominantly white Caucasian and from a relatively affluent London borough. The authors reported that the rate of uptake varied depending on the level of risk estimate. However, exact figures for uptake relative to risk estimate and the numbers in each risk group were not reported and confidence intervals were unable to be calculated. In this study all women received standardized information regarding the PRL risk for invasive testing and the interpretation of their estimated DS risk. The results therefore are possibly not reflective of the reality of a community-based screening programme. 185 Chapter 8: Invasive Testing and Procedure-Related Loss SECONDARY RESEARCH The search strategy identified one relevant systematic review. The methods and conclusions are described in Table 22 (page 186). The report by the Cochrane Collaboration (Alfirevic et al. 2006) aimed to compare the safety and accuracy of second trimester amniocentesis (after 15 completed weeks of gestation), early amniocentesis (before 15 completed weeks of gestation), transcervical and transabdominal chorionic villus sampling. The review included all 14 randomised comparisons of these methods of invasive testing, including one RCT (Tabor, 1986). Quasi-randomised studies were excluded and the review was graded as being of evidence level II. Outcome measures included the accuracy of the method, technical difficulties related to sampling, pregnancy complications, pregnancy outcomes, and neonatal complications. In relation to fetal loss, the authors found the only RCT (Tabor, 1986∗) reported a 1% increase in total pregnancy loss in the 2nd trimester AC group compared to no testing (3.2% versus 2.2%, ns), and a 0.8% greater risk of spontaneous miscarriage (2.1% versus 1.3%, relative risk (RR) 1.6, 95% CI 1.02 to 2.52). This was in a low risk population. One study compared early versus 2nd trimester amniocentesis∗ (CEMAT, 1998) and found a 1.7% higher risk in total pregnancy loss with early amniocentesis (7.6% versus 5.9%). Four trials compared transcervical (TC) or transabdominal (TA) CVS with 2nd trimester AC (MRC, 1991; Canada, 1992; Denmark, 1992; Borrell, 1999)∗, finding total pregnancy loss to be 3.5% higher following TC CVS than 2nd trimester amniocentesis (14.5 vs. 11%) but no difference between TA CVS and AC. Conclusions The quality of this review was high in terms of both the studies included and the evaluation of included studies. The reviewers concluded that the finding of a 1% increased risk of fetal loss (Tabor, 1986) is still the best estimate of procedure-related loss in low-risk women. The recommendations of the review were that prenatal diagnosis is safest performed in the second trimester by amniocentesis, which is safer than both transcervical CVS and first trimester amniocentesis. Transabdominal CVS is the safest method for first trimester prenatal diagnosis, followed by transcervical CVS. ∗ As cited by Alfirevic et al. (2006) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 186 Table 22. Evidence table of secondary research studies appraised investigating the rate of fetal loss following invasive prenatal diagnostic procedures Source Search method Selection Criteria Results Comments (Alfirevic et al. 2006) Cochrane Collaboration Search: Cochrane Pregnancy and Childbirth Group trials register(March 2003) which contains trials identified from: Cochrane Central Register of Controlled Trials (CENTRAL) Monthly searches of Medline Handsearches of 30 other journals and conference proceedings Weekly current awareness search of 37 journals All randomised comparisons of late amniocentesis, early amniocentesis and chorionic villus sampling (either transabdominally or transvaginally) with each other or with no testing have been included. 14 studies appraised: Second trimester amniocentesis versus no testing (Tabor, 1986) Early versus 2nd trimester amniocentesis (CEMAT, 1998) CVS versus 2nd trimester amniocentesis (MRC, 1991; Canada, 1992; Denmark, 1992; Borrell, 1999) Transabdominal versus transcervical CVS (USNICHD, 1992; Brambati, 1991; Bovicelli, 1986; Tomassini, 1988) Early amniocentesis versus CVS (Copenhagen, 1997; Leiden, 1998; Kings, 1996) Ultrasound assisted amniocentesis (Nolan, 1981) Authors conclusions In a low risk population with a background pregnancy loss of around 2%, a 2nd T AC will increase this risk by another 1%. The increase in spontaneous miscarriages following 2nd T AC compared with controls was statistically significant (2.1% vs. 1.3%). Early AC is not a safe alternative to 2nd T AC because of increased pregnancy loss (7.6% vs. 5.9%). Transcervical CVS carries a significantly higher risk of pregnancy loss (14.5% vs. 11%) and spontaneous miscarriage (12.9% vs. 9.4%) than 2nd T AC. One study compared transabdominal CVS with 2nd T AC and found no significant difference in total pregnancy loss (6.3% vs. 7%). Cochrane Database of Systematic Reviews Level of evidence II Cochrane Central Register of Controlled Trials (The Cochrane Library, Issue 1, 2002) searched using the terms ‘amniocentesis*ME’, ‘amniocentes*’, ‘chorionic-villisampling*ME’, and ‘chorion*vill*’. Databases searched: Cochrane library, Embase, Medline, PsychLIT. Also searched reference lists of identified studies, and contacted experts in the field to identify further references. Quasi-randomised studies were excluded. All trials were assessed for methodological quality using the criteria in the Cochrane Handbook (Clarke 2000) but there were no other planned exclusions based on quality. Types of intervention: 2nd T amniocentesis (after 15 completed weeks of gestation) Early amniocentesis (before 15 completed weeks gestation (i.e. ≤ 14 weeks 6 days) Transabdominal, transcervical, or transvaginal CVS Data extracted and analysed on an ‘intention to treat’ basis. Weighted estimate of relative risk calculated for each outcome. Chi-square test of heterogeneity used to determine whether a fixed effects (no heterogeneity) or random effects (unexplained heterogeneity) model was used to pool results. Fetal Loss: The only RCT (Tabor, 1986) reported a 1% increase in total pregnancy loss in the 2nd T AC group compared to no testing (3.2% versus 2.2%, ns), and a 0.8% greater risk of spontaneous miscarriage (2.1% versus 1.3%, 95% CI 1.02 to 2.52). This was in a low risk population. One study compared early versus 2nd trimester amniocentesis (CEMAT, 1998) and found a 1.7% higher risk in total pregnancy loss with early amniocentesis (7.6% versus 5.9%). Four trials compared transcervical (TC) or transabdominal (TA) CVS with 2nd T AC (MRC, 1991; Canada, 1992; Denmark, 1992; Borrell, 1999), finding total pregnancy loss to be higher following TC CVS than 2nd trimester amniocentesis but no difference between TA CVS and AC. Outcome measures included the accuracy of the method, technical difficulties related to sampling, pregnancy complications, pregnancy outcomes, and neonatal complications. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers conclusions Methodologically sound review process – two reviewers assessed eligibility and trial quality and performed data extraction. Studies were graded based on their method of randomization (A – adequate, B – unclear, and quasi-randomised studies, graded C, were excluded). There were no other planned exclusions based on methodological quality. Because of this, some of the studies included in the review have methodological flaws which make the interpretation of their findings more difficult. However, several studies of very high quality, including 1 RCT (Tabor, 1986), were included and their findings should be considered good estimates of the risk associated with each of the invasive tests they examined. 187 PRIMARY RESEARCH: STUDY DESIGNS AND QUALITY The search identified five eligible primary research studies. Below is an overview of study designs and aspects of quality represented by these studies. Full details of the five papers appraised, including methods, key results, limitations and conclusions, are provided in Table 23 (pages 191-195). Study design Of the five studies included in this part of the review, four were retrospective cohort studies (Scott et al. 2002; Muller et al. 2002b; Antsaklis et al. 2000; Antsaklis et al. 2002), and one was a prospective case-control study (Mungen et al. 2006). Study setting Of the four retrospective cohort studies, three were single centre studies. Two were set in maternity hospitals in Greece (Antsaklis et al. 2000; Antsaklis et al. 2002)) and one in a prenatal diagnosis clinic in Australia (Scott et al. 2002). One multi-centre study (Muller et al. 2002b) utilised a large community-based maternal serum screening programme in France which collected data from 6 cytogenetic laboratories. The prospective case-control study was conducted in the perinatology unit of a university hospital in Turkey (Mungen et al. 2006). Samples The sample sizes of the three single-centre cohort studies ranged from 365 – 9,200 and the case-control study included 2068 participants and 2068 matched controls. Inclusion and exclusion criteria varied depending on the sample and interventions. Younger (<35 years) and older (≥ 35 years) women (13 – 52 years) were included in three studies, while two studies examined fetal loss in either older (Scott et al. 2002) or younger (Antsaklis et al. 2000) women. Scott et al. (2002) examined rates of fetal loss in women referred for counselling about prenatal diagnosis mostly on the basis of AMA. These women chose whether to have an invasive procedure and which invasive procedure to have following counselling. Groups were not randomised and all women were high risk for DS, in most cases on the basis of maternal age. No other exclusion criteria were mentioned. Antsaklis (2002) reviewed the records of women with multiple pregnancies who underwent AC or CVS at their clinic between 1977 and 2000. No exclusion criteria were applied. Antsaklis (2000) examined the procedure-related loss in 20-34 year old women who were referred for a variety of reasons. Women in the study or control groups with multiple pregnancies, uterine abnormalities, intrauterine contraceptive devices in situ or a serious history of second trimester miscarriage were excluded from the study. The control group consisted of women who were screened but were considered low risk for DS or NTD. Muller et al. (2002b) utilised a large community screening programme which screened almost 55,000 women over a 3 year period. Of those women, 3472 opted for amniocentesis. The control group consisted of 47,000 women who opted not to have amniocentesis, some of whom had calculated DS risks above the cut-off (n=2418). The study group consisted of women who screened positive for DS risk, mostly on the basis of elevated maternal serum levels, and opted for prenatal diagnosis. The control group was all women who were screened but did not opt for AC. No exclusion criteria were applied, 7.35% were lost to follow-up, and the age and serum levels of the study and control groups differed significantly. Mungen et al. (2006) applied strict inclusion and exclusion criteria to both cases and controls. Controls and cases were matched one-to-one for maternal age, parity, and number of prior spontaneous abortions. Matched pairs were excluded if either woman was lost to follow-up, had a fetus with an abnormality or underwent repeat AC. Women with medical complications or elevated maternal serum levels were excluded from the study. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 188 Interventions Four of the five studies in this section of the review included a comparison of second trimester (15-24 gestational weeks) amniocentesis with no invasive testing, with one study also including a comparison of 2nd trimester amniocentesis with 1st trimester CVS and CVS with no invasive testing (Scott et al. 2002). One study compared fetal loss in women with multiple pregnancies who underwent 2nd trimester amniocentesis versus CVS (Antsaklis et al. 2002). This study reviewed data collected between 1977 and 2000, during which time it is likely technology and procedure-related counselling changed. In most studies amniocentesis was performed under concurrent ultrasound guidance at 15-22 weeks gestation using a 20-22 gauge needle. Outcomes Most studies collected pregnancy outcome data from obstetric units and information regarding fetal loss from clinic databases. In Muller et al. (2002b), data were collected from six cytogenetic laboratories and contributing maternity units. Antsaklis (2000) collected information regarding risk factors for spontaneous miscarriage via a maternal interview. The interview was about retrospective and potentially sensitive topics (previous abortions, miscarriages and pregnancy complications) and could have been subject to recall or reporting bias. Two of the authors reviewed the maternal interviews and the presence or absence of risk factors was recorded. Information as to whether the interviewers or reviewers were blind to whether the woman had or did not have amniocentesis was not provided. The measure of fetal loss varied between studies. Two of the studies measured fetal loss within 28 or 30 days of the procedure or inclusion in the study for the control group (Antsaklis et al. 2002; Mungen et al. 2006). Four of the studies measured fetal loss up to a particular gestational week, varying between 22 and 28 weeks (Scott et al. 2002, Antsaklis et al. 2000; Muller et al. 2002b, Mungen et al. 2006). One study also compared losses occurring any time during the pregnancy (Mungen et al. 2006). PRIMARY RESEARCH: STUDY RESULTS Studies comparing 2nd trimester amniocentesis with no invasive testing Four studies compared the rates of fetal loss in women who had a 2nd trimester amniocentesis procedure and women who had no invasive testing. Mungen et al. (2006) found no difference in rates of fetal loss either within 30 days of the procedure or inclusion in the study, or before 28 weeks gestation in women who underwent AC or did not. Cases and controls were matched one-to-one for maternal age, parity, and number of prior spontaneous abortions. Matched pairs were excluded from analysis if either woman was lost to follow-up, had a fetus with an abnormal karyotype or a major malformation, or underwent repeated AC because of culture failure. Women with medical complications or elevated maternal serum levels were excluded from the study. The sample size was sufficient to detect differences of 1% or greater (80% power at a significance level of 0.05). However, the magnitude of increased fetal loss related to invasive testing is typically estimated to be between 0.5 -1%. This study was well designed but the sample size was not appropriate to examine these small differences and the authors’ concluded that they could only say from their study that the increased fetal loss in women who had an amniocentesis was less than 1% if there was any difference. Antsaklis et al. (2000) compared the rates of fetal loss before 28 weeks gestation in 20-34 year old women who underwent amniocentesis versus controls who did not. There were no significant differences between the study and control group in mean maternal age, mean number of previous pregnancies, history of abortion, or previous bleeding in current pregnancy. There was no information regarding gestational age in each of the groups. There was a significant difference (p<.01) in the fetal loss rate before 28 weeks in the control group (1.5%) compared to the AC group (2.1%). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 189 Antsaklis et al. (2000) also investigated the effects of predisposing risk factors on rates of fetal loss. When patients had a previous history of abortion or bleeding in the current pregnancy, there was a significant difference between the rate of fetal loss in the amniocentesis (5.9% and 6.9%) versus no amniocentesis (3.8% and 3.5%) groups. When risk factors were not present, there was no difference in fetal loss between the two groups (AC=0.96%, no AC = 0.93%). Data regarding the presence or absence of risk factors were collected by maternal interview and included retrospective information of a sensitive nature (e.g. previous abortions) and so may have been subject to recall or report bias. There was no mention of whether interviewers were blind to the group allocation of each patient. Muller et al. (2002b) utilized a large (n=54, 902) community-based maternal serum programme to compare the fetal loss rates in screened women who had a 2nd trimester amniocentesis with those who did not. The authors found a 0.8% increased risk of fetal loss in women who underwent 2nd trimester amniocentesis (1.12%) compared to those who did not (0.42%). However, there were significant differences between the two groups in mean maternal age (AC=33 years, no AC = 29 years) and maternal serum levels, with the AC group being selected based on a higher than 1:250 risk for DS. Both maternal age and elevated serum levels are associated with a higher rate of spontaneous miscarriage. The rate of fetal loss in the AC group included eight fetal losses which occurred prior to the procedure, thus overestimating the rate of loss following the procedure. Excluding these eight cases would have produced a fetal loss rate of 0.89% (95% CI 0.58 – 1.21), 0.47% higher than the control group. There was also a 7% loss to follow-up in this study. Scott et al. (2002) retrospectively reviewed the records of 2366 women referred for prenatal diagnosis at their clinic between 1995 and 1997. Women were mostly referred on the basis of age ≥ 35 years and following genetic counselling were offered the options of 2nd trimester AC, 1st trimester CVS or no invasive testing. Other complications or risk factors were not described and it is likely that women with lower overall risk of DS were more likely to opt for no invasive testing, and that women with a high risk for DS, based on maternal age and other factors, could be more likely to opt for early diagnosis with CVS. Alternatively, women at high risk of miscarriage might have been more likely to opt for a 2nd trimester AC procedure, with relatively lower risk, rather than CVS. Total fetal losses after the initial consultation (9-10 weeks) and before 22 weeks were compared and the authors reported a fetal loss rate of 8.4% in the control group and 3.1% in the AC group. Women were not randomly allocated to groups. Studies comparing 2nd trimester amniocentesis with CVS Two studies compared 2nd trimester amniocentesis with CVS. Scott et al. (2002) examined spontaneous miscarriages after AC or CVS and before 22 weeks gestational age and found a lower rate of fetal loss in the AC group (0.36%) compared to the CVS group (1.85%). However self-selection of groups meant that there is a possibility that lower risk women opted for AC. Transabdominal and transcervical CVS were also compared with a slightly higher rate of fetal loss in the transcervical CVS group (2.16% vs. 1.65%). However, the method of entry in CVS procedures was non-randomised and decided based on operator preference. Antsaklis et al. (2002) compared the rate of fetal loss in multiple pregnancies for women who opted for AC or CVS. There was no difference between the two groups in the rate of miscarriages either within four weeks of the procedure or after four weeks. However, the size of the sample (n=365) was most likely too small to detect differences of less than 1% so the results are inconclusive. Conclusion The findings of the five studies included in this section were inconclusive. There were several limitations in the studies including non-randomised allocation to groups, failure to exclude or control for existing conditions and significant differences between the study and comparison groups in age and maternal serum levels. In the best designed study, a matched case-control study of a low risk population, the sample size was too small to detect differences of the magnitude typically found in studies of procedure-related loss. The authors concluded that the difference, if any, between the amniocentesis and no amniocentesis group was less than 1%. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 190 The ideal study would randomly allocate women at risk for aneuploidy to undergo amniocentesis, chorionic villus sampling or no invasive testing. This would be unacceptable nowadays for ethical reasons and the only fully randomised study of this kind was conducted twenty years ago (Tabor (1986) as cited by Alfirevic (2006). The 1% increased risk of fetal loss following amniocentesis found in Tabor still stands as the best indication of procedure-related loss in a low risk population. The findings of the systematic review (Alfirevic et al. 2006) were that prenatal diagnosis is safest performed in the second trimester by amniocentesis, which is safer than both transcervical CVS and first trimester amniocentesis. Transabdominal CVS is the safest method for first trimester prenatal diagnosis, followed by transcervical CVS. No studies were identified in the search which investigated interventions to reduce rates of fetal loss. However, best practice guidelines recommend the use of concurrent ultrasound guidance and welltrained and experienced operators for both amniocentesis and CVS (RANZCOG, 2004). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 191 Table 23. Evidence table of primary research studies investigating the rate of fetal loss following invasive prenatal diagnostic procedures Source Setting Sample Interventions Outcomes Results Comments (Mungen et al. 2006) Cases were 2068 women with singleton pregnancies who underwent 2nd T AC. AC at 15-22 weeks GA Case-control study Perinatology unit of a university hospital AC group: Indications for AC MSS = 53% AMA = 38% Authors conclusions In our study there were no significant differences in fetal loss rates and major pregnancy complications between the study and control groups. Level of evidence III-2 February 1998 – June 2002 Fetal loss Premature delivery Small for gestational age infants Preeclampsia/eclampsia Placental abruptions Caesarean deliveries Istanbul, Turkey Controls were 2068 women selected from pregnant women with singleton pregnancies who either were not candidates for AC or declined the procedure for various reasons. Controls and cases matched one-to-one for maternal age, parity, number of prior spontaneous abortions Matched pairs were excluded from analysis if either woman was lost to follow-up, had a fetus with an abnormal karyotype or a major malformation, or underwent repeated AC because of culture failure. Exclusions applied to both study and control groups: Medical complications (hypertension, renal disease, pre-gestational diabetes etc.) Elevated maternal serum levels 22 gauge needle Continuous ultrasound guidance Compared for study and control groups Mean gestational age at AC: 17.5 +/- 1.9 weeks Majority of procedures performed between 15 and 18 weeks Mean maternal age 31.1 +/- 5.9 years Control group: Mean maternal age 30.9 +/- 5.7 years Fetal loss within 30 days of AC or inclusion in the study (%, 95% CI): AC group = 0.4% (0.1 – 0.6) Control = 0.3% (0.07 – 0.5) Difference = 0.1% (not sig.) Fetal loss before 28 weeks: AC group = 1.5% Control = 1.3% Difference = 0.2% (not sig.) Total fetal loss rates including spontaneous abortions and intrauterine fetal deaths/stillbirths: AC group = 2.3% (1.6 – 2.9) Control = 2.0% (1.4 – 2.6) Difference = 0.3% (not sig.) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Our study had sufficient power to detect a 1% increase in fetal loss rate, therefore, we can conclude that increased risk of fetal loss resulting from 2nd T AC, if any, is less than 1%. There was no significant difference in total fetal loss rates between the women who underwent one needle insertion and those requiring two punctures. Reviewers comments The authors acknowledged that the sample size was calculated on the assumption that AC related fetal loss risk was 1% and the background loss rate after 15 weeks gestation was 2%. The sample size did not have sufficient power to detect differences less than 1%. 192 Table 23. Evidence table of primary research studies investigating the rate of fetal loss following invasive prenatal diagnostic procedures (continued) Source Setting Sample Interventions Outcomes Results Comments (Antsaklis et al. 2002) Single centre maternity hospital fetal medicine unit, 1977-2000 n=365 1977 - 1990 women offered AC or fetal blood sampling based on gestational age Total Fetal loss 1977-2000 365 AC in twin pregnancies 347 included in analyses Authors conclusions Our data suggest that both AC and transabdominal CVS are equally safe, however, it is well recognised that the latter requires a more advanced level of expertise. Retrospective cohort Level of evidence III-2 University hospital Athens, Greece Data obtained from maternity units, obstetricians, or patients themselves Women with multiple gestations who attended the unit for prenatal diagnosis between 1977 and 2000 1 case lost to follow-up 17 ongoing pregnancies 12 cases have incomplete pregnancy outcome data Mean maternal age = 36.2 years (19-52) 1986 - 2000 patients presenting in the 1st trimester also offered option of CVS All patients were counseled regarding the risks of miscarriage/preterm delivery and options if their results were abnormal Detailed ultrasound performed prior to procedure - number of fetuses and placental and cord placement Miscarriages within 4 weeks of procedure Miscarriages more than 4 weeks after procedure 7 cases prior to 14 weeks (early AC) 340 between 16-21 weeks GA (2nd T AC) Indications for AC: AMA Previous hx of aneuploidy Abnormal serum biochemistry Increased NT Abnormal sonographic findings All procedures carried out by 5 operators Miscarriages within 4 weeks of procedure: AC 10/335 = 2.98% CVS 1/44 = 2.27% No significant difference AC Ultrasound guided for all procedures 22 gauge needle Transplacental entry in some cases where it could not be avoided Miscarriages more than 4 weeks after procedure: AC 4/335 = 1.19% CVS 1/44 = 2.27% No significant difference CVS Transabdominal in all cases 21 gauge needle and chorionic villi aspirated SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING The procedure-related risk was similar for both groups of AC and CVS. CVS appears to be as safe as AC and offers the advantage of earlier detection of defects. Reviewers comments Sample size is not big enough to detect differences of the magnitude typically found in studies of procedure-related loss. Technology and/or advice regarding procedure-related complications may have changed between 1977-2000 but the authors make no mention of any changes in uptake or complications in relation to the year of procedure. Early AC is associated with a higher rate of fetal loss than 2nd T AC – better to exclude early AC from analyses. 193 Table 23. Evidence table of primary research studies investigating the rate of fetal loss following invasive prenatal diagnostic procedures (continued) Source Setting Sample Interventions Outcomes Results Comments (Muller et al. 2002b) Maternal serum screening programme 54,902 women with singleton pregnancies MSS: AFP fßhCG /total hCG Fetal loss before 24 gestational weeks AC group: N= 3472 MSS high risk = 3151 Maternal anxiety or minor sonographic markers = 321 Authors conclusions The total rate of adverse outcome in the Ac group was 1.52% and 0.66% in controls, suggesting that AC carries an additional risk of 0.86%. Control group: N=47 004 MSS high risk = 2418 MSS low risk = 44 586 Indications for AC may have selected patients at higher risk of adverse outcome because high maternal serum markers are associated with adverse outcomes. % (95% CI) Fetal loss prior to 24 weeks: AC group = 1.12% (1.08 – 1.15) but 8 fetal deaths were ascertained by USS prior to AC and included in these rates Reviewers comments The authors acknowledged that the two groups differed significantly in maternal age (33 vs. 29 years), and that higher rates of adverse outcomes are observed in older patients. Control = 0.42% (0.41 - 0.43) Elevated maternal marker serum levels are also associated with higher rates of spontaneous miscarriage. The higher rate of fetal loss in the AC group could have been due to serum levels rather than the AC procedure. A comparison of the fetal loss rate in women who had high risk serum levels but declined AC and those who accepted AC would have been a better comparison. However it is possible that those women who declined AC were closer to the cut-off (i.e. lower risk) than those who accepted AC. Retrospective cohort Level of evidence III-2 6 accredited laboratories between 19971999 France Data sources: Data sheet completed for each patient by obstetrical units Median maternal age = 29 years (range 13-44 years). 97.8% < 35 years 4039 (7.35%) lost to follow-up 387 excluded because of severe fetal abnormalities Study group: N=3472 women who had an AC following screening for DS risk Mean maternal age = 33 years Control group: N= 47004 women who elected not to have an AC following screening for DS Mean maternal age = 29 years 2418 of these women had a DS risk of ≥ 1:250 but opted for no AC. DS risk calculated by MA + MSS 1:250 risk cut-off screen positive women were offered an amniocentesis. AC performed under ultrasound guidance using a 20 gauge needle AC between 15-24 gestational weeks (median 18 weeks) Premature delivery between 24 and 28 weeks Difference = 0.8% Premature delivery between 24 and 28 weeks: AC group = 0.40% (0.39 – 0.41) Control = 0.24% (0.23 – 0.25) Total adverse outcome: AC group = 1.52% Control = 0.66% Included losses diagnosed by USS prior to AC in the calculation of fetal losses so 1.12% likely to be an overestimation (acknowledged by authors) Karyotype of fetal death cases not reported. Cannot ascertain loss of normal fetuses as a result of AC. Initial sample 54 902 women of which 4039 (7.35%) were lost to follow-up. The authors acknowledged that this could be a source of bias but felt that it was not likely to have affected results in this study. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 194 Table 23. Evidence table of primary research studies investigating the rate of fetal loss following invasive prenatal diagnostic procedures (continued) Source Setting Sample Interventions Outcomes Results Comments (Scott et al. 2002) Retrospective cohort Specialised prenatal diagnosis practice N= 2366 women Women referred for counseling about prenatal diagnosis Following genetic counseling, women chose between no invasive testing, undergoing AC or undergoing CVS. There were no significant differences in maternal age or gestational age at consultation between any of the three groups. Level of evidence III-2 Sydney, Australia Total fetal loss (includes miscarriages after consultation but prior to procedure and terminations but not fetal losses after 22 weeks gestation) Authors conclusions Total loss rate for the AC group was 0.36% (procedure-related loss and spontaneous loss). PRL is less than 0.36% but we cannot establish how much less. The spontaneous miscarriage rate was high (8.4%) in this group of women, who were largely of advanced maternal age. The procedure-related loss rate for AC was less than 1/280. The miscarriage rate after transabdominal CVS appeared less than for transcervical CVS but this is not statistically significant. Data sources: Referring doctors and hospital records Indication for prenatal diagnosis AMA (≥ 35 years) in most cases 53 women (2.2%) were lost to follow-up Three groups: No invasive testing (n=346) Mean MA = 35.6 years Mean GA (at time of consultation) = 9.53 weeks CVS (n=1128) Mean MA = 38.5 years Mean GA (at time of consultation) = 9.44 weeks AC (n=839) Mean MA = 37.9 years Mean GA (at time of consultation) = 10 weeks CVS • performed under ultrasound guidance • 10 - 12 weeks GA • transabdominally (19 gauge needle) or transcervically depending on operator preference. AC • performed under ultrasound guidance • from 14+6 weeks GA • 22 gauge needle. Spontaneous miscarriage rates for each group Miscarriages defined as fetal loss up to 22 weeks GA (allowing 6 weeks or more from any procedure) % (95% CI) Total fetal losses: No invasive test 29/346 = 8.4% (5.5 – 11.3) CVS 71/1128 = 6.3% (4.9 – 7.7) AC: 26/839 = 3.1% (1.9 – 4.3) Spontaneous miscarriages after procedure before 22 weeks GA: AC 3/829 = 0.36% (-0.04 – 0.77) (excludes 10 miscarriages prior to procedure) Transabdominal CVS 11/665 = 1.65 % (0.68 – 2.62) Transcervical CVS: 9/416 = 2.16 (0.77 – 3.56) Total CVS: 20/1081 = 1.85% (1.05 – 2.65) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING Reviewers comments Greater risk of spontaneous miscarriage earlier in gestation so authors compared total loss from the time of the initial consultation. Because CVS and AC procedures were carried out at different gestational ages and fetal loss was measured up to 22 weeks, there was a longer period of time after a CVS procedure for losses to occur. Not a randomized study – sample opted for no testing, AC or CVS. The known PRL was 1/200 for AC and higher for CVS at the time of the study, and it is possible that lower risk patients were attracted to AC rather than CVS (acknowledged by authors). Those with previous miscarriages or bleeding may have decided against prenatal testing. Transabdominal and transcervical CVS was carried out based on ‘operator preference’. Higher risk women possibly allocated to TA CVS. Confidence intervals were wide due to a small sample size and small differences between groups. 195 Table 23. Evidence table of primary research studies investigating the rate of fetal loss following invasive prenatal diagnostic procedures (continued) Source Setting Sample Interventions Outcomes Results Comments (Antsaklis et al. 2000) Maternal fetal medicine unit All procedures carried out by 1 of 4 people Retrospective cohort Athens, Greece Study group: N=3910 women referred for prenatal diagnosis between 1992-1997. Women aged 20-34 years at EDD 3 risk factors were identified by the authors as being potentially responsible for increased perinatal loss: - hx of 3 or more 1st T abortions or 1 2nd T abortion - bleeding or spotting before AC - AC No significant differences between the study and control group in mean maternal age, mean number of previous pregnancies, at least 1 induced abortion, at least 1 spontaneous abortion, ≥ 3 previous abortions, previous bleeding in current pregnancy Authors conclusions Authors - both study patients and control women were in the same age group and thus equally prone to chromosomally abnormal fetuses. Both groups were screened during the 2nd T and had the same exclusion criteria. 1992-1997 Level of evidence III-2 Indications for prenatal diagnosis in study group: Anxiety Abnormal MSS result Abnormal sonographic findings Maternal infection with toxoplasma or cytomegalovirus Fetal karyotyping DNA analysis Known metabolic disease Exclusions: Women with multiple pregnancies Women with known uterine abnormalities Women with intrauterine contraceptive devices in situ Serious maternal hx of 2nd T miscarriage Control group: n=5324 women who were screened for DS between 16 and 20 weeks GA between 1992 and 1997. Women aged 20-34 years at EDD. Only pregnancies considered at low risk for DS or NTD were included. AC Continuous ultrasound guidance Complete anatomic fetal survey prior to procedure 21 gauge needle Control group MSS at 16-20 weeks GA AFP, ßhCG, uE3 Risk Factors: Data regarding risk factors were collected by a maternal interview. Medical secretary and midwives carried out interviews which were reviewed by 2 of the authors. The interview collected information regarding: Gravidity Parity History of 1st T spontaneous and induced abortions 2nd T complications Any sign of threatened abortion in current pregnancy Presence/absence of risk factors in control and study groups was recorded Fetal losses before 28 weeks GA in study and control groups Fetal loss in study and control groups with regard to the presence or absence of risk factors Fetal losses before 28th week: AC 79/3696 = 2.1% (95% CI 1.7-2.6) No AC 80/5324 = 1.5% (95% CI 1.2-1.8) Significant (p<.01) Previous bleeding AC 31/527 = 5.9% No AC 28/723 = 3.8% Previous abortions AC 18/258 = 6.9% No AC 12/334 = 3.5% No predisposing factors AC 30/3125 = 0.96% No AC 40/4267 = 0.93% The background risk in our population after the 16th week of pregnancy was 1.5%. Our study and control groups were comparable in terms of age, background factors and time of entry into the study. There was a 2.1% loss rate among women who had 2nd T AC, leaving a procedurerelated loss rate of 0.6%. This risk is directly related to risk factors such as previous hx of abortions or bleeding during current pregnancy. There was a statistically significant difference between the study group and controls when these factors were present. When these predisposing factors were not present, there was no significant difference in fetal losses between the study and control groups. Reviewers comments Data regarding the presence or absence of risk factors were collected by maternal interview and included retrospective information of a sensitive nature (e.g. previous abortions). May have been subject to recall bias. There was no mention of whether Interviewers and reviewers were blind to the group allocation of each patient. Indications for AC group included risk factors associated with higher spontaneous miscarriage (maternal serum levels). However, 35% of women in this group had an AC procedure due to anxiety. Other exclusion criteria were the same as those applied to the study group SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 196 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 197 Chapter 9: Discussion SUMMARY OF EVIDENCE Aim of the review This review aimed to systematically appraise the international evidence for the antenatal use of technologies and screening strategies for DS. Specific aims included assessing the validity of DS screening methods, any difficulties implementing screening strategies, the impact of screening on amniocentesis and chorionic villus sampling, and the safety of these procedures. Part A considered the validity of screening methods and difficulties implementing screening strategies. Part B considered the impact of screening on invasive testing and the safety of these procedures. Result of the search strategy The search strategy for Part A yielded 1138 articles. From 219 articles identified as potentially eligible for inclusion, a final group of 66 papers were selected for appraisal, all of which were primary research studies. The search strategy for part B yielded 1116 articles. From these 192 articles identified as potentially eligible for inclusion, a final group of 35 papers were selected for appraisal, all but one of which were primary research Key results Accuracy of screening methods Maternal age alone is not an appropriate screening test for DS. Screening methods that combine tests in an independent manner are not recommended. The quad test provided the best screening performance characteristics in the 2nd trimester. The combined test performed better than other 1st trimester and all 2nd trimester strategies, and serum integrated screening had a similar performance to the combined test. All other integrated and sequential screening strategies have improved performance compared to either 1st trimester or 2nd trimester screening strategies. Fully integrated screening performed marginally better than both stepwise and contingent screening, with a lower FPR for a fixed DR. The performance of stepwise and contingent screening is similar. Difficulties implementing any screening strategies NT measurement not being successful or taking more time than expected. NT requiring trained staff, high quality equipment, and quality control. Women defaulting from 2nd trimester maternal serum screening. Need to adjust for maternal weight, and for false positive results in previous pregnancies. Need for USS dating. Issues with serum marker reliability (assay drift, and inaccurate marker MoM). Inappropriate model parameters giving inaccurate risk estimates. Unique issues with screening in twin pregnancies. Uptake of invasive testing following receipt of screening results Maternal age (<35 years versus ≥ 35 years) appears to be a factor in the uptake of invasive testing following screening. Younger and older women may perceive their risk differently and utilise DS screening for different reasons. Women’s individual estimates of calculated risk appear to influence the uptake of invasive testing, with an increase in the uptake of invasive testing as the likelihood of carrying a fetus with DS increases. Perceived accuracy of the screening test and social, ethnic and cultural factors may influence uptake and a local study of the acceptability of both screening and invasive testing is needed. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 198 Changes in the rate of invasive testing with the introduction of a screening programme Overall, rates of invasive testing increased slightly with the introduction of screening programmes. This was against the backdrop of a steeper rise in maternal age and the increase in invasive testing was less than what would be expected if invasive tests were offered based on maternal age alone. Changes in rates of invasive testing varied as a function of maternal age. Rates of invasive testing decreased among older mothers (≥ 35 years) and increased among younger mothers (< 35 years) with the introduction of a screening programme. Safety of amniocentesis and chorionic villus sampling A high quality systematic review confirmed that a 1% increased risk of fetal loss (against a background risk of 2%) following amniocentesis in a low risk population is the best estimate of the rate of procedure-related loss. The recommendations of the review were that prenatal diagnosis is safest performed in the second trimester by amniocentesis, which is safer than both transcervical CVS and first trimester amniocentesis. Transabdominal CVS is the safest method for first trimester prenatal diagnosis, followed by transcervical CVS. CONCLUSIONS Amongst the papers identified to assess the validity of DS screening methods the evidence indicated that integrated and sequential screening had superior screening performance compared to screening confined to either 1st or 2nd trimester. The choice of screening method should be selected from integrated or sequential strategies (fully integrated, contingent or stepwise screening). The gestational age at which the serum sample is taken or NT measured, has an affect on the validity of the screening test. For example in the SURUSS study (Wald et al. 2003b) the combined test performed better at 10/40 than at 13/40. The evidence for the optimum timing of screening tests was not considered as part of this review, but this would need careful consideration by policy makers when implementing a Down syndrome screening programme. The optimum risk cut-off, for determining whether a screening result was screen positive or screen negative, is also outside the scope of this review. The cut-offs for contingent screening is particularly complicated as a first trimester high and low risk cut-off, and a 2nd trimester cut-off need to be determined, and each will effect screening performance. Papers that compared the same screening method before and after an adjustment (e.g. for maternal weight) were not included in the review unless they also included a comparison with another screening method fitting the review protocol criteria. Possible adjustments in Down syndrome screening include: maternal weight, which is now a standard component of screening strategies (Spencer et al. 2003a); smoking status (Spencer et al. 2004); and ethnicity (Spencer et al. 2005). The evidence considered for the assessment of validity of DS screening tests exhibited methodological limitations including: case-control studies used to determine validity of screening methods (therefore sample not necessarily representative of the potential screening population) studies with small sample sizes (and therefore small number of DS cases) which will limit the accuracy of the estimates of DR and FPR no details of parameters used or inappropriate parameters used for Gaussian distribution no blinding of investigators to pregnancy outcomes (case-control studies and retrospective analysis of stored samples) use of intervention studies particularly in comparing 1st trimester and 2nd trimester screening (for example where NT used clinically before 2nd trimester) population already screened for DS by another method not accounting for all fetal loss and/or few details of how live birth outcomes were obtained some modelling papers using the same primary data, similar modelling techniques, and similar assumptions which may mean the same or similar results are replicated. The omission of studies published prior to 2000 may cause publication bias, as some of these papers will contain evidence relevant to this review. However, integrated screening was not described until SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 199 1999 (Wald et al. 2006c), and the majority of papers comparing integrated and sequential screening with each other and/or with first and second trimester methods have been published during or after the year 2000. There are a number of considerations apart from the validity of a screening strategy which are important when deciding between integrated or sequential screening strategies. While integrated screening may have a better performance it does have some practical disadvantages when compared to sequential screening strategies. One potential drawback is the need to withhold results of 1st trimester screening until the 2nd trimester, and the resulting delay in diagnosis of DS until the 2nd trimester. This may not be acceptable to women or clinicians (Lam et al. 2002). Another consideration is that all women will need both 1st and 2nd trimester screening for integrated screening, while for stepwise and contingent screening most DS cases will be detected in the 1st trimester and fewer women will need to have 2nd trimester screening (Palomaki et al. 2006). Another consideration is that if women default from 2nd trimester screening and the intention has been for them to have integrated screening, they will only have had a PAPP-A and NT assayed in the 1st trimester, as opposed to contingent or stepwise screening when they would have had the combined test. However, fβhCG could be measured in any left over serum sample in this instance, allowing a combined test to be completed (Wald et al. 2006c). Stepwise and contingent strategies also have some disadvantages. For both stepwise and contingent strategies women will be given more than one risk estimate which may be confusing and distressing (Lam et al. 2002). While integrated screening avoids detecting DS cases that would have miscarried between 1st trimester and 2nd trimester (Lam et al. 2002), stepwise or contingent strategies will detect these cases and will involve ToPs that may not have been necessary. The design of contingent screening is complicated and screening planners need to choose three risk cut-off levels for this screening: a very low and very high cut-off based on 1st trimester results, and a final cut-off based on all markers after 2nd trimester serum screening (Benn et al. 2005a; Wright et al. 2006). Also, integrated screening has the advantage of allowing all women to have 2nd trimester screening for neural tube defects as all women will have AFP in the 2nd trimester (Lam et al. 2002). Sequential screening and integrated screening have not yet been implemented in large population-based screening programmes (Wald et al. 2006c) and so some issues with their implementation may not yet be apparent. Policy makers may need to await the results of any evaluation of the implementation of such screening programmes. The implications of resources for training, monitoring and quality control particularly for NT testing would need to be carefully considered when determining whether a screening programme should proceed. It has been suggested that in the UK, implementation of a screening programme including NT measurement may be difficult due to a shortage of well trained ultrasonographers and because of issues with quality assurance (Neilson and Alfirevic 2006). It will also be important to ensure software is able to accurately determine an individual’s risk of DS, including calculation of correct MoMs, adjustments (including for maternal weight, false positive results from previous pregnancies) and use of appropriate population parameters for Gaussian distributions. Also, USS should be used to estimate gestational age. Amongst the papers identified to assess the uptake of invasive testing following a positive or negative screening result for DS risk, it appears that both maternal age and individual risk estimates are both factors in women’s decisions to opt for invasive testing or not. Younger and older mothers may perceive their level of risk differently and so utilise screening for different reasons. These perceptions and motivations may then effect the decisions they make based on their screening results. Other factors may also play a part, such as perceived confidence in the accuracy of the screening test, and there seems to be culturally-based variation in the uptake of both screening and invasive testing. The overall rate of invasive testing seems to increase slightly with the introduction of a screening programme but when examined by maternal age group, it appears that the rate of invasive testing increases in younger mothers and decreases in older mothers. If the proportion of older mothers accepting invasive testing declines with the introduction of a screening programme, changing patterns SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 200 in maternal age are likely to be an important determinant of invasive testing rates if such a programme was to be introduced. Long-term evaluations of population-based screening programmes using integrated and sequential methods have not been completed as yet, so the effects of these strategies on invasive testing rates are unclear at this stage. Relative to a screening programme based on maternal age alone, the introduction of first or second trimester screening methods decreases the rate of unnecessary invasive procedures. Again, social, ethnic and cultural variability in the acceptability of screening and invasive testing is a factor in the impact of screening policy changes. Because of this variability, a local study investigating the acceptability of screening and invasive testing in older and younger mothers of different ethnic backgrounds is needed before implementing a large-scale screening programme in New Zealand. The findings of the papers identified to assess the rate of fetal loss associated with invasive prenatal diagnosis were inconclusive. There were several limitations in the studies including non-randomised allocation to groups, failure to exclude or control for existing conditions and significant differences between the study and comparison groups in age and maternal serum levels. The only completely randomised study found a 1% increased risk of fetal loss following second trimester amniocentesis and this estimate still stands as the best indication of procedure-related loss in a low risk population. Other complications have been investigated in relation to amniocentesis and CVS, such as infection, amniotic fluid leakage, vaginal bleeding, needle injury to the fetus and orthopaedic problems in infancy and childhood; however these were beyond the scope of this review. No studies were identified in the search which investigated interventions to reduce rates of fetal loss. However, best practice guidelines recommend the use of concurrent ultrasound guidance and well-trained and experienced operators for both amniocentesis and CVS (RANZCOG, 2004). The recommendations of a systematic review (Alfirevic et al. 2006) included information about the most appropriate procedure for each gestational age and the relative safety of each method of needle insertion (transcervical or transabdominal). While this review is concerned with the validity of screening strategies and difficulties implementing screening, there are a number of other considerations when deciding on which screening strategy to use for DS screening. These include cost, availability of different screening tests, and the acceptability of screening for clinicians and women, including the acceptability for women of different ethnicities. A woman’s decision to undergo screening or invasive testing could be influenced on a societal level by cultural, ethnic, religious, and social factors and on a personal level by previous pregnancy history, personal values and beliefs, and the advice of family, friends and her health provider. Another consideration is the gestational age at which women need to attend for screening. In the USA, only 75% to 80% of women present in the first trimester for antenatal care (Fuchs and Peipert 2005). In New Zealand women choose a lead maternity carer (usually a midwife rather than a GP) and if this choice is delayed beyond the appropriate gestational window there will be major implications for DS screening. There are also other promising screening tests/diagnostic tests which policy makers should be aware of. Other ultrasound markers such as absence of nasal bone, abnormal ductus venosus flow, and tricuspid regurgitation are being assessed (Neilson and Alfirevic 2006) and researchers are continuing to investigate the use of fetal cells in maternal circulation as a “non-invasive” diagnostic test for antenatal DS. Further research should address limitations in study design demonstrated in this review. Where possible, research on the accuracy of screening for DS should be based on large prospective cohort studies (or nested case-control studies). However, the current evidence based on the accuracy of screening supports the use of fully integrated, contingent or stepwise screening. If women present in the 2nd trimester the quad test is best and if women request screening only in the 1st trimester, the combined test has the best performance (Wald et al. 2003b). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 201 References Alfirevic, Z., Sundberg, K., & Brigham, S. (2006). Amniocentesis and chorionic villus sampling for prenatal diagnosis. Cochrane Database of Systematic Reviews, Issue 3. Antsaklis, A., Papantoniou, N., Xygakis, A., Mesogitis, S., Tzortzis, E., & Michalas, S. (2000). Genetic amniocentesis in women 20-34 years old: associated risks. Prenatal Diagnosis, 20, 247-250. Antsaklis, A., Souka, A. P., Daskalakis, G., Kavalakis, Y., & Michalas, S. (2002). Second-trimester amniocentesis vs. chorionic villus sampling for prenatal diagnosis in multiple gestations. Ultrasound in Obstetrics & Gynecology, 20, 476-481. Audibert, F., Dommergues, M., Benattar, C., Taieb, J., Thalabard, J. C., & Frydman, R. (2001). Screening for Down syndrome using first-trimester ultrasound and second-trimester maternal serum markers in a low-risk population: a prospective longitudinal study. Ultrasound in Obstetrics & Gynecology, 18, 26-31. Avgidou, K., Papageorghiou, A., Bindra, R., Spencer, K., & Nicolaides, K. H. (2005). Prospective first-trimester screening for trisomy 21 in 30,564 pregnancies. American Journal of Obstetrics & Gynecology, 192, 1761-1767. Azuma, M., Yamamoto, R., Wakui, Y., Minobe, S., Satomura, S., & Fujimoto, S. (2002). A novel method for the detection of Down syndrome with the use of four serum markers. American Journal of Obstetrics & Gynecology, 187, 197-201. Babbur, V., Lees, C. C., Goodburn, S. F., Morris, N., Breeze, A. C., & Hackett, G. A. (2005). Prospective audit of a one-centre combined nuchal translucency and triple test programme for the detection of trisomy 21. Prenatal Diagnosis, 25, 465-469. Baviera, G., Carbone, C., Corrado, F., & Mastrantonio, P. (2004). Placental growth hormone in Down's syndrome screening. Journal of Maternal-Fetal & Neonatal Medicine, 16, 241-243. Benn, P., & Donnenfeld, A. E. (2005). Sequential Down syndrome screening: the importance of first and second trimester test correlations with calculating risk. Journal of Genetic Counseling, 14, 409-413. Benn, P., Wright, D., & Cuckle, H. (2005a). Practical strategies in contingent sequential screening for Down syndrome. Prenatal Diagnosis, 25, 645-652. Benn, P. A., Egan, J. F., Fang, M., & Smith-Bindman, R. (2004). Changes in the utilization of prenatal diagnosis. Obstetrics & Gynecology, 103, 1255-1260. Benn, P. A., Fang, M., Egan, J. F., Horne, D., & Collins, R. (2003). Incorporation of inhibin-A in second-trimester screening for Down syndrome. Obstetrics & Gynecology, 101, 451-454. Benn, P. A., Fang, M., & Egan, J. F. X. (2005b). Trends in the use of second trimester maternal serum screening from 1991 to 2003. Genetics in Medicine, 7, 328-331. Benn, P. A., Ying, J., Beazoglou, T., & Egan, J. F. (2001). Estimates for the sensitivity and falsepositive rates for second trimester serum screening for Down syndrome and trisomy 18 with adjustment for cross-identification and double-positive results. Prenatal Diagnosis, 21, 46-51. Bindra, R., Heath, V., Liao, A., Spencer, K., & Nicolaides, K. H. (2002). One-stop clinic for assessment of risk for trisomy 21 at 11-14 weeks: a prospective study of 15 030 pregnancies. Ultrasound in Obstetrics & Gynecology, 20, 219-225. Chang, T. C. (2006). Antenatal screening for Down syndrome in New Zealand: time for a national screening policy? Australian & New Zealand Journal of Obstetrics & Gynaecology, 46, 9296. Chasen, S. T., McCullough, L. B., & Chervenak, F. A. (2004). Is nuchal translucency screening associated with different rates of invasive testing in an older obstetric population? American Journal of Obstetrics and Gynecology, 190, 769-774. Cheffins, T., Chan, A., Haan, E. A., Ranieri, E., Ryall, R. G., Keane, R. J., Byron-Scott, R., et al. (2000). The impact of maternal serum screening on the birth prevalence of Down's syndrome and the use of amniocentesis and chorionic villus sampling in South Australia. BJOG: An International Journal of Obstetrics & Gynaecology, 107, 1453-1459. Chen, C. P., Lin, C. J., & Wang, W. (2005). Impact of second-trimester maternal serum screening on prenatal diagnosis of Down syndrome and the use of amniocentesis in the Taiwanese population. Taiwanese Journal of Obstetrics & Gynecology, 44, 31-35. Chen, J., Heffley, D., Beazoglou, T., & Benn, P. (2000). Utilization of amniocentesis by women screening positive for Down syndrome on the second-trimester triple test. Community Genetics, 3. 24-30 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 202 Christiansen, M., & Jaliashvili, I. (2003). Total pregnancy-associated plasma protein A--a first trimester maternal serum marker for Down's syndrome: clinical and technical assessment of a poly-monoclonal enzyme immunoassay. Scandinavian Journal of Clinical & Laboratory Investigation, 63, 407-415. Christiansen, M., & Olesen Larsen, S. (2002). An increase in cost-effectiveness of first trimester maternal screening programmes for fetal chromosome anomalies is obtained by contingent testing. Prenatal Diagnosis, 22, 482-486. Crossley, J. A., Aitken, D. A., Cameron, A. D., McBride, E., & Connor, J. M. (2002). Combined ultrasound and biochemical screening for Down's syndrome in the first trimester: a Scottish multicentre study. BJOG: An International Journal of Obstetrics & Gynaecology, 109, 667676. Cuckle, H., Benn, P., & Wright, D. (2005). Down syndrome screening in the first and/or second trimester: model predicted performance using meta-analysis parameters. Seminars in Perinatology, 29, 252-257. Cuckle, H. S. (2003). Updated modelling parameters for down's syndrome screening. Balkan Journal of Medical Genetics, 6, 101-107. De Biasio, P., Ferrero, S., Prefumo, F., Canini, S., Marchini, P., Bruzzone, I., Ginocchio, G., et al.. (2001). Down's syndrome: first trimester approach. Italian Journal of Gynaecology & Obstetrics, 13, 22-26. de la Vega, A., Verdiales, M., & Leavitt, G. (2002). Method of risk assessment affects acceptance rate of amniocentesis. Puerto Rico Health Sciences Journal, 21, 233-235. Deeks, J. J. (2001). Systematic reviews in health care: Systematic reviews of evaluations of diagnostic and screening tests. BMJ, 323, 157-162. Dixon, J., Pillai, M., Mahendran, D., & Brooks, M. (2004). An assessment of the Down syndrome antenatal screening policies of East and West Gloucestershire between 1993 and 1999. Journal of Obstetrics & Gynaecology, 24, 760-764. Dommergues, M., Audibert, F., Benattar, C., Champagne, C., Gomel, V., & Frydman, R. (2001). Is routine amniocentesis for advanced maternal age still indicated? Fetal Diagnosis & Therapy, 16, 372-377. Fuchs, K. M., & Peipert, J. F. (2005). First trimester Down syndrome screening: public health implications. Seminars in Perinatology, 29, 267-271. Gasiorek-Wiens, A., Tercanli, S., Kozlowski, P., Kossakiewicz, A., Minderer, S., Meyberg, H., Kamin, G., et al. (2001). Screening for trisomy 21 by fetal nuchal translucency and maternal age: a multicenter project in Germany, Austria and Switzerland. Ultrasound in Obstetrics & Gynecology, 18, 645-648. Gonce, A., Borrell, A., Fortuny, A., Casals, E., Martinez, M. A., Mercade, I., Cararach, V., & Vanrell, J. A. (2005). First-trimester screening for trisomy 21 in twin pregnancy: does the addition of biochemistry make an improvement? Prenatal Diagnosis, 25, 1156-1161. Gyselaers, W. J., Vereecken, A. J., van Herck, E., Straetmans, D. P., de Jonge, E. T., Ombelet, W. U., & Nijhuis, J. G. (2004a). Single-step maternal serum screening for trisomy 21 in the era of combined or integrated screening. Gynecologic & Obstetric Investigation, 58, 221-224. Gyselaers, W. J., Vereecken, A. J., Van Herck, E. J., Straetmans, D. P., de Jonge, E. T., Ombelet, W. U., & Nijhuis, J. G. (2005). Population screening for fetal trisomy 21: easy access to screening should be balanced against a uniform ultrasound protocol. Prenatal Diagnosis, 25, 984-990. Gyselaers, W. J., Vereecken, A. J., Van Herck, E. J., Straetmans, D. P., Martens, G. E., de Jonge, E. T., Ombelet, W. U., et al. (2004b). Screening for trisomy 21 in Flanders: a 10 years review of 40.490 pregnancies screened by maternal serum. European Journal of Obstetrics, Gynecology, & Reproductive Biology, 115, 185-189. Hackshaw, A. K., & Wald, N. J. (2001). Inaccurate estimation of risk in second trimester serum screening for Down syndrome among women who have already had first trimester screening. Prenatal Diagnosis, 21, 741-746. Hadlow, N. C., Hewitt, B. G., Dickinson, J. E., Jacoby, P., & Bower, C. (2005). Community-based screening for Down's Syndrome in the first trimester using ultrasound and maternal serum biochemistry. BJOG: An International Journal of Obstetrics & Gynaecology, 112, 1561-1564. Hallahan, T., Krantz, D., Orlandi, F., Rossi, C., Curcio, P., Macri, S., Larsen, J., Buchanan, P., & Macri, J. (2000). First trimester biochemical screening for Down syndrome: free beta hCG versus intact hCG. Prenatal Diagnosis, 20, 785-789;discussion 790-781. Harrison, G., & Goldie, D. (2006). Second-trimester Down's syndrome serum screening: double, triple or quadruple marker testing? Annals of Clinical Biochemistry, 43, 67-72. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 203 Herman, A., Dreazen, E., Tovbin, J., Weinraub, Z., Bukovsky, Y., & Maymon, R. (2002). Comparison between disclosure and non-disclosure approaches for trisomy 21 screening tests. Human Reproduction, 17, 1358-1362. Huderer-Duric, K., Skrablin, S., Kuvacic, I., Sonicki, Z., Rubala, D., & Suchanek, E. (2000). The triple-marker test in predicting fetal aneuploidy: a compromise between sensitivity and specificity. European Journal of Obstetrics, Gynecology, & Reproductive Biology, 88, 49-55. Ilgin-Ruhi, H., Yurur-Kutlay, N., Tukun, A., & Bokesoy, I. (2005). The role of genetic counseling on decisions of pregnant women aged 35 years or over regarding amniocentesis in Turkey. European Journal of Medical Genetics, 48, 13-19. Irwig, L., & Glaziou, P. (1996). Screening and diagnostic tests: the Cochrane Methods Working Group on Systematic Review of Screening and Diagnostic Tests: recommended methods: Cochrane Collaboration. Available from: http://www.cochrane.org/docs/sadtdoc1.htm. Irwig, L., Tosteson, A. N., Gatsonis, C., Lau, J., Colditz, G., Chalmers, T. C., & Mosteller, F. (1994). Guidelines for meta-analyses evaluating diagnostic tests. Annals of Internal Medicine, 120, 667-676. Jaeschke, R., Guyatt, G., & Sackett, D. L. (1994). Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA, 271, 389-391. Jou, H. J., Kuo, Y. S., Hsu, J. J., Shyu, M. K., Hsieh, T. T., & Hsieh, F. J. (2005). The evolving national birth prevalence of Down syndrome in Taiwan. a study on the impact of secondtrimester maternal serum screening. Prenatal Diagnosis, 25, 665-670. Khoshnood, B., Blondel, B., De Vigan, C., & Breart, G. (2003). Effects of maternal age and education on the pattern of prenatal testing: implications for the use of antenatal screening as a solution to the growing number of amniocenteses. American Journal of Obstetrics & Gynecology, 189, 1336-1342. Knight, G. J., Palomaki, G. E., Neveux, L. M., Smith, D. E., Kloza, E. M., Pulkkinen, A. J., Williams, J., et al. (2005). Integrated serum screening for Down syndrome in primary obstetric practice. Prenatal Diagnosis, 25, 1162-1167. Krantz, D. A., Hallahan, T. W., Orlandi, F., Buchanan, P., Larsen, J. W., Jr., & Macri, J. N. (2000). First-trimester Down syndrome screening using dried blood biochemistry and nuchal translucency. Obstetrics & Gynecology, 96, 207-213. Laigaard, J., Sorensen, T., Frohlich, C., Pedersen, B. N., Christiansen, M., Schiott, K., Uldbjerg, N.,et al.. (2003). ADAM12: a novel first-trimester maternal serum marker for Down syndrome. Prenatal Diagnosis, 23, 1086-1091. Lam, Y. H., Lee, C. P., Sin, S. Y., Tang, R., Wong, H. S., Wong, S. F., Fong, D. Y. T., et al. (2002). Comparison and integration of first trimester fetal nuchal translucency and second trimester maternal serum screening for fetal Down syndrome. Prenatal Diagnosis, 22, 730-735. Lam, Y. H., Tang, M. H., Lee, C. P., Sin, S. Y., Tang, R., Wong, H. S., & Wong, S. F. (2000). Acceptability of serum screening as an alternative to cytogenetic diagnosis of down syndrome among women 35 years or older in Hong Kong. Prenatal Diagnosis, 20, 487-490. Malone, F. D., Canick, J. A., Ball, R. H., Nyberg, D. A., Comstock, C. H., Bukowski, R., Berkowitz, R. et al. (2005). First-trimester or second-trimester screening, or both, for Down's syndrome. New England Journal of Medicine, 353, 2001-2011. Marini, T., Sullivan, J., & Naeem, R. (2002). Decisions about amniocentesis by advanced maternal age patients following maternal serum screening may not always correlate clinically with screening results: Need for improvement in informed consent process. American Journal of Medical Genetics, 109, 171-175. Marsk, A., Grunewald, C., Saltvedt, S., Valentin, L., & Almstrom, H. (2006). If nuchal translucency screening is combined with first-trimester serum screening the need for fetal karyotyping decreases. Acta Obstetricia et Gynecologica Scandinavica, 85, 534-538. Maymon, R., Cuckle, H., Jones, R., Reish, O., Sharony, R., & Herman, A. (2005). Predicting the result of additional second-trimester markers from a woman's first-trimester marker profile: A new concept in Down syndrome screening. Prenatal Diagnosis, 25, 1102-1106. Michailidis, G. D., Spencer, K., & Economides, D. L. (2001). The use of nuchal translucency measurement and second trimester biochemical markers in screening for Down's syndrome. BJOG: An International Journal of Obstetrics & Gynaecology, 108, 1047-1052. Montalvo, J., Gomez, M. L., Ortega, M. D., Soler, P., Herraiz, I., & Herraiz, M. A. (2005). First trimester combined screening for chromosomal defects: our results in a population with a high percent of women aged 35 or older. Ultrasound Review of Obstetrics & Gynecology, 5, 178185. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 204 Morris, J. K., & Wald, N. J. (2005). Graphical presentation of distributions of risk in screening. Journal of Medical Screening, 12, 155-160. Mueller, V. M., Huang, T., Summers, A. M., & Winsor, S. H. M. (2005). The influence of risk estimates obtained from maternal serum screening on amniocentesis rates. Prenatal Diagnosis, 25, 1253-1257. Muggli, E. E., & Halliday, J. L. (2004). Prenatal diagnostic testing and Down Syndrome in Victoria 1992--2002. Australian & New Zealand Journal of Public Health, 28, 465-470. Muller, F., Benattar, C., Audibert, F., Roussel, N., Dreux, S., & Cuckle, H. (2003a). First-trimester screening for Down syndrome in France combining fetal nuchal translucency measurement and biochemical markers. Prenatal Diagnosis, 23, 833-836. Muller, F., Dreux, S., Dupoizat, H., Uzan, S., Dubin, M. F., Oury, J. F., Dingeon, B., & Dommergues, M. (2003b). Second-trimester Down syndrome maternal serum screening in twin pregnancies: impact of chorionicity. Prenatal Diagnosis, 23, 331-335. Muller, F., Forestier, F., & Dingeon, B. (2002a). Second trimester trisomy 21 maternal serum marker screening. results of a countrywide study of 854 902 patients. Prenatal Diagnosis, 22, 925929. Muller, F., Thibaud, D., Poloce, F., Gelineau, M. C., Bernard, M., Brochet, C., Millet, C., et al. (2002b). Risk of amniocentesis in women screened positive for Down syndrome with second trimester maternal serum markers. Prenatal Diagnosis, 22, 1036-1039. Mungen, E., Tutuncu, L., Muhcu, M., & Yergok, Y. Z. (2006). Pregnancy outcome following secondtrimester amniocentesis: a case-control study. American Journal of Perinatology, 23, 25-30. National Health Committee (2003). Screening to improve health in New Zealand: criteria to assess screening programmes. Wellington: National Health Committee. National Health and Medical Research Council (NHMRC) (2000). How to use the evidence: assessment and application of scientific evidence. Canberra: NHMRC. National Institute of Clinical Excellence (NICE) (2003). Antenatal care: routine care for the healthy pregnant woman., Clinical Guideline 6. London: NICE. Available from: http://www.nice.org.uk/guidance/CG6/niceguidance/pdf/English Neilson, J. P., & Alfirevic, Z. (2006). Optimising prenatal diagnosis of Down's syndrome. British Medical Journal, 332, 433-434. Nicolaides, K. H. (2004). Nuchal translucency and other first-trimester sonographic markers of chromosomal abnormalities. American Journal of Obstetrics & Gynecology, 191, 45-67. Nicolaides, K. H., Chervenak, F. A., McCullough, L. B., Avgidou, K., & Papageorghio, A. (2005). Evidence-based obstetric ethics and informed decision-making by pregnant women about invasive diagnosis after first-trimester assessment of risk for trisomy 21. American Journal of Obstetrics and Gynecology, 193, 322-326. Niemimaa, M., Suonpaa, M., Perheentupa, A., Seppala, M., Heinonen, S., Laitinen, P., Ruokonen, A., et al. (2001). Evaluation of first trimester maternal serum and ultrasound screening for Down's syndrome in Eastern and Northern Finland. European Journal of Human Genetics, 9, 404408. O'Connell, M. P., Holding, S., Morgan, R. J., & Lindow, S. W. (2000). Biochemical screening for Down syndrome: patients' perception of risk. International Journal of Gynaecology & Obstetrics, 68, 215-218. O'Leary, P., Breheny, N., Dickinson, J. E., Bower, C., Goldblatt, J., Hewitt, B., Murch, A., et al. (2006). First-trimester combined screening for Down syndrome and other fetal anomalies. Obstetrics & Gynecology, 107, 869-876. Palomaki, G. E., Knight, G. J., Neveux, L. M., Pandian, R., & Haddow, J. E. (2005). Maternal serum invasive trophoblast antigen and first-trimester Down syndrome screening. Clinical Chemistry, 51, 1499-1504. Palomaki, G. E., Neveux, L. M., Knight, G. J., Haddow, J. E., & Pandian, R. (2004). Maternal serum invasive trophoblast antigen (hyperglycosylated hCG) as a screening marker for Down syndrome during the second trimester. Clinical Chemistry, 50, 1804-1808. Palomaki, G. E., Steinort, K., Knight, G. J., & Haddow, J. E. (2006). Comparing three screening strategies for combining first- and second-trimester Down syndrome markers. Obstetrics & Gynecology, 107, 367-375. Pandian, R., Cole, L. A., & Palomaki, G. E. (2004). Second-trimester maternal serum invasive trophoblast antigen: a marker for Down syndrome screening. Clinical Chemistry, 50, 14331435. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 205 Platt, L. D., Greene, N., Johnson, A., Zachary, J., Thom, E., Krantz, D., Simpson, J. L., et al. (2004). Sequential pathways of testing after first-trimester screening for trisomy 21. Obstetrics & Gynecology, 104, 661-666. Rahim, R. R., Cuckle, H. S., Sehmi, I. K., & Jones, R. G. (2002). Compromise ultrasound dating policy in maternal serum screening for Down syndrome. Prenatal Diagnosis, 22, 1181-1184. Rode, L., Wojdemann, K. R., Shalmi, A. C., Larsen, S. O., Sundberg, K., Norgaard-Pedersen, B., Christiansen, M., et al. (2003). Combined first- and second-trimester screening for Down syndrome: an evaluation of proMBP as a marker. Prenatal Diagnosis, 23, 593-598. Royal Australian and New Zealand College of Obstetricians and Gynaecologists (RANZCOG) (2004). Joint HGSA/RANZCOG recommended best practice guidelines on antenatal screening for Down syndrome and other fetal aneuploidy. C-Obs4. The College. Available from: http://www.ranzcog.edu.au/publications/statements/C-obs4.pdf accessed on 24.01.07 Rozenberg, P., Malagrida, L., Cuckle, H., Durand-Zaleski, I., Nisand, I., Audibert, F., Benattar, C., et al. (2002). Down's syndrome screening with nuchal translucency at 12(+0)-14(+0) weeks and maternal serum markers at 14(+1)-17(+0) weeks: a prospective study. Human Reproduction, 17, 1093-1098. Saltvedt, S., Almstrom, H., Kublickas, M., Valentin, L., Bottinga, R., Bui, T. H., Cederholm, M., et al. (2005). Screening for Down syndrome based on maternal age or fetal nuchal translucency: a randomized controlled trial in 39,572 pregnancies. Ultrasound in Obstetrics & Gynecology, 25, 537-545. Sancken, U., & Bahner, D. (2003). Comparison of triple-risk assessment of fetal trisomy 21 including total human choriogonadotropin (hCG) or its free beta-subunit (free beta hCG). Fetal Diagnosis & Therapy, 18, 122-127. Schuchter, K., Hafner, E., Stangl, G., Metzenbauer, M., Hofinger, D., & Philipp, K. (2002). The first trimester 'combined test' for the detection of Down syndrome pregnancies in 4939 unselected pregnancies. Prenatal Diagnosis, 22, 211-215. Schuchter, K., Hafner, E., Stangl, G., Ogris, E., & Philipp, K. (2001). Sequential screening for trisomy 21 by nuchal translucency measurement in the first trimester and maternal serum biochemistry in the second trimester in a low-risk population. Ultrasound in Obstetrics & Gynecology, 18, 23-25. Scott, F., Peters, H., Bonifacio, M., McLennan, A., Boogert, A., Kesby, G., & Anderson, J. (2004). Prospective evaluation of a first trimester screening program for Down syndrome and other chromosomal abnormalities using maternal age, nuchal translucency and biochemistry in an Australian population. Australian & New Zealand Journal of Obstetrics & Gynaecology, 44, 205-209. Scott, F., Peters, H., Boogert, T., Robertson, R., Anderson, J., McLennan, A., Kesby, G., et al. (2002). The loss rates for invasive prenatal testing in a specialised obstetric ultrasound practice. Australian & New Zealand Journal of Obstetrics & Gynaecology, 42, 55-58. Shohat, M., Frimer, H., Shohat-Levy, V., Esmailzadeh, H., Appelman, Z., Ben-Neriah, Z., Dar, H.,et al. (2003). Prenatal diagnosis of Down syndrome: Ten year experience in the Israeli population. American Journal of Medical Genetics, 122 A, 215-222. Slack, C., Lurix, K., Lewis, S., & Lichten, S. (2006). Prenatal genetics: the evolution and future directions of screening and diagnosis. Journal of Perinatal and Neonatal Nursing, 20, 93-97. Smith-Bindman, R., Chu, P., Bacchetti, P., Waters, J. J., Mutton, D., & Alberman, E. (2003). Prenatal screening for Down syndrome in England and Wales and population-based birth outcomes. American Journal of Obstetrics & Gynecology, 189, 980-985. Soergel, P., Pruggmayer, M., Schwerdtfeger, R., Muhlhaus, K., & Scharf, A. (2006). Screening for trisomy 21 with maternal age, fetal nuchal translucency and maternal serum biochemistry at 11-14 weeks: A regional experience from Germany. Fetal Diagnosis & Therapy, 21, 264-268. Spencer, K., Berry, E., Crossley, J. A., Aitken, D. A., & Nicolaides, K. H. (2000a). Is maternal serum total hCG a marker of trisomy 21 in the first trimester of pregnancy? Prenatal Diagnosis, 20, 311-317. Spencer, K., Bindra, R., Cacho, A. M., & Nicolaides, K. H. (2004). The impact of correcting for smoking status when screening for chromosomal anomalies using maternal serum biochemistry and fetal nuchal translucency thickness in the first trimester of pregnancy. Prenatal Diagnosis, 24, 169-173. Spencer, K., Bindra, R., & Nicolaides, K. H. (2003a). Maternal weight correction of maternal serum PAPP-A and free beta-hCG MoM when screening for trisomy 21 in the first trimester of pregnancy. Prenatal Diagnosis, 23, 851-855. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 206 Spencer, K., & Cuckle, H. S. (2002). Screening for chromosomal anomalies in the first trimester: does repeat maternal serum screening improve detection rates? Prenatal Diagnosis, 22, 903-906. Spencer, K., Heath, V., El-Sheikhah, A., Ong, C. Y., & Nicolaides, K. H. (2005). Ethnicity and the need for correction of biochemical and ultrasound markers of chromosomal anomalies in the first trimester: a study of Oriental, Asian and Afro-Caribbean populations. Prenatal Diagnosis, 25, 365-369. Spencer, K., Spencer, C. E., Power, M., Dawson, C., & Nicolaides, K. H. (2003b). Screening for chromosomal abnormalities in the first trimester using ultrasound and maternal serum biochemistry in a one-stop clinic: a review of three years prospective experience. BJOG: An International Journal of Obstetrics & Gynaecology, 110, 281-286. Spencer, K., Spencer, C. E., Power, M., Moakes, A., & Nicolaides, K. H. (2000b). One stop clinic for assessment of risk for fetal anomalies: a report of the first year of prospective screening for chromosomal anomalies in the first trimester. BJOG: An International Journal of Obstetrics & Gynaecology, 107, 1271-1275. Spencer, K., Talbot, J. A., & Abushoufa, R. A. (2002). Maternal serum hyperglycosylated human chorionic gonadotrophin (HhCG) in the first trimester of pregnancies affected by Down syndrome, using a sialic acid-specific lectin immunoassay. Prenatal Diagnosis, 22, 656-662. Stone, P., & Austin, D. (2006). Assessment of antenatal screening for Down syndrome in New Zealand. Report to the National Screening Unit. Auckland: Auckland UniServices. Talbot, J. A., Spencer, K., & Abushoufa, R. A. (2003). Detection of maternal serum hCG glycoform variants in the second trimester of pregnancies affected by Down syndrome using a lectin immunoassay. Prenatal Diagnosis, 23, 1-5. Vergani, P., Locatelli, A., Biffi, A., Ciriello, E., Zagarella, A., Pezzullo, J. C., & Ghidini, A. (2002). Factors affecting the decision regarding amniocentesis in women at genetic risk because of age 35 years or older. Prenatal Diagnosis, 22, 769-774. von Kaisenberg, C. S., Gasiorek-Wiens, A., Bielicki, M., Bahlmann, F., Meyberg, H., Kossakiewicz, A., Pruggmayer, M., et al. (2002). Screening for trisomy 21 by maternal age, fetal nuchal translucency and maternal serum biochemistry at 11-14 weeks: a German multicenter study. Journal of Maternal-Fetal & Neonatal Medicine, 12, 89-94. Wald, N. J., Barnes, I. M., Birger, R., & Huttly, W. (2006a). Effect on Down syndrome screening performance of adjusting for marker levels in a previous pregnancy. Prenatal Diagnosis, 26, 539-544. Wald, N. J., Bestwick, J. P., & Morris, J. K. (2006b). Cross-trimester marker ratios in prenatal screening for Down syndrome. Prenatal Diagnosis, 26, 514-523. Wald, N. J., Huttly, W. J., & Hackshaw, A. K. (2003a). Antenatal screening for Down's syndrome with the quadruple test. Lancet, 361, 835-836. Wald, N. J., Rodeck, C., Hackshaw, A. K., Walters, J., Chitty, L., Mackinson, A. M., & Suruss Research Group (2003b). First and second trimester antenatal screening for Down's syndrome: the results of the serum, urine and ultrasound screening study (SURUSS). Health Technology Assessment (Winchester, England), 7, 1-77. Wald, N. J., Rudnicka, A. R., & Bestwick, J. P. (2006c). Sequential and contingent prenatal screening for Down syndrome. Prenatal Diagnosis, 26, 769-777. Wapner, R., Thom, E., Simpson, J. L., Pergament, E., Silver, R., Filkins, K., Platt, L., et al. (2003). First-trimester screening for trisomies 21 and 18. New England Journal of Medicine, 349, 1405-1413. Wellesley, D., Boyle, T., Barber, J., & Howe, D. T. (2002). Retrospective audit of different antenatal screening policies for Down's syndrome in eight district general hospitals in one health region. British Medical Journal, 325, 15-17. Wojdemann, K. R., Shalmi, A. C., Christiansen, M., Larsen, S. O., Sundberg, K., Brocks, V., Bang, J., et al. (2005). Improved first-trimester Down syndrome screening performance by lowering the false-positive rate: a prospective study of 9941 low-risk women. Ultrasound in Obstetrics & Gynecology, 25, 227-233. Wright, D., Bradbury, I., Cuckle, H., Gardosi, J., Tonks, A., Standing, S., & Benn, P. (2006). Threestage contingent screening for Down syndrome. Prenatal Diagnosis, 26, 528-534. Wright, D. E., & Bradbury, I. (2005). Repeated measures screening for Down's Syndrome. BJOG: An International Journal of Obstetrics & Gynaecology, 112, 80-83. Zoppi, M. A., Ibba, R. M., Putzolu, M., Floris, M., & Monni, G. (2001). Nuchal translucency and the acceptance of invasive prenatal chromosomal diagnosis in women aged 35 and older. Obstetrics and Gynecology, 97, 916-920. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 207 Appendix 1: Search Strategies SEARCH STRATEGIES Medline 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Downs syndrome/ (5152) trisomy 21.mp. (1407) 1 or 2 (5529) Pregnancy Trimester, First/ (3620) Nuchal Translucency Measurement/ (138) Pregnancy Trimester, Second/ (3504) exp prenatal diagnosis/ (18671) Pregnancy-Associated Plasma Protein-A/ (362) Chorionic Gonadotropin, beta Subunit, Human/ (1289) alpha-Fetoproteins/ (2914) estriol/bl (293) inhibins/bl (753) papp-a.tw. (290) beta hcg.tw. (903) uE3.tw. (97) (unconjugated oestriol or unconjugated estriol).tw. (201) inhibin a.tw. (599) afp.tw. (2476) ((integrated or sequential or contingent or step-wise) adj (screen$ or test$)).mp. (270) ultrasonography, prenatal/ (10659) Maternal Age/ (4280) or/4-21 (31726) mass screening/ (25674) False Positive Reactions/ (6544) false negative reactions/ (4337) (screen$ or test$).mp. (846205) or/23-26 (850494) 3 and 22 and 27 (1439) limit 28 to english (1314) limit 29 to yr=2000-2006 (784) (letter or news).pt. (318032) 30 not 31 (712) Medline 2 1 2 3 4 5 6 7 8 9 10 11 12 13 Downs syndrome/ (5207) trisomy 21.mp. (1424) Chromosomes, Human, Pair 21/ (1693) trisomy/ (2631) 3 and 4 (153) 1 or 2 or 5 (5640) amniocentesis/ (1581) Chorionic Villi Sampling/ (720) chorionic villus.mp. (101) exp Karyotyping/ (11568) or/7-10 (13229) 6 and 11 (649) limit 12 to yr=2000-2006 (367) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 208 Embase 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Downs syndrome/ (5796) trisomy 21.mp. (2002) 1 or 2 (6917) fetus echography/ (5445) First Trimester Pregnancy/ (5389) Second Trimester Pregnancy/ (3712) Maternal Serum/ (1089) Alpha Fetoprotein/ (4868) Inhibin A/ (691) papp-a.tw. (550) beta hcg.mp. (873) uE3.tw. (110) (unconjugated oestriol or unconjugated estriol).tw. (200) Maternal Age/ (4165) (nuchal translucency or nuchal fold).tw. (877) exp Prenatal Diagnosis/ (20616) Pregnancy Associated Plasma Protein A/ (410) Chorionic Gonadotropin Beta Subunit/ (2095) ((integrated or sequential or contingent or stepwise) adj (screen$ or test$)).tw. (316) inhibin a.tw. (1921) or/4-20 (37859) Screening Test/ (17624) Mass Screening/ (5299) (screen$ or test$).mp. (795108) Prenatal Screening/ (2369) screening/ (18303) or/22-26 (795108) 3 and 21 and 27 (1612) limit 28 to yr=2000-2006 (1039) limit 29 to english (929) letter.pt. (203923) 30 not 31 (848) Embase 2 1 2 3 4 5 6 7 8 9 10 11 Downs syndrome/ (5916) Trisomy 21/ (1684) 1 or 2 (6902) exp amniocentesis/ (3374) chorion villus sampling/ (983) chorionic vill$.mp. (1306) karyotyping/ (2675) fetus karyotyping/ (251) or/4-8 (7120) 3 and 9 (814) limit 10 to yr=2000-2006 (551) Cochrane Central Register of Controlled Trials 1 2 3 4 5 6 7 8 9 Downs syndrome/ (121) trisomy 21.mp. (15) 1 or 2 (123) Pregnancy Trimester, First/ (339) Nuchal Translucency Measurement/ (2) Pregnancy Trimester, Second/ (334) exp prenatal diagnosis/ (383) Pregnancy-Associated Plasma Protein-A/ (3) Chorionic Gonadotropin, beta Subunit, Human/ (40) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 209 10 alpha-Fetoproteins/ (73) 11 estriol/bl (22) 12 inhibins/bl (33) 13 papp-a.tw. (7) 14 beta hcg.tw. (61) 15 uE3.tw. (1) 16 (unconjugated oestriol or unconjugated estriol).tw. (10) 17 inhibin a.tw. (68) 18 afp.tw. (76) 19 ((integrated or sequential or contingent or step-wise) adj (screen$ or test$)).mp. (31) 20 ultrasonography, prenatal/ (222) 21 Maternal Age/ (143) 22 or/4-21 (1355) 23 down$ syndrome.mp. (179) 24 3 or 23 (180) 25 22 and 24 (28) 26 limit 25 to yr=2000-2006 (11) Cochrane Central Register of Controlled Trials 2 1 2 3 4 5 6 7 8 9 10 11 12 13 Downs syndrome/ (124) trisomy 21.mp. (15) Chromosomes, Human, Pair 21/ (14) trisomy/ (12) 3 and 4 (2) 1 or 2 or 5 (128) amniocentesis/ (88) Chorionic Villi Sampling/ (50) chorionic villus.mp. (0) exp Karyotyping/ (88) or/7-10 (172) 6 and 11 (7) limit 12 to yr=2000-2006 (3) Cinahl 1 Downs syndrome/ (1171) 2 trisomy 21.mp. (59) 3 down$ syndrome.tw. (863) 4 or/1-3 (1280) 5 (screen$ or test$).mp. (182249) 6 false positive$.mp. or false negative$.tw. [mp=title, subject heading word, abstract, instrumentation] (1714) 7 (false positive$ or false negative$).mp. (1881) 8 or/5-7 (182762) 9 Pregnancy Trimester, First/ (410) 10 Pregnancy Trimester, Second/ (381) 11 exp Prenatal Diagnosis/ (2412) 12 exp Ultrasonography, Prenatal/ (1119) 13 (nuchal translucency or nuchal fold).mp. (51) 14 pregnancy associated plasma protein a.mp. (12) 15 papp-a.mp. (7) 16 Gonadotropins, Chorionic/ (201) 17 beta hcg.tw. (29) 18 Alpha Fetoproteins/ (131) 19 ESTRIOL/bl [Blood] (21) 20 inhibin a.mp. (9) 21 (uE3 or unconjugated oestriol or unconjugated estriol).tw. (16) 22 afp.tw. (74) 23 Maternal Age/ (591) SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 210 24 25 26 27 28 29 30 (integrated or sequential or contingent or step wise).tw. (8005) or/9-24 (11580) 4 and 8 and 25 (252) limit 26 to yr=2000-2006 (184) limit 27 to english (183) letter.pt. (33692) 28 not 29 (179) Cinahl 2 1 2 3 4 5 6 7 8 9 10 11 12 13 Downs syndrome/ (1187) trisomy 21.mp. (60) Chromosomes, Human, Pair 21/ (0) trisomy/ (0) 3 and 4 (0) 1 or 2 or 5 (1205) amniocentesis/ (314) Chorionic Villi Sampling/ (95) chorionic villus.mp. (1) exp Karyotyping/ (60) or/7-10 (391) 6 and 11 (80) limit 12 to yr=2000-2006 (58) Psychinfo 1 2 3 4 5 6 7 8 exp Downs syndrome/ (933) trisomy 21.mp. (44) 1 or 2 (938) amniocentesis.mp. (45) karyotyping.mp. (10) chorion$ vill$.mp. (3) or/4-6 (56) 3 and 7 (8) Current Contents 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. Downs syndrome Down’s syndrome Trisomy 21 #1 OR #2 OR #3 Screen* Testing Tests #4 AND (#5 OR #6 OR #7) First trimester OR second trimester Nuchal translucency OR nuchal fold Pregnancy associated plasma protein a OR papp-a (Human chorionic gonadotropn SAME beta) OR beta hcg Alpha fetoproteins OR alha foetoproteins OR afp Inhibin a (integrated OR sequential OR continugent OR stepwise) SAME (screen* OR test*) Prenatal SAME (ultrasonography OR ultrasound) Unconjugated estriol OR uncongugated oestriol or ue3 Maternal age #9 OR #10 OR #11 OR #12 OR #13 OR #14 OR #15 OR #16 OR #17 OR #18 #8 AND #19 false positive OR false negative #8 AND #21 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 211 23. #20 OR #22 Current Contents 2 1. 2. 3. 4. 5. 6. Downs syndrome OR trisomy 21 Amniocentesis Karyotyp* chorionic villus OR chorionic villi OR chorionic villus #2 OR #3 OR #4 #1 AND #5 SEARCHES FROM OTHER SOURCES In databases and all other sources without controlled vocabulary combinations of the index terms and additional keywords from the above strategies were used in the search. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 212 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 213 Appendix 2: Sources searched SOURCES SEARCHED Bibliographic databases Medline Embase Cinahl Psychinfo Current Contents Cochrane Central Register of Controlled Trials Index New Zealand PubMed (last 60 days) Review databases ACP Journal Club – via Ovid Cochrane Database of Systematic Reviews – (Wiley Interscience version) Database of Abstracts of Reviews of Effectiveness - http://www.crd.york.ac.uk/crdweb/ NHS Economic Evaluation database http://www.crd.york.ac.uk/crdweb/ Health Technology Assessment database http://www.crd.york.ac.uk/crdweb/ Evidence-based collections TRIP Database http://www.tripdatabase.com ATTRACT http://www.attract.wales.nhs.uk NHS Technology Assessment Programme http://www.hta.nhsweb.nhs.uk/ (UK) National Institute for Health & Clinical Excellence http://www.nice.org.uk Other UK National Screening Committee Down’s Syndrome Screening Programme http://www.screening.nhs.uk/downs/home.htm Clinical Trials.gov http://www.clinicaltrials.gov Current Controlled Trials http://www.controlled-trials.com References of retrieved papers were scanned for relevant publications SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 214 SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 215 Appendix 3 Retrieved studies excluded for review : Part A Anonymous. (2003). First trimester ultrasound identifies more cases of Down syndrome than second trimester maternal serum screening and is more cost effective. Research Activities, 273, 8-9. Acacio, G. L., Barini, R., Pinto Junior, W., Ximenes, R. L., Pettersen, H., & Faria, M. (2001). Nuchal translucency: an ultrasound marker for fetal chromosomal abnormalities. Sao Paulo Medical Journal = Revista Paulista de Medicina, 119, 19-23. Akbas, S. H., Ozben, T., Alper, O., Ugur, A., Yucel, G., & Luleci, G. (2001). Maternal serum screening for Down's syndrome, open neural tube defects and trisomy 18. Clinical Chemistry & Laboratory Medicine, 39, 487-490. Bahado-Singh, R., Shahabi, S., Karaca, M., Mahoney, M. J., Cole, L., & Oz, U. A. (2002). The comprehensive midtrimester test: high-sensitivity Down syndrome test. American Journal of Obstetrics & Gynecology, 186, 803-808. Bahado-Singh, R. O., & Cheng, C. S. (2004). First trimester prenatal diagnosis. Current Opinion in Obstetrics & Gynecology, 16, 177-181. Bahado-Singh, R. O., Mendilcioglu, I., Rowther, M., Choi, S. J., Oz, U., Yousefi, N. F., & Mahoney, M. J. (2002). Early genetic sonogram for Down syndrome detection. American Journal of Obstetrics & Gynecology, 187, 1235-1238. Bahado-Singh, R. O., Oz, U., Shahabi, S., Mahoney, M. J., Baumgarten, A., & Cole, L. A. (2000). Comparison of urinary hyperglycosylated human chorionic gonadotropin concentration with the serum triple screen for Down syndrome detection in high-risk pregnancies. American Journal of Obstetrics & Gynecology, 183, 1114-1118. Ball, R. H. (2004). Invasive fetal testing. Current Opinion in Obstetrics & Gynecology, 16, 159-162. Beaman, J. M., & Goldie, D. J. (2001). Second trimester screening for Down's syndrome: 7 years experience. Journal of Medical Screening, 8, 128-131. Benn, P. A., Egan, J. F. X., & Ingardia, C. J. (2002). Extreme second-trimester serum analyte values in Down syndrome pregnancies with hydrops fetalis. Journal of Maternal-Fetal & Neonatal Medicine, 11, 262-265. Bersinger, N. A., Vanderlick, F., Birkhauser, M. H., Janecek, P., & Wunder, D. (2005). First trimester serum concentrations of placental proteins in singleton and multiple IVF pregnancies: Implications for Down syndrome screening. Immuno-Analyse et Biologie Specialisee, 20, 2127. Bersinger, N. A., Wunder, D., Vanderlick, F., Chanson, A., Pescia, G., Janecek, P., Boillat, E., & et al. (2004). Maternal serum levels of placental proteins after in vitro fertilisation and their implications for prenatal screening. Prenatal Diagnosis, 24, 471-477. Bianchi, D. W. (2004). Circulating fetal DNA: its origin and diagnostic potential-a review. Placenta, 25 Suppl A, S93-S101. Biggio, J. R., Jr., Morris, T. C., Owen, J., & Stringer, J. S. (2004). An outcomes analysis of five prenatal screening strategies for trisomy 21 in women younger than 35 years. American Journal of Obstetrics & Gynecology, 190, 721-729. Bindra, R., Heath, V., Liao, A., Spencer, K., & Nicolaides, K. H. (2002). One-stop clinic for assessment of risk for trisomy 21 at 11-14 weeks: a prospective study of 15 030 pregnancies. Ultrasound in Obstetrics & Gynecology, 20, 219-225. Borrell, A., Casals, E., Fortuny, A., Farre, M. T., Gonce, A., Sanchez, A., Soler, A., et al. (2004). Firsttrimester screening for trisomy 21 combining biochemistry and ultrasound at individually optimal gestational ages. An interventional study. Prenatal Diagnosis, 24, 541-545. Borruto, F., Comparetto, C., Acanfora, L., Bertini, G., & Rubaltelli, F. F. (2002). Role of ultrasound evaluation of nuchal translucency in prenatal diagnosis. Clinical & Experimental Obstetrics & Gynecology, 29, 235-241. Brigatti, K. W., & Malone, F. D. (2004). First-trimester screening for aneuploidy. Obstetrics & Gynecology Clinics of North America, 31, 1-20. Brizot, M. L., Carvalho, M. H., Liao, A. W., Reis, N. S., Armbruster-Moraes, E., & Zugaib, M. (2001). First-trimester screening for chromosomal abnormalities by fetal nuchal translucency in a Brazilian population. Ultrasound in Obstetrics & Gynecology, 18, 652-655. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 216 Canini, S., Prefumo, F., Famularo, L., Venturini, P. L., Palazzese, V., & De Biasio, P. (2002). Comparison of first trimester, second trimester and integrated Down's syndrome screening results in unaffected pregnancies. Clinical Chemistry & Laboratory Medicine, 40, 600-603. Celentano, C., Guanciali-Franchi, P. E., Liberati, M., Palka, C., Fantasia, D., Morizio, E., Calabrese, et al. (2005). Lack of correlation between elevated maternal serum hCG during second-trimester biochemical screening and fetal congenital anomaly. Prenatal Diagnosis, 25, 220-224. Centini, G., Rosignoli, L., Kenanidis, A., Talluri, B., Pasqui, L., Scarinci, R., De Simone, S., et al. (2004). Can a selective use of amniocentesis replace the routine procedure for advanced maternal age? Italian Journal of Gynaecology & Obstetrics, 16, 27-31. Centini, G., Rosignoli, L., Scarinci, R., Faldini, E., Morra, C., Centini, G., & Petraglia, F. (2005). Reevaluation of risk for Down syndrome by means of the combined test in pregnant women of 35 years or more. Prenatal Diagnosis, 25, 133-136. Cha, D. H., Khosrotehrani, K., Bianchi, D. W., & Johnson, K. L. (2005). The utility of an erythroblast scoring system and gender-independent short tandem repeat (STR) analysis for the detection of aneuploid fetal cells in maternal blood. Prenatal Diagnosis, 25, 586-591. Chasen, S. T., Sharma, G., Kalish, R. B., & Chervenak, F. A. (2003). First-trimester screening for aneuploidy with fetal nuchal translucency in a United States population. Ultrasound in Obstetrics & Gynecology, 22, 149-151. Chen, C. P., Lin, C. J., & Wang, W. (2005). Impact of second-trimester maternal serum screening on prenatal diagnosis of Down syndrome and the use of amniocentesis in the Taiwanese population. Taiwanese Journal of Obstetrics & Gynecology, 44, 31-35. Chen, M., Lam, Y. H., Lee, C. P., & Tang, M. H. Y. (2004). Ultrasound screening of fetal structural abnormalities at 12 to 14 weeks in Hong Kong. Prenatal Diagnosis, 24, 92-97. Cheng, C. C., Bahado-Singh, R. O., Chen, S. C., & Tsai, M. S. (2004). Pregnancy outcomes with increased nuchal translucency after routine Down syndrome screening. International Journal of Gynaecology & Obstetrics, 84, 5-9. Cheng, P. J., Chu, D. C., Chueh, H. Y., See, L. C., Chang, H. C., & Weng, D.-H. (2004). Elevated maternal midtrimester serum free beta-human chorionic gonadotropin levels in vegetarian pregnancies that cause increased false-positive Down syndrome screening results. American Journal of Obstetrics & Gynecology, 190, 442-447. Christiansen, M., Hogdall, E. V., Larsen, S. O., & Hogdall, C. (2002). The variation of risk estimates through pregnancy in second trimester maternal serum screening for Down syndrome. Prenatal Diagnosis, 22, 385-387. Christiansen, M., Larsen, S. O., Oxvig, C., Qin, Q. P., Wagner, J. M., Overgaard, M. T., Gleich, G. J., et al. (2004). Screening for Down's syndrome in early and late first and second trimester using six maternal serum markers. Clinical Genetics, 65, 11-16. Christiansen, M., & Norgaard-Pedersen, B. (2005). Inhibin A is a maternal serum marker for Down's syndrome early in the first trimester. Clinical Genetics, 68, 35-39. Cicero, S., Bindra, R., Rembouskos, G., Spencer, K., & Nicolaides, K. H. (2003). Integrated ultrasound and biochemical screening for trisomy 21 using fetal nuchal translucency, absent fetal nasal bone, free beta-hCG and PAPP-A at 11 to 14 weeks. Prenatal Diagnosis, 23, 306-310. Cole, L. A., Sutton, J. M., & Stephens, N. D. (2003). Invasive trophoblast antigen (ITA): a new highsensitivity test for detecting gestational Down syndrome. Journal of Clinical Ligand Assay, 26, 121-128. Comas, C., Antolin, E., Torrents, M., Mun~oz, A., Figueras, F., Echevarria, M., Gomez, O., et al. (2001). Early screening for chromosomal abnormalities: new strategies combining biochemical, sonographic and doppler parameters. Prenatal & Neonatal Medicine, 6, 95-102. Comas, C., Torrents, M., Munoz, A., Antolin, E., Figueras, F., & Echevarria, M. (2002). Measurement of nuchal translucency as a single strategy in trisomy 21 screening: should we use any other marker? Obstetrics & Gynecology, 100, 648-654. Crossley, J. A., Aitken, D. A., Waugh, S. M., Kelly, T., & Connor, J. M. (2002). Maternal smoking: age distribution, levels of alpha-fetoprotein and human chorionic gonadotrophin, and effect on detection of Down syndrome pregnancies in second-trimester screening. Prenatal Diagnosis, 22, 247-255. Cuckle, H., Aitken, D., Goodburn, S., Senior, B., Spencer, K., Standing, S., & UK National Down's Syndrome Screening Programme, L. A. G. (2004). Age-standardisation when target setting and auditing performance of Down syndrome screening programmes. Prenatal Diagnosis, 24, 851-856. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 217 Cusick, W., Buchanan, P., Hallahan, T. W., Krantz, D. A., Larsen, J. W., Jr., & Macri, J. N. (2003). Combined first-trimester versus second-trimester serum screening for Down syndrome: a cost analysis. American Journal of Obstetrics & Gynecology, 188, 745-751. DeVore, G. R. (2001). The genetic sonogram: its use in the detection of chromosomal abnormalities in fetuses of women of advanced maternal age. Prenatal Diagnosis, 21, 40-45. DeVore, G. R., & Romero, R. (2002). Genetic sonography: a cost-effective method for evaluating women 35 years and older who decline genetic amniocentesis. Journal of Ultrasound in Medicine, 21, 5-13. DeVore, G. R., & Romero, R. (2003). Genetic sonography: an option for women of advanced maternal age with negative triple-marker maternal serum screening results. Journal of Ultrasound in Medicine, 22, 1191-1199. Dixon, J., Pillai, M., Mahendran, D., & Brooks, M. (2004). An assessment of the Down syndrome antenatal screening policies of East and West Gloucestershire between 1993 and 1999. Journal of Obstetrics & Gynaecology, 24, 760-764. Dormandy, E., Hooper, R., Michie, S., & Marteau, T. M. (2002). Informed choice to undergo prenatal screening: a comparison of two hospitals conducting testing either as part of a routine visit or requiring a separate visit. Journal of Medical Screening, 9, 109-114. Drysdale, K., Ridley, D., Walker, K., Higgins, B., & Dean, T. (2002). First-trimester pregnancy scanning as a screening tool for high-risk and abnormal pregnancies in a district general hospital setting. Journal of Obstetrics & Gynaecology, 22, 159-165. Egan, J. F., Benn, P., Borgida, A. F., Rodis, J. F., Campbell, W. A., & Vintzileos, A. M. (2000). Efficacy of screening for fetal Down syndrome in the United States from 1974 to 1997. Obstetrics & Gynecology, 96, 979-985. Egan, J. F., Malakh, L., Turner, G. W., Markenson, G., Wax, J. R., & Benn, P. A. (2001). Role of ultrasound for Down syndrome screening in advanced maternal age. American Journal of Obstetrics & Gynecology, 185, 1028-1031. Falcon, O., Auer, M., Gerovassili, A., Spencer, K., & Nicolaides, K. H. (2006). Screening for trisomy 21 by fetal tricuspid regurgitation, nuchal translucency and maternal serum free beta-hCG and PAPP-A at 11 + 0 to 13 + 6 weeks. Ultrasound in Obstetrics & Gynecology, 27, 151-155. Fortuny, A., Borell, A., Casals, E., Seres, A., Sanchez, A., & Soler, A. (2005). First trimester aneuploidy screening combining biochemical and ultrasound markers. Ultrasound Review of Obstetrics & Gynecology, 5, 9-17. Ghisoni, L., Ferrazzi, E., Castagna, C., Levi Setti, P. E., Masini, A. C., & Pigni, A. (2003). Prenatal diagnosis after ART success: the role of early combined screening tests in counselling pregnant patients. Placenta, 24, S99-S103. Goldberg, J. D. (2004). Routine screening for fetal anomalies: expectations. Obstetrics & Gynecology Clinics of North America, 31, 35-50. Green, J. M., Hewison, J., Bekker, H. L., Bryant, L. D., & Cuckle, H. S. (2004). Psychosocial aspects of genetic screening of pregnant women and newborns: A systematic review. Health Technology Assessment (Winchester, England), 8, iii-87. Hadlow, N. C., Hewitt, B. G., Dickinson, J. E., Jacoby, P., & Bower, C. (2005). Community-based screening for Down's Syndrome in the first trimester using ultrasound and maternal serum biochemistry. BJOG: An International Journal of Obstetrics & Gynaecology, 112, 1561-1564. Harris, A. H. (2004). The cost effectiveness of prenatal ultrasound screening for trisomy 21. International Journal of Technology Assessment in Health Care, 20, 464-468. Has, R., Kalelioglu, I., Ermis, H., Ibrahimoglu, L., Yuksel, A., Yildirim, A., & Basaran, S. (2006). Screening for fetal chromosomal abnormalities with nuchal translucency measurement in the first trimester. Fetal Diagnosis and Therapy, 21, 355-359. Herman, A., Dreazen, E., Herman, A. M., Batukan, C. E., Holzgreve, W., & Tercanli, S. (2002). Bedside estimation of Down syndrome risk during first-trimester ultrasound screening. Ultrasound in Obstetrics & Gynecology, 20, 468-475. Hsu, J. J., Chiang, C. H., Hsieh, C. C., & Hsieh, T. T. (2004). The influence of image magnification in first-trimester screening for Down syndrome by fetal nuchal translucency in Asians. Prenatal Diagnosis, 24, 1007-1012. Hulten, M. (2004). Combined serum and nuchal translucency screening in the first trimester achieves 85% to 90% detection rate for Down and Edward syndromes. Evidence-Based Healthcare, 8, 82-84. Hung, J. H., Fu, C. Y., Yuan, C. C., Chen, C. L., Yang, M. L., Shu, L. P., & Wu, C. C. (2003). Nuchal translucence incorporated into a one-stage multifactorial screening model for Down syndrome prediction at second-trimester pregnancy. Ultrasound in Medicine & Biology, 29, 1667-1674. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 218 Hwa, H. L., Yen, M. F., Hsieh, F. J., Ko, T. M., & Chen, T. H. (2004). Evaluation of second trimester maternal serum screening for Down's Syndrome using the Spiegelhalter-Knill-Jones (S-KJ) approach. Journal of Perinatal Medicine, 32, 407-412. Jaques, A. M., Collins, V. R., Haynes, K., Sheffield, L. J., Francis, I., Forbes, R., & Halliday, J. L. (2006). Using record linkage and manual follow-up to evaluate the Victorian maternal serum screening quadruple test for Down's syndrome, trisomy 18 and neural tube defects. Journal of Medical Screening, 13, 8-13. Jou, H. J., Shyu, M. K., Chen, S. M., Shih, J. C., Hsu, J. J., & Hsieh, F. J. (2000). Maternal serum screening for down syndrome by using alpha-fetoprotein and human chorionic gonadotropin in an asian population. a prospective study. Fetal Diagnosis & Therapy, 15, 108-111. Kennelly, M., Carroll, S., & Parland, P. M. (2004). Nuchal translucency audit: low uptake of invasive testing in screen positive cases. Irish Medical Journal, 97, 304-305. Kim, M. H., Park, S. H., Cho, H. J., Choi, J. S., Kim, J. O., Ahn, H. K., Shin, J. S., et al. (2006). Threshold of nuchal translucency for the detection of chromosomal aberration: comparison of different cut-offs. Journal of Korean Medical Science, 21, 11-14. Kim, S. K., Bai, S. W., Chung, J. E., Jung, Y. N., Park, K. H., Cho, D. J., Kim, J. W., Yang, Y. H., et al. (2001). Triple marker screening for fetal chromosomal abnormalities in Korean women of advanced maternal age. Yonsei Medical Journal, 42, 199-203. Kishida, T., Hoshi, N., Hattori, R., Negishi, H., Yamada, H., Okuyama, K., Hanatani, K., et al. (2000). Efficacy of maternal serum screening in the prenatal detection of fetal chromosome abnormalities in Japanese women. Fetal Diagnosis & Therapy, 15, 112-117. Krantz, D., Goetzl, L., Simpson, J. L., Thom, E., Zachary, J., Hallahan, T. W., Silver, R., et al. (2004). Association of extreme first-trimester free human chorionic gonadotropin-beta, pregnancyassociated plasma protein A, and nuchal translucency with intrauterine growth restriction and other adverse pregnancy outcomes. American Journal of Obstetrics & Gynecology, 191, 14521458. Krantz, D. A., Hallahan, T. W., Macri, V. J., & Macri, J. N. (2005). Maternal weight and ethnic adjustment within a first-trimester Down syndrome and trisomy 18 screening program. Prenatal Diagnosis, 25, 635-640. Kremensky, I., Jordanova, A., Michaylova, E., Todorova, A., Ivanova, M., Petkova, R., Andonova, S., et al. (2000). Laboratory diagnosis of inherited disorders and congenital anomalies in Bulgaria. Balkan Journal of Medical Genetics, 3, 13-21. Lai, T. H., Chen, S. C., Tsai, M. S., Lee, F. K., & Wei, C. F. (2003). First-trimester screening for Down syndrome in singleton pregnancies achieved by intrauterine insemination. Journal of Assisted Reproduction & Genetics, 20, 327-331. Lambert-Messerlian, G., Dugoff, L., Vidaver, J., Canick, J. A., Malone, F. D., Ball, R. H., Comstock, C. H., et al (2006). First- and second-trimester Down syndrome screening markers in pregnancies achieved through assisted reproductive technologies (ART): a FASTER trial study. Prenatal Diagnosis 26, 672-8 Epub ahead of print. Lambert-Messerlian, G. M., & Canick, J. A. (2004). Clinical application of inhibin a measurement: prenatal serum screening for Down syndrome. Seminars in Reproductive Medicine, 22, 235242. Leung, T. Y., Spencer, K., Leung, T. N., Fung, T. Y., & Lau, T. K. (2006). Higher median levels of free beta-hCG and PAPP-A in the first trimester of pregnancy in a Chinese ethnic group. Implication for first trimester combined screening for Down's syndrome in the Chinese population. Fetal Diagnosis & Therapy, 21, 140-143. Lewis, S. M., Cullinane, F. M., Carlin, J. B., & Halliday, J. L. (2006). Women's and health professionals' preferences for prenatal testing for Down syndrome in Australia. Australian and New Zealand Journal of Obstetrics and Gynaecology, 46, 205-211. Lewis, S. M., Cullinane, F. N., Bishop, A. J., Chitty, L. S., Marteau, T. M., & Halliday, J. L. (2006). A comparison of Australian and UK obstetricians' and midwives' preferences for screening tests for Down syndrome. Prenatal Diagnosis, 26, 60-66. Lim, K. I., Pugash, D., Dansereau, J., & Wilson, R. D. (2002). Nuchal index: a gestational age independent ultrasound marker for the detection of Down syndrome. Prenatal Diagnosis, 22, 1233-1237. Liu, S. S., Lee, F. K., Lee, J. L., Tsai, M. S., Cheong, M. L., She, B. Q., & Chen, S. C. (2004). Pregnancy outcomes in unselected singleton pregnant women with an increased risk of firsttrimester Down's syndrome. Acta Obstetricia et Gynecologica Scandinavica, 83, 1130-1134. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 219 Locatelli, A., Piccoli, M. G., Vergani, P., Mariani, E., Ghidini, A., Mariani, S., & Pezzullo, J. C. (2000). Critical appraisal of the use of nuchal fold thickness measurements for the prediction of Down syndrome. American Journal of Obstetrics & Gynecology, 182, 192-197. MacRae, A. R., Gardner, H. A., Allen, L. C., Tokmakejian, S., & Lepage, N. (2003). Outcome validation of the Beckman Coulter access analyzer in a second-trimester Down syndrome serum screening application. Clinical Chemistry, 49, 69-76. Malone, F. D. (2005). Nuchal translucency-based Down syndrome screening: barriers to implementation. Seminars in Perinatology, 29, 272-276. Malone, F. D., Ball, R. H., Nyberg, D. A., Comstock, C. H., Saade, G. R., Berkowitz, R. L., Gross, S. et al. (2005). First-trimester septated cystic hygroma: prevalence, natural history, and pediatric outcome. Obstetrics & Gynecology, 106, 288-294. Malone, F. D., D'Alton, M. E., & Society for Maternal-Fetal, M. (2003). First-trimester sonographic screening for Down syndrome. Obstetrics & Gynecology, 102, 1066-1079. Marical, H., Douet-Guilbert, N., Bages, K., Collet, M., Le Bris, M. J., Morel, F., & De Braekeleer, M. (2006). Second-trimester prenatal screening for trisomy 21 using biochemical markers: A 7year experience in one cytogenetic laboratory. Prenatal Diagnosis, 26, 308-312. Marsis, I. O. (2004). Screening for Down syndrome using nuchal translucency thickness and nasal bone examination at advanced maternal age in Jakarta: A preliminary report. Journal of Medical Ultrasound, 12, 1-6. Matias, A., Montenegro, N., & Blickstein, I. (2005). Down syndrome screening in multiple pregnancies. Obstetrics & Gynecology Clinics of North America, 32, 81-96. Maymon, R., Betser, M., Dreazen, E., Padoa, A., & Herman, A. (2004). A model for disclosing the first trimester part of an integrated Down's syndrome screening test. Clinical Genetics, 65, 113119. Maymon, R., Jauniaux, E., Holmes, A., Wiener, Y. M., Dreazen, E., & Herman, A. (2001). Nuchal translucency measurement and pregnancy outcome after assisted conception versus spontaneously conceived twins. Human Reproduction, 16, 1999-2004. Maymon, R., Sharony, R., Grinshpun-Cohen, J., Itzhaky, D., Herman, A., & Reish, O. (2005). The best marker combination using the integrated screening test approach for detecting various chromosomal aneuploidies. Journal of Perinatal Medicine, 33, 392-398. Maymon, R., & Shulman, A. (2002). Serial first- and second-trimester Down's syndrome screening tests among IVF-versus naturally-conceived singletons. Human Reproduction, 17, 1081-1085. Maymon, R., & Shulman, A. (2004). Integrated first- and second-trimester Down syndrome screening test among unaffected IVF pregnancies. Prenatal Diagnosis, 24, 125-129. Meier, C., Huang, T., Wyatt, P. R., & Summers, A. M. (2002). Accuracy of expected risk of Down syndrome using the second-trimester triple test. Clinical Chemistry, 48, 653-655. Monni, G., Zoppi, M. A., Ibba, R. M., Floris, M., Manca, F., & Axiana, C. (2005). Nuchal translucency and nasal bone for trisomy 21 screening: single center experience. Croatian Medical Journal, 46, 786-791. Mueller, V. M., Huang, T., Summers, A. M., & Winsor, S. H. M. (2005). The effect of fetal gender on the false-positive rate of Down syndrome by maternal serum screening. Prenatal Diagnosis, 25, 1258-1261. Muller, F., Dreux, S., Lemeur, A., Sault, C., Desgres, J., Bernard, M. A., Giorgetti, C., et al. (2003). Medically assisted reproduction and second-trimester maternal serum marker screening for Down syndrome. Prenatal Diagnosis, 23, 1073-1076. Muller, F., Dreux, S., Oury, J. F., Luton, D., Uzan, S., Uzan, M., Levardon, M., & Dommergues, M. (2002). Down syndrome maternal serum marker screening after 18 weeks' gestation. Prenatal Diagnosis, 22, 1001-1004. Nicolaides, K. H. (2004). Nuchal translucency and other first-trimester sonographic markers of chromosomal abnormalities. American Journal of Obstetrics & Gynecology, 191, 45-67. Nicolaides, K. H., Bindra, R., Heath, V., & Cicero, S. (2002). One-stop clinic for assessment of risk of chromosomal defects at 12 weeks of gestation. Journal of Maternal-Fetal & Neonatal Medicine, 12, 9-18. Nicolaides, K. H., Spencer, K., Avgidou, K., Faiola, S., & Falcon, O. (2005). Multicenter study of firsttrimester screening for trisomy 21 in 75 821 pregnancies: Results and estimation of the potential impact of individual risk-oriented two-stage first-trimester screening. Ultrasound in Obstetrics & Gynecology, 25, 221-226. O'Callaghan, S. P., Giles, W. B., Raymond, S. P., McDougall, V., Morris, K., & Boyd, J. (2000). First trimester ultrasound with nuchal translucency measurement for Down syndrome risk estimation using software developed by the Fetal Medicine Foundation, United Kingdom--the SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 220 first 2000 examinations in Newcastle, New South Wales, Australia. Australian & New Zealand Journal of Obstetrics & Gynaecology, 40, 292-295. O'Connell, M. P., Holding, S., Morgan, R. J., & Lindow, S. W. (2000). Biochemical screening for Down syndrome: patients' perception of risk. International Journal of Gynaecology & Obstetrics, 68, 215-218. Odibo, A. O., Stamilio, D. M., Nelson, D. B., Sehdev, H. M., & Macones, G. A. (2005). A costeffectiveness analysis of prenatal screening strategies for Down syndrome. Obstetrics & Gynecology, 106, 562-568. Onda, T., Tanaka, T., Yoshida, K., Nakamura, Y., Kudo, R., Yamamoto, H., Sato, A., et al. (2000). Triple marker screening for trisomy 21, trisomy 18 and open neural tube defects in singleton pregnancies of native Japanese pregnant women. Journal of Obstetrics & Gynaecology Research, 26, 441-447. Orlandi, F., Rossi, C., Allegra, A., Krantz, D., Hallahan, T., Orlandi, E., & Macri, J. (2002). First trimester screening with free beta-hCG, PAPP-A and nuchal translucency in pregnancies conceived with assisted reproduction. Prenatal Diagnosis, 22, 718-721. Palomaki, G. E., Knight, G. J., Roberson, M. M., Cunningham, G. C., Lee, J. E., Strom, C. M., & Pandian, R. (2004). Invasive trophoblast antigen (hyperglycosylated human chorionic gonadotropin) in second-trimester maternal urine as a marker for down syndrome: preliminary results of an observational study on fresh samples. Clinical Chemistry, 50, 182-189. Panburana, P., Ajjimakorn, S., & Tungkajiwangoon, P. (2001). First trimester Down Syndrome screening by nuchal translucency in a Thai population. International Journal of Gynaecology & Obstetrics, 75, 311-312. Parano, E., Falcidia, E., Grillo, A., Takabayashi, H., Trifiletti, R. R., & Pavone, P. (2001). Fetal nucleated red blood cell counts in peripheral blood of mothers bearing Down syndrome fetus. Neuropediatrics, 32, 147-149. Perenc, M., Dudarewicz, L., & Kaluzewski, B. (2000). Utility of the triple test in the detection of abnormalities of the feto-placental unit. Medical Science Monitor, 6, 994-999. Perni, S. C., Predanic, M., Kalish, R. B., Chervenak, F. A., & Chasen, S. T. (2006). Clinical use of first-trimester aneuploidy screening in a United States population can replicate data from clinical trials. American Journal of Obstetrics & Gynecology, 194, 127-130. Pertl, B., & Bianchi, D. W. (2001). Fetal DNA in maternal plasma: emerging clinical applications. Obstetrics & Gynecology, 98, 483-490. Platt, L. D. (2005). First-trimester risk assessment: twin gestations. Seminars in Perinatology, 29, 258262. Platt, L. D., Greene, N., Johnson, A., Zachary, J., Thom, E., Krantz, D., Simpson, J. L., et al. (2004). Sequential pathways of testing after first-trimester screening for trisomy 21. Obstetrics & Gynecology, 104, 661-666. Prefumo, F., & Thilaganathan, B. (2002). Agreement between predicted risk and prevalence of Down syndrome in first trimester nuchal translucency screening. Prenatal Diagnosis, 22, 917-918. Raty, R., Virtanen, A., Koskinen, P., Anttila, L., Forsstrom, J., Laitinen, P., Morsky, P., et al. (2002). Serum free beta-HCG and alpha-fetoprotein levels in IVF, ICSI and frozen embryo transfer pregnancies in maternal mid-trimester serum screening for Down's syndrome. Human Reproduction, 17, 481-484. Rice, J. D., McIntosh, S. F., & Halstead, A. C. (2005). Second-trimester maternal serum screening for Down syndrome in in vitro fertilization pregnancies. Prenatal Diagnosis, 25, 234-238. Roberts, D., Walkinshaw, S. A., McCormack, M. J., & Ellis, J. (2000). Prenatal detection of trisomy 21: combined experience of two British hospitals. Prenatal Diagnosis, 20, 17-22. Rosen, D. J., Kedar, I., Amiel, A., Ben-Tovim, T., Petel, Y., Kaneti, H., Tohar, M., et al. (2002). A negative second trimester triple test and absence of specific ultrasonographic markers may decrease the need for genetic amniocentesis in advanced maternal age by 60%. Prenatal Diagnosis, 22, 59-63. Rosen, T., & D'Alton, M. E. (2005). Down syndrome screening in the first and second trimesters: what do the data show? Seminars in Perinatology, 29, 367-375. Rozenberg, P., Bussieres, L., Chevret, S., Bernard, J. P., Malagrida, L., Cuckle, H., Chabry, C., et al. (2006). Screening for Down syndrome using first-trimester combined screening followed by second-trimester ultrasound examination in an unselected population. American Journal of Obstetrics and Gynecology 199, 1379-1387 Epub ahead of print. Rudnicka, A. R., Wald, N. J., Huttly, W., & Hackshaw, A. K. (2002). Influence of maternal smoking on the birth prevalence of Down syndrome and on second trimester screening performance. Prenatal Diagnosis, 22, 893-897. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 221 Sabria, J., Cabrero, D., & Bach, C. (2002). Aneuploidy screening: ultrasound versus biochemistry. Ultrasound Review of Obstetrics & Gynecology, 2, 221-228. Saltvedt, S., Almstrom, H., Kublickas, M., Valentin, L., Bottinga, R., Bui, T. H., Cederholm, M., et al.. (2005). Screening for Down syndrome based on maternal age or fetal nuchal translucency: a randomized controlled trial in 39,572 pregnancies. Ultrasound in Obstetrics & Gynecology, 25, 537-545. Sau, A., Langford, K., Auld, B., & Maxwell, D. (2001). Screening for trisomy 21: the significance of a positive second trimester serum screen in women screen negative after a nuchal translucency scan. Journal of Obstetrics & Gynaecology, 21, 145-148. Schielen, P. C., van Leeuwen-Spruijt, M., Belmouden, I., Elvers, L. H., Jonker, M., & Loeber, J. G. (2006). Multi-centre first-trimester screening for Down syndrome in the Netherlands in routine clinical practice. Prenatal Diagnosis 26, 711-718 Epub ahead of print. Skotko, B. (2006). Comparing three screening strategies for combining first- and second-trimester Down syndrome markers. Obstetrics and Gynecology, 107, 1170. Smith-Bindman, R., Hosmer, W., Feldstein, V. A., Deeks, J. J., & Goldberg, J. D. (2001). Secondtrimester ultrasound to detect fetuses with Down syndrome: a meta-analysis. JAMA, 285, 1044-1055. Snijders, R. (2001). First-trimester ultrasound. Clinics in Perinatology, 28, 333-352. Sorensen, T., Larsen, S. O., & Christiansen, M. (2005). Weight adjustment of serum markers in early first-trimester prenatal screening for Down syndrome. Prenatal Diagnosis, 25, 484-488. Souter, V. L., & Nyberg, D. A. (2001). Sonographic screening for fetal aneuploidy: first trimester. Journal of Ultrasound in Medicine, 20, 775-790. Spencer, K. (2001). Age related detection and false positive rates when screening for Down's syndrome in the first trimester using fetal nuchal translucency and maternal serum free betahCG and PAPP-A. BJOG: An International Journal of Obstetrics & Gynaecology, 108, 1043-1046. Spencer, K. (2005). First trimester maternal serum screening for Down's syndrome: an evaluation of the DPC Immulite 2000 free beta-hCG and pregnancy-associated plasma protein-A assays Annals of Clinical Biochemistry, 42, 30-40. Spencer, K., Bindra, R., Cacho, A. M., & Nicolaides, K. H. (2004). The impact of correcting for smoking status when screening for chromosomal anomalies using maternal serum biochemistry and fetal nuchal translucency thickness in the first trimester of pregnancy. Prenatal Diagnosis, 24, 169-173. Spencer, K., Bindra, R., & Nicolaides, K. H. (2003). Maternal weight correction of maternal serum PAPP-A and free beta-hCG MoM when screening for trisomy 21 in the first trimester of pregnancy. Prenatal Diagnosis, 23, 851-855. Spencer, K., Bindra, R., Nix, A. B., Heath, V., & Nicolaides, K. H. (2003). Delta-NT or NT MoM: which is the most appropriate method for calculating accurate patient-specific risks for trisomy 21 in the first trimester? Ultrasound in Obstetrics & Gynecology, 22, 142-148. Spencer, K., Crossley, J. A., Aitken, D. A., Nix, A. B., Dunstan, F. D., & Williams, K. (2003). The effect of temporal variation in biochemical markers of trisomy 21 across the first and second trimesters of pregnancy on the estimation of individual patient-specific risks and detection rates for Down's syndrome. Annals of Clinical Biochemistry, 40, 219-231. Spencer, K., Heath, V., El-Sheikhah, A., Ong, C. Y., & Nicolaides, K. H. (2005). Ethnicity and the need for correction of biochemical and ultrasound markers of chromosomal anomalies in the first trimester: a study of Oriental, Asian and Afro-Caribbean populations. Prenatal Diagnosis, 25, 365-369. Spencer, K., Liao, A. W., Ong, C. Y., Geerts, L., & Nicolaides, K. H. (2001). First trimester maternal serum placenta growth factor (PIGF)concentrations in pregnancies with fetal trisomy 21 or trisomy 18. Prenatal Diagnosis, 21, 718-722. Spencer, K., & Nicolaides, K. H. (2003). Screening for trisomy 21 in twins using first trimester ultrasound and maternal serum biochemistry in a one-stop clinic: a review of three years experience. BJOG: An International Journal of Obstetrics & Gynaecology, 110, 276-280. Spencer, K., Spencer, C. E., Power, M., Moakes, A., & Nicolaides, K. H. (2000). One stop clinic for assessment of risk for fetal anomalies: a report of the first year of prospective screening for chromosomal anomalies in the first trimester. BJOG: An International Journal of Obstetrics & Gynaecology, 107, 1271-1275. Stenhouse, E. J., Crossley, J. A., Aitken, D. A., Brogan, K., Cameron, A. D., & Connor, J. M. (2004). First-trimester combined ultrasound and biochemical screening for Down syndrome in routine clinical practice. Prenatal Diagnosis, 24, 774-780. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 222 Summers, A. M., Farrell, S. A., Huang, T., Meier, C., & Wyatt, P. R. (2003). Maternal serum screening in Ontario using the triple marker test. Journal of Medical Screening, 10, 107-111. Toyama, J. M., Brizot, M. L., Liao, A. W., Lopes, L. M., Nomura, R. M. Y., Saldanha, F. A. T., & Zugaib, M. (2004). Ductus venosus blood flow assessment at 11 to 14 weeks of gestation and fetal outcome. Ultrasound in Obstetrics & Gynecology, 23, 341-345. Tsai, M. S., Huang, Y. Y., Hwa, K. Y., Cheng, C. C., & Lee, F. K. (2001). Combined measurement of fetal nuchal translucency, maternal serum free beta-hCG, and pregnancy-associated plasma protein A for first-trimester Down's syndrome screening. Journal of the Formosan Medical Association, 100, 319-325. Van Den Berg, M., Timmermans, D. R. M., Kleinveld, J. H., Garcia, E., Van Vugt, J. M. G., & Van Der Wal, G. (2005). Accepting or declining the offer of prenatal screening for congenital defects: Test uptake and women's reasons. Prenatal Diagnosis, 25, 84-90. van den Berg, M., Timmermans, D. R. M., ten Kate, L. P., van Vugt, J. M. G., & van der Wal, G. (2005). Are pregnant women making informed choices about prenatal screening? Genetics in Medicine, 7, 332-338. Vandecruys, H., Faiola, S., Auer, M., Sebire, N., & Nicolaides, K. H. (2005). Screening for trisomy 21 in monochorionic twins by measurement of fetal nuchal translucency thickness. Ultrasound in Obstetrics & Gynecology, 25, 551-553. Viora, E., Masturzo, B., Bastonero, S., Errante, G., Sciarrone, A., Grassi Pirrone, P., & Campogrande, M. (2003). Efficiency and intra-operator's variability of nuchal translucency measurement. Importance of operator's experience. Italian Journal of Gynaecology & Obstetrics, 15, 69-73. Wapner, R. J. (2005). First trimester screening: the BUN study. Seminars in Perinatology, 29, 236-239. Wasant, P., & Liammongkolkul, S. (2003). Prenatal genetic screening for Down syndrome and open neural tube defects using maternal serum markers in Thai pregnant women. Southeast Asian Journal of Tropical Medicine & Public Health, 34 Suppl 3, 244-248. Wayda, K., Kereszturi, A., Orvos, H., Horvath, E., A, P. A., Kovacs, L., & Szabo, J. (2001). Four years experience of first-trimester nuchal translucency screening for fetal aneuploidies with increasing regional availability. Acta Obstetricia et Gynecologica Scandinavica, 80, 11041109. Wilson, R. D., & Genetics Committee of the Society of Obstetricians and Gynaecologists of Canada (2005). Cell-free fetal DNA in the maternal circulation and its future uses in obstetrics. Journal of Obstetrics & Gynaecology Canada: JOGC, 27, 54-62. Wright, D., Bradbury, I., Benn, P., Cuckle, H., & Ritchie, K. (2004). Contingent screening for Down syndrome is an efficient alternative to non-disclosure sequential screening Prenatal Diagnosis, 24, 762-766. Yamamoto, R., Azuma, M., Hoshi, N., Kishida, T., Satomura, S., & Fujimoto, S. (2001). Lens culinaris agglutinin-reactive alpha-fetoprotein, an alternative variant to alpha-fetoprotein in prenatal screening for Down's syndrome. Human Reproduction, 16, 2438-2444. Zoppi, M. A., Ibba, R. M., Floris, M., Manca, F., Axiana, C., & Monni, G. (2005). Nuchal translucency measurement at different crown-rump lengths along the 10- to 14-week period for Down syndrome screening. Prenatal Diagnosis, 25, 411-416. Zoppi, M. A., Ibba, R. M., Floris, M., & Monni, G. (2001). Fetal nuchal translucency screening in 12495 pregnancies in Sardinia. Ultrasound in Obstetrics & Gynecology, 18, 649-651. Zoppi, M. A., Ibba, R. M., Putzolu, M., Floris, M., & Monni, G. (2000). Assessment of risk for chromosomal abnormalities at 10-14 weeks of gestation by nuchal translucency and maternal age in 5,210 fetuses at a single centre. Fetal Diagnosis & Therapy, 15, 170-173. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 223 Appendix 4: Retrieved studies excluded for review: Part B Abbott, M. A., & Benn, P. (2002). Prenatal genetic diagnosis of Down's syndrome. Expert Review of Molecular Diagnostics, 2, 605-615. Alfirevic, Z., & Neilson, J. P. (2004). Antenatal screening for Down's syndrome. British Medical Journal, 329, 811-812. American College of Obstetricians and Gynecologists (2001). ACOG Practice Bulletin. Clinical management guidelines for obstetrician-gynecologists. Prenatal diagnosis of fetal chromosomal abnormalities. Obstetrics & Gynecology, 97, Suppl 1-12. Anonymous. (2003). Risk and Down's screening. Bandolier, 10, 6. Anonymous. (2005). Sequential pregnancy screening. ACOG Clinical Review, 10, 4-5. Antsaklis, A. (2003). Invasive genetic studies in multiple pregnancy. Balkan Journal of Medical Genetics, 6, 41-47. Audibert, F., Mairovitz, V., & Frydman, R. (2002). Alternatives to amniocentesis for advanced maternal age. [French]. Gynecologie, Obstetrique & Fertilite, 30, 562-566. Avgidou, K., Papageorghiou, A., Bindra, R., Spencer, K., & Nicolaides, K. H. (2005). Prospective first-trimester screening for trisomy 21 in 30,564 pregnancies. American Journal of Obstetrics & Gynecology, 192, 1761-1767. Babbur, V., Lees, C. C., Goodburn, S. F., Morris, N., Breeze, A. C., & Hackett, G. A. (2005). Prospective audit of a one-centre combined nuchal translucency and triple test programme for the detection of trisomy 21. Prenatal Diagnosis, 25, 465-469. Bahado-Singh, R. O., Oz, A. U., Gomez, K., Hunter, D., Copel, J., Baumgarten, A., & Mahoney, M. J. (2000). Combined ultrasound biometry, serum markers and age for Down syndrome risk estimation. Ultrasound in Obstetrics & Gynecology, 15, 199-204. Ball, R. H. (2004). Invasive fetal testing. Current Opinion in Obstetrics & Gynecology, 16, 159-162. Beaman, J. M., & Goldie, D. J. (2001). Second trimester screening for Down's syndrome: 7 years experience. Journal of Medical Screening, 8, 128-131. Benacerraf, B. R. (2000). Should sonographic screening for fetal Down syndrome be applied to low risk women? Ultrasound in Obstetrics & Gynecology, 15, 451-455. Benn, P., & Donnenfeld, A. E. (2005). Sequential Down syndrome screening: the importance of first and second trimester test correlations with calculating risk. Journal of Genetic Counseling, 14, 409-413. Benn, P. A. (2002). Advances in prenatal screening for Down syndrome: II first trimester testing, integrated testing, and future directions. Clinica Chimica Acta, 324, 1-11. Benn, P. A., Fang, M., Egan, J. F., Horne, D., & Collins, R. (2003). Incorporation of inhibin-A in second-trimester screening for Down syndrome. Obstetrics & Gynecology, 101, 451-454. Benn, P. A., Kaminsky, L. M., Ying, J., Borgida, A. F., & Egan, J. F. X. (2002). Combined secondtrimester biochemical and ultrasound screening for Down syndrome. Obstetrics and Gynecology, 100, 1168-1176. Biggio, J. R., Jr., Morris, T. C., Owen, J., & Stringer, J. S. (2004). An outcomes analysis of five prenatal screening strategies for trisomy 21 in women younger than 35 years. American Journal of Obstetrics & Gynecology, 190, 721-729. Bindra, R., Heath, V., Liao, A., Spencer, K., & Nicolaides, K. H. (2002). One-stop clinic for assessment of risk for trisomy 21 at 11-14 weeks: a prospective study of 15 030 pregnancies. Ultrasound in Obstetrics & Gynecology, 20, 219-225. Blackwell, S. C., Abundis, M. G., & Nehra, P. C. (2002). Five-year experience with midtrimester amniocentesis performed by a single group of obstetricians-gynecologists at a community hospital. American Journal of Obstetrics & Gynecology, 186, 1130-1132. Borrell, A., Casals, E., Fortuny, A., Farre, M. T., Gonce, A., Sanchez, A., Soler, A., Cararach, V., & Vanrell, J. A. (2004). First-trimester screening for trisomy 21 combining biochemistry and ultrasound at individually optimal gestational ages. An interventional study. Prenatal Diagnosis, 24, 541-545. Brigatti, K. W., & Malone, F. D. (2004). First-trimester screening for aneuploidy. Obstetrics & Gynecology Clinics of North America, 31, 1-20. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 224 Canick, J. A., Saller, D. N., Jr., & Lambert-Messerlian, G. M. (2003). Prenatal screening for Down syndrome: current and future methods. Clinics in Laboratory Medicine, 23, 395-411. Caughey, A. B. (2005). Cost-effectiveness analysis of prenatal diagnosis: methodological issues and concerns. Gynecologic & Obstetric Investigation, 60, 11-18. Caughey, A. B., Lyell, D. J., Filly, R. A., Washington, A. E., & Norton, M. E. (2001). The impact of the use of the isolated echogenic intracardiac focus as a screen for Down syndrome in women under the age of 35 years. American Journal of Obstetrics & Gynecology, 185, 1021-1027. Caughey, A. B., Lyell, D. J., Washington, A. E., Filly, R. A., & Norton, M. E. (2006). Ultrasound screening of fetuses at increased risk for Down syndrome: how many missed diagnoses? Prenatal Diagnosis, 26, 22-27. Centini, G., Rosignoli, L., Kenanidis, A., Talluri, B., Pasqui, L., Scarinci, R., De Simone, S., & Petraglia, F. (2004). Can a selective use of amniocentesis replace the routine procedure for advanced maternal age? Italian Journal of Gynaecology & Obstetrics, 16, 27-31. Centini, G., Rosignoli, L., Scarinci, R., Faldini, E., Morra, C., Centini, G., & Petraglia, F. (2005). Reevaluation of risk for Down syndrome by means of the combined test in pregnant women of 35 years or more. Prenatal Diagnosis, 25, 133-136. Cheng, C. C., Bahado-Singh, R. O., Chen, S. C., & Tsai, M. S. (2004). Pregnancy outcomes with increased nuchal translucency after routine Down syndrome screening. International Journal of Gynaecology & Obstetrics, 84, 5-9. Chiang, H. H., Chao, Y. M., & Yuh, Y. S. (2006). Informed choice of pregnant women in prenatal screening tests for Down's syndrome. Journal of Medical Ethics, 32, 273-277. Chiu, R. W. K., & Lo, Y. M. D. (2003). Non-invasive prenatal diagnosis: on the horizon? Pharmacogenomics, 4, 191-200. Comas, C., Torrents, M., Munoz, A., Antolin, E., Figueras, F., & Echevarria, M. (2002). Measurement of nuchal translucency as a single strategy in trisomy 21 screening: should we use any other marker? Obstetrics & Gynecology, 100, 648-654. Crossley, J. A., Aitken, D. A., Cameron, A. D., McBride, E., & Connor, J. M. (2002). Combined ultrasound and biochemical screening for Down's syndrome in the first trimester: a Scottish multicentre study. BJOG: An International Journal of Obstetrics & Gynaecology, 109, 667676. Cuckle, H. (2001). Time for total shift to first-trimester screening for Down's syndrome. Lancet, 358, 1658-1659. D'Alton, M., & Cleary-Goldman, J. (2006). Ultrasound clinics: putting the FASTER results into clinical practice. Contemporary OB/GYN, 51, 54, 56, 58 passim. Dallapiccola, B., & Novelli, G. (2000). Genetic testing and prenatal diagnosis. Minerva Biotecnologica, 12, 5-14. DeVore, G. R. (2003). Is genetic ultrasound cost-effective? Seminars in Perinatology, 27, 173-182. DeVore, G. R., & Romero, R. (2001). Combined use of genetic sonography and maternal serum triplemarker screening: An effective method for increasing the detection of trisomy 21 in women younger than 35 years. Journal of Ultrasound in Medicine, 20, 645-654. Dugoff, L., Hobbins, J. C., Malone, F. D., Vidaver, J., Sullivan, L., Canick, J. A., Lambert-Messerlian, G. M.,et al. (2005). Quad screen as a predictor of adverse pregnancy outcome. Obstetrics and Gynecology, 106, 260-267. Egan, J. F., Benn, P., Borgida, A. F., Rodis, J. F., Campbell, W. A., & Vintzileos, A. M. (2000). Efficacy of screening for fetal Down syndrome in the United States from 1974 to 1997. Obstetrics & Gynecology, 96, 979-985. Egan, J. F. X., Kaminsky, L. M., DeRoche, M. E., Barsoom, M. J., Borgida, A. F., & Benn, P. A. (2002). Antenatal Down syndrome screening in the United States in 2001: a survey of maternal-fetal medicine specialists. American Journal of Obstetrics & Gynecology, 187, 12301234. Eiben, B., & Glaubitz, R. (2005). First-trimester screening: an overview. Journal of Histochemistry & Cytochemistry, 53, 281-283. Fortuny, A., Borell, A., Casals, E., Seres, A., Sanchez, A., & Soler, A. (2005). First trimester aneuploidy screening combining biochemical and ultrasound markers. Ultrasound Review of Obstetrics & Gynecology, 5, 9-17. Framarin, A. (2000). Economic and organizational issues in prenatal screening and diagnosis of Down syndrome. Community Genetics, 3, 116-118. Gilbert, R. E., Augood, C., Gupta, R., Ades, A. E., Logan, S., Sculpher, M., & van Der Meulen, J. H. (2001). Screening for Down's syndrome: effects, safety, and cost effectiveness of first and second trimester strategies. BMJ, 323, 423-425. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 225 Gorincour, G., Tassy, S., & D'Ercole, C. (2005). Prenatal screening for Down syndrome: didn't we forget something? Fetal Diagnosis & Therapy, 20, 239-240. Grant, S. S. (2000). Prenatal genetic screening. Online Journal of Issues in Nursing, 5 (Sep 30), 1-19. Grant, S. S. (2005). Options for Down syndrome screening: what will women choose? Journal of Midwifery & Women's Health, 50, 211-218. Grimshaw, G. M., Szczepura, A., Hulten, M., MacDonald, F., Nevin, N. C., Sutton, F., & Dhanjal, S. (2003). Evaluation of molecular tests for prenatal diagnosis of chromosome abnormalities. Health Technology Assessment (Winchester, England), 7(10), 1-146. Gyselaers, W. J., Vereecken, A. J., Van Herck, E. J., Straetmans, D. P., Martens, G. E., de Jonge, E. T., Ombelet, W. U., & Nijhuis, J. G. (2004). Screening for trisomy 21 in Flanders: a 10 years review of 40.490 pregnancies screened by maternal serum. European Journal of Obstetrics, Gynecology, & Reproductive Biology, 115, 185-189. Harris, R. A., Washington, A. E., Nease Jr, R. F., & Kuppermann, M. (2004). Cost utility of prenatal diagnosis and the risk-based threshold. Lancet, 363, 276-282. Hartnett, J., Borgida, A. F., Benn, P. A., Feldman, D. M., DeRoche, M. E., & Egan, J. F. (2003). Cost analysis of Down syndrome screening in advanced maternal age. Journal of Maternal-Fetal & Neonatal Medicine, 13, 80-84. Herman, A., Dreazen, E., Tovbin, J., Weinraub, Z., Bukovsky, Y., & Maymon, R. (2002). Comparison between disclosure and non-disclosure approaches for trisomy 21 screening tests. Human Reproduction, 17, 1358-1362. Herman, A., Weinraub, Z., Dreazen, E., Arieli, S., Rozansky, S., Bukovsky, I., & Maymon, R. (2000). Combined first trimester nuchal translucency and second trimester biochemical screening tests among normal pregnancies. Prenatal Diagnosis, 20, 781-784. Hodges, R. J., & Wallace, E. M. (2005). Testing for Down syndrome in the older woman: a risky business? Australian & New Zealand Journal of Obstetrics & Gynaecology, 45, 486-488. Horger, E. O., 3rd, Finch, H., & Vincent, V. A. (2001). A single physician's experience with four thousand six hundred genetic amniocenteses. American Journal of Obstetrics & Gynecology, 185, 279-288. Howe, D. T., Gornall, R., Wellesley, D., Boyle, T., & Barber, J. (2000). Six year survey of screening for Down's syndrome by maternal age and mid-trimester ultrasound scans. British Medical Journal, 320, 606-610. Huang, T., Owolabi, T., Summers, A. M., Meier, C., & Wyatt, P. R. (2005). The identification of risk of spontaneous fetal loss through second-trimester maternal serum screening. American Journal of Obstetrics & Gynecology, 193, 395-403. Jorgensen, F. S. (2001). Screening and diagnosis of fetal neural tube defects, abdominal wall defects and Down's syndrome. With special reference to biochemical and ultrasound screening in the second trimester of pregnancy and to early amniocentesis. Danish Medical Bulletin, 48, 127145. Jou, H. J., Shyu, M. K., Chen, S. M., Shih, J. C., Hsu, J. J., & Hsieh, F. J. (2000). Maternal serum screening for down syndrome by using alpha-fetoprotein and human chorionic gonadotropin in an asian population. a prospective study. Fetal Diagnosis & Therapy, 15, 108-111. Khoshnood, B., De Vigan, C., Vodovar, V., Goujard, J., & Goffinet, F. O. (2004a). A population-based evaluation of the impact of antenatal screening for Down's syndrome in France, 1981-2000. Bjog-an International Journal of Obstetrics and Gynaecology, 111, 485-490. Khoshnood, B., Pryde, P., Blondel, B., & Lee, K. S. (2003). Socioeconomic and state-level differences in prenatal diagnosis and live birth prevalence of Down's syndrome in the United States. Revue d Epidemiologie et de Sante Publique, 51, 617-627. Khoshnood, B., Pryde, P., Wall, S., Singh, J., Mittendorf, R., & Lee, K. S. (2000). Ethnic differences in the impact of advanced maternal age on birth prevalence of Down syndrome. American Journal of Public Health, 90, 1778-1781. Khoshnood, B., Wall, S., Pryde, P., & Lee, K. S. (2004b). Maternal education modifies the age-related increase in the birth prevalence of Down syndrome. Prenatal Diagnosis, 24, 79-82. Kocun, C. C., Harrigan, J. T., Canterino, J. C., Feld, S. M., & Fernandez, C. O. (2000). Changing trends in patient decisions concerning genetic amniocentesis. American Journal of Obstetrics & Gynecology, 182, 1018-1020. Koos, B. J. (2006). First-trimester screening: lessons from clinical trials and implementation. Current Opinion in Obstetrics & Gynecology, 18, 152-155. Kott, B., & Dubinsky, T. J. (2004). Cost-effectiveness model for first-trimester versus second-trimester ultrasound screening for Down syndrome. Journal of the American College of Radiology, 1, 415-421. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 226 Kuppermann, M., & Norton, M. E. (2005). Prenatal testing guidelines: time for a new approach? Gynecologic & Obstetric Investigation, 60, 6-10. Lam, Y. H., Lee, C. P., Sin, S. Y., Tang, R., Wong, H. S., Wong, S. F., Fong, D. Y. T., Tang, M. H. Y., & Woo, H. H. N. (2002). Comparison and integration of first trimester fetal nuchal translucency and second trimester maternal serum screening for fetal Down syndrome. Prenatal Diagnosis, 22, 730-735. Lewis, P. H., & Pasalio, R. (2004). First-trimester tests for trisomies 21 and 18 as sensitive as triple screen. Journal of Family Practice, 53, 184-186. Liu, S. S., Lee, F. K., Lee, J. L., Tsai, M. S., Cheong, M. L., She, B. Q., & Chen, S. C. (2004). Pregnancy outcomes in unselected singleton pregnant women with an increased risk of firsttrimester Down's syndrome. Acta Obstetricia et Gynecologica Scandinavica, 83, 1130-1134. Maclachlan, N., Iskaros, J., & Chitty, L. (2000). Ultrasound markers of fetal chromosomal abnormality: a survey of policies and practices in UK maternity ultrasound departments. Ultrasound in Obstetrics & Gynecology, 15, 387-390. Mahieu-Caputo, D., Senat, M. V., Romana, S., Houfflin-Debarge, V., Gosset, P., Audibert, F., Bessis, R., et al. (2002). Recent progress in fetal medicine. Archives De Pediatrie, 9, 172-186. Malone, F. D., Ball, R. H., Nyberg, D. A., Comstock, C. H., Saade, G. R., Berkowitz, R. L., Gross, S. J., et al. (2005a). First-trimester septated cystic hygroma: prevalence, natural history, and pediatric outcome. Obstetrics & Gynecology, 106, 288-294. Malone, F. D., Canick, J. A., Ball, R. H., Nyberg, D. A., Comstock, C. H., Bukowski, R., Berkowitz, R. L., et al. (2005b). First-trimester or second-trimester screening, or both, for Down's syndrome. New England Journal of Medicine, 353, 2001-2011. Malone, F. D., D'Alton, M. E., & Society for Maternal-Fetal Medicine (2003). First-trimester sonographic screening for Down syndrome. Obstetrics & Gynecology, 102, 1066-1079. Marsis, I. O. (2004). Screening for Down syndrome using nuchal translucency thickness and nasal bone examination at advanced maternal age in Jakarta: A preliminary report. Journal of Medical Ultrasound, 12, 1-6. Marsk, A., Grunewald, C., Saltvedt, S., Valentin, L., & Almstrom, H. (2006). If nuchal translucency screening is combined with first-trimester serum screening the need for fetal karyotyping decreases. Acta Obstetricia et Gynecologica Scandinavica, 85, 534-538. Matsuda, I., & Suzumori, K. (2000). Prenatal genetic testing in Japan. Community Genetics, 3, 12-16. Maymon, R., Betser, M., Dreazen, E., Padoa, A., & Herman, A. (2004). A model for disclosing the first trimester part of an integrated Down's syndrome screening test. Clinical Genetics, 65, 113119. Maymon, R., & Shulman, A. (2002). Serial first- and second-trimester Down's syndrome screening tests among IVF-versus naturally-conceived singletons. Human Reproduction, 17, 1081-1085. Monni, G., Zoppi, M. A., Ibba, R. M., Floris, M., Manca, F., & Axiana, C. (2005). Nuchal translucency and nasal bone for trisomy 21 screening: single center experience. Croatian Medical Journal, 46, 786-791. Moore, L. L., Bradlee, M. L., Singer, M. R., Rothman, K. J., & Milunsky, A. (2002). Chromosomal anomalies among the offspring of women with gestational diabetes. American Journal of Epidemiology, 155, 719-724. Moran, C. J., Tay, J. B., & Morrison, J. J. (2002). Ultrasound detection and perinatal outcome of fetal trisomies 21, 18 and 13 in the absence of a routine fetal anomaly scan or biochemical screening. Ultrasound in Obstetrics & Gynecology, 20, 482-485. Morris, J. K., De Vigan, C., Mutton, D. E., & Alberman, E. (2005). Risk of a Down syndrome live birth in women 45 years of age and older. Prenatal Diagnosis, 25, 275-278. Morris, J. K., Wald, N. J., & Watt, H. C. (1999). Fetal loss in Down syndrome pregnancies. Prenatal Diagnosis, 19, 142-145. Muller, F., Dreux, S., Oury, J. F., Luton, D., Uzan, S., Uzan, M., Levardon, M., & Dommergues, M. (2002a). Down syndrome maternal serum marker screening after 18 weeks' gestation. Prenatal Diagnosis, 22, 1001-1004. Muller, F., Forestier, F., & Dingeon, B. (2002b). Second trimester trisomy 21 maternal serum marker screening. results of a countrywide study of 854 902 patients. Prenatal Diagnosis, 22, 925929. Mulvey, S., & Wallace, E. M. (2000). Comparison of miscarriage rates between early and late amniocentesis. Prenatal Diagnosis, 20, 265-266. Naylor, C. S., Porto, M., Cohen, B., & Garite, T. J. (2001). Pregnancy outcome in Hispanic patients with unexplained positive triple marker screening for Down syndrome. Journal of MaternalFetal Medicine, 10, 20-22. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 227 Nicolaides, K. H. (2004). Nuchal translucency and other first-trimester sonographic markers of chromosomal abnormalities. American Journal of Obstetrics & Gynecology, 191, 45-67. Nicolaides, K. H. (2005). First-trimester screening for chromosomal abnormalities. Seminars in Perinatology, 29, 190-194. Nicolaides, K. H., Bindra, R., Heath, V., & Cicero, S. (2002). One-stop clinic for assessment of risk of chromosomal defects at 12 weeks of gestation. Journal of Maternal-Fetal & Neonatal Medicine, 12, 9-18. Nicolaides, K. H., Spencer, K., Avgidou, K., Faiola, S., & Falcon, O. (2005). Multicenter study of firsttrimester screening for trisomy 21 in 75 821 pregnancies: Results and estimation of the potential impact of individual risk-oriented two-stage first-trimester screening. Ultrasound in Obstetrics & Gynecology, 25, 221-226. O'Callaghan, S. P., Giles, W. B., Raymond, S. P., McDougall, V., Morris, K., & Boyd, J. (2000). First trimester ultrasound with nuchal translucency measurement for Down syndrome risk estimation using software developed by the Fetal Medicine Foundation, United Kingdom--the first 2000 examinations in Newcastle, New South Wales, Australia. Australian & New Zealand Journal of Obstetrics & Gynaecology, 40, 292-295. O'Leary, P., Breheny, N., Dickinson, J. E., Bower, C., Goldblatt, J., Hewitt, B., Murch, A., & Stock, R. (2006). First-trimester combined screening for Down syndrome and other fetal anomalies. Obstetrics & Gynecology, 107, 869-876. Odibo, A. O., Elkousy, M. H., Ural, S. H., Driscoll, D. A., Mennuti, M. T., & Macones, G. A. (2003). Screening for aneuploidy in twin pregnancies: maternal age- and race-specific risk assessment between 9-14 weeks. Twin Research, 6, 251-256. Odibo, A. O., Stamilio, D. M., Nelson, D. B., Sehdev, H. M., & Macones, G. A. (2005). A costeffectiveness analysis of prenatal screening strategies for Down syndrome. Obstetrics & Gynecology, 106, 562-568. Ogilvie, C. M., Lashwood, A., Chitty, L., Waters, J. J., Scriven, P. N., & Flinter, F. (2005). The future of prenatal diagnosis: rapid testing or full karyotype? an audit of chromosome abnormalities and pregnancy outcomes for women referred for Down's Syndrome testing. BJOG: An International Journal of Obstetrics & Gynaecology, 112, 1369-1375. Onda, T., Tanaka, T., Yoshida, K., Nakamura, Y., Kudo, R., Yamamoto, H., Sato, A., et al.. (2000). Triple marker screening for trisomy 21, trisomy 18 and open neural tube defects in singleton pregnancies of native Japanese pregnant women. Journal of Obstetrics & Gynaecology Research, 26, 441-447. Palomaki, G. E., Knight, G. J., Roberson, M. M., Cunningham, G. C., Lee, J. E., Strom, C. M., & Pandian, R. (2004). Invasive trophoblast antigen (hyperglycosylated human chorionic gonadotropin) in second-trimester maternal urine as a marker for down syndrome: preliminary results of an observational study on fresh samples. Clinical Chemistry, 50, 182-189. Palomaki, G. E., Steinort, K., Knight, G. J., & Haddow, J. E. (2006). Comparing three screening strategies for combining first- and second-trimester Down syndrome markers. Obstetrics & Gynecology, 107, 367-375. Park, S. Y., Kim, J. W., Kim, Y. M., Kim, J. M., Lee, M. H., Lee, B. Y., Han, J. Y., Kim, M. Y.,et al. (2001). Frequencies of fetal chromosomal abnormalities at prenatal diagnosis: 10 years experiences in a single institution. Journal of Korean Medical Science, 16, 290-293. Peller, A. J., Westgate, M. N., & Holmes, L. B. (2004). Trends in congenital malformations, 19741999: Effect of prenatal diagnosis and elective termination. Obstetrics & Gynecology, 104, 957-964. Perni, S. C., Predanic, M., Kalish, R. B., Chervenak, F. A., & Chasen, S. T. (2006). Clinical use of first-trimester aneuploidy screening in a United States population can replicate data from clinical trials. American Journal of Obstetrics & Gynecology, 194, 127-130. Pescia, G., & Addor, M. C. (2000). Trisomy 21 and its prenatal detection in the Canton of Vaud (19801996). Schweizerische Medizinische Wochenschrift, 130, 1332-1338. Pettker, C. M., & Copel, J. A. (2005). Ultrasound clinics: amniocentesis: technique and complications. Contemporary OB/GYN, October 1. Available from: http://www.contemporaryobgyn.net/obgyn/issue/issueDetail.jsp?id=7149 Accessed 19.9.06. Pinette, M. G., Garrett, J., Salvo, A., Blackstone, J., Pinette, S. G., Boutin, N., & Cartin, A. (2001). Normal midtrimester (17-20 weeks) genetic sonogram decreases amniocentesis rate in a highrisk population. Journal of Ultrasound in Medicine, 20, 639-644. Prefumo, F., Sethna, F., Sairam, S., Bhide, A., & Thilaganathan, B. (2005). First-trimester ductus venosus, nasal bones, and Down syndrome in a high-risk population. Obstetrics & Gynecology, 105, 1348-1354. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 228 Prefumo, F., & Thilaganathan, B. (2002). Agreement between predicted risk and prevalence of Down syndrome in first trimester nuchal translucency screening. Prenatal Diagnosis, 22, 917-918. Ramos, D., Santiago, C., Gallo, M., Zaragoza, E., & Montoya, F. (2004). How far does first-trimester screening for trisomies 13 and 18 increase the need for invasive techniques? Ultrasound Review of Obstetrics & Gynecology, 4, 160-164. Reddy, U. M., & Mennuti, M. T. (2006). Incorporating first-trimester Down syndrome studies into prenatal screening: executive summary of the National Institute of Child Health and Human Development workshop. Obstetrics & Gynecology, 107, 167-173. Resta, R. G. (2005). Changing demographics of Advanced Maternal Age (AMA) and the impact on the predicted incidence of down syndrome in the United States: Implications for prenatal screening and genetic counseling. American Journal of Medical Genetics, 133 A, 31-36. Roberts, D., Walkinshaw, S. A., McCormack, M. J., & Ellis, J. (2000). Prenatal detection of trisomy 21: combined experience of two British hospitals. Prenatal Diagnosis, 20, 17-22. Rosen, D. J., Kedar, I., Amiel, A., Ben-Tovim, T., Petel, Y., Kaneti, H., Tohar, M., & Fejgin, M. D. (2002). A negative second trimester triple test and absence of specific ultrasonographic markers may decrease the need for genetic amniocentesis in advanced maternal age by 60%. Prenatal Diagnosis, 22, 59-63. Rowe, R. E., Garcia, J., & Davidson, L. L. (2004). Social and ethnic inequalities in the offer and uptake of prenatal screening and diagnosis in the UK: a systematic review. Public Health, 118, 177189. Rozenberg, P., Malagrida, L., Cuckle, H., Durand-Zaleski, I., Nisand, I., Audibert, F., Benattar, C., et al. (2002). Down's syndrome screening with nuchal translucency at 12(+0)-14(+0) weeks and maternal serum markers at 14(+1)-17(+0) weeks: a prospective study. Human Reproduction, 17, 1093-1098. Ryall, R. G., Callen, D., Cocciolone, R., Duvnjak, A., Esca, R., Frantzis, N., Gjerde, E. M., Haan, E. A., et al. (2001). Karyotypes found in the population declared at increased risk of Down syndrome following maternal serum screening. Prenatal Diagnosis, 21, 553-557. Sangalli, M., Langdana, F., & Thurlow, C. (2004). Pregnancy loss rate following routine genetic amniocentesis at Wellington Hospital. New Zealand Medical Journal, 117, U818. Sau, A., Langford, K., Auld, B., & Maxwell, D. (2001). Screening for trisomy 21: the significance of a positive second trimester serum screen in women screen negative after a nuchal translucency scan. Journal of Obstetrics & Gynaecology, 21, 145-148. Savva, G. M., Morris, J. K., Mutton, D. E., & Alberman, E. (2006). Maternal age-specific fetal loss rates in Down syndrome pregnancies. Prenatal Diagnosis, 26, 499-504. Schluter, P. J., & Pritchard, G. (2005). Mid trimester sonographic findings for the prediction of Down syndrome in a sonographically screened population. American Journal of Obstetrics and Gynecology, 192, 10-16. Schuchter, K., Hafner, E., Stangl, G., Metzenbauer, M., Hofinger, D., & Philipp, K. (2002). The first trimester 'combined test' for the detection of Down syndrome pregnancies in 4939 unselected pregnancies. Prenatal Diagnosis, 22, 211-215. Schuchter, K., Hafner, E., Stangl, G., Ogris, E., & Philipp, K. (2001). Sequential screening for trisomy 21 by nuchal translucency measurement in the first trimester and maternal serum biochemistry in the second trimester in a low-risk population. Ultrasound in Obstetrics & Gynecology, 18, 23-25. Seeds, J. W. (2004). Diagnostic mid trimester amniocentesis: how safe? American Journal of Obstetrics & Gynecology, 191, 607-615. Seror, V., Costet, N., & Ayme, S. (2001). Participation in maternal marker screening for Down syndrome: contribution of the information delivered to the decision-making process. Community Genetics, 4, 158-172. Simpson, J. L., Bombard, A., D'Alton, M., & Platt, L. D. (2000). Noninvasive screening for aneuploidy: who, when, and why? Contemporary OB/GYN, 45, 76-78, 81-82, 87-78 passim. Smith-Bindman, R., Hosmer, W., Feldstein, V. A., Deeks, J. J., & Goldberg, J. D. (2001). Secondtrimester ultrasound to detect fetuses with Down syndrome: a meta-analysis. JAMA, 285, 1044-1055. Snijders, R., & Smith, E. (2002). The role of fetal nuchal translucency in prenatal screening. Current Opinion in Obstetrics & Gynecology, 14, 577-585. Spencer, K. (2001). What is the true fetal loss rate in pregnancies affected by trisomy 21 and how does this influence whether first trimester detection rates are superior to those in the second trimester? Prenatal Diagnosis, 21, 788-789. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 229 Spencer, K., & Nicolaides, K. H. (2003). Screening for trisomy 21 in twins using first trimester ultrasound and maternal serum biochemistry in a one-stop clinic: a review of three years experience. BJOG: An International Journal of Obstetrics & Gynaecology, 110, 276-280. Stojilkovic-Mikic, T., & Rodeck, C. H. (2003). Screening for chromosomal anomalies: first or second trimester, biochemical or ultrasound? Annals of the Academy of Medicine, Singapore, 32, 583589. Summers, A. M., Farrell, S. A., Huang, T., Meier, C., & Wyatt, P. R. (2003a). Maternal serum screening in Ontario using the triple marker test. Journal of Medical Screening, 10, 107-111. Summers, A. M., Huang, T., Meier, C., & Wyatt, P. R. (2003b). The implications of a false positive second-trimester serum screen for Down syndrome. Obstetrics & Gynecology, 101, 13011306. Tsai, M. S., Huang, Y. Y., Hwa, K. Y., Cheng, C. C., & Lee, F. K. (2001). Combined measurement of fetal nuchal translucency, maternal serum free beta-hCG, and pregnancy-associated plasma protein A for first-trimester Down's syndrome screening. Journal of the Formosan Medical Association, 100, 319-325. Tseng, J. J., Chou, M. M., Lo, F. C., Lai, H. Y., Chen, M. H., & Ho, E.-C. (2006). Detection of chromosome aberrations in the second trimester using genetic amniocentesis: Experience during 1995-2004. Taiwanese Journal of Obstetrics & Gynecology, 45, 39-41. Turhan, N. O., Eren, U., & Seckin, N. C. (2005). Second-trimester genetic amniocentesis: 5-Yar experience. Archives of Gynecology & Obstetrics, 271, 19-21. Verdin, S. M., Whitlow, B. J., Lazanakis, M., Kadir, R. A., Chatzipapas, I., & Economides, D. L. (2000). Ultrasonographic markers for chromosomal abnormalities in women with negative nuchal translucency and second trimester maternal serum biochemistry. Ultrasound in Obstetrics & Gynecology, 16, 402-406. Vintzileos, A. M., Guzman, E. R., Smulian, J. C., Yeo, L., Scorza, W. E., & Knuppel, R. A. (2002a). Down syndrome risk estimation after normal genetic sonography. American Journal of Obstetrics & Gynecology, 187, 1226-1229. Vintzileos, A. M., Guzman, E. R., Smulian, J. C., Yeo, L., Scorza, W. E., & Knuppel, R. A. (2002b). Second-trimester genetic sonography in patients with advanced maternal age and normal triple screen. Obstetrics & Gynecology, 99, 993-995. Wald, N. J., Rodeck, C., Hackshaw, A. K., & Rudnicka, A. (2004). SURUSS in perspective. BJOG: An International Journal of Obstetrics & Gynaecology, 111, 521-531. Wald, N. J., Rodeck, C., Hackshaw, A. K., & Rudnicka, A. (2005). SURUSS in perspective. Seminars in Perinatology, 29, 225-235. Wald, N. J., Rodeck, C., Hackshaw, A. K., Walters, J., Chitty, L., Mackinson, A. M., et al. (2003). First and second trimester antenatal screening for Down's syndrome: the results of the serum, urine and ultrasound screening study (SURUSS). Health Technology Assessment (Winchester, England), 7, 1-77. Wax, J. R., Guilbert, J., Mather, J., Chen, C., Royer, D., Steinfeld, J. D., & Ingardia, C. J. (2000). Efficacy of community-based second trimester genetic ultrasonography in detecting the chromosomally abnormal fetus. Journal of Ultrasound in Medicine, 19, 689-694. Wayda, K., Kereszturi, A., Orvos, H., Horvath, E., Pal,. A., Kovacs, L., & Szabo, J. (2001). Four years experience of first-trimester nuchal translucency screening for fetal aneuploidies with increasing regional availability. Acta Obstetricia et Gynecologica Scandinavica, 80, 11041109. Wellesley, D., De Vigan, C., Baena, N., Cariati, E., Stoll, C., Boyd, P. A., & Clementi, M. (2004). Contribution of ultrasonographic examination to the prenatal detection of trisomy 21: Experience from 19 European registers. Annales de Genetique, 47, 373-380. Witters, I., Legius, E., Devriendt, K., Moerman, P., Van Schoubroeck, D., Van Assche, A., & Fryns, J. P. (2001). Pregnancy outcome and long term prognosis in 868 children born after second trimester amniocentesis for maternal serum positive triple test screening and normal prenatal karyotype. Journal of Medical Genetics, 38, 336-338. Witters, I., Legius, E., Matthijs, G., & Fryns, J. P. (2002). Prenatal diagnosis of trisomy 21 between 1991 and 1999 in the Leuven Centre for Human Genetics: Effect of triple test screening. Genetic Counseling, 13, 199-202. Wojdemann, K. R., Larsen, S. O., Shalmi, A., Sundberg, K., Christiansen, M., & Tabor, A. (2001). First trimester screening for Down syndrome and assisted reproduction: No basis for concern. Prenatal Diagnosis, 21, 563-565. Won, R. H., Currier, R. J., Lorey, F., & Towner, D. R. (2005). The timing of demise in fetuses with trisomy 21 and trisomy 18. Prenatal Diagnosis, 25, 608-611. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 230 Wray, A. M., Ghidini, A., Alvis, C., Hodor, J., Landy, H. J., & Poggi, S. H. (2005). The impact of firsttrimester screening on AMA patients' uptake of invasive testing. Prenatal Diagnosis, 25, 350353. Wright, D. E., & Bradbury, I. (2005). Repeated measures screening for Down's Syndrome. BJOG: An International Journal of Obstetrics & Gynaecology, 112, 80-83. Wyatt, P. R., Owolabi, T., Meier, C., & Huang, T. (2005). Age-specific risk of fetal loss observed in a second trimester serum screening population. American Journal of Obstetrics and Gynecology, 192, 240-246. Yeo, L., & Vintzileos, A. M. (2003). The use of genetic sonography to reduce the need for amniocentesis in women at high-risk for Down syndrome. Seminars in Perinatology, 27, 152159. SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING 231 Appendix 5: DESIGNATIONS OF LEVELS OF EVIDENCE Level I Evidence obtained from a systematic review (or meta-analysis) of relevant randomised controlled trials. Level II Evidence obtained from at least one randomised controlled trial. Level III. 1 Evidence obtained from pseudorandomised controlled trials (alternate allocation or some other method). 2 Evidence obtained from comparative studies (including a systematic reviews of such studies) with concurrent controls and allocation not randomised, cohort studies, case control studies or interrupted time series with a control group). 3 Evidence obtained from comparative studies with historical control, two or more single-arm studies or interrupted time series without a parallel control group. Level IV Evidence obtained from case series, either post-test or pretest/post-test. *Modified from NHMRC (2000). SCREENING STRATEGIES FOR ANTENATAL DOWN SYNDROME SCREENING