Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Role of Local Specificity in the Interpretation of Small Area Estimation Benmei Liu Scott Gilkeson Gordon Willis Rocky Feuer 2012 FCSM Statistical Policy Seminar December 4, 2012 Outline I. Overview of small area estimation II. The importance of local specificity and how it could affect data use III. An example from a recent project to estimate cancer risk factors and screening behavior IV.Discussion 2 I. Overview of Small Area Estimation (SAE) The demand for survey estimates for small areas (small geographic areas or domains) has increased in many different areas of application (e.g., income and poverty, education, health, substance use) over the past several decades The standard direct estimation methods for survey data cannot provide reliable estimates due to the small sample size Model-based methods that combine information from multiple related sources have been developed to increase the precision 3 Basic SAE Model and Estimates Fay-Herriot model (1979) has been considered the prominent fundamental approach The final estimate for area 𝑖 derived from the Fay-Herriot class of models: 𝜃𝑖 = 1 − 𝑝𝑖 𝐷𝑖 + 𝑝𝑖 𝑀𝑖 , 𝑖 = 1, … , 𝑚 where: 𝐷𝑖 is the direct estimate; 𝑀𝑖 is a regression-based synthetic estimate; 𝑝𝑖 is the proportion of the final estimate due to regression based synthetic estimate, or a measure of this borrowed strength; 0 ≤ 𝑝𝑖 ≤ 1. 4 II. The importance of Local Specificity We label the information about the use of local versus borrowed data based on the SAE techniques as local specificity We propose that the term local specificity be used as a generalizable and intuitively understandable term for the degree to which local data contribute to the small area estimate for a specified area 5 The importance of Local Specificity (Cont’d) Local specificity can be an important indicator of fitness for use We argue that local specificity provides unique information that is not otherwise available For local data users, a measure of local specificity could be useful A measure of local specificity was not provided on any of the government websites that release small area estimates data (e.g., SAIPE, NAAL, NSDUH) 6 III. Communicating Local Specificity to End Users: An Example Combining information from two health surveys to enhance small-area estimation (Raghunathan et al. 2007; Davis et al. 2010) Project led by National Cancer Institute, with collaboration by: National Center for Health Statistics National Center for Chronic Disease Prevention and Health Promotion University of Michigan University of Pennsylvania Information Management Services 7 Motivation for the Project Cancer screening and risk factor data are of great interest to cancer control planners at the state and sub-state level, but accurate local statistics have been difficult to obtain Different surveys have different strengths Combining information from surveys and borrowing strength from other sources (e.g., Census or administrative records) using small area modeling approach could improve smallarea estimates 8 Surveys Used Behavioral Risk Factor Surveillance System (BRFSS) – the largest U.S. survey tracking health conditions and risk behaviors at the state and substate level since 1984 Limitations: Potential nonresponse bias; Undercoverage of hhlds without landline phones National Health Interview Survey (NHIS) – the principal source of information on the health of the civilian noninstitutionalized population of U.S. since 1957 Limitations: Smaller sample size; only includes data on about ¼ of U.S. counties 9 Project Description Bayesian methods are developed to combine information from the two surveys; also incorporated telephone coverage rates from the Census National Cancer Institute released estimates for two time periods: 1997-99 and 2000-03 (http://sae.cancer.gov/) - Smoking, mammography, and pap smear - Counties, health service areas, and states Current work involves including component for cellphone-only households and for the recent periods 10 Focus Group Suggestions Conducted two focus groups with cancer control planners and public health professionals at the Comprehensive Cancer Control Leadership Institute in June 2010 Recommendations: Include these estimates within NCI’s State Cancer Profiles website (http://statecancerprofiles.cancer.gov/) The website is a comprehensive system of interactive maps and graphs enabling the investigation of cancer trends at the national, state, and county level Need a way to describe the differences between the biasadjusted model-based estimates and existing direct estimates Data users would appreciate an indicator like local specificity to validate the estimate against local evidence 11 Issues on Communicating Local Specificity 1) How should it be measured? 2) What should it be labeled? 3) What thresholds should be set in assigning values to it? 12 1) Measuring Local Specificity The bias-adjusted SAE model is complex and lacks an explicit shrinkage factor The concept of borrowed strength still applies, depending primarily on the combined BRFSS and NHIS sample size within the area NHIS sample size is confidential. The sample size of the combined sample is close to the BRFSS sample size BRFSS sample size is published, and alone was the best practical measure of the amount of local data 13 2) Labeling Local Specificity Presenting the BRFSS sample size as a number along with the estimates didn’t convey the message of local specificity Developed the term local specificity and selected qualitative (i.e., high, medium, and low) rather than quantitative descriptors 14 3) Assigning Thresholds Selected BRFSS sample size of 50 as the threshold for low local specificity Determining break points for the categories of local specificity deserves further study 15 Ratios of model-based county level current mammography screening rate over the bias-corrected BRFSS direct estimate 16 Small area estimates of mammography screening by county in Pennsylvania, with a mini-map showing local specificity Warren county 2000-2003 percentage = 65.9 (56.6-75.2) Westmoreland county 2000-2003 percentage = 64.8 (57.5-72.2) 17 IV. Discussion Our experience has convinced us that such a measure is critical for end users in their use and interpretation of results The potential importance of local specificity should not be under-emphasized, given that users demand more from SAEs than from the results of most other statistical models There is no single computational formula for calculating levels of local specificity that will apply generally across various models and further research is needed Whenever estimates are based on non-ignorable levels of borrowed strength, it is vitally important to disseminate analyses in such a way that local specificity, as an important index of fitness for use, be conveyed to data users in a clear and unbiased manner 18 Thank you! Contact information: Benmei Liu, Ph.D. Survey Statistician National Cancer Institute [email protected] 19