Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Advancing the Frontiers of Social Science: The Rocky Mountain Research Data Center The Rocky Mountain Federal Statistical Opportunities and Challenges Jani Little, Executive Director Research Data Center Jani Little, Executive Director [email protected] Katie Genadek, Expected Administrator (RMRDC) Jani Little Executive Director What is a Federal Statistical Research Data Center (FSRDC)? --A secure computing lab where restricted data, collected by federal agencies, can be accessed FOR STATISTICAL PURPOSES ONLY --Made possible by a contractual agreement between a leading research institution and the U.S. Census Bureau --The Census Bureau’s Center for Economic Studies (CES) directs all FSRDCs and the FSRDC Program --FSRDCs are managed by an on-site Census employee—the administrator— who guides researchers on proposal development, enforces security guidelines, and serves as liaison with the research community. Katie Genadek, PhD RMRDC Administrator University of Colorado [email protected] IBS Room 423 The RMRDC Consortium Partner Members: Supporting Members: UC Colorado Springs Colorado State Government Colorado School of Mines National Center for Atmospheric Research National Renewable Energy Laboratory Partner Consortium Members Faculty, Grad Students, and Affiliated Researchers: Free access to RMRDC services and secure laboratory Researchers with continued use are expected to write grant proposals and include lab fees Advantages to Researchers and Institutions: --Greatly expands the policy and basic questions that can be addressed --Builds on past research findings with richer data --Improves competitive edge for grants and publications --Improves graduate education (big data/statistical techniques) and placement --Attracts and retains data-intensive faculty Advantages Provided to Research: --Microdata not available publicly firms and establishments individuals and households (especially longitudinal studies) children --Variables not available in public versions of data sets (e.g., low level geography) --Full population counts or larger samples (Decennial Census, ACS, CPS) --Full range of response items (e.g., industry codes, occupational codes, detailed race answers, income is not top-coded, etc.) --Ability to make linkages with external data (e.g., via geocodes, establishment ID, etc.) between multiple internal data sets via non-public link keys FSRDCs Used to Address Many Research Topics • • • • • • • • • • • • Business, Trade, Finance, and Management Crime and Crime Victimization Demography, Population Distributions and Trends, Migration, and Immigration Economics, Labor Markets, Entrepreneurship, Employment and Industry Education and Education Policy Hazard Mitigation, Environmental Impact Assessment, Pollution Abatement Health and Well-Being, Health Insurance, Health Policy Housing, Housing Markets, and Residential Patterns Poverty, Social Welfare Policy, and Social Mobility Transportation Analysis and Planning Urban and Regional Economics and Planning Energy Efficiency and Greenhouse Gas Emissions in Manufacturing Requirements for Any FSRDC Project: --Research projects must undergo a formal approval process with the agency that owns the data, e.g., Census, NCHS, AHRQ, BLS --Researchers must go through a background investigation that qualifies them for “Special Sworn Status (SSS)” which makes them an unpaid Census Bureau employee. --Results must be formally reviewed for disclosure violation before they leave the secure facility. RMRDC: The Physical Facility Projected Opening: May 2017 Location: IBS Building on CU Boulder Campus --10 thin client workstations to access FSRDC servers --Secure communications that tunnel over campus internet --Contains the Administrator’s office --Badge Reader at Entrance --24/7 Security System with camera --no electronic devices allowed --NOTHING leaves the secure lab without approval FSRDC Server Software Gauss Stata Matlab & toolboxes PBS Pro Intel Composer XE NX Enterprise R SAS SAS (Dataflux) SUDAAN GeoDa Tomlab Knittro Madd QGIS StatTransfer Python - Anaconda Fortran Perl Tex/LaTex Components of Proposals: --Personnel and Time frame --Project Description (scientific merit, methods, feasibility, why requires restricted data) --Dataset(s), Variables, Geography --Results Expected and Disclosure Avoidance Strategies Proposal Differences by Agency: Census NCHS and AHRQ Time to Approval 3 months on average 1-3 months on average Benefit to Agency PPS Required Not Required Fee None $1200 min extract fee NCHS $300 AHRQ* Scope Broad (max of 30 pages) Precise Major Partners in the FSRDC System • U.S. Census Bureau • Economic Data • Demographic Data • Longitudinal Employer-Household Dynamics (LEHD) Data • Bureau of Labor Statistics (BLS) • National Center for Health Statistics (NCHS) • Agency for Healthcare Research and Quality (AHRQ) • Other Federal Partners Economic data available in RDCs • Microdata not available elsewhere • Detailed geographies and industries • Data linked over time • Employee and employer linked data • Full business register for the US • Can link own data to individual businesses Examples of Economic Microdata Data Set Frequency Unit of Enumeration Availability Standard Statistical Establishment List/Business Register (SSEL) Annually Establishment 1974–2014 Longitudinal Business Database (LBD) Annually Establishment 1976–2014 Examples of Economic Microdata Data Sets Census of Auxiliary Establishments (AUX) Frequency Every 5 Years Unit of Enumeration Establishment Availability 1977–2012 Census of Construction Industries (CCN) Every 5 Years Establishment 1972–2012 Census of Finance, Insurance, and Real Estate (CFI) Census of Manufactures (CMF) Every 5 Years Establishment 1992–2012 Every 5 Years Establishment Census of Mining (CMI) Every 5 Years Establishment 1963, 1967–2012 1987–2012 Census of Retail Trade (CRT) Every 5 Years Establishment 1977–2012 Census of Services (CSR) Every 5 Years Establishment 1977–2012 Census of Transportation, Communications, and Utilities (CUT) Census of Wholesale Trade (CWH) Every 5 Years Establishment 1987–2012 Every 5 Years Establishment 1977–2012 Census of Services-- • includes Health Care and Social Assistance Enterprises • NAICS code 62 • 2012 Number of Establishments in U.S.: 831,303 • Receipts/Revenues ($1,000): 2,040,441,203 • Summary table: https://factfinder.census.gov/faces/tableservices/jsf/pages/productvi ew.xhtml?src=bkmk Linked Employer Household Dynamics (LEHD) LEHD data combine administrative data from states’ Unemployment Insurance systems with Census Bureau data. Workers: Employer history and quarterly wages, Individual characteristics (sex, age, race), Point in time residence and place of birth Employers: Industry, employment, total payroll, location Linkages between workers and employers Links to other Census data Census Data: Demographic data available in RDCs • More geographic detail—usually block group or tract • Additional variables • More observations • Variables not censored (income) • Additional detail within variables Data Available • Decennial Censuses • Yearly ACS (American Community Survey) • Current Population Survey Supplements • American Housing Survey • Survey of Income and Program Participation • National Crime Victimization Survey • National Longitudinal Mortality Study • National Longitudinal Surveys (NLS) • Decennial Censuses • 1950-2000 full count short form and 17% long form Long form: Household and individual level demographic, socio-economic, program participation, education, household characteristics, etc • 2010 short form only • Yearly ACS (American Community Survey) • Annual full samples-- 1.5% of US population • Replaced Long form from 2000 decennial + a few extra questions • Current Population Survey Supplements • ASEC (Annual Social and Economic Supplement) or March 1967-2015 • Fertility Supplement (1998-2012), Food Security (2001-2012), School enrollment (2004-2014), Tobacco Use (1998-2011), Unbanked (20092013), Volunteer (2002-2015), Voter Reg (1998-2012) • American Housing Survey • Some years from 1984-2015; ~50,000 households per year • Core questions: Home condition, occupant characteristics, home improvements, housing costs, home values, characteristics of recent movers, etc • Topical questions vary by year • Survey of Income and Program Participation • 2-4 year household panels; interviews ~every 4 months; 19842014; 14,000 to 52,000 households each wave • Core: labor force, income dynamics, government transfers • Topical modules vary • National Crime Victimization Survey • Yearly 2006-2014; ~90,000 households • Non-fatal and property crimes, reported and unreported; demographic information for respondent; demographic information of perpetrator • National Longitudinal Mortality Study • CPS-ASEC data linked to national death index • CPS cohorts 1973-1998 • National Longitudinal Survey (NLS) • Original cohorts (1966, 1968) • Labor market, demographic, and other data collected over 35 years • ~5,000 respondents per cohort Health Restricted Data: • More geographic detail • Additional variables • Child data (under 18 years) • Additional detail within variables Restricted Health Data and Variables • Geographic Codes for all NCHS Surveys • National Health and Nutrition Examination Survey (NHANES) • National Health Care Surveys • National Ambulatory Medical Care Survey (NAMCS) and National Hospital Ambulatory Medical Care Survey (NHAMCS) • National Hospital Discharge Survey (NHDS) • National Nursing Home Survey (NNHS) and National Nursing Assistant Survey (NNAS) • National Home and Hospice Care Survey (NHHCS) and National Home Health Aide Survey (NHHAS) • National Survey of Residential Care Facilities (NSRCF) • National Study of Long-Term Care Providers (NSLTCP) • National Hospital Care Survey (NHCS) • National Health Interview Survey (NHIS) • National Survey of Family Growth (NSFG) • State and Local Area Integrated Telephone Survey (SLAITS) • National Survey of Children's Health (NSCH) • National Survey of Children with Special Health Care Needs (CSHCN) • NCHS Data Linkage Activities • • • • Linked Mortality Data Products Linked Medicare Enrollment and Claims Files Data Linked Medicaid Enrollment and Claims Data Linked Social Security Benefit History Data • National Vital Statistics System (NVSS) Data Release and Access Policy • National Maternal and Infant Health Survey Some major health data sources: • Survey data • NHANES • NHIS • NSCH • AHRQ Survey data • MEPS-HC • MEPS-IC • Health Care Survey data • NAMCSs • NHDS • Administrative data • Vital Records • Linked Data • • • • Mortality Data Products Medicare Enrollment and Claims Data Medicaid Enrollment and Claims Data Social Security Benefit History Data National Health and Nutrition Examination Survey (NHANES) • Provides prevalence data on selected diseases and risk factors of U.S. Population • Monitors trends in diseases, behaviors, and environmental exposures • Identifies emerging public health concerns • Provides national baseline information on health and nutrition National Health and Nutrition Examination Survey (NHANES), 1999-2014 • National probability sample, approx. 10,000 • Data collection from Mobile unit • Interview—acculturation, air quality, allergies, demographics, diet, cognitive functioning, physical activity, sleep disorder, smoking, social support, weight history, family background, food security, • alcohol use, bowel health, overall health, depression screening, pesticide exposure, reproductive health, exposure to chemicals, drug use, sexual behavior, etc • Physical exam — hearing, body measurements, balance, blood pressure, vision, heart, etc • Lab testing —blood, urine, oral rinse, etc National Health and Nutrition Examination Survey (NHANES) Restricted Data • Identifies geography below national level down to Census block • Youth -- Alcohol and Drug Use, ADHD, STDs, Mental Health Disorders, Depression, Sexual Behavior National Health Interview Survey, 1993-2015 • Annual Sample that is Nationally and Regionally Representative • Family, Household and Person Self-Report Data • Extensive Health and Social Psychological Measures including • Depression, anxiety • Other Mental Health Conditions • Other Emotional or Behavioral Problems National Health Interview Survey, Restricted Data • Country of Birth and Related Immigration Variables (Person File) • State and Year of Birth (Person File) • Industry and Occupation Codes • Detailed Race and Hispanic Origin (Person File) • Exact Dates (e.g., date of birth in Person File) • Low levels of geography from state down to tract Exposures to Fine Particulate Air Pollution and Respiratory Outcomes in Adults Using Two National Datasets: A Cross-sectional Study Researchers: Keeve Nachman and Jennifer Parker Datasets: NHIS, EPA Air Data System (External- Linked using geocode) --Evaluates the relationship between air pollution and asthma across race/ethnicity… --Revealed significant associations for non-Hispanic blacks but not for Hispanics and non-Hispanic whites National Survey of Children’s Health • National telephone survey of households with at least 1 child, • N= 91,642 • Demographics, Health and Functioning, Home Environment, • Early Childhood Care, Developmental Screening, • Adolescent School, Exercise, Emotional Difficulties • Family Functioning and Parental Health • Neighborhood and Community • All variables restricted • County and zip code geography available Medical Expenditure Panel Survey--Insurance Component (AHRQ and Census) • 1996-2006, 2008-2015 • Public (govt) and private sector employers ~40,000 each year • Asks about insurance plans offered • Asks about contributions provided by employers and employees • Can be linked with Census business data • Used to document changes in employer-provided insurance before and after ACA Medical Expenditure Panel Surveys— Household Component (AHRQ) • Annual sample of households from prior year NHIS • 30,000 persons, 14,000 households • Health services used, frequency, charges and source of payments • Access to care and quality of care • Panel design over 2 years • Medical Provider Component supplements Household Component • Detailed charge and payment data • Hospitals, physicians, home health care providers, and pharmacies National Ambulatory Medical Care Surveys • Sample of physicians, 1 week of visits, randomly sampled • Patient demographics, symptoms, diagnoses and medications ordered, number of visits in past year • Physician demographics, type and size of practice, specialty • Zip code Useful Websites • Restricted NCHS Data https://www.cdc.gov/rdc/b1datatype/dt122.htm • Restricted AHRQ Data • https://meps.ahrq.gov/mepsweb/data_stats/onsite_datacenter.jsp