* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download PPT - United Nations Statistics Division
Survey
Document related concepts
Transcript
Big Data in the National Accounts Experience in the United States Brent Moulton Advisory Expert Group on National Accounts Washington, DC 9 September 2014 www.bea.gov What are big data? ▪ Wikipedia: “Any collection of data sets so large and complex that it becomes difficult to process using… traditional data processing applications.” ▪ IBM: “Every day we create 2.5 quintillion bytes of data… This data comes from everywhere… This is big data.” ▪ Forbes: “12 big data definitions: what’s yours?” # 11 – “The belief that the more data you have, the more insights and answers will arise automatically from the pool” # 12 – “A new attitude… that combining data from multiple sources could lead to better decisions.” www.bea.gov 2 Big data and official statistics ▪ Statistical agencies as producers of big data Consistency in format and presentation Catalogued in common, machine-readable format Accessible in bulk Desirable to make government data available on a single platform ▪ Big data as source data for national accounts Administrative data, especially micro-data Data from private sources Web scraping www.bea.gov 3 Concerns about using big data ▪ Do the concepts match those needed for national accounts? ▪ How representative are the data? Selection biases ▪ Is it possible to fill the gaps in coverage? ▪ Do the data provide consistent time series and classifications? ▪ How timely are the data? ▪ How cost effective? www.bea.gov 4 Defined-benefit pension funds ▪ For the SNA’s new treatment of definedbenefit pensions, BEA found it useful to work with administrative micro-data filed by pension funds “Form 5500” data from Pension Benefit Guaranty Corporation ~ 45,000 records per year covering 98% of private pension funds BEA had to edit data to remove data errors and anomalies www.bea.gov 5 Private source data for early estimates ▪ For “advance” GDP estimate (release about 30 days after the end of the quarter), official monthly/quarterly indicators are not always available ▪ Examples of private source data used by BEA: www.bea.gov Ward’s/JD Powers/Polk (auto sales/price/registrations) American Petroleum Institute (oil drilling) Air Transport Association of America (airlines) Variety magazine (motion picture admissions) Smith Travel Research (hotels and motels) Investment Company Institute (mutual fund sales) 6 Health care satellite account ▪ Schultze Commission (At What Price? 2002) recommended that health care price indexes should be based on cost of treating a specific diagnosis ▪ BEA is preparing a health care satellite care (http://www.bea.gov/national/health_care_satellite_account.htm) One approach uses insurance claims data for several million insured individuals Claims grouped in disease episodes Allows comparison of change in cost for treating particular diseases www.bea.gov 7 Local area tracking system ▪ Used by BEA’s regional accounts staff for independent data on regional economies ▪ Used to vet official statistics before publishing ▪ Types of data Employment data: largest employers, principal industries, recent layoffs Natural events affecting the economy Local real estate and financial trends ▪ Automated using web scraping methods Identifying key word searches Archiving relevant articles www.bea.gov 8 BEA research on depreciation ▪ Identifying depreciation in the presence of obsolescence is a long-standing issue ▪ BEA research on motor vehicle depreciation proposes to address this problem using data on “build dates,” which can differ from model years ▪ Data scraping – VIN-level data from decodethis.com combined with auction data from NADA and data from other auto websites ▪ Goal is improved estimates of depreciation www.bea.gov 9 Conclusions ▪ Big data will become increasingly important ▪ Priority to improving data quality, filling gaps, and keeping up with changing economy ▪ Big data especially useful for research projects ▪ Big data may allow for more timely or higher frequency estimates ▪ Attention must continue to be paid to traditional data quality issues www.bea.gov 10