* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Automated Data Analysis
Survey
Document related concepts
Transcript
Automated Data Analysis Nishan Ahmed Data Management Training Cairo, Egypt April 21 - 25, 2013 National Center for Immunization & Respiratory Diseases Influenza Division Objectives • Why Automated Data Analysis (ADA) • What does the ADA process involve • Preparation Steps • Which Applications can be used • Software Examples to set up ADA • Software Considerations • Basic requirements to develop ADA • Benefits of a well developed ADA system Why Automate Data Analysis • To simplify and ease routine data analysis • To standardize analysis output • Uniform report • Replicable • To make reporting efficient • To make sharing of information easier • To make data management processes easier and effective • Enable frequent runs for data checks • Catch errors quickly • • Up-to-date editing Quick intervention – error pattern What does the ADA process involve • Plan analysis needs in advance of data collection • Identify data needed for each process • Plan and create export or import table templates – i.e Merged tables • Verify desired data outputs • Create standardized routine reports • Create queries in advance • Write codes in advance Preparation Steps • Understand the study objectives • Identify routine data outputs • Identify required graphs and summary tables • Understand the purpose of each report • Identify core information & variables required • Outline the data management objectives • Evaluation of data quality • Enhancement of data quality • Tracking data input activities • Identification of emerging odd patterns Software Examples to setup ADA • • Routine and Basic ADA • Database Software: • Excel • Access • EPI Info Basic to Advanced ADA • Statistical Analysis Software: • STATA • SAS • R statistics Which Applications can be used • Database software capabilities: • • • • Capture information Queries Advanced Code Reporting • Summary Tables • Graphs • Statistical software focuses on • Data acquisition and sharing • Basic and advanced statistical analysis • Output and reporting Software Considerations - Excel • Spread sheet format has limited application • Simple tables can be created easily • Has built-in formulas that make it easy to perform simple calculations • Time consuming to set up table links • Links have limitations • Macros are available • Limited number of columns/rows allowed • Better for smaller databases Software Considerations - ACCESS • Database application • Includes a easy to use guided set up for queries and reports • Advanced query capability available • Built-in formulas make it easy to perform simple calculations • VBA coding – created behind the scenes for tables, queries, reports • User can write custom macros to perform specific tasks • Limited scope for data manipulation • Limited number of columns within table format Software Considerations - Statistical • Can perform basic functions • Good table merging applications • Import/Export functions compatible with most applications • Code available to create various report outputs • Able to analyse large data sets • Automated functions may be set for reporting as well as sending an automated email • Simple and advanced macros can be written • Training is essential but support websites are available • Unlimited scope for advancements Basic Requirements to Develop ADA • Problem solver and constant learner • Search websites for formulas and guidelines • Willingness to ask for help when it’s needed • Practice • Logical insight and attention to detail • Creating queries • Creating & understanding desired output • Additional helpful skills • Statistical background • • • • Descriptive Stats Summary Statistics Coding skills (i.e. SQL) Coding translation ability and adaptation Benefits • Less time spent • Recreating queries to find relevant data • Recreating reports • Reduced effort in reporting data to collaborators and stakeholders • • • • • Detect data inconsitencies more quickly Perform more consistent analysis Better understanding of your data Increase data re-use Opportunity to standardize codes for advanced analysis Thank you