Download Automated Data Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Geographic information system wikipedia , lookup

Pattern recognition wikipedia , lookup

Neuroinformatics wikipedia , lookup

Theoretical computer science wikipedia , lookup

Coding theory wikipedia , lookup

Corecursion wikipedia , lookup

Data analysis wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
Automated Data Analysis
Nishan Ahmed
Data Management Training
Cairo, Egypt
April 21 - 25, 2013
National Center for Immunization & Respiratory Diseases
Influenza Division
Objectives
• Why Automated Data Analysis (ADA)
• What does the ADA process involve
• Preparation Steps
• Which Applications can be used
• Software Examples to set up ADA
• Software Considerations
• Basic requirements to develop ADA
• Benefits of a well developed ADA system
Why Automate Data Analysis
• To simplify and ease routine data analysis
• To standardize analysis output
• Uniform report
• Replicable
• To make reporting efficient
• To make sharing of information easier
• To make data management processes easier and
effective
• Enable frequent runs for data checks
• Catch errors quickly
•
•
Up-to-date editing
Quick intervention – error pattern
What does the ADA process involve
• Plan analysis needs in advance of data
collection
• Identify data needed for each process
• Plan and create export or import table templates
– i.e Merged tables
• Verify desired data outputs
• Create standardized routine reports
• Create queries in advance
• Write codes in advance
Preparation Steps
• Understand the study objectives
• Identify routine data outputs
• Identify required graphs and summary tables
• Understand the purpose of each report
• Identify core information & variables required
• Outline the data management objectives
•
Evaluation of data quality
•
Enhancement of data quality
•
Tracking data input activities
•
Identification of emerging odd patterns
Software Examples to setup ADA
•
•
Routine and Basic ADA
• Database Software:
• Excel
• Access
• EPI Info
Basic to Advanced ADA
• Statistical Analysis Software:
• STATA
• SAS
• R statistics
Which Applications can be used
• Database software capabilities:
•
•
•
•
Capture information
Queries
Advanced Code
Reporting
• Summary Tables
• Graphs
• Statistical software focuses on
• Data acquisition and sharing
• Basic and advanced statistical analysis
• Output and reporting
Software Considerations - Excel
• Spread sheet format has limited application
• Simple tables can be created easily
• Has built-in formulas that make it easy to perform
simple calculations
• Time consuming to set up table links
• Links have limitations
• Macros are available
• Limited number of columns/rows allowed
• Better for smaller databases
Software Considerations - ACCESS
• Database application
• Includes a easy to use guided set up for queries
and reports
• Advanced query capability available
• Built-in formulas make it easy to perform simple
calculations
• VBA coding – created behind the scenes for
tables, queries, reports
• User can write custom macros to perform specific
tasks
• Limited scope for data manipulation
• Limited number of columns within table format
Software Considerations - Statistical
• Can perform basic functions
• Good table merging applications
• Import/Export functions compatible with most
applications
• Code available to create various report outputs
• Able to analyse large data sets
• Automated functions may be set for reporting as
well as sending an automated email
• Simple and advanced macros can be written
• Training is essential but support websites are
available
• Unlimited scope for advancements
Basic Requirements to Develop ADA
• Problem solver and constant learner
• Search websites for formulas and guidelines
• Willingness to ask for help when it’s needed
• Practice
• Logical insight and attention to detail
• Creating queries
• Creating & understanding desired output
• Additional helpful skills
• Statistical background
•
•
•
•
Descriptive Stats
Summary Statistics
Coding skills (i.e. SQL)
Coding translation ability and adaptation
Benefits
• Less time spent
• Recreating queries to find relevant data
• Recreating reports
• Reduced effort in reporting data to collaborators and
stakeholders
•
•
•
•
•
Detect data inconsitencies more quickly
Perform more consistent analysis
Better understanding of your data
Increase data re-use
Opportunity to standardize codes for advanced
analysis
Thank you