Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Shawn McClure Software Engineer CIRA, Colorado State University [email protected] 970-491-8455 Projects: Visibility Information Exchange Web System (VIEWS) Interagency Monitoring of Protected Visual Environments (IMPROVE) Air Toxics Data Archive (ATDA) WRAP Technical Support System (TSS) Background: Systems Analyst Database Architect Programmer “VIEWS-based” web sites... VIEWS ATDA IMPROVE VIEWS Data Presentation and Analysis Why am I here? To help ask a question: Can we improve upon the data management system* used by the Air Quality Group at CNL? Some possible goals: Understand the current system Identify issues, goals, and requirements Design a new system Implement the new system Maintain and evolve the system * System: consists of the collection of manual and automatic processes by which information is collected, managed, and disseminated, and by which work is done. Some common problems with data management… Data Redundancy and Confusion Results when different people independently collect or copy the same piece of information Can lead to conflicting naming and coding conventions May result in confusion about which version of the data is most recent or most correct Program – Data Dependence Tight relationship between data and the programs required to update and maintain that data A change in data requires a change in all the program that work with the data Changes in data type, field length, etc. cause change in programs Some common problems with data management: (cont’d) Lack of Flexibility Difficulty delivering ad hoc reports and/or responding to unanticipated information requirements in a timely fashion Handling ad hoc requests: Data is in the system, but is very expensive (in time and effort) to assemble and organize Poor Security There is no mechanism for knowing who is accessing the data and how they're modifying it Access to the data can be unsystematic and uncontrolled Some common problems with data management: (cont’d) Lack of Data Sharing and Availability Finding data can be difficult Retrieving data can be difficult Because pieces of information in different files and different parts of the organization cannot be related to one another, it is virtually impossible for information to be shared or accessed in a timely manner Information cannot flow freely across different functional areas or different parts of the organization Functions of a Data Management System Getting data into the system Working with data while it is in the system Getting data out of the system 1. 2. 3. 1 3 Getting Data In Getting Data Out Data Management System 2 Working With Data A data management system in more detail… Import Validation Source Data Database Rules Storage Source Data Source Data Transformation Retrieval Analysis Presentation Interpretation Program Logic Back End Import: Getting data into the system Validation: Ensuring data accuracy Storage: Managing data, backup, and archival Transformation: Sorting, joining, aggregating Front End Retrieval: Getting the data out Presentation: Displaying the data Analysis: Making the data understandable Interpretation: Making the data usable Two Possible Approaches: Transactional Database and a Data Warehouse Transactional Database Transactional Database · · · Fully normalized, third normal form Best for data import, validation, tracking, and management More difficult for end user interaction Data Warehouse Data Warehouse · · · De-normalized “star” schema Best for end user interaction, querying, and front-end applications Harder to update and automatically maintain data integrity A Hybrid Approach: Two Interrelated Relational Database Systems OLTP: Data Warehouse Generation System: Data Warehouse: • Functions as the “back-end” database • Fully relational and in 3rd normal form • Used for data import, validation, and management • Technologies: Microsoft SQL Server • • • • • • • • • • • Extracts data from the OLTP De-normalizes and transforms data Loads data into the Data Warehouse Builds table indexes Archives “snapshots” of the database Technologies: VB, stored procedures Functions as the “front-end” database Uses a de-normalized “star schema” Used for querying and archiving data Automatically generated from the OLTP Technologies: Microsoft SQL Server General Tasks and Associated Questions Determine our goals: What are we trying to achieve? Design a strategy: How are we going to achieve it? Identify problem areas: What do we have to watch out for? Prioritize tasks: What is most important for success? Adjust scope: What set of goals is most realistic? Recommend alternatives: How else could we do things? Allocate resources: How do we support our efforts? Realign expectations: How do we communicate any adjustments? Define milestones: How do we know when we're done? Some Issues and Concerns… What components/aspects of the current CNL-AQG data management system(s) are we interested in examining? How do we determine our priorities? (i.e. What should come first, next, and later?) How do we make any necessary “transitions” with minimal impact to current operations? What new “learning curves” will be involved? How will we need to “shift our paradigms”? How do we maintain a new system? How do we know when we’ve achieved what we want? Thanks! [email protected]