* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 1: Database Overview
Survey
Document related concepts
Transcript
Introduction to databases © 2007 by Prentice Hall 1 Objectives • • • • • • • • • • 2 Define terms Show limitations of conventional file processing Show advantages of databases Identify costs and risks of databases List components of database environment Describe evolution of database systems Describe database system development life cycle Explain prototyping and agile development approaches Explain roles of individuals Explain the three-schema architecture for databases Definitions • Database: organized collection of logically related data • Data: stored representations of meaningful objects and events – Structured: numbers, text, dates – Unstructured: images, video, documents • Information: data processed to increase knowledge in the person using the data 3 Data in context Context helps users understand data 4 Summarized data Graphical displays turn data into useful information that managers can use for decision making and interpretation 5 Metadata: Descriptions of the properties or characteristics of the data, including data types, field sizes, allowable values, and data context 6 Old file processing systems at Pine Valley Furniture Company Duplicate Data 7 Disadvantages of File Processing • Program-Data Dependence – All programs maintain metadata for each file they use • Duplication of Data (data redundancy) – Different systems/programs have separate copies of the same data • Limited Data Sharing – No centralized control of data • Lengthy Development Times – Programmers must design their own file formats • Excessive Program Maintenance – 80% of information systems budget 8 Problems with Data Dependency • Each application program typically maintains its own data. – – • Each application program includes code for the metadata of each file (e.g., “record layouts” appear in each cobol program). – 9 One application’s data might be duplicated in another application. (Then the users must do “double data entry” or there needs to be an interface program that copies data from one application to the other.) So, if you want to add a new field to a table (e.g., birth-date added to each employee), every single program must be modified and retested. Problems with Data Dependency • Each application program has its own processing routines for reading, inserting, updating, and deleting data. – Too many lines of code makes it hard to maintain (modify while keeping out the bugs) • Lack of coordination and central control • Non-standard file formats 10 Problems with Data Redundancy • Data inconsistency: Data changes in one file could cause inconsistencies (if not changed in other copies of that data – which one is the latest/correct version?) • Compromises in data integrity: As example of “compromised data integrity” would be allowing records (that are referenced by other data) to be deleted. 11 SOLUTION: The DATABASE Approach • Central repository of data -- accessed by all applications. (so, no data redundancy) • The data is managed (protected) by a software called a Database Management System (no programs can access the database except thru the DBMS) – so, no data inconsistencies. 12 Database Management System A software system that is used to create, maintain, and provide controlled access to user databases Order Filing System Invoicing System Payroll System DBMS Central database Contains employee, order, inventory, pricing, and customer data DBMS manages data resources like an operating system manages hardware resources 13 Advantages of the Database Approach • • • • • • • • • • 14 Program-data independence Less data redundancy Improved data consistency Improved data sharing Increased application development productivity Enforcement of standards Improved data quality Improved data accessibility and responsiveness Reduced program maintenance Improved decision support Costs and Risks of the Database Approach • Installation and conversion costs • management cost and complexity - specialized personnel are needed to support the Database and DBMS. • Organizational conflict – (user groups may not agree about the details of the shared data, or who should be allowed to access/update it) 15 Definitions • Relational Database: a database that contains related tables and little/no redundancy. Typically accessed by business applications. • Data model: Graphical representation (picture) of a database, showing it’s tables and how they are related. • Database Management System (DBMS) : software for managing the database. No program can see or modify the database unless it “passes through” the DBMS which acts guards to ensure – Everyone sees only what they are allowed to see and modifies only what they are allowed to modify • • – Data integrity is maintained (e.g., you can’t delete data that is being referenced by other data. Repository: A database management system can work with ANY data model. How? It stores the details about the data model (table names, field names and types etc). These details aer called “METADATA”. Database Application: – A business application program that supports business functions and accesses a database, e.g., add new data, update data, delete data, read data, summarize/report data) – Usually called a Web application (if used in browser, run over the internet) or a windows app (if installed directly on a PC and run over a local area network). 16 One customer may place many orders, but each order is placed by a single customer One-to-many relationship 17 One order has many order lines; each order line is associated with a single order One-to-many relationship 18 One product can be in many order lines, each order line refers to a single product One-to-many relationship 19 Therefore, one order involves many products and one product is involved in many orders Many-to-many relationship 20 Enterprise Database Applications • Enterprise Resource Planning (ERP) – Integrate all enterprise functions (manufacturing, finance, sales, marketing, inventory, accounting, human resources) • Data Warehouse – Integrated decision support system derived from various operational databases 21 Enterprise Data Warehouse – A data warehouse is an Integrated decision support system derived from various operational databases. The data warehouse contains historical copies of transactions of those operational systems. • By getting all that data into one place, users can then ask questions of the warehouse , such as which customers buy the most, when, how often. 22 Enterprise Data Warehouse • Each day data is cleaned/reconciled (one system may call something one way and another system may call it another way) so that it can be loaded into the warehouse. • While the data is being loaded, it is also being summarized. This is to make it more efficient to access. Otherwise there is too much detailed data that needs to be summarized on the fly and performance suffers. • During the load process, the warehouse is not available for users (because the totals wont be correct until the load is complete). 23 Enterprise Data Warehouse • It is interesting to note that good design principles for relational databases (no data redundancy, many constraints to maintain data integrity) are BAD design for data warehouse • Data Warehouses (because they load lots of data from different sources) use lots of data redundancy (e.g., precalculated totals) to make queries run faster. • Data Warehouses have few or no constraints so that the load process goes faster (so that the warehouse becomes available for users as soon as possible). • We’ll learn more about warehouses later… 24 An enterprise data warehouse • Data is extracted from multiple systems within a corporation. • This data is transformed (as necessary) and merged possibly adding data from external sources) into the data warehouse. Evolution of DB Systems 26 Summary • Define terms • Show limitations of conventional file processing • Show advantages of databases • Identify costs and risks of databases • Data Models • Define Enterprise Data Warehouse • Describe evolution of database systems