Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 4 Database Management Data Elements Employee database Payroll file Employee record1 Employee record2 Remunerations Employee record3 Employee record4 Name SSN Salary Name SSN Salary Name SSN Salary Name SSN Salary Text: O’Brien : p. 144 Logical Data Elements Field represents an attribute of an entity Record Set of attributes that describe an entity File set of records of the same type transaction file, archive file, ... Database integrated set of logically related files and the rules they have to satisfy Text: O’Brien : p. 145 Database Employee database Payroll data Person data Training data Database Management User Inquiry Applications Request Database Management System Corporate database Text: O’Brien : p. 146 Example CARS (model, #cylinders, origin, tax, fee) Model #cyl origin tax fee -----------------------------------------------------Rabit 4 Germany 15 30 Mustang 6 USA 0 45 Mirafiori 4 Italy 18 30 Accord 4 Japan 20 30 Cutlass 8 USA 0 60 Laguna 4 France 20 30 BMW 6 Germany 15 45 Velsatis 6 France 20 45 Velsatis 4 France 20 30 Origin tax #cyl fee Example Drivers having a driver-no , a name , a home address and a birthdate take out vehicles to make deliveries. A vehicle ( with a unique id-number ) may be taken out of a depot whenever available and kept out for any length of time . It is possible for a vehicle to be taken out more than once on a given day by any driver . Each time a driver takes out a vehicle , he or she takes out a load made up of any quantity of any number of item types , identified by an item-number. Every time a vehicle is taken out , the driver can incur expenses of allowed type (e.g. fuel cost ). Each expense type has a code-no. The amount and code-no are recorded each time an expense is incurred . Any number of stops can be made during the trip . An address of the stop is recorded for each stop , together with the quantity left on that stop for each item type . A driver stops at an address only one during a trip . However, stops can be made at the same address on different trips. Assignment The university that has to keep track of student transcripts could use a database structured as follows. Student(Student Number - Last name - First name - SSSN - Street - Nr - Zip Code - City - State - Country - Phone Number - Birth Date - Gender Class - Major - Minor) Department(Depart. Code - Name - Office Nr - Office Code) Course(Course Code - Course Name - Description - Credits - Department) Section(Course Code - Section Nr - Instructor - Semester - Year) Transcript(Student Number - Name - Course Code - Section Number Grade) Work with DBMS-software Operating system Databases DBMS Application programs Database development Database inquiry Database maintenance Application development Text: O’Brien : p. 147 Data Dictionary Dictionary Access Text: O’Brien : p. 149 Database Inquiry Query Languages SQL Select Name, department, salary From Employee, Payroll Where Employee.number = Payroll.number and Function = “Analyst”. QBE Employee Payroll Name number num number num function “ Analyst” department salary DBMS User interface Query language Report generator DBMS utilities Application generators DBMS kernel program Data-dictionary programs Host-language Program Interface Problems with files Data redundancy Non-integrated files Data dependency Other problems inconsistency integrity Possible solutions : database management system • separation of files and applications • specific database languages • central dictionary Characteristics of databases Redundancy . storage cost . multiple update needed . inconsistency possible Data independence The degree on which data structures can be modified without affecting the related programs Solution : Database architecture with three levels ( conceptual, internal , external ) Types of Databases Operational databases (subject area databases SADB) transaction databases, production databases employee database, stock database, ... Analytical databases ( management databases ) data retrieved from external and operational databases Accessible for OLAP, DSS, management information systems Data Warehouses contain historical data from various operational and external databases central source for standardized and integrated data for management and users data mart is a limited version for a specific subject used for data mining Text: O’Brien : p. 150 Types of Databases (2) Distributed databases mostly for local needs, can be copies consistency problems distributed over servers related, eventually via WWW or intranet Personal databases for end users various data gathered by users on their workstations eventual with spreadsheet or DBMS External databases often access charge statistical , bibliographical Text: O’Brien : p. 151-153 Types of Databases Client-PC or NC End-user databases Text: O’Brien : p. 151 External databases Network server Distributed databases (3) Datawarehouse Operational databases Management databases Hypermedia DB on the WEB WEBbrowser Internet intranet extranet Web-server software Client-PC or NC Network server parts of a Web-information system Text: O’Brien : p. 153 Hypermedia database Data Resource Management Data administration •Policy governing data ownership and access control •Data Resource Requirement Planning •Data model and data architecture Data Planning Database Administration Text: O’Brien : p. 154 •Strategic and technical database planning •Define applications •Procedures for data retention •Operational procedures •Physical database •Logical database •Data dictionary •HW and SW Benefits and Limitations Benefits of a DBMS reduces data redundancy and integrates data data accessible from all programs programs not dependent on data format query/answer and reporting facilities easier application development and programming better integrity and security Limitations of a DBMS increased complexity of the technology development of large databases is difficult and expensive slower than file management systems due to additional layer more sensible for fraud, errors and failures Technical Principles of database management Files Physical Logical A D B C C B E A D E Aspects of file usage File activity input 2000 orders processing 20% revenue state 1 instruction inquiry 1 inquiry Accessibility Evolution Client file 100% 10000 records 0.01% What is a database ? 1. A database is a set of data which are modifiable and accessible by a computer program. text, image, voice, structured information. 2. A database is a set of data which are . structured . integrated . non-redundant . shared It is a source of data for a large number of different applications and for a variety of users. Database type Employee Number Name City STUDIES Diploma CHILDREN year occurrence 12 Duval Rome 16 Krols Dublin 38 Smets Berlin First name Birthday college 87 master math 94 econ 90 Accounting 94 college 91 Bache elec 94 john 120694 inge 241198 lidy 070296 Carl 040894 Supplier file SNUMBER SNAME S1 S2 S3 S4 S5 De Smet Janssens Blanchart Clark Adams STATUS 20 10 30 20 30 CITY London Paris Paris London Athens Supplier file with index on city City-index Supplier file Athens . SNUM LNAME London . S1 De Smet 20 London London . S2 Janssens 10 Paris Paris . S3 Blanchart 30 Paris Paris . S4 Clark 20 London S5 Adams 30 Athens STATUS City Index on city and index on status Supplier file City-index LNUM LNAME Athens . London . London . Paris Paris . . STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens Status-index . 10 . 20 . 20 . 30 . 30 Non-dense index Supplier file SNUM SNAME S1 De Smet S4 . S2 Janssens 10 Paris S5 . S3 Blanchart 30 Paris S2 . STATUS 20 CITY London S4 Clark 20 London S5 Adams 30 Athens block 1 block 2 block 3 File Organization: Indexed-sequential Bens Dooms Fagin parameters - index block size - data block size multi-level index blocks Adams Albert Bens data blocks Ace Adams Ademar Behr Bens Aerts Albert Bodoo Claes Codd Dooms Bodoo Ernest Fagin Fully Inverted file SNAME-index De Smet Janssens Blanchart Clark Adams S1-> S2-> S3-> S4-> S5-> STATUS-index 10 S1-> 20 S1->,S4-> 30 S3->,S5-> City-index Supplierfile Athens S5-> S1 London S1->,S4-> S2 Paris S2->,S3-> S3 S4 S5 Sample file Supplier file SNUMBER SNAME S1 S2 S3 S4 S5 De Smet Janssens Blanchart Clark Adams STATUS 20 10 30 20 30 CITY London Paris Paris London Athens Organization with key transformation 0 1 S300 Blanchart 30 Paris 2 3 4 5 S200 Janssens 10 Paris 6 7 S500 Adams 30 Athens 8 9 S100 De Smet 10 S400 12 11 Clark 20 London 20 London Hierarchical Database Structure Department data element Project A data element Employee 1 Text: O’Brien : p. 158 Project B data element Employee 2 Network Database Structure Department A Employee 1 Employee 2 Project A Text: O’Brien : p. 158 Department B Employee 3 Project B Relational Structure Department D# D1 D2 D3 D4 D5 Employee Text: O’Brien : p. 158 DName location Manager E# EName FUNCTION SALARY E1 E2 E3 E4 E5 D# D1 D1 D2 D3 D3 Object Oriented Structure Attributes Object class •client Bank account •status •interest Operations •payment •withdrawal Object class Checking account Attributes •credit limit •monthly statement Operations •calculate interest •print status Text: O’Brien : p. 161 Inheritance Object class Saving account Attributes •number of withdrawals •Quarterly statement Operations •calculate interest •print quarterly statement Database Development 1.Data Planning 2. Specification gathering 3. Conceptual design 4. Physical design 5. Logical design Text: O’Brien : p. 164 Data Modeling Entity-relation diagram EMPLOYEE SUPERVISED by Allocated MANAGER STAFFER FREELANCER Text: O’Brien : p. 165 COMPANY CAR Authorizes submits INVOICE