* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Software Engineering Syllabus
Survey
Document related concepts
Microsoft Access wikipedia , lookup
Lustre (file system) wikipedia , lookup
Asynchronous I/O wikipedia , lookup
File Allocation Table wikipedia , lookup
File system wikipedia , lookup
Information privacy law wikipedia , lookup
Serializability wikipedia , lookup
Expense and cost recovery system (ECRS) wikipedia , lookup
Operational transformation wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Business intelligence wikipedia , lookup
Data vault modeling wikipedia , lookup
Computer file wikipedia , lookup
Concurrency control wikipedia , lookup
File locking wikipedia , lookup
Versant Object Database wikipedia , lookup
Clusterpoint wikipedia , lookup
Transcript
Chapter 9 Database Systems Introduction to CS 1st Semester, 2012 Sanghyun Park Outline Database Fundamentals Data Models Relational Model Object-Oriented Databases Maintaining Database Integrity Traditional File Structures Data Mining Social Impact of Database Technology (skip) (skip) (skip) Database Fundamentals A database is a collection of data that is ________, not necessarily physically, _________ A database management system (_______) defines, creates, and maintains a database The DBMS also allows users _________ access to data in the database relying on schemas and subschemas A ________ is a description of the entire database structure used by the DBMS A __________ is a description of only that portion of the database ________ to a particular user’s needs Databases have evolved as a means of _________ data storage systems File vs. Database Organization (1/2) _________ data ___________ problem File vs. Database Organization (2/2) Layered Approach to Database Implementation A typical database system consists of ____ layers – an _________ layer and a database management layer Advantages of this dichotomy Allows for the construction and use of _________ tools Provides a means for controlling access to the databases Achieves data _____________ Database Models A database model defines the _______ design of data The model also describes the ____________ between different parts of data In the history of database design, three models have been in use: Hierarchical model Network model Relational model Hierarchical Model Data are organized as a ____ Each entity has only ___ parent but can have several children Network Model Entities are organized in a _____, where some entities can be accessed through _______ paths There is __ hierarchy Relational Model (1/2) Data are organized in two-dimensional ______ called _________ There is no hierarchical or network structure imposed on the data Relational Model (2/2) Each column in a relation is called an _________ The total number of attributes for a relation is called the ______ of the relation Each row in a relation is called a _____ The total number of rows in a relation is called the __________ of the relation The cardinality of a relation changes when tuples are added or deleted; this makes the database _________ Issues of Relational Design (1/4) Lack of efficiency due to __________ Possibility of information ____ when deleting a tuple The source of the problems is that we have combined more than one concept into a ______ relation Issues of Relational Design (2/4) Issues of Relational Design (3/4) ASSIGNMENT relation Job Id Start Date Term Date S25X F5 S25Z 3-1-1999 10-1-2001 5-1-2001 4-30-2001 * * This database contains information about employees in EMPLOYEE relation, about available jobs in JOB relation, and about job history in ASSIGNMENT relation Additional information is implicitly available by ________ the information from _________ relations Issues of Relational Design (4/4) Finding the department in which employee 23Y34 has worked Relational Operations: SELECT Relational Operations: PROJECT Relational Operations: JOIN (1/3) Relational Operations: JOIN (2/3) Relational Operations: JOIN (3/3) Issues of Implementation The data in a database are actually stored in a ____ storage system A DBMS allows the application software to be written in terms of a ______________ It is the duty of the ______ to accept commands in terms of the relational _____ and convert them into actions relative to the actual storage structure The simplest way for a DBMS to implement a relation is to store it as a __________ file To provide rapid access to entries, the DBMS would store the relation as an _______ file or utilize ________ techniques Structured Query Language (1/2) The language called ____ (Structured Query Language) is used extensively by application software programmers for manipulating databases A query involving a combination of SELECT, PROJECT, and JOIN operations can be expressed as a _____ SQL statement We should read a SQL statement as a description of the information ______ rather than a _________ of activities to be performed not procedural but __________ Structured Query Language (2/2) SELECT statement select from where INSERT statement insert values into EMPLOYEE (‘42Z12’, ‘Sue A. Burt’, ‘33 Fair St.’, ‘444661111’) DELETE statement delete where EMPLOYEE.Name, ASSIGNMENT.StartDate EMPLOYEE, ASSIGNMENT EMPLOYEE.EmplId = ASSIGNMENT.EmplId from EMPLOYEE Name = ‘G. Jerry Smith’ UPDATE statement update set where EMPLOYEE Address = ‘1812 Napoleon Ave.’ Name = ‘Joe E. Baker’ Maintaining Database Integrity (1/2) ___________ is a sequence of operations that must all happen together (e.g., money transfer between accounts) Before a transaction is allowed to alter the database, the alteration to be performed is first recorded in the ___ The point at which all the steps in a transaction have been recorded in the log is called the ____________ At this point DBMS becomes ________ to the transaction If problems arise before a transaction has reached its commit point, the log can be used to _______ (undo) the activities actually performed by the transaction Maintaining Database Integrity (2/2) Simultaneous access problems Incorrect summary problem Lost update problem ______ is used for preventing others from accessing data being used by a transaction Shared lock is used when ________ data Exclusive lock is used when _________ data Traditional File Structures A ___ is an external collection of related data treated as a unit The primary purpose of a file is to _____ data Files are stored in what are known as ________ or _________ storage devices such as disk and tape For our purposes, a file is a collection of data _______ with each record consisting of one or more _____ The _____ method determines how records can be retrieved: sequentially or randomly Taxonomy of File Structures Sequential Files While Not EOF { Read the next record Process the record } Applications of Sequential Files The sequential file is used in applications that need to access ___ records from beginning to end For example, if personal information about each employee is stored in a file, we can use sequential access to retrieve each record at the end of the month to print the paychecks The sequential file is not efficient for _______ access Indexed Files To access a record in a file randomly, we need to know the _______ of the record An indexed file is made of a data file, which is a sequential file, and an _____ The index itself is a very ____ file with only two fields: the ___ of the sequential file and the ________ of the corresponding record on the disk Mapping in an Index File Logical View of an Indexed File Hashed Files In an indexed file, the index _____ the key to the address; A hashed file uses a ________ to accomplish this mapping The hashed file eliminates the need for an ____ file (index) However, we will see that hashed files have their own problems Modulo Division Hashing Method Collision (1/2) The population of keys is greater than the ______ of records in the data file Because there are many keys for each address in the file, there is a possibility that more than one key will hash to the _____ address in the file We call the set of keys that hash to the same address ________ A _______ is the event that occurs when a hashing algorithm produces an address for an insertion key, and that address is already occupied Collision (2/2) Collision Resolution: Open Addressing Collision Resolution: Linked List Resolution Collision Resolution: Bucket Hashing