* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Database concepts
Survey
Document related concepts
Serializability wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Microsoft Access wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Oracle Database wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational model wikipedia , lookup
Concurrency control wikipedia , lookup
Database model wikipedia , lookup
Transcript
Database concepts A database is a collection of related data. By data, we mean known facts that can be recorded and that have implicit meaning. The preceding definition of database is quite general; for example, we may consider the collection of words that make up this page of text to be related data and hence to constitute a database. However, the common use of the term database is usually more restricted. A database has the following implicit properties: i. ii. iii. A database represents some aspect of the real world, sometimes called the miniworld or the Universe of Discourse (UoD). Changes to the miniworld are reflected in the database. A database is a logically coherent collection of data with some inherent meaning. A random assortment of data cannot correctly be referred to as a database. A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested. Thus a database is a collection of related data that are related in a meaningful way arranged and presented to serve an assigned purpose. For example, consider the names, telephone numbers, and addresses of the people you know. This is a collection of related data with an implicit meaning and hence is a database. Mini-world: Some part of the real world about which data is stored in a database. For example; student grades and transcripts at a university Field: The smallest piece of meaningful information in a file is called a data item or Field. A data item is generally used for a group of alphanumeric characters. Example, Name, Locality, City, State, Pin_Code are all known as Data Items or Fields. Record: Is the collection of related fields File: Is the collection of all related records. A database management system (DBMS) is a collection of programs that enables users to create and maintain a database. The DBMS is a software system that facilitates the processes of defining, constructing, and manipulating databases for various applications. Examples: Oracle, DB2 (IBM), MS SQL Server, MS Access, Ingres, MySQL etc. Typical DBMS Functionality Defining a database involves specifying the data types, structures, and constraints for the data to be stored in the database. 1|Page CST_04204 Database principles Database concepts Constructing the database is the process of storing the data itself on some storage medium that is controlled by the DBMS. Manipulating a database includes such functions as querying the database to retrieve specific data, updating the database to reflect changes in the miniworld, and generating reports from the data. Concurrent processing and sharing by a set of users and programs – yet, keeping all data valid and consistent Database system is simply The DBMS software together with the database itself. Examples: payroll system and enrollment system. Traditional file processing Prior to DBMS, file system provided by OS was used to store information in a file-based system. There was a collection of application programs that perform services for the end user where each program defines and manages its own data. Each user defines and implements the files needed for a specific application as part of programming the application. For example, one user, the grade reporting office, may keep a file on students and their grades. Programs to print a student’s transcript and to enter new grades into the file are implemented. A second user, the accounting office, may keep track of students’ fees and their payments. Although both users are interested in data about students, each user maintains separate files—and programs to manipulate these files—because each requires some data not available from the other user’s files. This redundancy in storing the same data multiple times leads to several problems. First, there is the need to perform a single logical update—such as entering data on a new student—multiple times: once for each file where student data is recorded. This leads to duplication of effort. Second, storage space is wasted when the same data is stored repeatedly, and this problem may be serious for large databases. Third, files that represent the same data may become inconsistent. This may happen because an update is applied to some of the files but not to others. Even if an update—such as adding a new student—is applied to all the appropriate files, the data concerning the student may still be inconsistent since the updates are applied independently by each user group. Other draw backs of filesystem are: Data dependency Does not support multiple user views No Sharing of data among multiple transactions. 2|Page CST_04204 Database principles Database concepts Excessive program maintenance No provision for security, recovery. In the database approach, a single repository of data is maintained that is defined once and then is accessed by various users. The main characteristics of the database approach versus the file-processing approach are the following: i. Self-Describing Nature of a Database System A fundamental characteristic of the database approach is that the database system contains complete definition or description of the database structure and constraints. This definition is stored in the system catalog, which contains information such as the structure of each file, the type and storage format of each data item, and various constraints on the data. The information stored in the catalog is called meta-data, and it describes the structure of the primary database. Therefore the catalog is used by the DBMS software and also by database users who need information about the database structure. ii. Insulation between Programs and Data, and Data Abstraction In traditional file processing, the structure of data files is embedded in the access programs, so any changes to the structure of a file may require changing all programs that access this file. By contrast, DBMS access programs do not require such changes in most cases. The structure of data files is stored in the DBMS catalog separately from the access programs. We call this property program-data independence. iii. Support of Multiple Views of the Data A database typically has many users, each of whom may require a different perspective or view of the database. Some users may not need to be aware of whether the data they refer to is stored or derived. A multiuser DBMS whose users have a variety of applications must provide facilities for defining multiple views. iv. Sharing of Data and Multiuser Transaction Processing A multiuser DBMS, as its name implies, must allow multiple users to access the database at the same time. This is essential if data for multiple applications is to be integrated and maintained in a single database. The DBMS must include concurrency control software to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct. For example, when several reservation clerks try to assign a seat on an airline flight, the DBMS should ensure that each seat can be accessed by only one clerk at a time for assignment to a passenger. 3|Page CST_04204 Database principles Database concepts Advantages of Using a DBMS i. Controlling Redundancy In traditional software development utilizing file processing, every user group maintains its own files for handling its data-processing applications. In the database approach, the views of different user groups are integrated during database design. For consistency, we should have a database design that stores each logical data item—such as a student’s name or birth date—in only one place in the database. This does not permit inconsistency, and it saves storage space. ii. Restricting Unauthorized Access When multiple users share a database, it is likely that some users will not be authorized to access all information in the database. For example, financial data is often considered confidential, and hence only authorized persons are allowed to access such data. In addition, some users may be permitted only to retrieve data, whereas others are allowed both to retrieve and to update. Typically, users or user groups are given account numbers protected by passwords, which they can use to gain access to the database. A DBMS should provide a security and authorization subsystem, which the DBA uses to create accounts and to specify account restrictions. iii. Providing Multiple User Interfaces Because many types of users with varying levels of technical knowledge use a database, a DBMS should provide a variety of user interfaces. These include query languages for casual users; programming language interfaces for application programmers; forms and command codes for parametric users; and menu-driven interfaces and natural language interfaces for stand-alone users. Both forms-style interfaces and menu-driven interfaces are commonly known as graphical user interfaces (GUIs). iv. Enforcing Integrity Constraints Most database applications have certain integrity constraints that must hold for the data. A DBMS should provide capabilities for defining and enforcing these constraints. The simplest type of integrity constraint involves specifying a data type for each data item.. It is the database designers’ responsibility to identify integrity constraints during database design. Some constraints can be specified to the DBMS and automatically enforced. Other constraints may have to be checked by update programs or at the time of data entry. v. Providing Backup and Recovery 4|Page CST_04204 Database principles Database concepts A DBMS must provide facilities for recovering from hardware or software failures. The backup and recovery subsystem of the DBMS is responsible for recovery. For example, if the computer system fails in the middle of a complex update program, the recovery subsystem is responsible for making sure that the database is restored to the state it was in before the program started executing. Alternatively, the recovery subsystem could ensure that the program is resumed from the point at which it was interrupted so that its full effect is recorded in the database. vi. vii. Greater data independence from applications programs. Sharing of data among multiple users. Disadvantages of Database systems vs file systems i. ii. iii. iv. v. Database systems are complex, difficult, and time-consuming to design Substantial hardware and software start-up costs Damage to database affects virtually all applications programs Extensive conversion costs in moving form a file-based system to a database system Initial training required for all programmers and users When not to use a DBMS i. Main inhibitors (costs) of using a DBMS: High initial investment and possible need for additional hardware. Overhead for providing generality, security, concurrency control, recovery, and integrity functions. ii. When a DBMS may be unnecessary: If the database and applications are simple, well defined, and not expected to change. If there are stringent real-time requirements that may not be met because of DBMS overhead. If access to data by multiple users is not required. Database Users i. Database administrators: responsible for authorizing access to the database, for coordinating and monitoring its use, acquiring software and hardware resources and ensuring Backup and recovery. ii. Database Designers: responsible to define the content, the structure, the constraints, and functions or transactions against the database. They must communicate with the end-users and understand their needs. 5|Page CST_04204 Database principles Database concepts iii. End-users: they use the data for queries, reports and some of them actually update the database content. Categories of End-users i. ii. iii. iv. Casual: access database occasionally when needed. Naïve or Parametric: they make up a large section of the end-user population. They use previously well-defined functions against the database. Examples are banktellers or reservation clerks who do this activity for an entire shift of operations Sophisticated: these include business analysts, scientists, engineers, others thoroughly familiar with the system capabilities. Many use tools in the form of software packages that work closely with the stored database. Stand-alone: mostly maintain personal databases using ready-to-use packaged applications. An example is a tax program user that creates their internal database. Database vendors and their products i. ii. iii. iv. v. IBM: DB2/MVS, DB2/UDB, DB2/400, Informix Dynamic Server (IDS) Microsoft: Access, SQL Server, Desktop Edition (MSDE) Open Source: MySQL, PostgreSQL Oracle: Oracle DBMS, RDB Sybase –Adaptive Server Enterprise (ASE), Adaptive Server Anywhere (ASA), Watcom 6|Page CST_04204 Database principles