Download Database concepts

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Serializability wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Microsoft Access wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Oracle Database wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

IMDb wikipedia , lookup

Ingres (database) wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Transcript
Database concepts
A database is a collection of related data. By data, we mean known facts that can be
recorded and that have implicit meaning. The preceding definition of database is quite
general; for example, we may consider the collection of words that make up this page of
text to be related data and hence to constitute a database. However, the common use of the
term database is usually more restricted.
A database has the following implicit properties:
i.
ii.
iii.
A database represents some aspect of the real world, sometimes called the
miniworld or the Universe of Discourse (UoD). Changes to the miniworld are
reflected in the database.
A database is a logically coherent collection of data with some inherent meaning. A
random assortment of data cannot correctly be referred to as a database.
A database is designed, built, and populated with data for a specific purpose. It has
an intended group of users and some preconceived applications in which these
users are interested.
Thus a database is a collection of related data that are related in a meaningful way
arranged and presented to serve an assigned purpose. For example, consider the names,
telephone numbers, and addresses of the people you know. This is a collection of related
data with an implicit meaning and hence is a database.
Mini-world: Some part of the real world about which data is stored in a database. For
example; student grades and transcripts at a university
Field: The smallest piece of meaningful information in a file is called a data item or Field. A
data item is generally used for a group of alphanumeric characters. Example, Name,
Locality, City, State, Pin_Code are all known as Data Items or Fields.
Record: Is the collection of related fields
File: Is the collection of all related records.
A database management system (DBMS) is a collection of programs that enables users to
create and maintain a database. The DBMS is a software system that facilitates the
processes of defining, constructing, and manipulating databases for various applications.
Examples: Oracle, DB2 (IBM), MS SQL Server, MS Access, Ingres, MySQL etc.
Typical DBMS Functionality
Defining a database involves specifying the data types, structures, and constraints for the
data to be stored in the database.
1|Page
CST_04204 Database principles
Database concepts
Constructing the database is the process of storing the data itself on some storage medium
that is controlled by the DBMS.
Manipulating a database includes such functions as querying the database to retrieve
specific data, updating the database to reflect changes in the miniworld, and generating
reports from the data.
Concurrent processing and sharing by a set of users and programs – yet, keeping all data
valid and consistent
Database system is simply The DBMS software together with the database itself.
Examples: payroll system and enrollment system.
Traditional file processing
Prior to DBMS, file system provided by OS was used to store information in a file-based
system. There was a collection of application programs that perform services for the end
user where each program defines and manages its own data. Each user defines and
implements the files needed for a specific application as part of programming the
application.
For example, one user, the grade reporting office, may keep a file on students and their
grades. Programs to print a student’s transcript and to enter new grades into the file are
implemented. A second user, the accounting office, may keep track of students’ fees and
their payments. Although both users are interested in data about students, each user
maintains separate files—and programs to manipulate these files—because each requires
some data not available from the other user’s files.
This redundancy in storing the same data multiple times leads to several problems. First,
there is the need to perform a single logical update—such as entering data on a new
student—multiple times: once for each file where student data is recorded. This leads to
duplication of effort. Second, storage space is wasted when the same data is stored
repeatedly, and this problem may be serious for large databases. Third, files that represent
the same data may become inconsistent. This may happen because an update is applied to
some of the files but not to others. Even if an update—such as adding a new student—is
applied to all the appropriate files, the data concerning the student may still be inconsistent
since the updates are applied independently by each user group.
Other draw backs of filesystem are:



Data dependency
Does not support multiple user views
No Sharing of data among multiple transactions.
2|Page
CST_04204 Database principles
Database concepts


Excessive program maintenance
No provision for security, recovery.
In the database approach, a single repository of data is maintained that is defined once and
then is accessed by various users.
The main characteristics of the database approach versus the file-processing approach are
the following:
i.
Self-Describing Nature of a Database System
A fundamental characteristic of the database approach is that the database system contains
complete definition or description of the database structure and constraints. This definition
is stored in the system catalog, which contains information such as the structure of each
file, the type and storage format of each data item, and various constraints on the data. The
information stored in the catalog is called meta-data, and it describes the structure of the
primary database. Therefore the catalog is used by the DBMS software and also by
database users who need information about the database structure.
ii.
Insulation between Programs and Data, and Data Abstraction
In traditional file processing, the structure of data files is embedded in the access programs,
so any changes to the structure of a file may require changing all programs that access this
file. By contrast, DBMS access programs do not require such changes in most cases. The
structure of data files is stored in the DBMS catalog separately from the access programs.
We call this property program-data independence.
iii.
Support of Multiple Views of the Data
A database typically has many users, each of whom may require a different perspective or
view of the database. Some users may not need to be aware of whether the data they refer
to is stored or derived. A multiuser DBMS whose users have a variety of applications must
provide facilities for defining multiple views.
iv.
Sharing of Data and Multiuser Transaction Processing
A multiuser DBMS, as its name implies, must allow multiple users to access the database at
the same time. This is essential if data for multiple applications is to be integrated and
maintained in a single database. The DBMS must include concurrency control software to
ensure that several users trying to update the same data do so in a controlled manner so
that the result of the updates is correct. For example, when several reservation clerks try to
assign a seat on an airline flight, the DBMS should ensure that each seat can be accessed by
only one clerk at a time for assignment to a passenger.
3|Page
CST_04204 Database principles
Database concepts
Advantages of Using a DBMS
i.
Controlling Redundancy
In traditional software development utilizing file processing, every user group maintains
its own files for handling its data-processing applications. In the database approach, the
views of different user groups are integrated during database design. For consistency, we
should have a database design that stores each logical data item—such as a student’s name
or birth date—in only one place in the database. This does not permit inconsistency, and it
saves storage space.
ii.
Restricting Unauthorized Access
When multiple users share a database, it is likely that some users will not be authorized to
access all information in the database. For example, financial data is often considered
confidential, and hence only authorized persons are allowed to access such data. In
addition, some users may be permitted only to retrieve data, whereas others are allowed
both to retrieve and to update. Typically, users or user groups are given account numbers
protected by passwords, which they can use to gain access to the database. A DBMS should
provide a security and authorization subsystem, which the DBA uses to create accounts and
to specify account restrictions.
iii.
Providing Multiple User Interfaces
Because many types of users with varying levels of technical knowledge use a database, a
DBMS should provide a variety of user interfaces. These include query languages for casual
users; programming language interfaces for application programmers; forms and
command codes for parametric users; and menu-driven interfaces and natural language
interfaces for stand-alone users. Both forms-style interfaces and menu-driven interfaces
are commonly known as graphical user interfaces (GUIs).
iv.
Enforcing Integrity Constraints
Most database applications have certain integrity constraints that must hold for the data. A
DBMS should provide capabilities for defining and enforcing these constraints. The
simplest type of integrity constraint involves specifying a data type for each data item.. It is
the database designers’ responsibility to identify integrity constraints during database
design. Some constraints can be specified to the DBMS and automatically enforced. Other
constraints may have to be checked by update programs or at the time of data entry.
v.
Providing Backup and Recovery
4|Page
CST_04204 Database principles
Database concepts
A DBMS must provide facilities for recovering from hardware or software failures. The
backup and recovery subsystem of the DBMS is responsible for recovery. For example, if
the computer system fails in the middle of a complex update program, the recovery
subsystem is responsible for making sure that the database is restored to the state it was in
before the program started executing. Alternatively, the recovery subsystem could ensure
that the program is resumed from the point at which it was interrupted so that its full effect
is recorded in the database.
vi.
vii.
Greater data independence from applications programs.
Sharing of data among multiple users.
Disadvantages of Database systems vs file systems
i.
ii.
iii.
iv.
v.
Database systems are complex, difficult, and time-consuming to design
Substantial hardware and software start-up costs
Damage to database affects virtually all applications programs
Extensive conversion costs in moving form a file-based system to a database system
Initial training required for all programmers and users
When not to use a DBMS
i.
Main inhibitors (costs) of using a DBMS:
 High initial investment and possible need for additional hardware.
 Overhead for providing generality, security, concurrency control, recovery, and
integrity functions.
ii.
When a DBMS may be unnecessary:
 If the database and applications are simple, well defined, and not expected to
change.
 If there are stringent real-time requirements that may not be met because of DBMS
overhead.
 If access to data by multiple users is not required.
Database Users
i.
Database administrators: responsible for authorizing access to the database, for
coordinating and monitoring its use, acquiring software and hardware resources
and ensuring Backup and recovery.
ii.
Database Designers: responsible to define the content, the structure, the
constraints, and functions or transactions against the database. They must
communicate with the end-users and understand their needs.
5|Page
CST_04204 Database principles
Database concepts
iii.
End-users: they use the data for queries, reports and some of them actually update
the database content.
Categories of End-users
i.
ii.
iii.
iv.
Casual: access database occasionally when needed.
Naïve or Parametric: they make up a large section of the end-user population. They
use previously well-defined functions against the database. Examples are banktellers or reservation clerks who do this activity for an entire shift of operations
Sophisticated: these include business analysts, scientists, engineers, others
thoroughly familiar with the system capabilities. Many use tools in the form of
software packages that work closely with the stored database.
Stand-alone: mostly maintain personal databases using ready-to-use packaged
applications. An example is a tax program user that creates their internal database.
Database vendors and their products
i.
ii.
iii.
iv.
v.
IBM: DB2/MVS, DB2/UDB, DB2/400, Informix Dynamic Server (IDS)
Microsoft: Access, SQL Server, Desktop Edition (MSDE)
Open Source: MySQL, PostgreSQL
Oracle: Oracle DBMS, RDB
Sybase –Adaptive Server Enterprise (ASE), Adaptive Server Anywhere (ASA),
Watcom
6|Page
CST_04204 Database principles