Download Chapter1[1]

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Serializability wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Microsoft Access wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Oracle Database wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

IMDb wikipedia , lookup

Ingres (database) wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

ContactPoint wikipedia , lookup

Transcript
Chapter One
Definition and Basic Concepts
What is Data?
 Facts that can be recorded and have implicit
meanings. For example name, telephone number,
etc.
 Many facts can be recorded but not everything is
recorded
1. Only used or useful data need to be recorded,
e.g., we need to record students names and
addresses but we do not need to record their
parents’ jobs.
2. Only known facts need to be recorded.
What is a Database?
 A database is a collection of interrelated data.
 A database has the following implicit properties.
 It represents some aspect of real world, called
Universe of Discourse (UoD) or Mini World.
 It is a logically coherent collection of data with
some inherent meanings. Random collection of
data cannot correctly be referred to as a
database.
 A database is designed, built, and populated with
data for a specific purpose for specific users.
1.1
 Database can be of any size from a database of
books in a library to world statistical information
about water resources.
 It is made independent of applications, and
organized to provide a foundation for future
application development.
Types of Databases and Database Applications
 Traditional Applications:
 Numeric and Textual Databases
 More Recent Applications:
 Multimedia Databases
 Geographic Information Systems (GIS)
 Data Warehouses
 Real-time and Active Databases
 Many other applications
1.2
A Simple Example of a Database
 Consider student record in a university environment
where student grades are also recorded. We may
have the following files to maintain student-grade
information:
 Student file: where basic information about
student is recorded.
 Course file: where information about courses is
recorded.
 Sections: where information about each section
is stored.
 Grade-report: where information about grades
for each student in each course is recorded.
 Pre-requisite: where information related prerequisite course is recorded.
1.3
Database that stores student records and their grades
Student
Name
Abc
Xyz
St-Id#
980001
980002
Class
1
2
Major
CS
CS
Course
Course-Name
Into to Comp. Sc
Data structure
Logic Design
Database
Course-code
CSC100
CSC215
CSC202
CSC385
CrHr
3
3
4
4
Department
CS
CS
EE
CS
Section
Section
number
1
2
1
1
1
Course-code
Semester
Year
Instructor
CSC100
CSC100
CSC202
CSC385
CS215
II
II
II
II
II
2002
2002
2002
2002
2002
Dr. X
Dr. Y
Dr. A
Dr. B
Dr. X
Grade-report
St-Id#
98001
98001
98002
98002
98002
Course-code
CS100
CS215
CS100
CS215
CS385
Grade
A
B
B
C+
A
Prerequisite
Course-code Prerequisitecode
CS215
CS100
CS202
CS100
CS385
CS215
1.4
Database Management System (DBMS)
 A database management system is a collection of
programs that enables users to create and maintain
a database.
 DMBS is a general purpose software used to:
 Define: A database involves specifying the
data types, structures, and constraints for the
data to be stored.
a. A special language such as SQL is used
for defining or declaring the database
b. SQL has all the necessary commands to
specify tables, fields and data types as
well as some constraints
c. One database can have many tables; a
table can have two or more fields; each
field can only have one data type, eg,
character or numeric or any other type
allowed by the DBMS
 Construct: a process of storing the data itself
on some storage medium that is controlled by
the DBMS.
a. How and where data is stored on disk is
decided by the DBMS
b. Data can be stored (inserted) directly
through SQL or can through a host
language program such as Developer 200
or VB
c. How the DBMS interacts with OS and HW
does not concern the developers or users
1.5
 Manipulate: A database includes such
functions as querying the database to retrieve
specific data, updating the database, and
generating reports from the data.
a. All access to the database is done
through the host DBMS and through SQL
b. The efficiency of storage and retrieval can
be enhanced by the DBA by using certain
commands and facilities such as indexing
provided by the DBMS
c. Generating reports can be done directly
using the SELECT statement in SQL or
using a special package provided the
DBMS company. There is no limit to the
variety of reports generated by SQL or the
report writer based on the information in
the database
Also note that
 A DBMS is normally bought as a ready made
package
 DBMSs vary in cost depending on a number of
issues, such as level of security provided, level of
support, supporting tools for application development,
Internet functions, speed of storage and retrieval, etc.
 A DBMS may be stand alone or multi-user
 The DBMS is like a black box to users and
developers, they know how to use it for a variety of
purposes but cannot modify its programs
 A DBMS must provide the basic facilities such as
retrieve, insert, delete and update. These four
1.6
operations could be carried out on any data in the
database mounted using this DBMS
A Simplified Database System Environment
Users\Programmers
Users who use the application system
Programmers who develop the application programs
Application Programs/Queries
Application programs are developed using a RAD tools such as VB. Queries are in SQL
DATABASE
SYSTEM
DBMS
SOFTWAREE
Software to Process
Queries/Programs
Software to Access
Stored Data
This plays
the role of
compiler
DBMS
specific
Stored Database
Definition
(Meta-Data)
Stored
Database
1.7
What is a Database System?
Database System = Database + DBMS
+ Application Programs
Brief History of Database Systems
1940's & 50's
1960's
Initial use of computers as calculators.
Limited data, focus on algorithms.
Science, military applications
Business uses. Organizational data,
customer
data,
sales,
inventory,
accounting, etc. File system based, high
emphasis on applications programs to
extract and assimilate data. Larger
amounts of data, relatively simple
calculations.
Hierarchical or Network Database
Systems
1970's & 80's
The relational model. Data separated
into individual tables. Related by keys.
Initially
required
heavy
system
resources. Examples: Oracle, Sybase,
Informix, Digital RDB, IBM DB2.
Late- 1980's
Local area networks. Workgroups
sharing resources such as files, printers,
e-mail.
Client/Server Database resides on a
central server, applications programs
1.8
run on client PCs attached to the server
over a LAN
1990's
Internet and World Wide Web make
databases of all kinds available from a
single type of client - the Web Browser.
Object-Oriented Database Systems?
Distributed Database Systems?
Knowledge-Base Systems
Users or Actors on the Scene
 There are mainly three kinds of users associated with
Database
1. Database Administrators (DBA)
 DBA is responsible for the overall control of
the system at the technical level. The DBA is
mainly responsible for implementing and
maintaining database.
1. The DBA handles security issues of
passwords and levels of authority
2. The DBA programs the efficiency of the
DBMS and the associated database
3. The DBA can be responsible for backup
and recovery procedures
2. Database designer
 Database designers are responsible for
identifying the data to be stored in the
database and for choosing appropriate
structures to represent and store this data.
1.9
1. Database designers may be called by
other titles such as system analysts or
design analysts and many other titles.
Normally they are responsible for
designing the data part and functional
part for the system
2. Database
designers
follow
a
methodology together with techniques
to reach at the perfect database design.
The Entity Relationship model is used
to design the static part of the system,
which is the data design. They then
transform the ERM to relations (tables)
and then optimize these relations using
a technique known as Normalisation.
 Certain tasks are undertaken before the
database is actually implemented and populated
with data.
1. Before a database is implemented and
populated, the users must be satisfied
with the structure of the database. The
database must be able to contain all
their required data and produce the
necessary reports
2. The DBA implements the database on
the chosen DBMS
3. Normally the application system would
have been developed or being
developed.
1.10
3. End users
 End users are the people whose jobs require
them to access the database for querying,
updating, and generating reports; the database
primarily exists for their use.
 Types of end users:
 Casual end users:
 Occasionally access the database
 They may need different information
each time they access database
 Naïve or parametric end users:
 Bank teller’s users: check account of
balance, withdrawal, and deposit.
 Reservation clerks for airlines, hotel,
and car companies: check availability
for a given request and make
reservation.
 Sophisticated end users:
 Include engineers, scientists, business
analysts, and others who are thoroughly
familiar with the DBMS to implement
their complex quires.
 Stand-alone users:
 Mostly maintain personal databases
using
ready-to-use
packaged
applications.
 An example is a tax program user that
creates its own internal database.
 Another example is a user that
maintains an address book
1.11
Workers behind the scene
 DBMS system designers and implementers
 Tools developers
 Operators and maintenance personnel
Viewpoints
DBMS
designer/implementer
Database designer
Issues concerned
Develop a DBMS
Capture information
structures in the real
world and design an
organization of a
database
 logical structure
 physical structure
Database
administrator Monitor operations on a
(DBA)
database and maintain a
database system efficient
Application programmer
Write programs accessing
a database
End user
Enter data and
manipulate data through a
 Casual
query language
 Naïve/parametric
 Sophisticated
1.12
Benefits of the database Approach
1. Redundancy can be reduced
 In non-database systems each application has
its own private files. This fact can often lead to
considerable redundancy in stored data with
resultant waste in storage space.
 In the database approach, files for the entire
application can be stored at a single location,
therefore redundancy can be controlled carefully.
However, sometime there are sound reasons to
maintain multiple (many) copies of the same file.
2. Inconsistency can be avoided
 If some data (suppose name of student) is
represented in two places in the database, it may
happen that at some stage we update the name
in one place but forget in other places. At such
times the database is said to be inconsistent.
Inconsistency can be avoided in well designed
database but keeping all the data at a single
location.
3. The data can be shared
 Different users and different applications can
share data. It does not mean only that existing
applications can share data in the database but
also that new applications can be developed to
operate against that same stored data.
1.13
4. Standards can enforced
 With central control of the database, the DBA
can ensure that all applicable standards are
observed in the representation of the data.
 This is very crucial for the success of database
applications in large organizations. Standards
refer to data item names, display formats,
screens,
report
structures,
meta-data
(description of data), Web page layouts, etc.
5. Security restrictions can be applied
 Since, the data is controlled and stored at single
location, the DBA can ensure that the only
means of access to the database is through the
proper channels.
 Different security checks can be established for
each type of access.
6. Integrity can be maintained
 The problem of integrity is the problem of
ensuring that the data in the database is
accurate.
 Data integrity is more important in a multi-user
database system. Without appropriate controls it
would be possible for one user to update the
database incorrectly, thereby generating bad
data and so “infecting” other innocent users of
that data.
 Centralized control database can help in
avoiding such problems. DBA can define certain
1.14
controlled checks while data is updated or
deleted.
Other advantages
 Conflicting requirement can be balanced
1. Data can be designed in such a way to
accommodate
users
with
conflicting
requirements and satisfactory reports and
processing can be generate for all
2. Since one designer designs one centralized
database all different requirements can be
modeled and reflected on the same database
 Back-up & recovery
1. The DBMS provides built in backup and recovery
procedures which can be customized the DBA to
suit the requirements of the organization
2. Backup can be done every minute, every hour,
every day or every week or even every month
3. Special recovery procedures can be established
to retract from the fault and continue with the
database processing without loss of data or
transactions
4. In very important databases, disk mirroring can
be established where data is backed up on
immediate basis, i.e., as if two parallel systems
are operating
 Data independence
1. The structure of the database and the contents
are separate from the application system
1.15




programs and a change in one should not
necessarily necessitate a change in the other
2. Unlike file processing systems, rewriting and
recompiling a program does not mean you have
to change or recreate data structures.
Flexibility - Database structure may evolve as new
requirements are defined.
Speedy development - Incremental time to add each
new application is reduced.
Up-to-date information - Extremely important for online transaction systems such as airline, hotel, car
reservations.
Economies of scale - Wasteful overlap of resources
and personnel can be avoided by consolidating data
and applications across departments.
When not to use DBMS?
 In spite of the advantages of using a DBMS, there are
a few situations in which such a system may involve
unnecessary overhead costs, as that would not be
incurred in traditional file processing. For example:
 High initial investment in hardware, software, and
training.
 Generality that a DBMS provides for defining and
processing data.
 Overhead for providing security, concurrency
control, recovery, and integrity functions.
 Multiple users access to data is not required.
1.16
 The database and applications are simple, well
defined, and not expected to change.
 There are stringent (strict) real-time requirements
for some programs that may not be met because
of DBMS overhead.
Two Major Topics
1. Database Design (Viewpoint of a database designer)
2. Algorithmic Issues on a DBMS (Viewpoint of a DBMS
designer)
1.17