Download Unit-1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript

A database is any collection of data.

A DBMS is a software system
designed to maintain a database.

We use a DBMS when
 there is a large amount of data
 security and integrity of the data are important
 many users access the data concurrently
Concurrent Use
Structured Data
Separation of Data and Applications
Data Integrity
Transactions
Data Persistence
Data Views
A database system allows several users to access
the database concurrently. Answering different
questions from different users with the same (base)
data is a central aspect of an information system.
Such concurrent use of data increases the economy
of a system. Data capturing and data storage is
not redundant, the system can be operated from a
central control and the data can be updated more
efficiently
A fundamental feature of the database approach is
that the database system does not only contain the
data but also the complete definition and
description of these data. These descriptions are
basically details about , the structure, the type and
the format of all data and, additionally, the
relationship between the data. This kind of stored
data is called metadata ("data about data").
Structured Data:Data is called structured if it can
be subdivided systematically and linked.
 software application does not need any knowledge
about the physical data storage like encoding,
format, storage place, etc. It only communicates
with the management system of a database (DBMS)
via a standardised interface with the help of a
standardised language like SQL. The access to the
data and the metadata is entirely done by the
DBMS.
In this way all the applications can be totally
separated from the data. Therefore database
internal reorganisations or improvement of
efficiency do not have any influence on the
application software.
Data integrity ensures the quality and the reliability
of the data of a database system.Data integrity
includes also the protection of the database from
unauthorized access (confidentiality) and
unauthorized changes.
. A DBMS should bring only correct and consistent
data into the database. Additionally,
correct transactions ensure that the consistency is
maintained during the operation of the system.
An example for inconsistency would be if
contradictory statements were saved in the same
database.
A transaction is a bundle of actions which are done
within a database to bring it from one consistent
state to a new consistent state. In between the
data are inevitable inconsistencies.
A transaction is atomic, which means it cannot be
divided up any further. Within a transaction all or
none of the actions need to be carried out. Doing
only a part of the actions would lead to an
inconsistent database state.
Data persistence means that in a DBMS all data is
maintained as long as it is not deleted explicitly. The
life span of data needs to be determined directly or
indirectly be the user and must not be dependent
on system features. Additionally data once stored in
a database must not be lost.
Changes of a database which are done by
a transaction are persistent. When a transaction is
finished even a system crash cannot put the data in
danger.
Typically, a database has several users and each of
them, depending on access rights and desire, needs
an individual view of the data (content and form).
Such a data view can consist of a subset of the
stored data or from the stored data derived data
(not explicitely stored).
More information from given data
Ad hoc queries can be performed
Redundancy can be reduced
Inconsistency can be avoided
Security restriction can be applied
Data independence
 more cost-effective: reduced development time, flexibility,
economies of scale
Providing backup and recovery services.
Providing multiple interfaces to different classes of
users.
Representing complex relationships among data.
Enforcing integrity constraints on the database.
Drawing Inferences and Actions using rules
Expensive
 hardware, software, personnel, processing overhead, operating
cost , etc.
DBMS generality & overhead
=> performance issue
Increased vulnerability to failure
Recovery is more complex
•Proposed to support DBMS characteristics of:
• Program-data independence.
• Support of multiple views of the data.
•Defines DBMS schemas at three levels:
• Internal schema at the internal level to describe physical storage
structures and access paths. Typically uses a physical data
model.
• Conceptual schema at the conceptual level to describe the
structure and constraints for the whole database for a
community of users. Uses a conceptual or an implementation
data model.
• External schemas at the external level to describe the various
user views. Usually uses the same data model as the conceptual
level.
Mappings among schema levels are needed to
transform requests and data. Programs refer to an
external schema, and are mapped by the DBMS to
the internal schema for execution.
Data independence is defined as the capacity to change the schema at one level of
database s/m with out having to change the schema at next higher level.
Types of DI:
Logical Data Independence:
The capacity to change the conceptual schema without having
to
change the external schemas and their associated application
programs.
Physical Data Independence:
The capacity to change the internal schema without having to
change the conceptual schema.
For example, the internal schema may be changed when certai
n file
structures are reorganized to improve Database performance
When a schema at a lower level is changed, only
the mappings between this schema and higherlevel schemas need to be changed in a DBMS that
fully supports data independence. The higher-level
schemas themselves are unchanged. Hence, the
application programs need not be changed since
they refer to the external schemas.
A database model is a type of data model that
determines the logical structure of a database and
fundamentally determines in which manner data can be
stored, organized, and manipulated.
Common data models for databases include:
Hierarchical database model
Network model
Relational model
Entity–relationship model
 Enhanced entity–relationship model
Object model
SALIENT FEATURES
Logically
represented by an upside down TREE
Each parent can have many children
Each child has only one parent
The top layer is perceived as the parent of the
segment directly beneath it.
The segments below other segments are the children
of the segment above them.
Conceptual
Data
simplicity
independence
Efficiency
dealing with a large database
Complex
implementation
Difficult
to manage and lack of standards
Lacks
structural independence
Applications
programming and use complexity
Implementation
limitations (no M:N relationship)
Developed
in mid 1960s as part of work of CODASYL
(Conference on Data Systems Languages)
The network model has greater flexibility than the hierarchical
model for handling complex relationships
Objective of network model is to separate data structure from
physical storage, eliminate unnecessary duplication of data with
associated errors and costs
The Network Database Model was created for three main
purposes :
- representing a complex data relationship more effectively
- improving database performance
- imposing a database standard
Major
characteristic of this database model is that it
comprises of at least two record types ; the owner & the
member.
An owner is a record type equivalent to the parent type
in the hierarchal database model, and the member
record type resembles the child type in the hierarchal
model.
The network database model uses a data management
language that defines data characteristics and the data
structure in order to manipulate the data.
The
network model contains logical information such as
connectivity relationships among nodes and links, directions of
links, and costs of nodes and links.
Simplicity
 Ability to handle more relationship types :
 Ease of data access
Data Integrity :
 Data Independence

System Complexity : The structure of the network
model is very difficult to change. This type of system
is very complex
 Lack of Structural independence. Any changes
made to the database structure require the application
programs to be modified before they can access data.

The relational model uses a collection of tables to
represent both data and relation among the data
Table,
a set of rows and columns .each column cthe
has a unique name
Row, a set of columns from a table reflecting a
record.
Primary key, often designated pk, is 1 or more
columns in a table that makes a record unique.
In the relational model ,a row is called a tuple ,a
column header is called an attribute and the table is
called a relation
Foreign
key, often designated fk, is a common
column common between 2 tables that define the
relationship between those 2 tables.
Foreign keys are either mandatory or optional.
Hardware overhead : need more powerful
computing hardware and data storage devices to
perform RDBMS tasks
Entity Relationship Model or ER is based on a
perception of a real world that consists of a collection
of basic objects called entities and relationship among
these objects
The overall structure of a database can be represented
graphically by E-R diagram
Entity Relationship Model or ER is build up from the
following components
•Rectangle: represent the entity sets
•Ellipses: represents the attributes
•Diamonds: relationship among entity sets
•Lines :links attributes to entity sets and entity sets to
relationships
Double ellipses: which
represent multi valued
Attributes
Dashed ellipses: which denote derived attributes
Double lines : which represent which represent
total participates in an entity in a relationship set
Double Rectangle : which represent weak entity
sets
Index_no
It is based on the object –oriented –programming
language paradigms
The objects-oriented paradigm is based on the
Encapsulation of data and code related to an object
into a single unit , inheritance and object -identity