* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Unit-1
Survey
Document related concepts
Transcript
A database is any collection of data. A DBMS is a software system designed to maintain a database. We use a DBMS when there is a large amount of data security and integrity of the data are important many users access the data concurrently Concurrent Use Structured Data Separation of Data and Applications Data Integrity Transactions Data Persistence Data Views A database system allows several users to access the database concurrently. Answering different questions from different users with the same (base) data is a central aspect of an information system. Such concurrent use of data increases the economy of a system. Data capturing and data storage is not redundant, the system can be operated from a central control and the data can be updated more efficiently A fundamental feature of the database approach is that the database system does not only contain the data but also the complete definition and description of these data. These descriptions are basically details about , the structure, the type and the format of all data and, additionally, the relationship between the data. This kind of stored data is called metadata ("data about data"). Structured Data:Data is called structured if it can be subdivided systematically and linked. software application does not need any knowledge about the physical data storage like encoding, format, storage place, etc. It only communicates with the management system of a database (DBMS) via a standardised interface with the help of a standardised language like SQL. The access to the data and the metadata is entirely done by the DBMS. In this way all the applications can be totally separated from the data. Therefore database internal reorganisations or improvement of efficiency do not have any influence on the application software. Data integrity ensures the quality and the reliability of the data of a database system.Data integrity includes also the protection of the database from unauthorized access (confidentiality) and unauthorized changes. . A DBMS should bring only correct and consistent data into the database. Additionally, correct transactions ensure that the consistency is maintained during the operation of the system. An example for inconsistency would be if contradictory statements were saved in the same database. A transaction is a bundle of actions which are done within a database to bring it from one consistent state to a new consistent state. In between the data are inevitable inconsistencies. A transaction is atomic, which means it cannot be divided up any further. Within a transaction all or none of the actions need to be carried out. Doing only a part of the actions would lead to an inconsistent database state. Data persistence means that in a DBMS all data is maintained as long as it is not deleted explicitly. The life span of data needs to be determined directly or indirectly be the user and must not be dependent on system features. Additionally data once stored in a database must not be lost. Changes of a database which are done by a transaction are persistent. When a transaction is finished even a system crash cannot put the data in danger. Typically, a database has several users and each of them, depending on access rights and desire, needs an individual view of the data (content and form). Such a data view can consist of a subset of the stored data or from the stored data derived data (not explicitely stored). More information from given data Ad hoc queries can be performed Redundancy can be reduced Inconsistency can be avoided Security restriction can be applied Data independence more cost-effective: reduced development time, flexibility, economies of scale Providing backup and recovery services. Providing multiple interfaces to different classes of users. Representing complex relationships among data. Enforcing integrity constraints on the database. Drawing Inferences and Actions using rules Expensive hardware, software, personnel, processing overhead, operating cost , etc. DBMS generality & overhead => performance issue Increased vulnerability to failure Recovery is more complex •Proposed to support DBMS characteristics of: • Program-data independence. • Support of multiple views of the data. •Defines DBMS schemas at three levels: • Internal schema at the internal level to describe physical storage structures and access paths. Typically uses a physical data model. • Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model. • External schemas at the external level to describe the various user views. Usually uses the same data model as the conceptual level. Mappings among schema levels are needed to transform requests and data. Programs refer to an external schema, and are mapped by the DBMS to the internal schema for execution. Data independence is defined as the capacity to change the schema at one level of database s/m with out having to change the schema at next higher level. Types of DI: Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their associated application programs. Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema. For example, the internal schema may be changed when certai n file structures are reorganized to improve Database performance When a schema at a lower level is changed, only the mappings between this schema and higherlevel schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas. A database model is a type of data model that determines the logical structure of a database and fundamentally determines in which manner data can be stored, organized, and manipulated. Common data models for databases include: Hierarchical database model Network model Relational model Entity–relationship model Enhanced entity–relationship model Object model SALIENT FEATURES Logically represented by an upside down TREE Each parent can have many children Each child has only one parent The top layer is perceived as the parent of the segment directly beneath it. The segments below other segments are the children of the segment above them. Conceptual Data simplicity independence Efficiency dealing with a large database Complex implementation Difficult to manage and lack of standards Lacks structural independence Applications programming and use complexity Implementation limitations (no M:N relationship) Developed in mid 1960s as part of work of CODASYL (Conference on Data Systems Languages) The network model has greater flexibility than the hierarchical model for handling complex relationships Objective of network model is to separate data structure from physical storage, eliminate unnecessary duplication of data with associated errors and costs The Network Database Model was created for three main purposes : - representing a complex data relationship more effectively - improving database performance - imposing a database standard Major characteristic of this database model is that it comprises of at least two record types ; the owner & the member. An owner is a record type equivalent to the parent type in the hierarchal database model, and the member record type resembles the child type in the hierarchal model. The network database model uses a data management language that defines data characteristics and the data structure in order to manipulate the data. The network model contains logical information such as connectivity relationships among nodes and links, directions of links, and costs of nodes and links. Simplicity Ability to handle more relationship types : Ease of data access Data Integrity : Data Independence System Complexity : The structure of the network model is very difficult to change. This type of system is very complex Lack of Structural independence. Any changes made to the database structure require the application programs to be modified before they can access data. The relational model uses a collection of tables to represent both data and relation among the data Table, a set of rows and columns .each column cthe has a unique name Row, a set of columns from a table reflecting a record. Primary key, often designated pk, is 1 or more columns in a table that makes a record unique. In the relational model ,a row is called a tuple ,a column header is called an attribute and the table is called a relation Foreign key, often designated fk, is a common column common between 2 tables that define the relationship between those 2 tables. Foreign keys are either mandatory or optional. Hardware overhead : need more powerful computing hardware and data storage devices to perform RDBMS tasks Entity Relationship Model or ER is based on a perception of a real world that consists of a collection of basic objects called entities and relationship among these objects The overall structure of a database can be represented graphically by E-R diagram Entity Relationship Model or ER is build up from the following components •Rectangle: represent the entity sets •Ellipses: represents the attributes •Diamonds: relationship among entity sets •Lines :links attributes to entity sets and entity sets to relationships Double ellipses: which represent multi valued Attributes Dashed ellipses: which denote derived attributes Double lines : which represent which represent total participates in an entity in a relationship set Double Rectangle : which represent weak entity sets Index_no It is based on the object –oriented –programming language paradigms The objects-oriented paradigm is based on the Encapsulation of data and code related to an object into a single unit , inheritance and object -identity