Download Data Mart - KV Institute of Management and Information Studies

Introduction -Database management system •Database management system is software designed to assist the maintenance and utilization of large scale collection of data. •DBMS came into existence in 1960 by Charles. Integrated data store which is also called as the first general purpose DBMS. • Again in 1960 IBM brought IMS-Information management system. •In 1970 Edgor Code at IBM came with new database called RDBMS. •In 1980 then came SQL Architecture- Structure Query Language. •In 1980 to 1990 there were advances in DBMS e.g. DB2, ORACLE. Database Management System  A Database Management System (DBMS) is a collection of program that enables user to create and maintain a database.  The DBMS is hence a general purpose software system that facilitates the process of defining, constructing and manipulating database for various applications.  DBMS is efficient to use since there are wide varieties of sophisticated techniques to store and retrieve the data. Organisation of DBMS  The data may be logically organized into  Characters.  Fields.  Records.  Files and  Database. Characteristics of DBMS  Non Redundant Data  It avoids unnecessary duplication of data and effectively reduces the total amount of data storage required.  Sharing Data  A database allows the sharing of data under its control by any number of application programs or users.  Data Integrity  Data integrity means that the data contained in the database is both accurate and consistent. Data Security Data is vital importance to an organization and may be confidential. The DBA who has the ultimate responsibility for the data in DBMS can ensure that proper access procedures are followed, including proper authentication schemas for access to the DBMS and additional checks before permitting access to sensitive data. Conflict Resolution Since the database is under the control of the DBA, she or he should resolve the conflicting requirements of various users and applications. Disadvantages of DBMS. •Unauthorized •Threat •Need of failure. to control data quality. •Threat to data integrity. •Enterprise •Cost access vulnerability. of using DBMS. COMPONENTS OF DBMS DML Pre- complier  DDL complier  File manager  Database manager.  Query processor.  Database administrator.  Data dictionary.  Storage manager.  Database users.  Architecture of DBMS  Three Level Database Architecture  Data are actually stored as bits, or numbers and strings, but it is difficult to work with data at this level.  It is necessary to view data at different levels of abstraction.  There are following three levels or layers of DBMS architecture:  • External Level  •Conceptual Level  • Internal Level Architecture of a DBMS Layers External Level  The external level is the view that the individual user of the database has.  This view is often a restricted view of the database and the same database may provide a number of different views for different classes of users. conceptual view  The conceptual view is the overall community view of the database and it includes all the information that is going to be represented in the database.  The conceptual view is defined by the conceptual schema which includes definitions of each of the various types of data. Internal Level  The internal view is the view about the actual physical storage of data.  It tells us what data is stored in the database and how.  The following aspects are considered at this level:  Storage allocation  Access paths  Miscellaneous Categories of Data Model Categories of Data Model Record Based Models Relational Network Hierarchical Object Based Models Entity – Relationship model Object - Oriented Model Relational database management system (RDBMS ) Model  This model represents data and relationships among data by a collection of tables known as relations, each of which has a number of columns with unique names.  Example, consider the following wage table. hours rate total Raju 40 10 400 sabi 38 8.75 332.50 Ram 42 9.25 388.50 Concepts of RDBMS Model  E.F.Codd of the IBM propounded the relational model in 1972.  Some of the basic concepts of relational model are;  The relational database is a collection of two – dimensional tables.  Each table represents some real- world person, place, thing, or event about which information is collected.  The organization of data into relational tables is known as the logical view of the database. Advantages of RDBMS  Ease of use  Flexibility  Precision  Security  Data independence  Data manipulation language. Disadvantages of RDBMS  A major constraint and therefore disadvantage in the use of relational database system is machine performance.  If the number of tables between which relationships to be established are large and the table themselves are voluminous, the performance in responding to queries is definitely degraded. Network database management system (NDBMS)  This model represents data by collection of records and relationship among data.  This is represented by links, which can be viewed as pointers.  3 Basic Components:  Record type: it represents a finite number of similar type entities.  Data elements: Entities are distinguished by the values of the data elements with which the corresponding record type is associated.  links: all relationships between the same or different record types are restricted to binary, many – one relationships. These many – one relationships are called, links. Type level view in the network model  Many – one Teachers courses many - many employees Work _ in project Advantage of NDBMS model  Conceptual simplicity  Capability to handle more relationship types.  Ease to access data.  Data integrity.  Data independence. Disadvantage of NDBMS model  System complexity.  Operational anomalies.  Absence of structural independence. Hierarchical Database Management System ( HDBMS)MODEL •This model is similar to network model in the sense that data and relationships among data are represented by records and links respectively. •Hierarchical data model uses tree structures to represent relationship among records. •A parent record can have many child records but a child record can have only one parent. •There are no many-to-many relationships between records. No dependent record within a hierarchical data structure can exist without its parent record.  A Hierarchical database therefore consists of a collection of records, which are connected with each other through links.  Each record is a collection of fields(attributes) each of which contains one data value.  A link is an association between precisely two records. Ex: Consider the employee hierarchy  Root Employee First child Compensation Job Assignment Benefits Second child Rating Salary Pension Insurance Health Example: Advantages of HDBMS Model  It is simple, straight forward and natural method of implementing record relationships.  Disadvantages of HDBMS Model  It cannot represent all the relationships that occur in the real world.  It is used only when there is a hierarchical character in the concerned database. Concurrency Management  In computer science, concurrency is a property of systems in which several computations are executing simultaneously, and potentially interacting with each other.  Concurrency control methods are required to ensure that the transaction update do not result in an incorrect execution .  Eg. Update of one transaction overwrite another’s update.  It ensure both consistency and isolation. Reasons of Concurrency Management  The concurrency management is used because of following reasons:  To improved throughput and resource utilisation.  To reduced waiting time. Methods to avoid concurrency  The problem of concurrent access can be solved in a number of ways. Some of them are as follows:  Locking file  Locking record  Locking data field  Versioning. Data Warehouse  A data warehouse is supposed to be a place where data gets stored so that applications can access and share it easily.  A data warehouse is of course a base but it contains summarized information. A data warehousing system various company Data Warehouse Software Information discovery database Data warehouse database Features of Data Warehousing  A common way of introducing data warehousing is to refer to the characteristics of a data warehouse are: Nonvolatile Features of Time Integrity data variant warehousing Subject oriented Warehouse data modeling levels  There are three level of data modeling:  Physical.  Logical.  Data Mart.  Each level of data modeling has its own purpose in data warehouse design. Data Mart  A data mart is a simple form of a data warehouse that is focused on a single subject (or functional area), such as Sales, Finance, or Marketing.  Data marts are often built and controlled by a single organization. department within an What Are the Steps in Implementing a Data Mart?   the major steps in implementing a data mart "Designing"  "Constructing"  "Populating"  "Accessing"  "Managing“  data mart are to design the schema, construct the physical storage, populate the data mart with data from source systems, access it to make informed decisions, and manage it over time. Types of Data mart  Multidimensional database.  It support the management capability of analytically looking at the same data in different ways.  Relational OLAP  It contains both numeric and textual data.  It serve a much wider purpose than the multidimensional database Query processing  Query processing is the procedure of transforming a high – level query ( like SQL ) into a correct and efficient execution plan expressed in low level language that performs the required retrieval and manipulation in the database. Steps in query processing  High level query language ( SQL) Scanning & parsing Query decomposer Algebraic expression Query optimizer Execution Plan Code to execute the query Query code generator/ query Interpreter Runtime database processor Syntax analyzer Steps in query processing  1.Syntax analysis  An SQL query is analyzed and the server produces either a parse tree for the syntax or syntax error.  When a statement is parsed, the information necessary for its execution is loaded into the statement cache. 2.Query decomposition  It is a phase of query processing whose aims are to transform a high level query into a relational algebra query.  It also check whether the query is syntactically and semantically correct.  Query decomposer work in five stages, they are  Query analysis,  Query Normalization,  Semantic Analysis,  Query Simplifier,  Query Restructuring. Steps in Query Decomposition  SQL Query Analysis Query Normalization Semantic analysis  Equivalence Rules Data Dictionary Query simplifier Idempotancy rules Query restructuring Algebraic Expression Transformation Rules (i) Query analysis  During the query analysis phase, the query is lexically and syntactically analyzed in order to find out any syntax errors.  A syntactically legal query is then validated, to ensure that all the database objects( relations and attributes ) referred to by query are defined in the database. ii) Query Normalisation  The primary goal of normalisation phase is to avoid redundancy( to avoid duplication, data insufficient)  The normalisation converts the query into a normalised form that can be more easily manipulated. iii) Semantic Analysis • The main objective is to reduce the number of predicates that must be evaluated by refuting incorrect or contradictory. • The semantic analyzer rejects the normalized queries that are incorrectly formulated or contradictory. Query simplifier  The objectives of a query simplifier are to detect redundant qualifications, eliminate common sub- expressions and transform sub-graphs (query) to semantically equivalent but more easily and efficiently computed forms. Query Restructuring  The query can be restructured to give a more efficient implementation.  Transformation rules are used to convert one relational algebra expression into an equivalent from that is more efficient. 3.Query optimizer  Performing optimization by substituting equivalent expressions for those in the query.  4. query code generator  Generating the code for the queries  5.Runtime database processor  Estimates each process plan, selecting optimal plan and execution takes place.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Data Mart - KV Institute of Management and Information Studies