An Introduction to Database and Database Designing R C Goyal Principal Scientist IASRI, New Delhi Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response time Labor-intensive Often incomplete or inaccurate Data & Information Data known facts that can be recorded and that has implicit meaning Information processed data Meaningless data becomes information when it is processed and presented to the decision maker in a meaningful way Database A set of data that is required for a specific purpose or is fundamental to a system, project, enterprise, or business. A formally structured collection of data. A database may consist of one or more data banks and be geographically distributed among several repositories. In automated information systems, the database is manipulated using a database management system. File Mgmt vs. Database Mgmt File Management Each data entity is in a separate file system for creating, retrieving and manipulating files. Database Management Same file but data elements are integrated and shared among different files Program controls the structure of a database and access to data Database Management for Strategic Advantage We live in the Information Age Information used to make organizations more productive and competitive Databases used to support business operations Databases used across a range of applictions Personal, department, enterprise, web Technical Advantages of Database Management Reduced data redundancy Improved data integrity More program independence Increased user productivity Increased security Disadvantages of DBMS Cost issues Implementation & maintenance issues Security issues Privacy issues Features of DBMS Database Engine heart of DBMS stores, retrieves and updates data enforces business rules Data dictionary holds definitions of all of the data tables Describes the data type, allows DBMS to keep track of data, helps user find data they need Features of DBMS…Cont. Query processor Enables users to store and retrieve data Use English command that use such words as SELECT, DELETE, MODIFY Report generator Formats and prints reports after user uses query processor Features of DBMS…Cont. Forms generator Application generator Access security Setup access privileges to protect data from unauthorized access and sabotage System recovery Database Development Cycle Database planning System Definition Requirements collection and analysis Database design DBMS selection Application design Prototyping Implementation Data Conversion and loading Testing Operational Maintenance Database Life Cycle Database Planning Systems Definition Requirements Collection and analysis DBMS Selection Database Design Application Design Implementation Data Conversion and loading Prototyping Testing Evaluation & Maintenance Source: http://www.cs/ucf.edu/courses/cgs2545/CH02/index.htm Database Planning Current systems evaluation Development of standards Technological feasibility Operational feasibility Economical feasibility Systems definition Data dictionary Metadata Requirements Collection and Analysis identifying management information requirements, determining information requirements by functional area, and establishing hardware and software requirements Database Design Conceptual design Logical design Physical design Database design methodology A structured approach that uses procedures, techniques, tools, and documentation aids to support and facilitate the process of design. * Conceptual database design * Logical database design * Physical database design Critical Success Factors in Database Design Work interactively with the users as much as possible. Follow a structured methodology throughout the data modelling process. Incorporate structural and integrity considerations into the data models. Combine conceptualisation, normalisation, and transaction validation techniques into the data modelling methodology. Design Considerations - Tips Planning Tables Queries Forms Reports Macros Modules General Tips - Planning Before you start up the computer to build a database, PLAN ON PAPER!!! Gather all of the paper forms you use to collect your data Interview the people who will be using the database, from those who conduct the field work to those who analyze the data Think about what you want to get back out of your database, not just about how to get data in Steal - see if it’s been done before Tips - Tables Normalize, normalize, normalize Use field properties to help maintain data integrity When possible limit the possible entries in a field to a lookup list (domain) Choose your key fields carefully Index fields that you will search often Many-to-many relationships in Access Steal - link to existing tables Normalization The process of breaking down large tables into smaller ones by removing all unnecessary or duplicate fields, eliminating redundant data, and making sure that each table represents only one thing. Why Normalize? If you don’t normalize, you may run into anomalies (unexpected results) Deletion anomaly - deleting a record unexpectedly removes a value we wanted to keep Insertion anomaly - we can’t add a record because we don’t know the value of all of the required fields Change anomaly - one change must be applied to many records in a single table Non-normalized Table Typical flat-file table Student# StName 1022 Rajat 4123 Patel AdvName AdvRoom Class1 Class2 Class3 Goyal 412 101-07 143-01 159-02 Gupta 216 202-01 211-02 214-01 Contains all data in one record. Contains repeating fields. Contains information on more than one thing. First Normal Form Eliminate repeating fields within tables. Create a separate table for each set of related data. Identify each set of related data with a primary key. First Normal Form Student# 1022 1022 1022 4123 4123 StName Rajat Rajat Rajat Patel Patel AdvName AdvRoom Class# Goyal 412 101-07 Goyal 412 143-01 Goyal 412 159-02 Gupta 216 210-01 Gupta 216 211-02 No Repeating Fields: Class# is now a single field instead of repeating for each class a student takes. Now multiple records for each student because >1 record is needed to accommodate all classes for each student Non-normalized Table Student# StName 1022 Rajat 4123 Patel AdvName AdvRoom Class1 Class2 Class3 Goyal 412 101-07 143-01 159-02 Gupta 216 202-01 211-02 214-01 Second Normal Form Create separate tables for values that apply to multiple records. Relate these tables with a foreign key. Second Normal Form Students: Student# StName 1022 Rajat 4123 Patel Student# Classes: 1022 1022 1022 4123 4123 4123 AdvName AdvRoom Goyal 412 Gupta 216 Class# 101-07 143-01 159-02 210-01 211-02 214-01 Eliminate Redundant Data: no longer repeating student name, advisor, and adv-room for each class Third Normal Form Eliminate fields that do not depend on the key. Student# StName Students: 1022 Rajat 4123 Patel AdvName AdvRoom Goyal 412 Gupta 216 AdvRoom is independent of the key field (Student#). It does not depend on the student, it depends on the advisor. Third Normal Form Students: Student# StName 1022 Rajat 4123 Patel AdvID 1 2 Advisors: AdvID AdvName AdvRoom 1 Goyal 412 2 Gupta 216 Eliminate Data Not Dependent on Key: remove AdvName from the Students table and create a separate table describing the advisors with its own primary key (AdvID). Relate Advisors to Students using AdvID. Deletion Anomalies Student# StName 1022 Rajat 1022 Rajat 1022 Rajat 4123 Patel 4123 Patel AdvName AdvRoom Class# Goyal 412 101-07 Goyal 412 143-01 Goyal 412 159-02 Gupta 216 210-01 Gupta 216 211-02 If we delete student Rajat, we lose the only records for advisor Goyal Insertion Anomalies Student# StName 1022 Rajat 1022 Rajat 1022 Rajat 4123 Patel 4123 Patel AdvName AdvRoom Class# Goyal 412 101-07 Goyal 412 143-01 Goyal 412 159-02 Gupta 216 210-01 Gupta 216 211-02 If we want to add a new student we have to know what class they are in (it’s part of the key) Change Anomalies Student# StName 1022 Rajat 1022 Rajat 1022 Rajat 4123 Patel 4123 Patel AdvName AdvRoom Class# Goyal 412 101-07 Goyal 412 143-01 Goyal 412 159-02 Gupta 216 210-01 Gupta 216 211-02 If student Rajat switches to advisor Gupta, we have to change three records instead of just one Field Properties Data Type - text, integer, double, boolean, etc. Field Size - for text fields Input Mask - a pattern for entering data Default Values - auto-entered for new records Validation Rule - limits values entered Required? - force entry of data Relationships One to One One to Many Many to One Many to Many Many-to-Many Relationship Exist between two tables when: for one record in the first table, there can be many corresponding records in the second table and… for one record in the second table, there can be many corresponding records in the first table Many-to-Many Relationship One student can take many classes, and one class can be taken by many students. Students StudentID Name Address City State Zip WorkPhone HomePhone Classes ClassNumber StudentID Subject InstructorID Days Time Comments Tips - Queries Take advantage of action queries to handle batch record operations Use queries to present calculated values rather than storing the calculated values in your tables Remember that null never equals another null Joining two tables on a field that may contain a null value may not give you the results you expect Searching for duplicate values will not return two records that have a null Realize that you can link a table to itself DBMS Selection * Costs * Features and Tools * Underlying model * Portability * DBMS hardware requirements * Organisational requirements Implementation The physical realisation of the database and application designs the detailed model is converted to the appropriate implementation model, the data dictionary is built, the database is populated, application programs are developed and users are trained Data Conversion and Loading & Testing Transferring any existing or new data into the new database and converting any existing applications to run on the new database Finding errors Database Evaluation Interviewing and polling users to determine whether any data needs are unmet. Operational maintenance preventive maintenance (backup) corrective maintenance (recovery)1 adaptive maintenance regular monitoring & periodical check up Recent Developments Affecting Database Design and Use Data Mining (On-Line Analytical Processing) Drill down from summary data to detailed data Data Warehouses/Data Marts Integrates many large databases into one repository Linking Web Site Applications to Organizational Databases Users have Web view to organizational database Improves customer contact and service Adds security as a concern Data Warehouses Data warehouses are used to organize information for quick and effective queries Data Warehouses and Database In the data warehouse, data are organized around major subjects Data in the warehouse are stored as summarized rather than detailed raw data Data in the data warehouse cover a much longer time frame than in a traditional transaction-oriented database Data warehouses are organized for fast queries Data warehouses are usually optimized for answering complex queries, known as OLAP Data Warehouses and Database Data warehouses allow for easy access via data-mining software called software Data warehouses include multiple databases that have been processed so that data are uniformly defined, containing what is referred to as “clean” data Data warehouses usually contain data from outside sources Data Mining Statistical analysis Decision trees Neural networks Fuzzy logic Clustering Data Mining Patterns Data mining patterns that decision makers try to identify include Associations, patterns that occur together Sequences, patterns of actions that take place over a period of time Clustering, patterns that develop among groups of people Trends, the patterns that are noticed over a period of time Web Based Databases and XML Web-based databases are used for sharing data Extensible markup language (XML) is used to define data used primarily for business data exchange over the Web An XML document contains only data and the nature of the data Continue………………………………..