Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Oracle Database wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
ContactPoint wikipedia , lookup
Clusterpoint wikipedia , lookup
Unit: 2 Database Concept of database People need data, so we create all kinds of list to store and organize it. A grocery list, a phone book, a library’s card catalog, and instructor list of student are all organizes list of data. Likewise, computer can be used to store and manage lists of data, and this is the reason for the computerize database. In fact, many early attempts to build and program computer grew out of a need to manage large list of data. That’s why you probably have heard the phrase data processing – computers manipulate data. That data is usually stored in a database. Today’s computers are much more sophisticated than early computers, but they still need an organized data source. This need applies to just about every type of computer program. For example, a word processor maintain several database – such as a dictionary of words for the spell checker and thesaurus, a list of available fronts and other types of data. Data, Information, Database and Database Management Software(DBMS) Data Data can be anything like bio-data of various applications when the computer is used for recruiting personnel or the marks obtained by various students in various subject when computer is used to prepared result. Thus a collection of fact in raw form that becomes information after proper organization or process. Information When the raw data is processed then we obtain the information. The activity of processing data using a computer is called data processing. The information is the processed form of data which gives complete meaning and used in decision making. Database: A database is a collection of data stored in a standardized format designed to be shared by multiple user. The databases are designed to manage large body of information. The management of data involves both definition of structure for the storage of information and provision of mechanism for the manipulation of data. The example of database are telephone diary, a library’s card catalog etc. 1 DBMS: Database management system (DBMS) is software that defines a database, stores the data, supports query language, produces reports and creates data entry screens. The set of program provides to facilitate the user in organizing, creating, deleting and manipulating their data in a database is also known as DBMS. The examples of database management software are MSAccess, MYSQL, SQL server etc. Objectives of DBMS (Database management system) In the database –oriented of organized data, data from multiple related fields is integrated together in the form of a database, which has following objectives. a. b. c. d. e. f. g. h. i. Provides greater query flexibility. Provides for mass storage of relevant data. Reduces data redundancy. Solves data integrity (inconsistency) problems. Provide data security features at database level, record level, etc. Allows multiple users to be active at a time for accessing data. Makes data independent of the application programs. Provides prompt response to user request for data. Allows updating large volume of data at a time. Database model There are different types of database management systems, each are characterized by the way in which data are defined and structured of database. The database model defines the manner in which the various files of a database are linked together. There are mainly four types of database model commonly in used. They are: a) Relational database model. b) Hierarchical database model c) Network database model. 2 a) Relational database model. A database model in which the data elements are organized in the form of multiple tables and the data in one table is related to the data in another table through the use of a common field(column). Table 1 Table 2 StudentName RollNo RollNo PaymentDate DOB FeeAmount Gender Remark Address Fig 2.1 :Relational database model In the fig 2.1 the table of student info is related with the Fee table through the RollNo. Advantages a. b. c. d. e. It has less redundancy because of primary key. Normalization of database is possible. Rapid database process. Easy for searching of data. Referential integrity can be applied. Disadvantages a. It is complex to maintain than other. b. We have to apply to many rules c. It is not user friendly. 3 Hierarchical database model. In hierarchical database, the data elements are linked in the form of an inverted tree structure with the root as the top and the branch formed below. Below the single-root data element are subordinate elements, each of which in turn, has one or more other elements. There is a parentchild relationship among the data elements of a hierarchical database. A parent data element is one or more subordinate data elements. The data elements are below a parent are its children data elements. ORGANIZATION Personal dept Staff Manager Technical dept Finance dept Staff Manager Staff Manager Fig 2.2: hierarchical database model. Advantages a. Searching is fast and easy, if parent is known. b. It is easiest model than other database model. c. Very efficient in handling ‘one to many’ relationship. Disadvantages a. It is old model of database. b. Difficult to modification and addition of child. c. It can’t handle ‘many-to-many’ relationship. Network database model A network database structure is an extension of the hierarchical database structure. In this model the data elements of a database are organized in the form of parent-child relationships and all type of relationships among the data elements must be determined when the data is first design. In network database model a child data elements can have more than one parent or no parent at all. In this model the database management system permits the extraction of the needed 4 information by beginning from any data elements in the database structure instead of starting from the root data element. college English Computer Sabita sunita Maths Rabi Account Alin Fig 2.3: network database model In the fig 2.3 the data has more than one parents but the data ram has no parents. Advantages a. b. c. d. It has more flexibility. It reduces data redundancy. Searching is fast because of multi directional pointer. It accepts many-to-many relationship. Disadvantages a. Difficult to sort data. b. Very complex type of database model. c. Need long programs for relationship. Relational Data Model A database is the collection of related items of facts arranged in a specific structure. So arrange data in the proper structure we define different terms. 1. Field: The smallest unit of data in a database, used to group each piece or item of data into a specific category. Field is arranged in a column. The field has the same data types. 2. Record: A database row composed of related fields; a collection of record makes up the database. The record gives the complete meaning. 5 3. Tables: complete collection of record is called tables. A table contains rows and columns. Each columns of table represent fields and each row represents the record. 4. Entity: an entity is a class of people, objects, events, or concept in this real word that is different from other object. An entity is something about which the business needs to store data entity is called entity class or entity type. Person: doctor, teacher, student etc Object: tool, machine, building, pen etc. Place: zone, region, country etc. 5. Attributes: it is descriptive propertied possessed by each member of an entity. Attribute are also called elements, property or fields. Stid S001 S002 S003 S004 name Sabita regmi Ram Anjana Raju address Ngh Ktm Pok Ngh Class 12 11 12 12 Sec A B A a In the above table stid, name, address, classs, sec are attributes. Name Stid Student Address Sex Class Fig 2.5: E-R diagram The given figure is a ER (entity Relationship) diagram which shows the relation between different entities. The entity relationship show actual relation of instance to different attributes In E-R diagram we also assign different key to make the relation between two tables of different entity. The keys are as following. Primary key: A column or set of column that identify a particular row in a table. The primary key makes the data uniqueness in that table. It also helps us to reduce data redundancy. It is also used to set the relationship between tables. In E-R diagram the primary key is shown as underline. In fig 2.5 stid has the primary key. 6 Foreign key: A foreign key is a field in a child table that refers the primary key of a master table. It is required for setting relationship between tables Unique Key: A unique key is a type of key in which data can’t be repeated and they must be unique and allow null value once. Candidate Key: In table there may be no. of unique columns, but one is chosen for primary key then others become candidate keys. Concept of Normalization The essence of normalization is to split your data into several tables that will be connected to each other based on the data within them. Relational data base operate on tables of data. These tables must be carefully defined to obtain the advantage of database approach. The process of determining the proper tables for the database is called normalization. The normalization is the process of organizing data in a database to reduce the redundancies, it also include creating of tables and establishing the relationship between those tables using rules designed to protect the data and to make database flexibility. Unnormalized normalize table Student info personal info Studentid Firstname Lastname Class Subject Marks Sec Roll StudentId Firstname Lastname Class sec subject table marks table SubjectId Subject roll SubjectId Studentid Marks fig2.6: normalization of data 7 In the above fig 2.6 the first table is not in normalized form. Those we want to enter the marks we have to enter all the information so there is chance of redundancy of data and it seem to be ineffective so the table is split into other three table to make data independent and the table is only depend by keys. The above normalization helps us to make sure of following: a) Dependence between the data is identified. b) Redundancy in database is minimized. c) The data model is making more flexible, and easier to maintain. Types of normalization a) 1NF( first normal form) When the table has no repeating group of data then it is said to be in first normal form. That means for each cell in a table (one row and one column), there can be only one value. This value should be atomic in the sense that It can’t be decomposed into smaller pieces. Name Sangita Laxmi Sabita Roll 2 1 1 Class 12 11 12 Sec A B A Sub1 English English english Marks1 78 90 67 Sub2 maths Maths Maths Marks2 90 89 98 Sub3 Computer Computer Computer Marks3 78 67 90 The above table is not in normal form the attributes are most in repeated form do in first normal form we break table in the following way. Name Roll Class Sec Subject Marks Sangita 2 12 A Maths 90 Laxmi 1 11 B English 90 Sabita 1 12 A Computer 90 Sangita 2 12 A English 78 Laxmi 1 11 B Computer 67 Sabita 1 12 A Maths 98 Sangita 2 12 A Computer 78 Laxmi 1 11 B Maths 89 Sabita 1 12 A English 67 b) 2NF(second normal form) 8 The table is in second normal form if every non-key column depends on the entire key. For these split the table. Pull out the columns that depend on parts of key. Remember to include that part of the key in new table. The new table must have key or id that must be on both tables. Each attributes in the table must depend on whole key. Table 3: Marks Name Subject Sangita Maths Laxmi English Sabita Computer Sangita English Laxmi Computer Sabita Maths Sangita Computer Laxmi Maths Sabita English Marks 90 90 90 78 67 98 78 89 67 Table 1: student Name Roll Sangita 2 Laxmi 1 Sangita 1 Table 2:subject class Subject English 11 English 12 Computer 11 Computer 12 Maths 11 Maths 12 Class 12 11 12 Sec A B A In the above the whole table is split into the three tables, marks, subject, and student. The interrelated data are place together in the table. Name depends on roll+class+sec, subname dependent on class not on roll, name subject and marks are interrelated. c) 3NF(Third normal form) The logical, analysis and elements of designing for third normal form (3NF) are similar to those used in deriving 2NF. In particular, you still concentrate on the issue of dependence. To be in 3NF a table must be in 2NF, and every non-key column must depend on nothing but the key. Table1: class classid Classname 1 11 2 12 Table 2:subject subid Subject 1 English 2 Computer 3 maths Table 3: student stid Name 1 2 3 Table4: marks stdid 1 2 3 Sangita Laxmi sabita subid 1 1 1 Roll Classi d 2 2 1 1 1 2 Sec A B A Marks 78 90 67 9 1 2 2 2 In the given table all the attribute are depends on 3 2 the key thus in the table class subject and student all 1 3 the attributes depend on primary key but in table 2 3 3 3 marks the data are depend on stdid and subid. So these four tables are the normalized data of the given non normalized table. 78 67 90 90 89 98 Structure Query language Structured Query Language (SQL), in computer science, a database sublanguage used in querying, updating, and managing relational databases. Derived from an IBM research project that created Structured English Query Language (SEQUEL) in the 1970s, SQL is an accepted standard in database products. Although it is not a programming language in the same sense as C or Pascal, SQL can either be used in formulating interactive queries or be embedded in an application as instructions for handling data. The SQL standard also contains components for defining, altering, controlling, and securing data. SQL is designed for both technical and nontechnical users Some database system provides special window or form for creating queries. Because of similar of almost all databases, a common type of query language is developed which is called structured English Query language. The most commonly used command in SQL is the select statements, which is used to retrieve data from table. The basic structures of SQL language are as following. SELECT field1, field2,…. FROM table name WHERE condition. Query from one tables SELECT name, class, roll, sec FROM student WHERE sec=”a” This query show the name, class, roll, sec from student tables which section is A only. Query from two tables SELECT name, subject, marks FROM student, subject WHERE stid.dtudent=subid.subject 10 Centralized Vs Distributed Database. Centralized database work on a client-server basis. With this system powerful machine with multiuser operating system function as server. Smaller computer –usually personal computer operate as the clients. The software holds software and data that will be shared by the users. Individual client computer hold data that is used by the individual using that machine. The controlled mechanism and data are deposited in a center location. A DBA is appointed as the controller of the whole database. It is only suitable for the small organization and small-scale operation bases. Distributed database systems consist of multiple independent databases that operate on two or more computers that are connected and shared data over network. The databases are generally in different physical locations. Each database is controlled by an independent DBMS, which is responsible for maintaining the integrity of its own database. The advantages of distributed database are as following: a. It provide high performance, most update and queries are update locally. b. They are easy to expand database. c. It also provides transaction processing and decision support application. Data Security A databases collects a large amount of data in one location and makes it easy for people to retrieve and change data. So databases are a critical resource that must be protected. Yet the same factors that make a database so useful also make it more difficult to secure. The database should be making secure. The data may be lost by crashes during transaction processing, unauthorized reading of data, destruction of data etc. To protect the database, we must take security measures at several levels: Physical: the site or sites of containing the computer must physically secure against armed or surreptitious entry by intruders. The physical security also consist of regular maintenance, insurance, protect from theft etc. Human: database user must be authorized carefully. The data privacy of user is also the factor that may damage the database. 11 Operating system: we can also make the database secure by the policy of operating system also today there are many options of security provided by O.S. Network: we can also applied by the network physical layer, since all the databases are connected through the network. Data Integrity Data integrity is set of rule that govern database for correctness and consistence purpose. Data integrity ensure that changes made to the database by the authorized user don’t result in a loss of data consistency . Data integrity guard against accidental damage to the database by ensuring that authorized changes to the database do not result in loss of data consistency. The three different type of data consistency are: 1. Domain Integrity 2. Entity Integrity 3. Referential Integrity Domain Integrity Domain integrity is set of rule that is applied on cell level data. It ensure set of value that may be associated with and attribute(property/Column).Domain constraint(rule) are most elementary form of data integrity. They are tested by system, whenever a new data item is entered into database. E.g. Student RollNo 1 2 3 aa Name Ram Syam Hari Amit Class XI XI XI XI Age 20 19 18 55 Here ‘aa’ is not a roll no. that violate domain integrity and set of allowable value are only numeric value. Entity Integrity Entity integrity is set of rule that is applied on row level(record level) data. It ensure that each row in a table must be uniquely identifiable by some key.. E.g. Student RollNo(Primary Key) Name Class Age 12 1 2 3 3 Ram Syam Hari Amit XI XI XI XI 20 19 18 55 Here 4th row violate entity intergrity rule because each key that is use to indentify the row(record) must be unique in that column. Referential Integrity Referential integrity is set of rule that is applied on Table level. The table which refrence column of main table primary key column must be present in reference table. E.g ; . Student(Main Table) RollNo(Primary Key) 1 2 3 4 Marks(Refrence Table RollNo 1 2 3 Name Ram Syam Hari Amit Class XI XI XI XI Age 20 19 18 55 Subject Math Math Math Math Class XI XI XI XI Age 20 19 18 55 Here in Marks table Roll No. must be present for Marks information, Therefore the marks table violate referential integrity rule. 13 Difference Between DBMS and RDBMS DBMS DBMS Stands for Database Management System. It is a flat file approach of data storing system A DBMS, is any system that manages databases, Small numbers of users Chances of massive duplicate data records RDBMS RDBMS stands for Relational Database Management System. It stores data in multiple table which are related to each other by some relationship. A RDBMS is a subtype of DBMS that is limited to what are called relational databases. Large No. of users Less data duplication . 14