Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Understanding Database Models Before covering relational databases in more detail, we'll briefly cover hierarchical and network database management systems (DBMSs). Understanding their limitations will help you understand the relational approach and how the approach attempts to address these limitations. Hierarchical Databases One of the hierarchical DBMSs still in use today is an IBM product called IMS, which stands for Information Management System. Its paradigm is to use a tree structure and a series of links to navigate from record type (a table) to record type. Records (single rows) include one or more fields (columns). Each tree must have a single root, or parent record type. The relationship between the record types is the same as the directory structure on your computer: parent to child, continuing onto lower and lower levels. The relationship is maintained as a DBMS pointer structure from one record to another. That pointer is valid for only one level of connectivity, and maintaining the order of the rows is required for the pointer to work. As an example of this type of system, consider Figure 2-1, which illustrates a classscheduling system at a college. Figure 2-1. Hierarchical class scheduling: record types connected in a tree structure This type of data management, system has several challenges. One is the direct result of the restriction of being able to link only one parent to any children record types (such as Class in Figure 2-1). Look at the necessary duplication of Class because of its need to relate to both Campus and Course. A hierarchical DBMS (such as IMS) assumes the world can be viewed as a series of unrelated, strictly applied hierarchies, with one parent having many children within a tree structure. When this doesn't work (and there are many cases when you don't have exclusive classification), you end up with duplication of data across different tree structures. Workarounds for this usually require creating duplicate records or tables to satisfy each different use of the same data, which can lead to data synchronization problems, since the same records appear in numerous places within the database. Another challenge with this type of design is the unnecessary duplication of records within a single record type and hierarchy. Look at the relationship of Student to Course. A student will often enroll for many courses, and the details of that student will be stored as a child of every course for which the student is enrolled. Information support for control systems Lesson 2 / Student Page 1/5 One of the major limitations of this technology is that because of the technical difficulty inherent in setting up and navigating complex hierarchies within the physical database, the physical design is often implemented as a series of single-level parent-child relationships (rool-child, root-child2, and so on), regardless of whether this actually represents the business rules of the data. The technological limitations with which the database administrators (DBAs) and programmers must work bias the implementation lo the point where the hierarchies aren't implemented in the way the designers intended. And unfortunately, because of the linage structures connecting the tables, you can't skip a level of relationship to find data. So, to get from Course to Teacher, you must perform additional input/output (I/O) operations and walk the tree down through Class. You can imagine that for a large database with complex hierarchies this would be a very long path. Further- more, because these relationship paths aren't easy to change and new ones aren't easy to add, these databases become rather inflexible once the database is created. However, in spite of these types of challenges, hierarchical DBMSs are still regularly used. They were popular in the 1960-70s, can still deliver high-performing systems, and are prevalent in legacy systems still in use today. Network Databases One of the big network databases still fairly common in data management is IDMS, which stands for Integrated Database Management Systems. Network databases were a logical extension of hierarchical databases and resolved the problem hierarchical databases had with child records having multiple parents. The Conference on Data Systems Languages (CODASYL) in 1971 formally introduced the network model. Under this system, data management is based on mathematical "set" theory. A data set consists of an owner record type, a set name, and a member record type. A member record type basically corresponds to a data element in a row. The member record types can belong to various owner record types, thereby allowing for more than one parent relationship for a record. The owner record type can also be a member or an owner in another record type. To address data sizing issues, data elements making up a record of a network design can be "redefined" to mean different things. This means that if a record has been defined as containing four data elements, such as A, B, C and D, under some circumstances the fields within the record may be redefined to actually store the elements A, E, F, and G instead. This is a flexible construct, which reduces the amount of database structures and the amount of disk space required, but it can make locating and identifying specific data elements challenging. Also, one data element can be defined to "occur" more than one time on a record, allowing for the creation of a specific (and usually restrictive) number of multiples of a value or group of values to exist within a single record. The record type can be a complex structure even when looked at as a single construct and may hide some of the complexity of the data through the flexibility of its definition. From a high level, the database design is a simple network, with link and intersection record types (called junction records by IDMS). The flexibility of this design provides a network of relationships represented by several parent-to-child pairings. With each parentchild pair, one record type is recognized as the owner record type, and one or more record types are recognized as member record types. Information support for control systems Lesson 2 / Student Page 2/5 Revisiting the college database example from Figure 2-1, you could alter a network database design from the previous hierarchical example to allow Campus to be directly linked to Class through a second owner/member link, as shown in Figure 2-2. Figure 2-2. Network class scheduling: record types connected in a modified tree structure and allowing more than one parent Network databases still have the limitations of pointer-type connections between the tables. You still need to step through each node of the network to connect data records. In this example, you still need to navigate from Campus to Teacher by way of Class. And the rows still have to be maintained in the order the pointer is expecting so that they function properly. Relational Databases Relational databases such as Oracle, Microsoft SQL Server, and IBM DB2 are different from both network and hierarchical database management systems in several ways. In a relational database, data is organized into structures called tables, and the relations between data elements are organized into structures called constraints. A table is a collection of records, and each record in a table contains the same data elements, or fields. Relational databases don't generally support multiple definitions of the fields or multiple occurrences within a single record, which is in contrast to network and hierarchical databases. The highlevel properties of RDBMS tables are as follows: The value in a data element is single and atomic (no data replicates within a field, and data contained in a field doesn't require any interpretation). Each row is unique (no wholly duplicated records should exist within a set). Column values are of the same kind (a field's data doesn't have multiple definitions or permit "redefines”). The ordinal sequence of columns in a table isn't significant. The ordinal sequence of rows in a table isn't significant, eliminating the problem of maintaining record pointers. Each column has a unique name within its owning table. By connecting records through matching data contained in database fields rather than the pointer constructs (unlike network databases), and by allowing for child records to have multiple parents (unlike hierarchical databases), this type of design builds upon the strengths of prior database systems. You'll see in more detail how this is achieved in a moment, but let's consider how the example would be represented in a relational database (see Figure 2-3). Information support for control systems Lesson 2 / Student Page 3/5 Figure 2-3. A relational class scheduling—if all the business rules are the same It looks the same as the network database model, doesn't it? The network database management systems allow almost (the same flexibility as a relational system in terms of allowable relationships. The power of this database paradigm lies in a different area. If you follow the rules of normalization, you'd probably end up with a model that looks more like Figure 2-4. Figure 2-4. A relational class scheduling—in Third Normal Form So, why is this model so different from Figure 2-3? This model ensures that only one set named Course is necessary, since it can be related to other courses in the event you need to set up a Prerequisite. In a similar fashion, teachers and students are recognized as being part of the single set Person, allowing a person, Isaac Asimov, to be both a Student and a Teacher of a Class. Relational design emphasizes storing information in one and only one place. Here, a single teacher record is reused for as many classes as necessary, rather than duplicating the teacher information for each class. We also created two new sets, Course Subject and School Calendar, to group courses. This creates two new small sets to manage certain business rules in the database, rather than a code domain structure for subjects in the network database and a data format restriction to restrict Begin Date values to dates in Class. The other big gain is that accessing records can also be simpler in a relational database. Instead of using record pointers to navigate between data sets, you can reuse a portion of one record to link it to another. That portion is usually called the primary key on its owning table, because of its identifying nature. It becomes a foreign key on the child table. Information support for control systems Lesson 2 / Student Page 4/5 The process of propagating the key to the child table is called migrating. You can use a foreign key to navigate back into the design, allowing you to skip over tables. For example, you can see some examples of possible data values in Table 2-1, Table 2-2, Table 2-3, and Table 2-4. (PK means it's the primary key, and FK means it's the foreign key.) Table 2-1. Teacher Teacher Name (PK) Rating Isaac Asimov Most Excellent Table 2-2. Campus Campus Name (PK) Anaheim Satellite Campus Phone Number 714-663-7853 Table 2-3. Class Course Name (PK) Creative Writing 101 Campus Name (PK and FK) Anaheim Satellite 12/01/2004 Begin Date (PK and FK) Table 2-4. Class Teacher Teacher Name (PK and Course Name Campus Name (PK FK) (PK and FK) and FK) Isaac Asimov Creative Writing 101 Anaheim Satellite Begin Date (PK and FK) 12/01/2004 If you warn to know the phone number of the campus where Isaac Asimov is attending the Creative Writing 101 class, you can use the foreign key of Campus Name to skip over Class and link directly with Campus. We've shortened the access path to the data by avoiding the pointer design used in network database designs. The design characteristics of using natural data values to relate records between data sets, and the philosophy of building of simple understandable List of Values (LOV) sets for reuse as foreign keys, may be the most important characteristics contributing to the success of relational databases. Their potential for simplicity and understandability can make them nonthreatening and an easy technology to learn. Notice the evolutionary nature of the database systems. Over time, each database management system built on the strengths of existing systems while attempting to address their restrictions. Tasks 1. Which type of described databases can be used to store data from gauges? 2. Draw a database scheme for control system that will store collected data. This database should contain gauge descriptions (name, type, position, ranges) and data in different time periods. Information support for control systems Lesson 2 / Student Page 5/5