Download Appendix G: When is a DBMS Relational?

APPENDIX G When Is a DBMS Relational? Objectives In this appendix you will learn: • Criteria for the evaluation of relational database management systems. As we mentioned in Section 4.1, there are now several hundred relational DBMSs for both mainframe and PC environments. Unfortunately, some do not strictly follow the definition of the relational model. In particular, some traditional vendors of DBMS products based upon network and hierarchical data models have implemented a few relational features to claim they are in some way relational. Concerned that the full power and implications of the relational approach were being distorted, Codd specified 12 rules (13 with Rule 0, the foundational rule) for a relational DBMS (Codd, 1985a,b). These rules form a yardstick against which the “real” relational DBMS products can be identified. Over the years, Codd’s rules have caused a great deal of controversy. Some argue that these rules are nothing more than an academic exercise. Some claim that their products already satisfy most, if not all, rules. This discussion generated an increasing awareness within the user and vendor communities of the essential properties for a true relational DBMS. To emphasize the implications of the rules, we have reorganized the rules into the following five functional areas: (1) (2) (3) (4) (5) Foundational rules Structural rules Integrity rules Data manipulation rules Data independence rules Foundational rules (Rule 0 and Rule 12) Rules 0 and 12 provide a litmus test to assess whether a system is a relational DBMS. If these rules are not complied with, the product should not be considered relational. G-1 G-2 | Appendix G When Is a DBMS Relational? Rule 0—Foundational rule For any system that is advertised as or claimed to be a relational database management system, that system must be able to manage databases entirely through its relational capabilities. This rule means that the DBMS should not have to resort to any nonrelational operations to achieve any of its data management capabilities such as data definition and data manipulation. Rule 12—Nonsubversion rule If a relational system has a low-level (single-record-at-a-time) language, that low level cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher-level relational language (multiple-records-at-a-time). This rule requires that all database access is controlled by the DBMS so that the integrity of the database cannot be compromised without the knowledge of the user or the DBA. However, this does not prohibit the use of a language with a record-at-a-time interface. Structural rules (Rule 1 and Rule 6) The fundamental structural concept of the relational model is the relation. Codd states that an RDBMS must support several structural features, including relations, domains, primary keys, and foreign keys. There should be a primary key for each relation in the database. Rule 1—Information representation All information in a relational database is represented explicitly at the logical level and in exactly one way: by values in tables. This rule requires that all information, even the metadata held in the system catalog, must be stored as relations, and managed by the same operational functions as would be used to maintain data. The reference to “logical level” means that physical constructs, such as indexes, are not represented and need not be explicitly referenced by a user in a retrieval operation, even if they exist. Rule 6—View updating All views that are theoretically updatable are also updatable by the system. This rule deals explicitly with views. In Section 7.4.5, we discussed the conditions for view updatability in SQL. This rule states that if a view is theoretically updatable, then the DBMS should be able to perform the update. No system truly supports this feature, because conditions have not been found yet to identify all theoretically updatable views. When Is a DBMS Relational? Integrity rules (Rule 3 and Rule 10) Codd specifies two data integrity rules. The support of data integrity is an important criterion when assessing the suitability of a product. The more integrity constraints that can be maintained by the DBMS product, rather than in each application program, the better the guarantee of data quality. Rule 3—Systematic treatment of null values Nulls (distinct from the empty character string or a string of blank characters and distinct from zero or any other number) are supported for representing missing information and inapplicable information in a systematic way, independent of data type. Rule 10—Integrity independence Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage† and storable in the catalog, not in the application programs. Codd makes a specific point that integrity constraints must be stored in the system catalog, rather than encapsulated in application programs or user interfaces. Storing the constraints in the system catalog has the advantage of centralized control and enforcement. Data manipulation rules (Rule 2, Rule 4, Rule 5, and Rule 7) There are 18 manipulation features that an ideal relational DBMS should support. These features define the completeness of the query language (where, in this sense, “query” includes insert, update, and delete operations). The data manipulation rules guide the application of the 18 manipulation features. Adherence to these rules insulates the user and application programs from the physical and logical mechanisms that implement the data management capabilities. Rule 2—Guaranteed access Each and every datum (atomic value) in a relational database is guaranteed to be logically accessible by resorting to a combination of table name, primary key value, and column name. Rule 4—Dynamic online catalog based on the relational model The database description is represented at the logical level in the same way as ordinary data, so that authorized users can apply the same relational language to its interrogation as they apply to the regular data. † A sublanguage is one that does not attempt to include constructs for all computing needs. The relational algebra and relational calculus are database sublanguages. | G-3 G-4 | Appendix G When Is a DBMS Relational? This rule specifies that there is only one language for manipulating metadata as well as data, and moreover that there is only one logical structure (relations) used to store system information. Rule 5—Comprehensive data sublanguage A relational system may support several languages and various modes of terminal use (for example, the fill-in-the-blanks mode). However, there must be at least one language whose statements can express all of the following items: (1) data definition; (2) view definition; (3) data manipulation (interactive and by program); (4) integrity constraints; (5) authorization; (6) transaction boundaries (begin, commit, and rollback). Note that the ISO standard for SQL provides all these functions, so any language complying with this standard will automatically satisfy this rule (see Chapters 6, 7, and 23). Rule 7—High-level insert, update, delete The capability of handling a base relation or a derived relation (that is, a view) as a single operand applies not only to the retrieval of data but also to the insertion, update, and deletion of data. Data independence rules (Rule 8, Rule 9, and Rule 11) Codd defines three rules to specify the independence of data from the applications that use the data. Adherence to these rules ensures that both users and developers are protected from having to change the applications following low-level reorganizations of the database. Rule 8—Physical data independence Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representations or access methods. Rule 9—Logical data independence Application programs and terminal activities remain logically unimpaired when informationpreserving changes of any kind that theoretically permit unimpairment are made to the base tables. Rule 11—Distribution independence The data manipulation sublanguage of a relational DBMS must enable application programs and inquiries to remain logically the same whether and whenever data is physically centralized or distributed. When Is a DBMS Relational? Distribution independence means that an application program that accesses the DBMS on a single computer should also work in a network environment without modification, even if the data is moved about from computer to computer. In other words, the end-user should be given the illusion that the data is centralized on a single machine, and the responsibility of locating the data from (possibly) multiple sites and recomposing it should always reside with the system. Note that this rule does not say that to be fully relational the DBMS must support a distributed database, but it does say that the query language would remain the same if, and when, this capability is introduced and the data is distributed. Distributed databases were discussed in Chapters 24 and 25. | G-5

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Appendix G: When is a DBMS Relational?