* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Techniques for multiple database integration
Survey
Document related concepts
Serializability wikipedia , lookup
Oracle Database wikipedia , lookup
Relational algebra wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Navitaire Inc v Easyjet Airline Co. and BulletProof Technologies, Inc. wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Ingres (database) wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Concurrency control wikipedia , lookup
Versant Object Database wikipedia , lookup
Clusterpoint wikipedia , lookup
ContactPoint wikipedia , lookup
Transcript
Calhoun: The NPS Institutional Archive Theses and Dissertations Thesis and Dissertation Collection 1997-03 Techniques for multiple database integration Whitaker, Barron D Monterey, California. Naval Postgraduate School http://hdl.handle.net/10945/9058 N PS ARCHIVE 1997. OS WHITAKER, B. NAVAL POSTGRADUATE SCHOOL Monterey, California THESIS TECHNIQUES FOR MULTIPLE DATABASE INTEGRATION by Barron D. Whitaker March 1997 Thesis W55045 Thesis Advisor: Approved C. for public release; distribution is Thomas Wu unlimited. dudley knox library ostgraduate school nava*. MONTEREY CA 93943^401 REPORT DOCUMENTATION PAGE Public reporting burden for this collection of information estimated to average Form Approved OMB No. 0704-0188 hour per response, including the time for reviewing instruction, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington is 1 headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington, Management and Budget, Paperwork Reduction Project (0704-0188) Washington DC 20503. VA 22202-4302, and to the Office of 1 AGENCY USE ONLY (Leave blank) . REPORT DATE 2. 3. March 1997 TITLE 4. REPORT TYPE AND DATES COVERED Master's Thesis AND SUBTITLE 5. FUNDING NUMBERS TECHNIQUES FOR MULTIPLE DATABASE INTEGRATION AUTHOR(S) 6. Whitaker, Barron D. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 7. 8. Naval Postgraduate School Monterey, CA 93943-5000 SPONSORING / MONITORING AGENCY NAME(S) AND ADDRESS(ES) 9. PERFORMING ORGANIZATION REPORT NUMBER 10. SPONSORING MONITORING / AGENCY REPORT NUMBER 11. SUPPLEMENTARY NOTES The views expressed in this thesis are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S. Government. 12a. DISTRIBUTION / AVAILABILITY STATEMENT Approved 1 3. for public release; distribution ABSTRACT (maximum 12b. is DISTRIBUTION CODE unlimited. 200 words) There are several graphic client/server application development tools which can be used to easily develop powerful relational database applications. However, they do not provide a direct means of performing queries which require relational joins across multiple database boundaries. This thesis studies ways to access multiple databases. Specifically, it examines how a "cross-database join" can be performed. A case study of techniques used to perform joins between academic department financial management system and course management system databases was done using PowerBuilder 5.0. Although we were able to perform joins across database boundaries, we found that PowerBuilder is not conducive to cross-database join access because no relational database engine is available to execute cross-database queries. 14. SUBJECT TERMS 15. NUMBER OF PAGES Relational Database, Object Based Database Systems Design, 22 16. Multiple Database Integration SECURITY CLASSIFICATION SECURITY CLASSIFICATION PRICE CODE LIMITAITONOF SECURITY CLASSIFICATION OF REPORT OF THIS PAGE OF ABSTRACT ABSTRACT Unclassified Unclassified Unclassified UL 17. NSN 7540-01-280-5500 18. 19. 20. Standard Form 298 (Rev. 2-89) Prescribed by ANSI Std. 239-18-298-102 11 Approved for public release; distribution is unlimited. TECHNIQUES FOR MULTIPLE DATABASE INTEGRATION Barron D.Whitaker // Major, United States Marine Corps B.S., University of Texas A & M, 1984 Submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE IN COMPUTER SCIENCE from the NAVAL POSTGRADUATE SCHOOL March 1997 ca ABSTRACT bBS*** There are several graphic client/server application development tools which can be applications. used to easily develop However, they do not provide powerful a direct relational database means of performing queries which require relational joins across multiple database boundaries. This thesis studies ways to access multiple databases. it examines how a "cross-database join" can be performed. A Specifically, case study of techniques used to perform joins between academic department financial management system and course management system databases was done using PowerBuilder 5.0. Although we were able we found that PowerBuilder to is perform joins across database boundaries, not conducive to cross-database join access because no relational database engine queries. is available to execute cross-database VI 1 TABLE OF CONTENTS I. E. INTRODUCTION 1 A. BACKGROUND 1 B. THESIS OBJECTIVES 3 C. METHODOLOGY 3 D. ORGANIZATION OF THE THESIS 4 RELATIONAL DATABASE MODEL 5 A. RELATIONAL MODEL OVERVIEW 5 B. STRENGTHS 6 C. LIMITATIONS 7 m. INTEGRATION METHODS 9 A. MULTIPLE DATABASE INTEGRATION B. PHYSICAL INTEGRATION 10 C. LOGICAL INTEGRATION 1 D. 9 1. Database Level 11 2. Application Level 15 INTEGRATION STRATEGY CHALLENGES 1. Data Type Conflicts 17 17 2. Duplicate Data 18 3. Concurrency and Consistency 18 vn IV. CASE STUDY: INTEGRATION OF COURSE MANAGEMENT AND FINANCIAL MANAGEMENT DATABASES A. B. V. 21 BACKGROUND 21 Management System 1. Financial 2. Course Management System 21 3. InterDatabase Relationships 21 INTEGRATION TECHNIQUE 22 CMS 22 1. Create 2. Multiple Database Connections 23 3. Cross-database Join Query 24 4. Query Subcomponents 25 5. Data 6. Putting the Pieces Together with 7. Performance Drag 29 8. A More Efficient Technique 29 Window 26 Objects PERFORMANCE ISSUES A. B. VI. 21 Windows and Scripts 27 33 QUERY OPTIMIZATION 33 1. Relational Join Minimization 33 2. Table Size 33 PARALLELISM 35 CONCLUSIONS 37 Vlll APPENDIX A. ENTITY RELATIONSHIP DIAGRAM FOR THE FMS DATABASE.... 39 APPENDIX B. FMS DATABASE SCHEMA 41 APPENDIX C. CMS ENTITY-RELATIONSHIP DIAGRAM 45 APPENDIX D. CMS DATABASE SCHEMA 47 APPENDIX E. CMS SCRIPTS AND SAMPLE SCREEN CAPTURES 53 LIST OF REFERENCES 59 BIBLIOGRAPHY 61 INITIAL DISTRIBUTION LIST 63 IX LIST OF FIGURES view of tables from disparate databases 1. Integrated 2. Physical Database Integration 3. Logical Integration 4. Application Level Integration 16 5. Application script for concurrent database connections 24 6. Cross-database query between 7. First 8. Second subcomponent query 26 9. Data source query for "d_teach_less_than_2" data window 26 Data source query for "d_pi_less_than_2" data window 27 10. at the 10 Database Level 13 CMS andFMS subcomponent query 25 25 11. Cross-database join illustration using data windows 12. Nested for loop 13. Improved data source query 14. Single for loop script to perform cross-database join script to 2 perform cross-database join for data window XI 28 28 30 31 Xll INTRODUCTION I. BACKGROUND A. Organizations use and applications systems database information needed to conduct daily business and operations. to process Historically, these database systems were individually developed to meet specific needs. As an organization's additional applications requirements information were written maintain, to changed over process, and extract time, information from the individual databases. Typically however, each of these databases provided only a part of the total picture needed by an An example organization. of this illustrated is example integration of the two databases makes query to retrieve the answer books checked out?". of either the to the question, However, preexisting Additionally, as a result of the evolve, which of is it how often becomes necessary 1. In this possible to perform a is databases not available from in this example. that databases in organizations tend to to maintain databases that contain data duplicated in other databases. This leads to the additional problem to deal with data inconsistencies between the differing systems. This situation a underlying Figure "How many sophomores have information this way it in is inefficient both from a data processing perspective since associated with maintaining the it same management perspective and from requires the additional overhead information in more than one database. also, It, makes it difficult, if not to maintain Manpower intensive impossible, consistency and concurrency of the duplicate data. solutions for eliminating inconsistencies and managing concurrency are costly and when compared inefficient solutions to the problem. efficient If the to potential of automated an organization could identify effective and automated methods for integrating existing databases significantly enhance the usefulness of the information it it could already maintains by eliminating database organizational inefficiencies which cause data duplication and inconsistencies. Integrated book checkout book I View student info student id I due date I I student id I classification I student info student id Figure 1 . I classification Integrated view of tables from disparate databases I THESIS OBJECTIVES B. The purpose of this thesis determine to is if possible is it to concurrently access multiple independent databases within an organization in a way that provides an integrated Furthermore, assuming that this and efficiently effectively management system applications? model and to What is view of the information they contain. possible, can multiple databases be integrated provide a into focal a single cohesive of access point for database are the limitations in the context of the relational development in the context of the current database application technology? database In the process of addressing these questions this thesis also provides some analysis of the different methods which might be used to integrate multiple databases and lists the relative advantages and disadvantages of these methods. C. METHODOLOGY In order to test the actual ability to integrate multiple databases a relatively simple subset of the overall larger problem model was built around a problem was selected. university academic The department organization with only two specific database applications, each supported by a dedicated database. PowerBuilder 5.0, management system front-end development which tool, is a relational database was used to develop a course scheduling and management application and supporting database. This application was designed to provide a university academic department 3 with a tool to manage and plan course scheduling and course assignments to teachers. A method was then developed management scheduling and department financial database management developed using PowerBuilder 5.0. with system to a integrate pre-existing which database course the academic was also This method was then evaluated and conclusions were developed. D. ORGANIZATION OF THE THESIS This thesis is discussion of the relational database model, its Chapter organized as follows: its II presents a brief impact on database design, strengths for managing and accessing data, as well as The primary reason limitations Chapter III theoretical its for this discussion is to set the frame which may occur in implementing limitations. work for any a multiple database solution. provides an overview and some discussion of the various approaches which might be taken to develop an integration solution as well as potential advantages and disadvantages. Chapter IV presents a case study of one of the approaches discussed in Chapter III and evaluates this solution against the desired goals of database integration. Chapter V looks at the various database integration performance issues in the context of the relational in model and considers the impact of these issues terms of the model's strengths as well as its limitations. Chapter VI will present conclusions drawn from this research. Finally, RELATIONAL DATABASE MODEL II. RELATIONAL MODEL OVERVIEW A. The since its replace is relational model has been the subject of introduction by many Codd in the early 70's. a great deal of research In the 80's began Although, are looking for management may this ways to be changing as more and more organizations improve their information management and access capabilities through the use of object oriented solutions such as the database model. [Ref. 1] In the relational model, data which are frequently referred remember to as tables. is However, stored as relations it is relation is best represented as a mathematical set of tuples ordering on its members. imposed on its members A table since it however usually has some consists of rows or tuples. is relationship which members a in important to which has no sort of order represents an actual file or physical implementation of an instance of a relation. row ODMG two terms are not completely interchangeable since a that these tuples, each to preexisting networking and hierarchical database systems and currently the basis of most commercially available database systems. it Because by definition, A some other relation or table a relation is a set of distinct and corresponds to an instance of an entity or a exists row relationship instance. between two or more entity instances. or tuple The are the attributes set The of attributes of the entity or which correspond to the columns of a table are the properties of the entity or relationship instance. basic schemas or definitions and a set of integrity constraints placed upon those There are three fundamental update operations for a relation and They they are limited by the integrity constraints on the relation schemas. are MODIFY, UPDATE, and, DELETE. Although Codd eight relational algebra operations in the relational of this database schemas can be defined by a set of relation structure, schemas. Using relation instances, These primitive. only essentially operations five model of SELECT, are DIFFERENCE, and CARTESIAN PRODUCT. originally defined [Refs. for manipulation operations these UNION, PROJECT, 2, 3] are A more detailed discussion of relation schemas, integrity constraints, update operations, and relational algebra operations can be found in [Ref. framework which has formed the basis of relatively simple 4]. It a large is this number of relational database management systems in recent years. B. STRENGTHS • The relational structure and model is is based on a simple and uniform data therefore easier to understand than other database models such as the hierarchical and network data models. It is strongly founded on the mathematical concepts of predicate logic and set theory allowing the representation and manipulation of data to SQL, which is basis of be formalized. [Ref. 3] This the is one of the most widely used query languages in database management systems. • Allows a more abstract representation of a database than previous models such as the hierarchical and network data models. [Ref. 3] Therefore is it easier to design a database model which more closely resembles real world instances and relationships. C. LIMITATIONS • Insufficient semantic completeness or expressiveness. [Ref. This has been a very popular topic for discussion and have proposed semantic modeling extensions model to improve it. [Refs. 5, semantic modeling extension model. because an • it its is to the relational good example of a Entity-Relational (ER) simple and easy to understand. is was intended Mathematical However, when converted into a database schema resemblance relational the is many one of the most well known semantic data models ER model lose it It is A 7] 6, 5] to the real it may often world entities and relationships to represent. [Refs. 3, 4] aggregate algebra. In functions order to cannot be expressed specify operations in such as SUM, the • or AVERAGE, DBMS. The additional operators must be defined by [Ref. 4] recursive closure relational algebra. cannot operation is specified in In order to specify a query to retrieve all some form of looping instances in a recursive relationship, mechanism be needed as well as a means to specify the number of recursive levels required for the specific query based on some base condition. OUTER JOIN and [Ref. 4] OUTER UNION extend the relational JOIN and operations are required to UNION operations to handle joins between relations or tables which either have which contain null values or records have tuples or records which are some tuples in the join attributes or not union compatible. [Ref. 4] The relational JOIN can be and is is a very time consuming operation frequently the cause of performance degradation. because it This requires the cross-referencing of a tuple at a time between two relations and there are many possible ways execute this operation. As a result, to complex queries are usually a performance bottleneck for large database systems. [Refs. 4, 7] INTEGRATION METHODS III. MULTIPLE DATABASE INTEGRATION A. What meant by the term multiple database integration? is integration itself or to make one means make to into a whole by bringing thing a part of something else. parts together This definition context of database design in this thesis to the all mean is represented relationships databases. by data between exist In fact, databases multiple in entities within the of a context is the or by represented duplication of data in disparate databases create logical relationships when applied in is the logical physical combination of multiple databases into one database. of multiple databases can be particularly useful The term Integration same information where data meaning for in database model, relational real data spread over integration, data in Given the disparate It is then new and more expanded information views derived from multiple procedures for eliminating taxonomy of some world relationships not databases can then be accessed from a single vantage point. possible to present disparate actually necessary in order to which model the multiple world real yet realized within the organization's information infrastructure. above or strategies databases and to automate inconsistencies between them. more efficient Following are a which might provide database integration solutions as well as a brief discussion of probable advantages and disadvantages. B. PHYSICAL INTEGRATION The physical integration of a database forward process and consists of designing the union of all is conceptually a straight a single database which contains information contained in the preexisting databases which are to be integrated (Figure 2). The actual process of designing this new r— dbl ij dbSum db3 Figure database redesign. 2. Physical Database Integration can be very difficult because It will applications and, in always almost some most obvious benefits to it requires a require changes to complete system any preexisting cases, these changes will be substantial. this The approach over other integration solutions involving a more loosely coupled arrangement are better performance and easier and less expensive application development for data access and retrieval after a physical integration solution 10 is achieved. In a large organization with very large information management, storage, and access requirements, the physical integration will result in what is frequently referred to as a Very Large Database (VLDB). [Ref. now most In this 8] case redesign will require special attention to ensure that a number of new One very to this strategy challenges are handled. is that it may large possible not be as extensible as other more modular strategies which maintain the existence of separate databases. to achieve a solution which expansion to downside Special care scaleable in order to is is also required accommodate future handle new information requirements and avoid the need for costly changes and/or complete system redesign. Therefore, in terms of system maintenance and updates as a result of changes in the organization's business and management practices, this solution could easily become cost prohibitive. information In other words, handling and soon as the next new requirement for as access identified, is a strategy integration could result in the organization being back in the it was C. in when it started out same situation to integration. LOGICAL INTEGRATION Database Level 1. The strategy behind use on the path of physical logical integration at the database level of a middleware layer which connects presents a single interface to to layer so that databases appear to applications as one large database (Figure 11 the each database separately, yet application the is multiple 3). A significant advantage to this approach same that is it effectively maintains the level of simplicity in query formulation as the physical integration strategy while at the same time allowing the use of existing databases with fewer or no changes to existing structure. This strategy has effect on existing applications and only requires changes to accommodate changes inconsistencies between in their little to no where necessary underlying databases due to field type duplicate The data. between connections preexisting databases and their applications are not affected. This strategy could be implemented through a kind of universal database engine which establishes a single connection to each database. Its cross-database queries from an application and parse subquery components for each database involved. example of the university library given would take the query based on books checked out?" and divide the library database A, who is a the student are in Chapter the question it into would be based on it job is to receive into the appropriate For example, using the the universal engine 1, "How many sophomores have two subqueries. The subquery the smaller question for "Does student sophomore, have any books checked out?" and the subquery for admin database would be "Retrieve sophomores." a list of all students who The universal engine then performs the join of the two intermediate result sets and applies an aggregate function to return the final answer perform to the original question. query optimization based 12 At the same time the engine should on some acquired knowledge Course Management System db .^^ far.i ilty 1 ^ namp fimn id code y v^ V s Figure 3. Logical Integration 13 at the Database Level last namp; of the underlying databases structures. to get the result For example, it number of students university has a very large number of books checked its probably faster above by checking the library database for the existence of a particular "student_id" after first retrieving a list of largely from is comparison if the to the total This logical integration strategy benefits out. more modular approach and information infrastructure. in sophomores Its its ability to leverage existing modularity allows the use of databases which may be distributed and performance can benefit from a greater degree of parallelism since different parts of a cross-database query can be Additionally, this more modular and loosely processed simultaneously. coupled approach allows easier modification without necessarily affecting other parts of the organization's information infrastructure. an organization to upgrade to accommodate changes over time. itself. to It The key as the organization's information needs to the success in this strategy Fortunately the database connectivity of systems gradually, and has more flexibility sounds nice theoretically, but achieve. kind its common This can allow first (ODBC) interface it is the universal engine requires features that are difficult of these is already here. The open standard established by Microsoft necessary change for an application to is connect the to multiple relational database management systems. The more difficult part of the universal engine concept is the capability to establish a base from the connected databases making 14 it knowledge possible to perform intelligent and effective query optimization. Because this solution carries additional overhead of the middleware layer perform as well may architecture situations more as a it might not be expected the to tightly coupled approach, yet its distributed actually yield better overall performance particularly in where there is heavy user load or when handling large queries which span multiple databases. cross-database it with queries Resolving data type mismatches where involved are is more of integration strategy as well as the question of a how challenge with duplicate to deal data which leads to concurrency and consistency problems. with this Maintenance of concurrency and consistency between duplicate and related data could be handled by the implementation of business rules in the middleware layer for data which is related across database boundaries. implemented and how well it would work is How this would be not clear. Application Level 2. Integration at the application level could be accomplished by defining connections within an application to preexisting databases as illustrated in Figure This approach 4. solution in that coupled solution Integration databases it is is a is similar to the previous logical integration more modular approach, but since the boundaries become results in a a little more tightly more blurred. of the information contained in the disparate organizational handled completely at the application level. Therefore, all cross-database query optimization must be handled by the application 15 faculty ' namp emc Academic Dept Course id code las! name Academic Dept Mgmt Financial Mgmt Application Application Figure 4. Application Level Integration developer. This makes the application development process complex. The effectiveness of extent this approach will also depend much more to a great on the robustness of the application development environment. 16 Because this is a loosely coupled approach, redesign of the existing system, however it it does not require significant will require a more deliberate application design effort to resolve data conflicts, minimize duplicate data, and to maintain data concurrency and consistency between the different databases. INTEGRATION STRATEGY CHALLENGES D. Data Type Conflicts 1. A is how challenge that to is common to all of the above integration strategies resolve differences in data types between cross-database join The differences can range from differences fields. different types to different field masks. might be defined another. Using in For example, dates in "dd/mm/yy" format and a physical integration in field in length to one database "dd/mmm/yyyy" format strategy, these differences in are resolved in the design of the integrated database and resulting changes in data types and field lengths will most likely require changes to existing applications. data However special care must be taken to avoid corruption of during the process of migrating data from preexisting databases. Resolution of data conflicts in the logical integration strategies will most likely require changes to both existing databases and applications. However, data conflict resolution should only be necessary where actual joins across database boundaries are required. These cases can then be resolved as they are encountered during the integration design process. 17 Duplicate Data 2. Another common problem existence of duplicate data. database integration resolved integration strategy, As we through but proper requires is the stated earlier, one of the goals of minimize duplication between databases. is to automatically related to that of data conflicts normalization special in is physical the with attention This both logical In both application level and database level logical integration approaches. integration approaches, minimizing duplicate data will require changes to existing databases and applications. will require a Integration at the application level more substantial redesign effort but can be done as part of the actual integration implementation while at the database level fewer changes 3. to applications should be required. Concurrency and Consistency Concurrency and consistency issues arise primarily from the existence of duplicate data and are therefore not expected to present any special problems for the physical integration strategy. over wasted storage and performance Apart from concerns inefficiencies, and concurrency consistency issues are the primary reason for minimizing duplicate data and they present significant challenges for the logical integration strategies. handled by integrity constraints on the a single database, consistency database schema. The problem with way to is In the logical strategies is there impose actual integrity constraints on related 18 fields is no in different databases. Instead, business rules must be used to maintain consistency and must be defined in the application for application level integration this is possible with the application development tool in use. possible, then the benefits of integration reports for decision support. may be it is not limited to read only Business rules to maintain consistency should be defined in the universal engine for logical integration level. If if Concurrency controls will most likely have procedural constraints in the application layer and the application development tool in use. 19 to at the database be implemented using may also be limited by 20 CASE STUDY: INTEGRATION OF COURSE MANAGEMENT AND FINANCIAL MANAGEMENT DATABASES IV. A. BACKGROUND Financial 1. Management System The Financial Management System (FMS) application developed using PowerBuilder 5.0. is Its database a preexisting purpose is to provide a university academic department with an automated decision support tool for managing responsibilities. of this database the department's resources financial The Entity-Relationship diagram is illustrated in contained in Appendix B. Appendix See [Ref. 9] A and fiscal for the relevant entities and the associated schema is for full details of this application and supporting database. Course Management System 2. The Course Management (CMS) System database manages course scheduling and course assignments professors. It was developed using PowerBuilder Relationship diagram for the and the associated schema database is instructors 5.0. illustrated in and The EntityAppendix C contained in Appendix D. InterDatabase Relationships 3. In is CMS to application the case of the FMS and CMS databases, there is a need to accomplish some level of integration since many of the professors which 21 are assigned to teach classes are also principle investigators (PI) for one or more research of the projects. This results in a natural The academic department information contained in the two databases. would like to be able to track the relationship of courses taught since there the is a number of there intersection may by Pis business rule which links the research dollars for each PI to classes he or she is required to teach. On further inspection be other relationships between the two databases which could be taken advantage of, but the situation described above was sufficient to answer the primary questions of whether or not it was possible this thesis. We wanted to first establish client/server multiple databases simultaneously and, if so, to demonstrate connections to perform inner joins on tables from each of those databases. B. INTEGRATION TECHNIQUE 1. Create CMS Design and create the CMS application and supporting database based on the need for an academic department to schedule courses and make course assignments to professors and instructors. The CMS application also serves as the integration application in order to test cross- database join techniques which can be used in logical integration application level. 22 at the 2. Multiple Database Connections Initially it was not how clear if and multiple simultaneous database connections could be created within an integrated database application. discovered that this provides for the is a creation relatively We simple process since PowerBuilder of multiple "transaction" objects within an application which can then be used to represent a client connection to a The code database on the server. in Figure 5 is an excerpt from the "application open" event script contained in Appendix E shows was done. The "cms_trCursor" SQLCA" script statement "fms_trCursor "Transaction" object and assigns it The script assigns the "cms_trCursor" to point to the default database connection profile The this and 'fms_trCursor" are declared as gobal "Transaction" type variables in the PowerBuilder script painter. statement "cms_trCursor = how "SQLCA" CREATE = to the for the CMS application. transaction" creates a "fms_trCursor". new The necessary database connection profile parameters are retrieved from the database profile sections of the "PB.ini" file using the "ProfileStringO" function. The two Transaction databases using the objects "CONNECT" are then statement. 23 connected to their respective //Initial default //In this database transaction application this transaction connect to the CMS database = ProfileString("PB.INr,"Database","DBMS" " ") SQLCA.DbParm = ProfileString( PB.INI Database","DbParm", " ") //is used to SQLCA.DBMS , ,, ,, ,, I //Transaction cms_trCursor declared cms_trCursor = in menu "declare:global" SQLCA CONNECT using cms_trCursor; cmsJrCursor.SQLCODE <> THEN MessageBoxfConnect Error", "CMS connection IF failed") HALT END IF //Transaction fms_trCursor declared fms_trCursor = in "declare:global" menu CREATE transaction fmsJrCursor.DBMS = ProfileString("PB.INI","PROFILE Fms'V'DBMS" ," ") fmsJrCursor.DbParm = ProfileString("PB.INI","PROFILE Fms","DbParm"," ") CONNECT using fmsJrCursor; IF fmsJrCursor.SQLCODE <> THEN MessageBoxfConnect Error","FMS connection failed!") HALT END IF Figure 5. Application script for concurrent database connections. Cross-database Join Query 3. The first step to cross-database join developing a method or technique for performing is to create a standard question you wish to answer that databases. The tables database name. The query relationship between the faculty are members who is CMS query which represents the based on the tables in the relevant identified in SQL a using dot notation following the Figure 6 reflects the previous example and FMS databases and returns a are Pis for only one research project and teaching less than two courses. 24 list who of are SELECT faculty_id, l_name, f_name, mi FROM CMS. faculty WHERE f . f, FMS. employee e faculty_id = e emp_id_code . and EXISTS (SELECT f.faculty_id FROM CMS. teach t WHERE t.teach_fac_id = f.faculty_id GROUPBY f.faculty_id HAVING COUNT f.faculty_id < and EXISTS (SELECT 2) e.emp_id_code FROM FMS. pi p WHERE p emp_id_code = e emp_id_code . . GROUPBY e.emp_id_code HAVING COUNT e emp_id_code = . Figure 6. Cross-database query between CMS and 1) FMS Query Subcomponents 4. Once the proper cross database query subcomponent queries identified, the form will the PowerBuilder "data window" objects which will have The created. the CMS faculty first subcomponent identified database and members who is is are teaching less than 2 courses. f.faculty_id FROM CMS. teach t, CMS. faculty WHERE t.teach_fac_id = f.faculty_id f GROUP BY f.faculty_id HAVING COUNT f.faculty_id < Figure 7. 2 First subcomponent query 25 to is broken into the basis separate for the be defined and based on data contained in the part of the above query SELECT it from data retrieve to subcomponents These databases. necessary is which retrieves (Figure 7) all ) The second and only other subcomponent contained in the FMS in this database and retrieves all example is based on data employees who are Pis for only one research project. (Figure 8) SELECT e emp_id_code FROM FMS. pi p, WHERE p emp_id_code = e emp_id_code . FMS employee e . . . GROUP BY e emp_id_code . HAVING COUNT e emp_id_code < . Figure 5. 8. 2 Second subcomponent query Data Window Objects Using Powerbuilder's SQL select painter, the actual queries representing the subcomponents are created and used as the data source for separate "data window" data source queries for data window objects. Figure 9 and Figure 10 show the resulting the "d_teach_less_than_2" and "d_pi_less_than_2" objects respectively. SELECT "faculty". "faculty_id", "faculty" "f_name", . FROM WHERE "faculty", "faculty" "mi" . "teach" . "faculty". "faculty_id", "faculty" "f_name", . ORDER BY . ("teach". "teach_fac_id" = "faculty" "faculty_id" GROUP BY HAVING "faculty" "l_name", ( "faculty" "l_name", . "faculty" "mi" . count ("teach". "teach_fac_id") < '2* ) "faculty". "faculty_id" ASC Figure 9. Data source query for "d_teach_less_than_2" data 26 window. SELECT "employee". "emp_id_code", "employee" "mi", . FROM "employee", "employee" "f irst_name" . "employee" "last_name" "pi" ("pi"."emp_id_code" = "employee" "emp_id_code" WHERE , . . GROUP BY "employee". "emp_id_code" "employee" "mi", . HAVING , "employee" "first_name", "employee" "last_name" . (count ("employee". "emp_id_code") = ORDER BY ) . 1 ) "employee". "emp_id_code" ASC Figure 10. Data source query for "d_pi_less_than_2" data window. Putting the Pieces Together with 6. Once the data windows objects Windows and Scripts are completed, they are placed in a PowerBuilder "window" using the "window painter" and code the window "open" between the condition FMS is added to event script which implements the cross-database join and CMS (f.faculty_id graphically in Figure 11. databases as reflected in the = e.emp_id_code). The window WHERE This approach script to implement is this clause illustrated approach uses nested "for loops" in order to accomplish the join cross-product of the "dw_teach_less_than_2" and "dw_pi_less_than_2" result 27 sets. (Figure 12) ) dw teach less than 2 dw_pi_less_than_2 SELECT f. faculty id CMS. teach t, CMS .faculty f FROM WHERE t. teach fac id = f faculty id GROUPBY f ,faculty_id HAVING COUNT f.faculty_id < 2 SELECT e.emp id code FROM FMS.pi p, FMS employee e WHERE p emp_id_code = e emp id code GROUPBY e.emp id code HAVING COUNT e.emp_id_ code < 2 . f. . faculty id . e . emp id code Figure 11. Cross-database join illustration using data windows. long pi_number_of_rows long teach_number_of_rows long pijoopcount long teachjoopcount string string piNum teachNum pi_number_of_rows = dw_pi_less_than_2_list.RowCount() teach_number_of_rows = dw_fac_teach_less_than_2_list.RowCount() for pijoopcount = 1 to pi_number_of_rows piNum = dw_pi_less_than_2_list.GetltemString(piJoopcount,1) teachjoopcount = for 1 to teach_number_of_rows teachNum = dwJacJeachJessJhan_2Jist.GetltemString(teachJoopcount,1 IF dw_employeeJinkJacultyJist.Retrieve(piNum,teachNum) >= THEN COMMIT using SQLCA; ELSE ROLLBACK using SQLCA; MessageBox("Retrieve","Retrieve error END - employee join faculty detail") IF next next Figure 12. Nested for loop script to perform cross-database join 28 Performance Drag 7. At this point the original goal of has been accomplished with little performing a cross-database join difficulty; however, the use of nested "for loops" has a significant impact on performance. result in the "teach less than 2" data window result in the "pi less than 2" contains 13. against the CMS. faculty table only one name. contains 7 records and the This results in 91 retrieval calls which contains 74 records. 6734 join comparisons are required yields In this example, the to get the final result This means that which in this case For queries over large tables and with large intermediate result sets, this approach quickly becomes unusable. task. 8. A More Efficient Technique What is If the work done needed more efficient method of performing the same in the script can be reduced to a single loop vice the is a nested dual loop, the performance will be significantly improved. This can be done by combining the supporting subquery for the "teach less than 2" data window with data window and making using the supporting query for the "faculty_link_employee" the resulting query the the "faculty_link_employee" data is basically the same subquery window. The result new shown The passed window to the data string in Figure 13 for the condition "teach less than 2", but have added the join condition "faculty_id = :employee_id" "where" clause. data source for ":employee_id" is the inside a single loop in the 29 to the retrieval "window" we query's argument script and " is as shown against the to 13 in Figure CMS. faculty from 91 and the loop in the different window windows The number of 14. now required table for the cross-database join has been reduced total number of join comparisons required script is are retrieval calls reduced actually run to 962 from 6734. the in application, inside the When there is the two a very noticeable improvement in response speed of the second technique over that of the first. "faculty". "faculty_id", SELECT " FROM faculty" . "faculty", WHERE " f _name " , " "faculty" "l_name", . faculty" "mi . "teach" ("teach". "teach_fac_id" = "faculty" "faculty_id" . and "faculty" "faculty_id" = . ( : employee_id ) GROUP BY "faculty" "faculty_id*\ "faculty" "l_name", "faculty" "f_name", "faculty" "mi " HAVING ( count ("teach". "teach_fac_id") < Figure 13. Improved data source query for 30 2 data ) window ) long number_of_rows long loopcount string piNum number_of_rows = dw_pi_less_than_2_list.RowCount() FOR loopcount = 1 to number_of_rows piNum = dwjDi_less_than_2Jist.GetltemString(loopcount,1) IF dw_employee_link_faculty_list.Retrieve(piNum) >= THEN COMMIT using SQLCA; ELSE ROLLBACK using SQLCA; MessageBoxfRetrieve", "Retrieve error ^ END - employee join faculty detail") IF NEXT Figure 14. Single for loop script to perform cross-database join. 31 32 PERFORMANCE ISSUES V. QUERY OPTIMIZATION A. Relational Join Minimization 1. As in any relational database the join operator will have a significant impact on the performance of any integration solution. physical integration strategy this is handled way strategy of a However, some attention required by the application developer to ensure that queries are still formulated in a In by the query to a large extent optimization rules built into the database engine. is In the case of the that minimizes the number of intermediate result logical integration at implementation of query optimization rules means of minimizing application level, this as a as result, large queries are level, in the universal engine performance bottleneck. we demonstrated attention to query design database the in sets. the is a Integration at the Chapter IV, requires special when performing cross-database joins. As a probably not practical as a rule with this approach, and probably limit the usefulness of application level integration to smaller organizations and applications base on smaller databases. Table Size 2. Although there are many ways common brute is force to perform a join, one of the most the nested (inner-outer) loop approach, often referred to as the approach. This is the 33 method used in both techniques demonstrated in study because it are forced to use this method in the case deals with intermediate result sets or tables represented by windows and data We Chapter IV. the join achieved through the use of a looping is construct where the retrieval arguments are passed to the data record by record comparison and retrieval. When window for a using the nested loop approach, query optimization also becomes a function of which table chosen for the outer loop and which for the inner loop. A factor is known as the "join selection factor" affects the performance of a join and depends on the The equijoin equijoin condition of the two tables and their size. condition affects the percentage of records in a table which will be joined with records in the other illustrated 13 PI file. example given In the by the "faculty" window screen capture employee records with table of 88 records in the in in the case study, as Appendix E, there are less than 2 research projects retrieved FMS from a database while there only 7 faculty members teaching fewer than 2 courses retrieved from a table of 22 teach records in the CMS database. comparisons if we Therefore, result set as the outer loop. small, (7 x 88) join use the "PI less than 2" intermediate result as the outer loop versus 276 (13 x 22) relatively we must perform 616 it is if we use the "teach less than 2" intermediate Even though sufficient significantly affected if one table is to the show numbers in this example are how performance can be significantly larger than the other and only a small percentage of the records in that table are selected for join. 34 B. PARALLELISM Logical integration strategies offer the ability to take advantage of parallelism connections which can be achieved from multiple concurrent database and can realize an overall performance advantage over physical integration strategy depending on the overall configuration and performance. logical a system network The inherently distributed nature of the approach avoids the problem of server overload which is more likely to occur with a single physical integrated database on a centralized server. 35 36 CONCLUSIONS VI. Since the introduction tremendous growth of relational the data model and the use of computers in recent years, the use of in the database applications as a tool for managing information has grown in popularity. Organizations have discovered that they now have many databases and applications which manage access to them, but they find increasingly difficult to their disposal because which now is is through the large amounts of information many of to integrate existing databases so that among numerous considered three data systems strategies for at these databases contain overlapping data not coordinated in any useful or meaningful way. how retrieval sift it is The challenge information access and We efficient and effective. have accomplishing database integration and discussed some of the probable challenges, benefits, and limitations of each in the context of the relational data model. As a result, reminded that one of the primary sources of difficulty database retrieval is the join operation. This is cross-database join access. Using are relational even more significant when seeking an integration solution since the relational model to in we PowerBuilder is not conducive 5.0 we showed techniques for performing a join across multiple database boundaries as a method of integrating existing databases thesis demonstrates that it is at the application level. possible to perform 37 some This level of integration at However, the application level. this approach not practical for use is with large or complex queries for performance reasons making for use in large systems. database integration information It to unsuitable can be a useful means of performing limited between smaller applications resources it provide greater administrative functionality. 38 to decision leverage support existing utility or APPENDIX A. ENTITY RELATIONSHIP DIAGRAM FOR THE FMS DATABASE This entity relationship diagram 5.0's graphic representation of the Only tables which are relevant . ': Database FMS to this screen capture of PowerBuilder database described in Chapter IV. work are included. HEE Fins - ^PJ- depT code — 'jftj depLname chair is a ernp_td_code jon ^ code 1011 budgetjw©e_date fur»d_type Vp laborjon title ©mp_id_code dv_gtade & step mpl - emp_id_code dept_code d*te_ceceived e>$w_date spon_id_code in^_fac_labor_$ -sHU emp_code <&> «n Ht.travet_$ in»t_optar_$ fir*t_name mi IteijnatM base_saLary effjsaLdate eccel ate — ^fel tri\jxx*jript emp_id_code init_conUpa mil_grade WLconLoth service nd*ee^_cost remarks ba(_r*cJabor bal_sptJabor baLtravef i bWo.8 room_# work_phone home_pbone ffe streei_address emp_id_code ov_giade dfy step bal_opter ba<_cor^_mipr bat.cont_ipa bai_cont_oth state zipcode spousejname category term_date >fi 39 Entity relationship diagram of FMS created using S-designor 5.0 AppModeler Desktop. FACULTY EMP ID CODE CIV_GRADE STEP MILITARY char(4) EMP char(5) MIL_GRADE SERVICE char(2) EMP ID ID CODE STAFF EMP char(5) CIV_GRADE STEP char(4) CODE = EMP CODE char(4) ID ID char(4) char(5) char(2) CODE V EMP_ID_CODE = EMP_ID_CODE DEPT CODE = DEPT CODE £ DEPARTMENT DEPT CODE char(2) DEPT_NAME char(40) CHAIR_CODE char(4) EMP ID CODE = EMP ID CODE EMPLOYEE char(4) EMP ID CODE DEPT_CODE char(2) EMP_CODE char(2) SSN FIRST_NAME char(11) Ml char(1) LAST_NAME BASE_SALARY EFF_SAL_DATE ACCEL_RATE BLDG_# char(15) char(15) numeric(10,2) date numeric(3,2) char(3) ROOM_# WORK_PHONE HOME_PHONE char(5) STREET_ADDRESS char(20) CITY char(15) STATE ZIPCODE char(2) SPOUSE_FNAME CATEGORY TERM DATE char(15) char(13) char(13) char(10) char(1) date ACCOUNT PI EMP JON ID CODE char(4) char(5) JON _ JON JON char(5) BUDGET PAGE DATE FUND TYPE LABOR JON date char(2) char(5) TITLE char(30) DATE RECEIVED EXPIR DATE INIT FAC LABOR INIT SPT LABOR INIT TRAVEL $ date INIT_OPTAR_$ INIT CONT MIPR INIT_CONT_IPA CONT OTH INDIRECT COST REMARKS INIT date $ numeric(12,2) $ numeric(12,2) numeric(12,2) numeric(12,2) numeric(12,2) numeric(12,2) numeric(12,2) numeric(12,2) char(50) BAL FAC LABOR BAL SPT LABOR BAL TRAVEL BAL OPTAR BAL_CONT_MIPR BAL CONT PA numeric(12,2) BAL_CONT_OTH numeric(12,2) I 40 numeric(12,2) numeric(12,2) numeric(12,2) numeric(12,2) numeric(12,2) ;MP ID CODE = EMP ID CODE APPENDIX B. FMS DATABASE SCHEMA EMPLOYEE Column List Name ACCEL RATE BASE SALARY BLDG # Code Type CATEGORY ACCEL RATE BASE SALARY BLDG # CATEGORY CITY CITY char(15) DEPT CODE EFF SAL DATE DEPT CODE EFF SAL DATE date EMP CODE EMP ID CODE FIRST NAME HOME PHONE LAST NAME EMP CODE EMP ID CODE FIRST NAME HOME PHONE LAST NAME char(2) Ml Ml char(1) ROOM # ROOM # numeric(3,2) numeric(10,2) char(3) char(1) char(2) char(4) char(15) char(13) char(15) char(5) SPOUSE FNAME SPOUSE FNAME char(15) SSN STATE STREET ADDRESS TERM DATE SSN STATE STREET ADDRESS TERM DATE char(11) WORK PHONE WORK PHONE char(13) ZIPCODE ZIPCODE char(10) char(2) char(20) date P No No No No No No No No Yes No No No No No No No No No No No No Reference to List DEPARTMENT Foreign Key Primary Key Reference to DEPT_CODE DEPT CODE Reference by List PI FACULTY STAFF MILITARY Foreign Key Primary Key Referenced by EMP EMP EMP EMP ID ID ID ID CODE CODE CODE CODE 41 EMP EMP EMP EMP ID ID ID ID CODE CODE CODE CODE M No No No Yes No No No Yes Yes No No Yes No No No No No No No No No FACULTY Column List Name Code Type CIV GRADE EMP ID CODE CIV GRADE EMP ID CODE char(5) STEP STEP char(2) char(4) P M No No Yes Yes No No Reference to List EMPLOYEE Foreign Key Primary Key Reference to EMP_ID_CODE EMP_ID_CODE MILITARY Column List Name EMP MIL CODE GRADE ID SERVICE Code EMP MIL Type CODE GRADE char(4) ID char(5) SERVICE char(4) P M Yes Yes No No No No Reference to List EMP EMPLOYEE Foreign Key Primary Key Reference to EMP CODE ID ID CODE STAFF Column List Name CIV EMP GRADE ID CODE Type Code CIV EMP GRADE ID CODE char(5) char(4) STEP STEP char(2) P No No Yes Yes No No Reference to List EMPLOYEE Foreign Key Primary Key Reference to EMP ID CODE 42 EMP ID CODE M DEPARTMENT Column List Name Code Type CHAIR CODE CHAIR CODE char(4) DEPT CODE DEPT_NAME DEPT CODE DEPT NAME char(2) char(40) _ Reference by List Primary Key Referenced by DEPT CODE EMPLOYEE P M No No Yes Yes No No Foreign Key DEPT CODE PI Column List Name EMP ID CODE JON Code EMP ID Type CODE char(4) JON Reference to List Foreign Key Primary Key Reference to EMPLOYEE ACCOUNT char(5) P M Yes Yes Yes Yes EMP ID CODE EMP JON JON 43 ID CODE ACCOUNT Column List Name BAL BAL BAL BAL BAL BAL BAL Code CONT IPA CONT MIPR CONT OTH FAC LABOR OPTAR SPT LABOR TRAVEL BUDGET PAGE DATE BAL BAL BAL BAL BAL BAL BAL Type CONT IPA CONT MIPR CONT OTH FAC LABOR OPTAR SPT LABOR numeric(12,2) numeric(12,2) numeric(12,2) numeric(12,2) numeric(12,2) numerical 2,2) DATE RECEIVED EXPIR DATE FUND TYPE TRAVEL BUDGET PAGE DATE DATE RECEIVED EXPIR DATE FUND TYPE INDIRECT COST INDIRECT COST INIT CONT IPA CONT MIPR CONT OTH INIT FAC LABOR INIT INIT OPTAR $ SPT LABOR INIT TRAVEL INIT INIT INIT CONT IPA CONT MIPR CONT OTH FAC LABOR OPTAR $ SPT LABOR INIT TRAVEL INIT INIT INIT $ INIT INIT $ $ numeric(12,2) date date date char(2) numeric(12,2) numeric(12,2) numerical 2,2) numeric(12,2) $ numeric(12,2) numeric(12,2) $ $ numeric(12,2) numeric(12,2) JON LABOR JON JON LABOR JON char(5) REMARKS REMARKS char(50) TITLE TITLE char(30) char(5) P No No No No No No No No No No No No No No No No No No No Yes No No No Reference by List PI Foreign Key Primary Key Referenced by JON JON 44 M No No No No No No No No No No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No No No APPENDIX C. CMS ENTITY-RELATIONSHIP DIAGRAM This entity relationship diagram 5.0's graphic representation of the Database - is CMS a screen capture of PowerBuilder database described in Chapter IV. EM3 Cms bU 45 CMS Entity relationship diagram of created using S-designor 5.0 AppModeler Desktop. section teach char(5) section nbr teach sec nbr char(2) s course nbr char(6) teach course nbr char(6) s ay char(2) teach ay char(2) teach qtr char(1) teach fac id section_nbr = teach_sec_nbr s_course_nbr = teach_course_nbr char(2) char(1) s qtr s_ay = teach_ay = teach_ qtr s_qtr I offered »urse = s course nbr ay_offered = s_ay qtr_offered facultyjd = teach_fac_td = s_qtr faculty "faculty qtr offering char(5) id l_name f_name char(20) offered course char(6) char(20) ay offered char(2) mi chart 1) qtr offered aterQ) rank char(5) ay qtr ay = ay_offered qtr_nbr = qtr_offered char(2) nbr char(l) start_date date end date date course nbr = offered course faculty_id = ct_faculty_id can teach ct course nbr ct faculty id char(6) char(5) course nbr = ct course nbr char(6) coursejitle char(40) course nbr instr hours c_description 46 char(3) char(2000) prerequisite p_course_nbr cocourse_nbr = p_course_nbnbr prerequisite_c_nbr char(6) char(6) APPENDIX D. CMS DATABASE SCHEMA CAN TEACH Column list Name Code P M Yes Yes Yes Yes Type ct_course_nbr ct_course_nbr char(6) ct_faculty_id ct_faculty_id char(5) Reference to List Primary Key Reference to Foreign Key course course_nbr ct_course_nbr faculty faculty_id ct_faculty_id COURSE Column List Name Code Type c_description c_description char(2000) course nbr course title instr hours course nbr course title char(6) instrjiours char(3) char(40) P No Yes No No M No Yes Yes No Index List Index Code idx_course_nbr F P Yes No U C Yes No Column Code Sort ASC course_nbr Reference by List Referenced by can teach offering prerequisite prerequisite Foreign Key Primary Key course nbr course nbr course_nbr course_nbr 47 course nbr offered_course ct p_course_nbr prerequisite_c_nbr FACULTY Column List Name Code Type f_name fjiame char(20) facultyjd facultyjd char(5) l_name Ijiame char(20) mi mi char(1) rank rank char(5) P No Yes No No No M No Yes No No No Reference by List Primary Key Referenced by Foreign Key can_teach facultyjd ct_faculty_id teach facultyjd teach fac id OFFERING Column List Name Code Type ay_offered ay_offered char(2) offered_course offeredjx>urse char(6) qtijoffered qtr_offered char(1) P M Yes Yes Yes Yes Yes Yes Reference to List qtr Foreign Key Primary Key Reference to course coursejibr ay offeredjx>urse qtr_nbr qtr_offered ay_offered Reference by List Referenced by section Primary Key offered jx>urse Foreign Key ay_offered s_course_nbr s_ay qtr_offered s_qtr 48 PREREQUISITE Column List Name Code Type P M No No Yes Yes p_course_nbr p_course_nbr char(6) prerequ isite_c_n br prerequisite_c_nbr char(6) Table prerequisite Index List Index Code P No pk_prereq F U C Yes Yes No Column Code prerequisite_c_nbr Reference to Sort ASC ASC p_course_nbr List Reference to course course Primary Key course_nbr course_nbr Foreign Key p_course_nbr prerequisite_c_nbr QTR Column List Name Code Type ay end date ay end date date qtr_nbr qtr_nbr char(1) start_date start_date date Reference by M P Yes Yes No Yes Yes Yes No Yes List Primary Key Referenced by offering char(2) Foreign Key ay ay_offered qtr_nbr qtr_offered 49 SECTION Column List Name Code Type s_ay s course nbr s_ay char(2) s course nbr char(6) s_qtr s_qtr char(1) section_nbr section nbr char(2) Reference to P Yes Yes Yes Yes List Primary Key ay_offered Foreign Key s_course_nbr s_ay qtr_offered s_qtr Reference to offering offered_course Reference by List Primary Key Referenced by teach Foreign Key section nbr teach sec nbr s course nbr teach course nbr s_ay teach_ay s_qtr teach_qtr 50 M Yes Yes Yes Yes TEACH Column List Name Code Type teach_ay teach course nbr teach fac id teach_ay teach course nbr char(2) teach fac char(5) teach_qtr teach_qtr char(1) teach sec nbr teach_sec_nbr char(2) Reference to id char(6) P Yes Yes Yes Yes Yes List Primary Key Reference to Foreign Key faculty faculty_id teach_fac_id section section nbr s course nbr teach sec nbr teach course nbr s_ay teach_ay s_qtr teach_qtr 51 M Yes Yes Yes Yes Yes 52 APPENDIX E. CMS SCRIPTS AND SAMPLE SCREEN CAPTURES Script for Application Open Event //Initial default database transaction //In this application this transaction //is used to connect to the CMS database = Prof ileString "PB. INI", "Database", "DBMS" SQLCA.DBMS SQLCA.DbParm = ProfileString "PB. INI", "Database", "DbParm", ( , ( " " //Transaction cms_trCursor declared in "declare: global" menu cms_trCursor = SQLCA CONNECT using cms_trCursor; IF cms_trCursor.SQLCODE <> THEN MessageBox ("Connect Error", "CMS connection failed") HALT END IF //Transaction fms_trCursor declared in "declare global" menu fms_trCursor = CREATE transaction fms_trCursor.DBMS = Prof ileString "PB. INI", "PROFILE Fms", "DBMS", " ") fms_trCurs or. DbParm = Prof ileString "PB. INI", "PROFILE Fms", "DbParm", " " : ( ( CONNECT using fms_trCursor; IF fms_trCursor.SQLCODE <> THEN MessageBox ("Connect Error", "FMS connection failed!") HALT END IF /* let user control the toolbar*/ toolbarusercontrol = TRUE toolbartext = TRUE open (w main) 53 Script for Application Close Event transaction cms_trCursor cms_trCursor = SQLCA DISCONNECT using cms_trCursor; THEN IF cms_trCursor SQLCODE <> ROLLBACK using cms_trCursor; MessageBox( "Disconnect", cms_trCursor .SQLERRTEXT) END IF . 54 ) Script for w_facuity "window" Open Event with nested loop // associate each data window with a database connection // transaction object dw_fac_teach_less_than_2_list settransobject sqlca dw_pi_less_than_2_list settransobject (fms_trcursor dw_employee_link_faculty_list settransobject sqlca . ( ) . ( ) . // disable edit control for each data window dw_pi_less_than_2_list .Modify "datawindow. readonly = yes") dw_fac_teach_less_than_2_list .Modify "datawindow. readonly = yes") dw_employee_link_faculty_list .Modify "datawindow. readonly = yes") ( ( ( // retrieve list of faculty from CMS for which teach instances // are less than two dw_f ac_teach_less_than_2_list data window IF dw_f ac_teach_less_than_2_list. Retrieve () = -1 THEN ROLLBACK using SQLCA; MessageBox "Retrieve", "Faculty Teach Retrieval Error") ( ELSE COMMIT using SQLCA; END IF // retrieve list of employees from FMS for which PI instances // are less than 2 in dw_pi_less_than_2_list data window = -1 then IF dw_pi_less_than_2_list .Retrieve ( ) rollback using fms_trCursor; messagebox "Error", "PI Employee List Retrieval Error") ( ELSE commit using fms_trCursor; END IF 55 ) ) Script for w__faculty "window" Open Event with nested loop (continued) long pi_number_of_rows long teach_number_of_rows long pi_loopcount long teach_loopcount string piNum string teachNum // get number of rows in pi result data window pi_number_of_rows = dw_pi_less_than_2_list RowCount // get number of rows in teach result data window teach_number_of_rows = dw_fac_teach_less_than_2_list .RowCount . for pi_loopcount = 1 ( ( to pi_number_of_rows piNum = dw_pi_less_than_2_list .GetltemString (pi_loopcount, for teach_loopcount = 1 1) to teach_number_of_rows teachNum = dw_fac_teach_less_than_2_list .GetltemString (teach_loopcount, IF dw_employee_link_faculty_list .Retrieve (piNum, teachNum) >= THEN COMMIT using SQLCA; ELSE ROLLBACK using SQLCA; MessageBox ("Retrieve", "Retrieve error - employee join faculty detail") END IF next next 56 1) ) ) Script for w faculty "window" Open Event with single for loop // associate each data window with a database connection // transaction object dw_fac_teach_less_than_2_list settransobject sqlca dw_pi_less_than_2_list settransobject (fms_trcursor) dw_employee_link_faculty_list settransobject sqlca . ( ) . ( ) . // disable edit control for each data window dw_f ac_teach_less_than_2_list Modify "datawindow readonly = yes " dw_pi_less_than_2_list .Modify ("datawindow. readonly = yes") dw_employee_link_faculty_list .Modify "datawindow. readonly = yes") ( . . ( // retrieve list of faculty from CMS for which teach instances // are less than two dw_fac_teach_less_than_2_list data window IF dw_fac_teach_less_than_2_list.Retrieve() = -1 THEN ROLLBACK using SQLCA; MessageBox ("Retrieve", "Faculty Teach Retrieval Error") ELSE COMMIT using SQLCA; END IF // retrieve list of employees from FMS for which PI instances // are less than 2 in dw_pi_less_than_2_list data window IF dw_pi_less_than_2_list.Retrieve() = -1 THEN ROLLBACK using fms_trCursor; messagebox ("Error", "Employee List Retrieval Error") ELSE COMMIT using fms_trCursor; END IF long number_of_rows long loopcount string piNum number_of_rows = dw_pi_less_than_2_list .RowCount FOR loopcount = 1 ( to number_of_rows piNum = dw_pi_less_than_2_list. Get ItemString( loopcount, IF dw_employee_link_faculty_list .Retrieve (piNum) COMMIT using SQLCA; ELSE ROLLBACK using SQLCA; MessageBox "Retrieve", "Retrieve error END IF >= 1) THEN ( NEXT 57 - employee join faculty detail") Screen capture of w faculty window HBQ Course Management System &t» facuky Course* Schedufino. FaultyJdL Last Name 3333 HALWACHS ; Help First Name | M Initial [THOMAS ORNA NAKAGAWA ORPD =URDUE SCHMORRCW ORTW TAYLOR: 3RUTZMAN SORDOtvt 3 ETER 3YLAN iames: 50NALD * 5&S&*S&S» •-.. Close . . . 58 LIST OF REFERENCES 1. Loomis, Mary E. S. , Object Databases The Essentials, Reading, MA: Addison- Wesley Publishing Company, 1996. 2. 3. J., An Introduction to Database Systems, Fourth Wesley Publishing Company, Reading, MA, 1986. Date, C. Edition, Volume 1, Addison- Spear, R. L., "A Relational/Object-Oriented Database Management System: " Master's Thesis, Naval Postgraduate School, Monterey, CA, September, 1992. R/OODBMS, 4. Elmasri, R. and Navathe, S. B., Fundamentals ofDatabase Systems, Second Edition, Inc., Redwood City, CA, 1994. Benjamin/Cummings Publishing Company, 5. Kent, W. Limitations of Record-Based Information Models, 4(1); 1979; 107-131. ACM Transactions on Database Systems, 6. Codd, E. J., Extending the Database Relational Model to Capture More Meaning, ACM Transactions on Database Systems, 4(4); 7. 8. 1979. C. T., An Effect of Set Type to Query Formulation in Relational Database Systems, Naval Postgraduate School, Monterey, CA, Report No NPS52-88-018, July, 1988. Wu, Jernigan, Kevin, "Making the Move to VLDBs," Databased Advisor, p. 36, March, 1997. 9. Pires, S. Alan, FMS Thesis reference, Master's Thesis, Naval Postgraduate School, Monterey, CA, March, 1997. 59 60 BIBLIOGRAPHY L., and Tucherman, L., A Software Tool for Modular Design, ACM Transactions on Database Systems, Vol. 16, No.2, June 1991, pp. 209-234. Casanova, M. A., Furtado, A. Codd, E. F., The Relational Model for Database Management, Version Addison- Wesley Publishing Company, 1990. 2, Reading, J., with Warden, A., Relational Database Writings, 1885-1989, Reading, Addison- Wesley Publishing Company, 1990. Date, C. J., with Darwin, H., Relational Database Writings, 1889-1991, Reading, Addison- Wesley Publishing Company, 1992. Date, C. MA: MA: MA: Grahne, G., The Problem of Incomplete Information in Relational Databases, Lecture Notes in Computer Science, Vol 554, Springer- Verlag Berlin Heidelberg, 1991. Kuper, G. M. and Moshe, Y. V., The Logical Data Model, ACM Transactions on Database Systems, Vol. 15, No.l, September 1990, pp. 379-413. View Updates in Relational Databases with an Independent Scheme, ACM Transactions on Database Systems, Vol. 15, No.l, March 1990, pp. 40-66. Langerak, R., Shasha, D. and Wang, Relations Are T., Optimizing, Equijoin Queries in Distributed Databases Where Hash Partitioned, ACM Transactions on Database Systems, Vol. 16, No.2, June 1991, pp. 279-308. Markowitz, V. M. and Shoshani, A., Representing Extended Entity-Relationship Transactions on Structures in Relational Databases: A Modular Approach, Database Systems, Vol. 17, No.3, September 1992, pp. 423-464. ACM 61 62 INITIAL DISTRIBUTION LIST 1. Defense Technical Information Center. 8725 John Ft. Belvoir, 2. Kingman Road., J. VA Ste 0944 22060-6218 Dudley Knox Library Naval Postgraduate School 411 Dyer Rd. Monterey, 3. CA 93943-5101 Director, Training and Education MCCDC, Code C46 1019 Elliot Rd. Quantico, 4. VA 22134-5027 Director, Marine Corps Research Center MCCDC, Code C40RC 2040 Broadway Street Quantico, 5. VA 22134-5107 Director, Studies and Analysis Division MCCDC, Code C45 300 Russell Road Quantico, VA 22134-5130 63 6. Chairman, Code CS Computer Science Department Naval Postgraduate School Monterey, 7. C. CA 93943-5101 Thomas Wu, Code CS/Wq .. Computer Science Department Naval Postgraduate School Monterey, 8. CA 93943-5101 Mark H. Barber, Code CS/Bm Computer Science Department Naval Postgraduate School Monterey, 9. CA 93943-5101 Maj. Barron D. Whitaker 375C Bergin Dr. Monterey, CA, 93940 64 DUDLEY KNOX LIBRARY -POSTGRADUATE SCHOOL £Y CA S3343-SJ01 DUDLEY KNOX LIBRARY 3 2768 00337607