* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download What are the security issues in database management
Serializability wikipedia , lookup
Microsoft Access wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Concurrency control wikipedia , lookup
ContactPoint wikipedia , lookup
Clusterpoint wikipedia , lookup
Kristallynn D. Tolentino MSIT I. Introduction “A relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model.” As defined by E. F. Codd of IBM's San Jose Research Laboratory. A relational DBMS is special system software that is used to manage the organization, storage, access, security and integrity of data. This specialized software allows application systems to focus on the user interface, data validation and screen navigation. When there is a need to add, modify, delete or display data, the application system simply makes a "call" to the RDBMS. Although there are many different types of database management systems, relational databases are by far the most common. Other types include hierarchical databases and network databases. Although database management systems have been around since the 1960s, relational databases didn't become popular until the 1980s when the power of the computer skyrocketed and it became feasible to store data is sets of related tables and provided real-time data access. 1 RDBMS have become predominant choice since the 1980s a for the storage of information in new databases used for financial records, manufacturing and logistical information, personnel data, and much more. Relational databases have often replaced legacy hierarchical databases and network databases because they are easier to understand and use. In the relational model, all data must be stored in relations (tables), and each relation consists of rows and columns. Each relation must have a header and body. The header is simply the list of columns in the relation. The body is the set of data that actually populates the relation, organized into rows. You can extrapolate that the junction of one column and one row will result in a unique value - this value is called a tuple. The second major characteristic of the relational model is the usage of keys. These are specially designated columns within a relation, used to order data or relate data to other relations. One of the most important keys is the primary key, which is used to uniquely identify each row of data. To make querying for data easier, most relational databases go further and physically order the data by the primary key. Foreign keys relate data in one relation to the primary key of another relation. 2 The relational model is concerned with what is required, which separates it from concerns of how the model will be implemented. The how is the concern of the relational DBMS. The relational model focuses on representing data through relationships held between those data items. This approach has its theoretical basis in set theory and predicate logic - data is therefore represented in the form of tuples. The data in the relational model is queried/manipulated using relational algebra or relational calculus. The Relational DBMS is concerned with how the relational model will be implemented. The data from the relational data model is represented in the form of tables and rows. The data is queried using particular query languages, most commonly this is a language known as SQL (Sequential Query Language). II. Statement of the Problem 1. DBMS Architecture - “Open Problem” even up to now 1.1 What is DBMS in the Past? In the Present? And in the Future? 1.2 What are the security issues in database management system? 3 III. Related Literature Database is generally design “to operate large quantities of information by inputting, storing, retrieving, and managing that information. Databases are set up so that one set of software programs provides all users with access to all the data.” Database Management System (DBMS) allows users to define, create and maintain a database and provides controlled access on handling data which includes adding, retrieving and updating data. It helps the user to management or manipulates data inside a database or the data storage. There four (4) main of types of Database Management System (DBMS) which has its own category or applicability, and usefulness. The structure of each type DBMS is different. One of these types is the Relational Database Management System (RDBMS). Based upon the structure of an RDBMS, “the database relationships are treated in the form of a table. There are three keys on relational DBMS: relation, domain and attributes. A network means it contains a fundamental constructs sets or records. Sets contains one to many relationship, records contains fields statistical table that is composed of rows and columns is used to organize the database and its structure and is actually a two dimension array in the computer memory”. As have said RDBMS are widely used around the world, because it often replaced 4 legacy hierarchical databases and network databases because RDBMS are easier to understand and use. Nowadays, one of the types of Database Management System (DBMS) is widely used around the world; it is the Relational Database Management System (RDBMS). The RDBMS has different types, such as, Oracle, SQL Server, MySQL, Sybase, DB2, and among others. RDBMS Architecture is a 30 year old running architecture in an old platform. According to Michael Stonebraker, a computer scientist specialized on databases, “Traditional databases are slow not because SQL is slow. It’s because of their architecture and the fact that they are running code that is 30 years old.” Here are the following problems we frequently encountered: 1. Performance level. It has a very poor performance level. It may cause lagging or even a possibility that the database may crashed. 2. Bloat ware. Hence it is an Elephant System, bloat ware might occur. This may affects the usefulness of the software, for the reason that it extremely reduces disk-space and large memory or the Random Access Memory (RAM). 5 3. Scalability. The ability of a DBMS on handing a big data is also an issue on old DBMS, especially when handling multiple transactions at the same time. RDBMS uses executional plans in order to evaluate the data that the user requested at the run time, it uses such algorithm called costbased algorithm for handling task scheduling. Through this, it establishes an ad hoc manner by the used of defined elements of the database. According to Tony Bain “Even though RDBMS have provided database users with the best mix of simplicity, robustness, flexibility, performance, scalability, and compatibility, their performance in each of these areas is not necessarily better than that of an alternate solution pursuing one of these benefits in isolation.” RDBMS has its delightful features to ensure its concurrency, integrity, flexibility and other logical and/or outcome aspects, but still encountered critical situations after another. For over thirty years of technological innovation, the relational database architecture design is getting obsolete wherein it is no longer capable handling too much big data, scalability, performance, concurrency and other issues encountered in consuming this kind of technology. Performance and Scalability is important in any DBMS. “There are five factors that influence database 6 performance: workload, throughput, resources, optimization, and contention… database performance can be defined as the optimization of resource use to increase throughput and minimize the contention, enabling the largest possible workload to be processed.” New ideas for database architecture are needed, that can go with the flow on the innovation of technology. Thus, it is demanding but still needs to preserve some feature of the database capabilities that can help to improve the performance level and can supports higher scalability of the services given in a database management system. Performance and Scalability through modern innovative software architecture can help the non-functional requirements of a Database Management System by completely different implementation of new design architecture. For the reason that databases architecture and the building codes are old for the present competences and proficiencies of fast changes of technology and to cope-up the demands of user for services, it is like running an Elephant System. This affects the performance on how fast the retrieving, transferring, updating, saving data and avoiding crashes of databases or hardware when there is a simultaneous and continuous transaction of big data. One of the biggest problem in databases is that the existing legacy implementation of it. 7 Nowadays, there are proposed and currently used new developed database architecture or platform that can support Online Transaction Processing (OLTP) or Online Analytical Processing (OLAP). An “OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second.” While OLAP is “a category of software tools that provides analysis of data stored in a database. OLAP tools enable users to analyze different dimensions of multidimensional data. For example, it provides time series and trend analysis views. OLAP often is used in data mining.” The Old SQL or old database structure for OLTP is very slow and does not scale. Finance monitoring, website analytics, online gaming and more are the things that new OLTP work on and all these thing must responds on real-time basis. Processing data and validation of it are all in real-time. Furthermore it requires support for transactions that can span a network and may include more than one company. That’s why the new OLTP software uses client-server processing and brokering software that allows transactions to run on different computer platforms in a network. At the present time, there are still users/clients that uses old databases management system which are the legacy RDBMS. This is known as Elephant systems. Using such DBMS can bump into mediocre performance. Michael Stonebraker once said that these kind of databases 8 are “slow because they spend all of their time on not useful work” other than useful works. One of the solutions for the new era of technology is developing a new database known as NoSQL (short term for Not Only SQL). It is “capable of high throughput”. NoSQL doesn’t adopt the implementation of Atomicity, Consistency, Isolation, Durability (ACID) and Standard Query Language (SQL) features, in order to have high performance services of a DBMS. Tim Perdue a Director of Information Technology for a mid-sized tech company in Northern Virginia said that “The idea is that both technologies can coexist and each has its place. The NoSQL movement has been in the news in the past few years as many of the Web 2.0 leaders have adopted a NoSQL technology. Companies like Facebook, Twitter, Digg, Amazon, LinkedIn and Google all use NoSQL in one way or another.” It emerged from a need to data storage, more interconnected data and it can handle complex data structure. “Since the rise of the web, the volume of data stored about users, objects, products and events has exploded. Data is also accessed more frequently, and is processed more intensively – for example, social networks create hundreds of millions of customized, real-time activity feeds for users based on their connections' activities” , by this means NoSQL provides two (2) technical approaches to 9 address shortcomings of scalability and agility challenges these are manual sharding and distributed cache. There are two recently proposed NoSQL language standards namely CQL and UnQL. According to Edmond Lau, “The main problems that a NoSQL aims to solve typically revolve around issues of scale. When data no longer fits on a single MySQL server or when a single machine can no longer handle the query load, some strategy for sharding and replication is required.” NoSQL is better suited for Web application because of its ability to cope up with real-time service through web. But there still delinquent moments that will be encountered using this database. “These databases can be resource intensive. They demand higher CPU and RAM allocation than any relational database… when your website gets a traffic boost, be prepared to allocate more resources in a hurry. And that’s the problem with these NoSQL databases. And that’s the reason most shared Web hosting companies will not offer them to you on a shared hosting account. You need a Cloud or VPS or a dedicated server.” Another thing is that when giving up ACID, rolling back data on your own is hard and can be error-prone. Because ACID is: Atomicity is like follows all or nothing rule, comparable to processing a data and if it is fully completed or not. For Consistency it ensures only valid data. Isolation 10 “ensures that transactions are securely and independently processed at the same time without interference, but it does not ensure the order of transactions.” And lastly Durability, it assures that transactions committed are perform or compete over a long period of time and will survive permanently. One of the enhancements made to overcome the problem with the scalability and performance of database is having a solution in preventing four (4) main overhead problems. These four main overhead problems are online failover, online failback, Local Area Network (LAN) Partitioning and Wide Area Network Partitioning (WAN). Through these problems it needs a solution to write-ahead logging. One is that considering the node speed so that it can accommodate greater scale through reducing network traffic and reducing the probability of failures to hardware. If the database has main-memory storage, it is single threaded and has durability it will prevent locking / latching situation and also avoid log. NewSQL has this kind of ability. It preserves ACID and SQL but uses newly and different architecture of database implemented in a new platform in order to provide performance and scalability. Its Architecture provides much higher pernode performance. It can handle the same workload of NoSQL. Although NewSQL has these kinds of aptitudes, it is still new to the industry and need to explore more. 11 Though NewSQL is good and has its own approach to consider there are still a lot of users of NoSQL. And these users preferred using NoSQL rather than the old SQL and NewSQL. User supports for the development of NoSQL because of its maturity. “Enterprises want the reassurance that if a key system fails, they will be able to get timely and competent support. All RDBMS vendors go to great lengths to provide a high level of enterprise support… NoSQL databases are becoming an increasingly important part of the database landscape, and when used appropriately, can offer real benefits. However, enterprises should proceed with caution with full awareness of the legitimate limitations and issues that are associated with these databases. ” Performance and Scalability are two main things that are needed to contemplate on a Database Management System. Data partitioning is still the fundamental issue in high performance database processing. The data itself is getting more complex, including XML-based data bio-informatics data and data streams. Another thing to consider is the database security. According to Amichai Shulman the Co-founder and CTO of Imperva, Inc. there are ten Database Security Threats these are: “Excessive Privilege Abuse, Legitimate Privilege Abuse, Privilege Elevation, Database Platform Vulnerabilities, SQL Injection, Weak Audit Trail, Denial of Service, Database Communication Protocol Vulnerabilities, Weak Authentication and Backup Data Exposure”. 12 Excessive Privilege Abuse is when users (or applications) are granted database access privileges that exceed the requirements of their job function, these privileges may be abused for malicious purpose. The solution to excessive privileges is query-level access control. Users may also abuse legitimate database privileges for unauthorized purposes. Attackers may take advantage of database platform software vulnerabilities to convert access privileges from those of an ordinary user to those of an administrator. Vulnerabilities may be found in stored procedures, built-in functions, protocol implementations, and even SQL statements. Vulnerabilities in underlying operating systems (Windows 2000, UNIX, etc.) and additional services installed on a database server may lead to unauthorized access, data corruption, or denial of service. In a SQL injection attack, a perpetrator typically inserts (or “injects”) unauthorized database statements into a vulnerable SQL data channel. Typically targeted data channels include stored procedures and Web application input parameters. These injected statements are then passed to the database where they are executed. Using SQL injection, attackers may gain unrestricted access to an entire database. Automated recording of all sensitive and/or unusual database transactions should be part of the foundation underlying any database deployment. 13 Weak database audit policy represents a serious organizational risk on many levels. Denial of Service (DOS) is a general attack category in which access to network applications or data is denied to intended users. Denial of service (DOS) conditions may be created via many techniques - many of which are related to previously mentioned vulnerabilities. Database communication protocol attacks can be defeated with technology commonly referred to as protocol validation. Protocol validation technology essentially parses (disassembles) database traffic and compares it to expectations. In the event that live traffic does not match expectations, alerts or blocking actions may be taken. Weak authentication schemes allow attackers to assume the identity of legitimate database users by stealing or otherwise obtaining login credentials. An attacker may employ any number of strategies to obtain credentials. Lastly is when backing up a database. Backup database storage media is often completely unprotected from attack. As a result, several high profile security breaches have involved theft of database backup tapes and hard disks. John Ottman, the author of Save the Database, Save the World, that there are many approaches in network- or perimeter-based and not really centered on the database and that might be the biggest issue. “We have spent, as an industry, as a society, billions of dollars over the last 15 14 to 20 years on building security solutions for our infrastructure. Almost all of that has gone into network- and perimeter-oriented approaches. We have done some work with operating systems and spam and things like this. But it has really been focused on perimeter- and network-based security solutions and as our research shows, only 10% of databases have gotten that kind of focus so our message is that you have to protect the data where it lives - in the database. It is kind of like a bank locking the front door and leaving the bank vault open if you don't deal with the issue of database security”, he said. Ottman also said that “Obviously, a database administrator has universal access to the database and we have to have somebody who has universal access to manage the database. But the most common database security audit filing is a separation of duty violation, where database administrators are deemed to be privileged users who should have compensating control. In other words, it is theoretically possible for a database administrator to turn off audit and logging, do something to the database that might be nefarious - and then turn the audit back on, after they are done, so there are no footprints. That is a very typical SOX audit finding. And database activity monitoring is a pretty standard solution to resolve that. I think part of the issue is that the suggestion there is that database administrators are somehow the problem. Database administrators are actually the solution and there should be no thought of demonizing database administrators. They are critical to the solution set. But - compensating control of people 15 who have privileged access to sensitive data is a critical issue in many regulatory filings.” He also said that if they really want personally identifiable information will be protected, there has to be enforcement. Next is that database security professionals is avoiding inference capabilities. Basically, inference occurs when users are able to piece together information at one security level to determine a fact that should be protected at a higher security level. IV. Finding and Analysis DBMS Architecture - “Open Problem” even up to now. Database management systems are complex softwares which were often developed and optimized over years. Since the birth of DBMS thirty (30) years ago, DB researchers faced up to the question of how to design a dataindependent database management system (DBMS), that is, a DBMS which offers an appropriate application programming interface (API) to the user and whose architecture is open for open for criticism, updates, refinery and innovation. For this purpose, an architectural model based on successive data abstraction steps of record-oriented data was proposed as kind of a standard and later refined to a five-layer hierarchical DBMS model. Furthermore, we consider the interplay of the layered model with the transactional Atomicity, Consistency, Isolation and Durability (ACID) properties and again outline the progress obtained. 16 In the seventies, the scientific discussion in the database (DB) area was dominated by heavy arguments concerning the most suitable data model. It essentially focused on the question of which abstraction level is appropriate for a DB application programmer. The network data model seems to be best characterized by “the more complex the pointer-based data structure, the more accurate is the mini-world representation”. However, it offers only very simple operations forcing the programmer to navigate through cursor-controlled data spaces. In that time, the decision concerning the most appropriate data model could be pinpointed to “record orientation and pointerbased, navigational use” vs. “set orientation and value-based, declarative use”. Far ahead of the common belief of his time, E. F. Codd taught us that simplicity is the secret of data independence—a property of the data model and the database management system (DBMS) implementing it. A high degree of data independence is urgently needed to let a system “survive” the permanent change in computer science in general and in the DB area in particular. Nowadays, however, important DBMS requirements include data streams, unstructured or semi-structured documents, time series, spatial objects, and so on. What were the recommendations to achieve the system 17 properties for which the terms physical and logical data independence were coined? It is immediately clear that a monolithic approach to DBMS implementation is not very reasonable. It would mean to map the data model functionality (e.g., SQL) in a single step to the interfaces offered by external storage devices, e.g., read/write block. Since the development of Database Management System (DBMS), new system evolution requirements were abundant: growing information demand led to enhanced standards with new object types, constraints, etc.; advances in research and development bred new storage structures and access paths, etc.; rapid changes of the technologies used and especially Moore’s law had far-reaching consequences on storage devices, memory, connectivity (e.g., Web), and so on. Developing a hierarchically structured system offers the following important benefits: • The implementation of higher-level system components is simplified by the usage of lower-level system components. • Lower-level system components are independent of functionality and modifications in higher-level system components. 18 • Testing of lower-level system components is possible, before the higher system levels are put into use. The resulting abstraction hierarchy hides some properties of a system level (an abstract machine) from higher-layer machines. Furthermore, the implementation of higher-level operations extends the functionality of an abstract machine. System evolution is often restricted to the internals of such abstract machines when, for example, a function implementation is replaced by a more efficient one. In case new functionality extends their interfaces, the invocation of these operations implies “external” changes which are, however, limited to the next higher layer. Description of the DBMS mapping hierarchy Level of abstraction Objects Auxiliary mapping data Nonprocedural or Tables, views, tuples Logical algebraic access scheme description Record-oriented, Records, sets, Logical and physical navigational access hierarchies, networks schema description Record and access Physical records, Free space tables, DB- path management access paths key translation tables Propagation control Segments, pages DB buffer, page tables File management Files, blocks Directories, VTOCs, etc. 19 The architectural description embodies the major steps of dynamic abstraction from the level of physical storage up to the user interface. At the bottom, the database consists of huge volumes of bits stored on non-volatile storage devices, which are interpreted by the DBMS into meaningful information on which the user can operate. With each level of abstraction (proceeding upwards), the objects become more complex, allowing more powerful operations and being constrained by a growing number of integrity rules. The uppermost interface supports a specific data model, in our case by a declarative data access via SQL. The bottom layer, called File Management, operates on the bit pattern stored on some external, non-volatile storage device. Often in collaboration with the operating system’s file management, this layer copes with the physical characteristics of each type of storage device. Propagation Control as the next higher layer introduces different types of pages which are fixed-length partitions of a linear address space and mapped into physical blocks which are, in turn, stored on external devices by the file management. The strict distinction between pages and blocks offers additional degrees of freedom for the propagation of modified pages. For example, a page can be stored in different blocks during its lifetime in the database thereby enabling atomic propagation schemes (supporting failure recovery based on logical logging). To effectively reduce the physical I/O, this layer provides for a 20 (large) DB buffer which acts as a page-oriented interface (with fix/unfix operations) to the fraction of the DB currently resident in memory. The Record and Access Path Management implements mapping functions much more complicated than those provided by the two subordinate layers. For performance reasons, the partitioning of data into segments and pages is still visible at this layer. It has to provide clustering facilities and maintain all physical object representations, that is, data records, fields, etc. as well as access path structures, such as B-trees, and internal catalog information. It typically offers a variety of access paths of different types to the navigational access layer. Especially with the clustering options and the provision of flexibly usable access paths that are tailored to the anticipated workloads, this layer plays a key role for the entire DBMS performance. Extensions and Optimizations While the explanation model concerning the DBMS architecture is still valid, an enormous evolution/progress has been made during the last two decades concerning functionality, performance, and scalability. The fact that all these enhancements and changes could be adopted by the proposed architecture, is a strong indication that we refer to a salient DBMS model. We 21 cannot elaborate on all extensions, let alone to discuss them in detail, but we want to sketch some major improvements/changes. In the past thirty (30) years ago, SQL—not standardized at that time—and the underlying relational model were simple. Today, we have to refer to SQL:2013 and an object-relational model which are complex and not well understood in all parts. Many of the new aspects and functions—such as userdefined types, type and table hierarchies, recursion, constraints, triggers—have to be adjusted. While initially query translation and optimization started with solid foundations enabled the integration of new mechanisms, and could be successfully improved in particular, by using refined statistics (in particular, histograms), some of the new language concepts turn out to be very hard for the optimization. Furthermore, functionality for “arbitrary” join predicates, reuse of intermediate query evaluation results, sorting (internally usually optimized for relatively small sets of variable length records in memory as well as external sort/merge), etc. was improved and much better integrated. In particular, spaceadaptable algorithms contribute to great improvements and support load balancing and optimized throughput, even for high multi-programming levels. Moreover, it should not rely on special data preparation and its optimal use should not require the knowledge on system internals or expert experience. 22 Furthermore, over-specialized use and tailoring to narrow applications do not promise practical success in DBMSs5. Finally, many of these methods disregard the DBMS environment, where dependencies to locking and recovery issues, integration into optimizer decisions, support of mixed and unexpected workload characteristics have to be regarded. Finally, the huge DB buffer capacity facilitated the provision of buffer partitions where each partition can individually be tailored to the anticipated locality behavior of a specific workload. Nevertheless, buffering demands of VITA applications, considered in various projects, cannot be integrated in any reasonable way the transfer of the huge data volumes through the layered architecture up to the application. OS people proposed various improvements in file systems where only some were helpful for DB management, e.g., distribution transparency. Logstructured files, for example, turned out to be totally unsuitable. Furthermore, there is still no transaction support available at this layer of abstraction. A lot of new storage technology was invented during the last thirty (30) years disks of varying capacity, form and geometry, DVDs, WORM storage, electronic disks, etc.. Their integration into our architectural model could be transparently performed as far as the standard file interfaces were concerned. 23 Architectural Variants Up to now, we have intensively discussed the questions of data mapping and transactional support in a centralized DBMS architecture. In the last thirty (30) years, however, a variety of new data management scenarios emerged in the DBMS area. Architectural Requirements So far, our architectural layers perfectly match the invariants of setoriented, record-like database management such that they could be reused more or less unchanged in the outlined DBMS variants. However, recent requirements strongly deviate from this processing paradigm. Integration efforts developed during the last twenty (20) years were primarily based on a kind of loose coupling of components—called Extenders, DataBlades, or Cardridges—and a so-called extensibility infrastructure. Because these approaches could neither fulfill the demands for seamless integration nor the overblown performance and scalability expectations, future solutions may face major changes in the architecture. The Ten (10) Commandments of Database Management System (DBMS) General Rules 1. Recovery based on logical logging relies on a matching operation-consistent state of the materialized DB at the time of recovery. 2. The lock granule must be at least as large as the log granule. 24 3. Crash recovery under non-atomic propagation schemes requires Redo Winners resp. 4. State logging requires a WAL protocol (if pages are propagated before Commit). 5. Non-atomic propagation combined with logical logging is generally not applicable. 6. If the log granularity is smaller than the transfer unit of the system (block size), a system crash may cause media recovery. 7. Partial rollback within a transaction potentially violates the 2PL protocol 8. Log information for Redo must be collected independently of measures for Undo. 9. Log information for Redo must be written at the latest in phase 1 of Commit. 10. To guarantee repeatability of results of all transactions using Redo recovery based on logical logging, their DB updates must be reproduced on a transaction basis (in single-user mode) in the original Commit sequence. CODD Rules Codd's Rule Maulin Thaker Ahmedabad There are 13 (0 to 12) rules which were presented by Dr. E.F.Codd ,in June 1970,in ACM (Association of Computer Machinery) 25 Rule 0. Relational Database management “A relational database management system must use only its relational capabilities to manage the information stored in the database”. Rule 1. The information rule All information in the database to be represented in one and only one way, Namely by values in column positions within rows of tables. Rule 2. Logical accessibility This rule says about the requirement of primary keys. Every individual value in the database must be logically addressable by specifying the name of table, column and the primary key value of the row. Rule 3. Representation of null values The DBMS is required to support a representation of "missing information and inapplicable information" (for example, 0 'Zero' is different from other Numbers), This type of information must be represented by the DBMS in a systematic way (For example Null Character ). 26 Rule 4. Catalog Facilities The system is required to support an on line, in line, relational data access to authorized users by using their Query language. Rule 5. Data Languages. The system must support a least one relational language (It may support more than one relational language) that (a) has a linear syntax, (b) can be used in two ways and within application programs, (c) supports data operations security and integrity constraints, and transaction management operations (commit). Rule 6. View Updatability All views that are theoretically updatable must be updatable by the system. Rule 7. Update and delete. The system must support INSERT, UPDATE, and DELETE operators. 27 Rule 8. Physical data independence Changes to the physical level (how the data is stored, whether in arrays or linked lists etc.) must not require a change to an application based on the structure. Rule 9. Logical data independence Changes made to tables to modify any data stored in the tables must not require changes to be made to application programs. Logical data independence is more difficult to achieve than physical data independence. Rule 10. Integrity Constraints Integrity constraints must be specified separately from application programs and stored in the catalog. It must be possible to change such constraints when they are unnecessarily affecting existing applications. Rule 11. Database Distribution The RDBMS may spread across more than one system and across several networks, however the tables should appear in same manner to every user like local users. 28 Rule 12. The Non Subversion rule If the system provides a low-level interface, then that interface cannot be used to weaken the system (e.g.) bypassing a relational security or integrity constraint. What is DBMS in the Past? In the Present? And in the Future? In the late 1800s, Thomas Edison and George Westinghouse became embroiled in what has become known as "The War of the Currents." Edison had invested heavily in infrastructure, supporting the use of direct current for the distribution of electricity. Westinghouse, having bought patents on the inventions of Tesla, advocated alternating current. For a short period of time, there were two sets of infrastructure that operated under different assumptions about how power should be transported and consumed. Fortunately, the technology was young and the infrastructure was immature, so the cost of competing standards was relatively low. Fast forward about 100 years, data is the power that runs a modern business. When consumers of that power have different assumptions, a transformation is required. Very bad things (including the loss of infrastructure investment) can happen if that transformation is not carefully planned. Currently, there are at least three major data management paradigms - ISAM, SQL/Relational, and XML - in use, with XML poised to explode. 29 Each has different assumptions regarding how data models the organization's view of the real world. When different models try to operate on the same source of power, they must reconcile these differences. The process of reconciling these assumptions can result in data loss, performance degradation, system fragility, or feature unavailability. The ideal solution is for every consumer to enjoy native and natural access to the power source, without one model compromising another. Because data is the power for business, DBMSs (and their associated processes) tend to evolve more slowly than other, less mission-critical segments of computing infrastructure. In fact, database management models have remained relatively unchanged through the emergence and explosion of the Internet. It usually requires a significant shift in business practice to engender any change in database management. Distributed computing (such as web services) appears to be one such shift; it is growing rapidly and many major vendors expect distributed computing to be the next major paradigm in application development. XML is establishing the system for distributed data management, and this system does not neatly fit with the assumptions made by existing infrastructure. 30 Database Concepts of Concern The first assumptions involve three relatively basic (and familiar) database concepts: entities, attributes, and relationships. Each model addresses collections, operational efficiency, and the relationship of these concepts to the rest of the computing environment. There are other issues involved with data management (concurrency, operation atomicity, relational integrity, and so on), but these issues are not necessarily differentiating factors between models. The first basic concept is that of entity - the thing that is being stored and is representative of something in the external world, such as a customer, invoice, or inventory item. It may be thought of as the most granular representation of data that retains context. The question of what exactly comprises an entity is more frequently resolved through data/business analysis than through application of the normalization formulas found in database textbooks. The second concept is that of attribute - a descriptor of an entity. Depending on your particular prejudices, you may think of attributes as fields or columns. Attributes rely on the entity for context. The third concept is that of relationships. A customer entity and three order entities are useless in a business process unless you have some way of making their relationship persistent; that is, some way of denoting that the order 31 entities were made by the person represented by the customer entity. In relational theory, this can be represented by foreign key relationships. Obviously, a database would not be of much use if it allowed you to have one customer entity, one order entity, and one inventory entity. It would be almost as useless if it did not let you access the collection of customer entities independently from the collection of order entities. Although a flat text file could act as the basis of these concepts, the practical requirement is that specific information can be quickly and efficiently retrieved. It is critical that the performance does not degrade as more entities are added. This performance requirement is most often met through the use of indexes or keys. Finally, there is the relationship between the database system and the rest of the computing environment - in particular, the operating system and application. Early on, the database paradigm was represented by a set of procedures and coding standards that dictated how a particular shop's application code interacted with operating-system code. The evolution of database systems has seen at least one consistent trend: the abstraction of the database paradigm from application and operating-system constraints and the encapsulation of that abstraction within a database infrastructure. 32 In other words, relational database application developers typically no longer worry about the offset and length of a particular attribute or the particular OS file that contains the attribute data. Those issues are abstracted within a database infrastructure that generally is viewed as, if not a black box, then a really, really dirty one with very tiny windows. Stages of Database Evolution Database technology has evolved through several stages, including ISAM, SQL/Relational, and XML: ISAM. Although ISAM has not been formally standardized as a data model, thanks to the dominance of Cobol and the effect of that dominance on database management, there is a common set of well-understood expectations for an ISAM DBMS. In the ISAM paradigm, entities are records. Attributes are understood to be data stored starting at a specific offset for a specific length. The application is responsible for maintaining relationships, usually performed in much the same way as the relational model, where entities are collected in OS files, and the application (and thus the developer) is responsible for knowing which set of records is in which file. The application can include multiple types of records in one file, but any differences in entity type within a file must be implemented, understood, and 33 maintained by the application. The DBMS does not understand any distinction between different entity types within the same entity collection. Efficiency is achieved through the use of indexes. Since the DBMS is responsible for maintaining index information and the DBMS does not make any distinction between entity types within an entity collection, an ISAM file indexes the same attributes for an entire collection. This can result in added responsibility for the application if multiple entity types are in the same collection. Furthermore, since the DBMS is unaware of any nonindexed attributes of the entity, the same entity can be viewed as having several different compositions, and there is no guarantee that the attribute indexed by the DBMS is an attribute that is meaningful to the application. An ISAM application acts as if it is operating on the physical representation of the record (which it is, in most implementations). As a result, much of the database management of the ISAM paradigm is closely tied to both the operating system and the application. SQL/Relational. From a theoretical standpoint, the SQL paradigm and relational model are not synonymous. In fact, SQL can be used to build result sets that do not meet relational requirements. However, the average computing professional is not interested in purely theoretical DBMSs, and when most people use a relational database, they are almost invariably using SQL to manipulate the 34 data (whether directly or under the covers, as is often the case with ADO). Collections of entities and attributes may be arbitrarily defined at run time through SQL predicates (as well as through views). Relationships are persisted in much the same way as the ISAM model (the constraints, such as primary key uniqueness, are formalized in the relational model, but the ISAM model is similar in practice). To summarize, the relational model abstracts the database from the operating system and to some extent from the application. There is no longer an exploitable interaction between the operating system and the DBMS. Furthermore, while the application may have foreknowledge of the database composition, it is incapable of using that knowledge in a manner that is not understood beforehand by the DBMS. An application can also be written that derives all of its information about the database at run time, which is certainly not the case with the ISAM paradigm. This abstraction frees the application and the database administrator from a number of concerns regarding the internals of data management, but it also demands that the application conform to the expectations of the model. The .NET framework is an example of the pipeline approach. The architecture proposed by Microsoft has a SQL database (which generally performs pipelined ISAM atomics) returning query results as a disconnected, inmemory database. This database can then be transformed as needed or viewed 35 as XML or as a record set. The power in this is the flexibility for developers; the weaknesses involve the performance and concurrency issues of the multilayered disconnected approach, the mapping that must be performed beforehand, and the inability to adapt quickly to changes without losing data. Architectural issues With all the solutions currently being offered, how do you know which is right? Much of the decision involves your need to communicate with legacy databases. If you know that your need to interact with XML data is isolated from your need to work with your legacy relational data, then an XML database solution most directly addresses the paradigm in which you plan to work. The most common cases, though, involve integration of XML data with legacy databases. Inertia would suggest that the majority of adopted solutions are going to be based on the current solution offered by the vendor of the existing legacy store. The problem with this is that some of the existing solutions, such as the pipeline approach, will experience scalability problems that may not be apparent upon initial deployment. 36 What are the security issues in database management system? There are several co-related activities in the database area and computer architecture that make the discussion of database machines and their implications on DBMS standards timely and meaningful. First, in the database area there is a drive toward more powerful database management systems which support high-level data models and languages. The motive for this drive is the requirement to greatly improve user/programmer productivity and to protect applications from changes in the user environment. However, supporting these interfaces with software means often introduces inefficiency in database management systems because of the many levels of complex software which are required to map the high-level data representation and languages to the low level storage representation and machine codes. Second, the need for systems which handle very large databases is increasing rapidly. Very large databases complicate the problems of retrieval, update, data recovery, transaction processing, integrity, and security. Software solutions to these problems work well for both small databases supporting many applications and large databases supporting only a few applications. However, the labor-intensive cost, time delays and reliability problems associated with software development and maintenance will soon become prohibitive as large and highly shared databases emerge. The search for hardware solutions to these 37 problems is a necessary and viable alternative for balancing functionality and price/performance. Third, the progress made in hardware technology in the past decade is phenomenal. The cost of memories, processors, terminals and communication devices has dropped and will continue to drop at a drastic rate. It is time for a reevaluation of the traditional role of hardware and software in solving problems of today and tomorrow in database management. Database Security Issues Daily Maintenance: Database audit logs require daily review to make certain that there has been no data misuse. This requires overseeing database privileges and then consistently updating user access accounts. A database security manager also provides different types of access control for different users and assesses new programs that are performing with the database. If these tasks are performed on a daily basis, you can avoid a lot of problems with users that may pose a threat to the security of the database. Varied Security Methods for Applications: More often than not applications developers will vary the methods of security for different applications that are being utilized within the database. This can create difficulty with creating policies for accessing the applications. The 38 database must also possess the proper access controls for regulating the varying methods of security otherwise sensitive data is at risk. Post-Upgrade Evaluation: When a database is upgraded it is necessary for the administrator to perform a post-upgrade evaluation to ensure that security is consistent across all programs. Failure to perform this operation opens up the database to attack. Split the Position: Sometimes organizations fail to split the duties between the IT administrator and the database security manager. Instead the company tries to cut costs by having the IT administrator do everything. This action can significantly compromise the security of the data due to the responsibilities involved with both positions. The IT administrator should manage the database while the security manager performs all of the daily security processes. Application Spoofing: Hackers are capable of creating applications that resemble the existing applications connected to the database. These unauthorized applications are often difficult to identify and allow hackers access to the database via the application in disguise. 39 Manage User Passwords: Sometimes IT database security managers will forget to remove IDs and access privileges of former users which leads to password vulnerabilities in the database. Password rules and maintenance needs to be strictly enforced to avoid opening up the database to unauthorized users. Windows OS Flaws: Windows operating systems are not effective when it comes to database security. Often theft of passwords is prevalent as well as denial of service issues. The database security manager can take precautions through routine daily maintenance checks. These are just a few of the database security problems that exist within organizations. The best way to avoid a lot of these problems is to employ qualified personnel and separate the security responsibilities from the daily database maintenance responsibilities. NoSQL – Current Trends and issues about DBMS On late 1960's there are existing NoSQL Databases and that are used, these database are considered as not relational database. But it is not still that popular. On today's time NoSQL database are popular on the market because companies and organizations that uses relational database shift to NoSQL. 40 Some documents said that NoSQL are better to use than relational databases because it doesn't adopt ACID and SQL. Also according to Guy Harrison that "non-relational, "cloud," or "NoSQL" databases are gaining mindshare as an alternative model for database management". Because it has elastic scalability, can handle big data, it also requires less management, it is cheaper commodity servers on handling multiple transaction. "Their primary advantage is that, unlike relational databases, they handle unstructured data such as word-processing files, e-mail, multimedia, and social media efficiently." There are a lot of NoSQL databases but it has different approaches. Either way, "Relational databases are based on Edgar F. Codd's relational data model which assumes strictly structured data. The whole SQL language is constructed around this model and the databases which implement it are optimized for working that way. But in the past few years, there were attempts to add features to SQL which allow to work with unstructured data, like the SQL/XML extension which allows to store XML documents in fields of SQL tables and query their document-trees transparently. Document-oriented databases like MongoDB or CouchDB, on the other hand, were designed from the start to work with unstructured data and their query languages were designed around this concept, so when working with unstructured data they are usually much faster and more convenient to use." 41 Five advantages of NoSQL 1: Elastic scaling For years, database administrators have relied on scale up -- buying bigger servers as database load increases -- rather than scale out -- distributing the database across multiple hosts as load increases. However, as transaction rates and availability requirements increase, and as databases move into the cloud or onto virtualized environments, the economic advantages of scaling out on commodity hardware become irresistible. RDBMS might not scale out easily on commodity clusters, but the new breed of NoSQL databases are designed to expand transparently to take advantage of new nodes, and they're usually designed with low-cost commodity hardware in mind. 2: Big data Just as transaction rates have grown out of recognition over the last decade, the volumes of data that are being stored also have increased massively. O'Reilly has cleverly called this the "industrial revolution of data." RDBMS capacity has been growing to match these increases, but as with transaction rates, the constraints of data volumes that can be practically managed by a single RDBMS are becoming intolerable for some enterprises. 42 Today, the volumes of "big data" that can be handled by NoSQL systems, such as Hadoop, outstrip what can be handled by the biggest RDBMS. 3: Goodbye DBAs (see you later?) Despite the many manageability improvements claimed by RDBMS vendors over the years, high-end RDBMS systems can be maintained only with the assistance of expensive, highly trained DBAs. DBAs are intimately involved in the design, installation, and ongoing tuning of high-end RDBMS systems. NoSQL databases are generally designed from the ground up to require less management: automatic repair, data distribution, and simpler data models lead to lower administration and tuning requirements -- in theory. In practice, it's likely that rumors of the DBA's death have been slightly exaggerated. Someone will always be accountable for the performance and availability of any mission-critical data store. 4: Economics NoSQL databases typically use clusters of cheap commodity servers to manage the exploding data and transaction volumes, while RDBMS tends to rely on expensive proprietary servers and storage systems. The result is that the cost per gigabyte or transaction/second for NoSQL can be many times less than the cost for RDBMS, allowing you to store and process more data at a much lower price point. 43 5: Flexible data models Change management is a big headache for large production RDBMS. Even minor changes to the data model of an RDBMS have to be carefully managed and may necessitate downtime or reduced service levels. NoSQL databases have far more relaxed -- or even nonexistent -- data model restrictions. NoSQL Key Value stores and document databases allow the application to store virtually any structure it wants in a data element. Even the more rigidly defined BigTable-based NoSQL databases (Cassandra, HBase) typically allow new columns to be created without too much fuss. The result is that application changes and database schema changes do not have to be managed as one complicated change unit. In theory, this will allow applications to iterate faster, though,clearly, there can be undesirable side effects if the application fails to manage data integrity. Five challenges of NoSQL The promise of the NoSQL database has generated a lot of enthusiasm, but there are many obstacles to overcome before they can appeal to mainstream enterprises. Here are a few of the top challenges. 44 1: Maturity RDBMS systems have been around for a long time. NoSQL advocates will argue that their advancing age is a sign of their obsolescence, but for most CIOs, the maturity of the RDBMS is reassuring. For the most part, RDBMS systems are stable and richly functional. In comparison, most NoSQL alternatives are in preproduction versions with many key features yet to be implemented. Living on the technological leading edge is an exciting prospect for many developers, but enterprises should approach it with extreme caution. 2: Support Enterprises want the reassurance that if a key system fails, they will be able to get timely and competent support. All RDBMS vendors go to great lengths to provide a high level of enterprise support. In contrast, most NoSQL systems are open source projects, and although there are usually one or more firms offering support for each NoSQL database, these companies often are small start-ups without the global reach, support resources, or credibility of an Oracle, Microsoft, or IBM. 3: Analytics and business intelligence NoSQL databases have evolved to meet the scaling demands of modern Web 2.0 applications. Consequently, most of their feature set is oriented toward the demands of these applications. However, data in an application has value to the business that goes beyond the insert-read-update-delete cycle of a typical 45 Web application. Businesses mine information in corporate databases to improve their efficiency and competitiveness, and business intelligence (BI) is a key IT issue for all medium to large companies. NoSQL databases offer few facilities for ad-hoc query and analysis. Even a simple query requires significant programming expertise, and commonly used BI tools do not provide connectivity to NoSQL. Some relief is provided by the emergence of solutions such as HIVE or PIG, which can provide easier access to data held in Hadoop clusters and perhaps eventually, other NoSQL databases. Quest Software has developed a product -Toad for Cloud Databases -- that can provide ad-hoc query capabilities to a variety of NoSQL databases. 4: Administration The design goals for NoSQL may be to provide a zero-admin solution, but the current reality falls well short of that goal. NoSQL today requires a lot of skill to install and a lot of effort to maintain. 5: Expertise There are literally millions of developers throughout the world, and in every business segment, who are familiar with RDBMS concepts and 46 programming. In contrast, almost every NoSQL developer is in a learning mode. This situation will address naturally over time, but for now, it's far easier to find experienced RDBMS programmers or administrators than a NoSQL expert. NoSQL systems part ways with the hefty SQL standard and offer simpler but piecemeal solutions for architecting storage solutions. These systems were built with the belief that in simplifying how a database operates over data, an architect can better predict the performance of a query. In many NoSQL systems, complex query logic is left to the application, resulting in a data store with more predictable query performance because of the lack of variability in queries NoSQL systems part with more than just declarative queries over the relational data. Transactional semantics, consistency, and durability are guarantees that organizations such as banks demand of databases. Transactions provide an all-or-nothing guarantee when combining several potentially complex operations into one, such as deducting money from one account and adding the money to another. Consistency ensures that when a value is updated, subsequent queries will see the updated value. Durability guarantees that once a value is updated, it will be written to stable storage (such as a hard drive) and recoverable if the database crashes. 47 V. Conclusion Database solutions are not that perfect even though you are using the oldest trend or the new trend, because there are no finite solutions or definitive answer to every DBMS. The reason why? It’s because, innovation or current trend and issues as far as technology is concern is also interchanging every minute and every seconds. Organizations have a great deal of investment in their infrastructure incorporating ISAM and relational models. There are a number of competitive advantages that can be gained by distributed computing (such as web services), and the common language of distributed computing is XML. The problem is that the XML model makes different assumptions about data than the ISAM and relational models. The result is that businesses are now tasked with adapting existing infrastructure to a new, incompatible data model more quickly than ever before. There are several ways to accomplish this, but each has drawbacks. Some of these drawbacks are more likely than others to only show up as the system scales outward; other drawbacks are more obvious. Therefore, it is essential that the integration of distributed data not merely coast along the path of least resistance, but that it proceeds in the manner best suited to the needs of the business. 48 There are several possible futures. As unlikely as it seems, distributed computing may turn out to be a fad. One particular mechanism of model adaptation may improve to the point where it satisfactorily addresses the needs of most businesses. One of the models may evolve to accept the assumptions of the other models, making it the reference model. On the other hand, the importance of distributed computing may simply force businesses to accept the cost of inefficient data transformation. An ideal solution is a DBMS that can apply the constraints of any particular model to the underlying data, allowing existing infrastructure to perform at current levels while providing native and natural support for new models as needed. VI. Recommendation NoSQL databases are becoming an increasingly important part of the database landscape, and when used appropriately, can offer real benefits. However, enterprises should proceed with caution with full awareness of the legitimate limitations and issues that are associated with these databases. For a quarter of a century, the relational database (RDBMS) has been the dominant model for database management. But, today, non-relational, "cloud," or "NoSQL" databases are gaining mindshare as an alternative model for database management. In this article, we'll look at the 10 key aspects of these non-relational NoSQL databases: the top five advantages and the top five challenges. 49 It is clearly, absolutely, precisely and promptly that RDBM is not yet as of now, the final frontier, for the well solutionized problem solving for the trend and issues as far as databases are concerned. We do all now for the fact that RDBMS I evolving since its development, but still some problems and issues are not being solve by the system. No Database is an innovative distributed database that can be deployed in any datacenter, in any cloud, anywhere, without the compromises inherent in other New SQL solutions. The release also eliminates the need for the complex database workarounds like clustering, performance tuning and sharding that are typically associated with bringing applications to the cloud (internet). As the researcher, I therefore recommend the up to date tracking, continuous evolution and the professional growth of every DBMS be employed not only in the industry, not only in the education sector but also thru the heart of every information technologist, computer scientist but also to all the people that are responsible in the evolution of DBMS. 50