Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
Database Systems “Breaking Out of the Box” Avi Silberschatz Bell Laboratories Stan Zdonik Brown University July 7, 1997 Mehmet Uner 1 The Paper’s Theme (Strategic Directions) 1) Database Research should be devoted to the problems of data management no matter where and in what form the data might be found. 2) Database management skills should be applied to new data management environments that potentially require radically new software architectures. Mehmet Uner 2 Outline Introduction Background Our Skills Scenarios Barriers Research Conclusions References Mehmet Uner 3 Introduction The field of database systems research and development has been very successful over its 30 year history. It has led to $10 billion industry that touches virtually every major company in the world. Unthinkable to manage large volume of valuable information that keeps corporations runing without support from commercial database management systems (DBMS). DBMS is a very complex system incorporating a rich set of technologies. Suited for solving problems of large-scale data management in the corporate setting. Mehmet Uner 4 DBMS DBMS Requirements: Execution Overhead. High level of expertise to install and maintain. Only manages data in fairly specific file formats. Mehmet Uner 5 Solution At the same time: Data is changing rapidly. Data is stored in different places (e.g. files) Data is obtained in large volumes from external sources like sensors. Solution: Not full-blown DBMS, a lighter-weight solution Instead of using an existing tool in a new application, it is better to embed reusable components. Use database system components, techniques and experience in new ways. Mehmet Uner 6 Examples Some examples that could benefit from data management techniques but that typically do not make heavy use of database products: – – – – World Wide Web Personal Information Systems (e-mail) News Services Scientific Applications Mehmet Uner 7 Background Database field born with release of IMS in 60’s. – IBM Product – Managed data as hierarchies – Data has value, manage independently of application Codasyl, most well known successor – Based on graph-based structure. Ted Codd published a paper in 1970 – Suggested relational model. Mehmet Uner 8 Background Object Oriented Principles in 80’s – Allow users to create their own application-specific types that can be managed by the DBMS. Hybrid model in 90’s – Embeds object-oriented features in a relational context. Mehmet Uner 9 Our Skills Database Management Systems have been concerned with the following problems: – – – – High Performance Correctness Maintainability Reliability From point of view of slow-memory devices that must be shared by multiple concurrent users This approach leads to a set of skills and techniques that can be applied and extended to other problems. Mehmet Uner 10 Skills and Techniques Data Modeling – Language for defining structure of database – Language for manipulating those structures. Query Languages – High-level language to retrieve data from the database. (SQL) Query Optimization and evaluation State-based views – Restricted and reorganized view of database. Mehmet Uner 11 Skills and Techniques Data Management – Automatic maintenance of data structures – Efficient Movement of data Transactions – A response to correctness problems introduced by concurrent access and update Distributed Systems Scalable Systems – Database systems have been tuned to efficiently and reliably handle data volumes that exceed the size of the the physical memory by several orders of magnitude. Mehmet Uner 12 Scenarios The way for future data management systems The technology that would support these scenarios constitutes a research agenda for the next decade. 1) Instant Virtual Enterprise 2) Personal Information Systems Mehmet Uner 13 Instant Virtual Enterprise An “instant virtual enterprise” (IVE) is a group of companies, that do not routinely function as a unit. Come together to respond to a customer order or request for proposal. Computer integrated manufacturing (CIM) is an example of an environment requiring IVE cooperation. Engineering side – Design, Production, Quality Assurance Administrative side – Planning, Production Control, Resource Management Mehmet Uner 14 Instant Virtual Enterprise Companies in IVE needs to exchange and manage large amounts of data Companies will have many heterogeneous databases Sharing and exchanging data with coordinating information is critical Mehmet Uner 15 IVE Scenario Building an oil pipeline Company A Company Q Engineering Firm (IVE) License their design Company R Engineering Analysis Company S Mehmet Uner 16 IVE Scenario Actual Fabrication Company T Company U Casting Design file conversion service Company V Documentation and Archiving Company W Mehmet Uner 17 IVE Scenario Database Capabilities Needed: – – – – Executing a query for the design Data translation services for engineering analysis Coordination and configuration management Changes to an object in one subsystem require changes to one or more related objects in other subsystems. – Security and access control over the information – Archiving of information, even after the IVE disbands Mehmet Uner 18 Personal Information Systems Scenario Provides information to an individual Uses PID (Personal Information Device) – PDA – Handheld PC – Laptop Equipped with wireless network connection Access to internet Anywhere, Anytime. Mehmet Uner 19 Personal Information Systems Scenario Tightly integrated with individual’s activities. From morning to bed time. In the morning – – – – – – Local Weather Report List of Reminders List of Morning Meetings Best Route from home to work Personalized Headlines Personalized Investment Report Mehmet Uner 20 Personal Information Systems Scenario Throughout the day – Tasks for the day – List of customers to contact – Summary of breaking news – Best Driving Routes in the city At the end of the day – Next day’s activities – Appointments Mehmet Uner 21 Personal Information Systems Scenario PID must continuosly query remote databases and monitor broadcast information PID will magnify today’s client-server performance, scalibility and reliability problems Where should data reside, PID or Server? Mehmet Uner 22 Barriers DBMS provides a tightly controlled and highly uniform environment For the new applications, database functionality should be provided outside of the limits of a DBMS. For the vision represented in the scenarios, a number of technical barriers must be removed. Mehmet Uner 23 Barriers Overhead – System requirements, expertise, planning, monetary cost – Builder of personalized newspaper service do not use DBMS because there is no need for many of the advanced features. – A subset of the traditional database services are needed by many new applications Scale – Greater volume of data (petabytes) – Hundreds of servers, client population even larger Mehmet Uner 24 Barriers Schema Organization – First create a schema to describe the structure of the database and populate the database – Many applications currently create data independently of a database system. (scientific applications, web sites) – Schema is incomplete or inconsistent. – Schema management facilities is needed to adapt the dynamic nature of foreign data. Data Quality – Information accessed form a WAN may be of varying quality. – Future information systems must be able to react to the quality of the data source. Mehmet Uner 25 Barriers Heterogeneity – Data exists in many forms – These dissimilar formats must be integrated to allow applications to access data in a high-level and uniform way Query Complexity – Different characteristics in future environments • Conventional, minimize number of disk access • Future, minimize total “information bill” Mehmet Uner 26 Barriers Ease of Use – Highly-trained, full-time staff is assumed to manage a DBMS – Yet most users have no training in database tech. – Simple set of interfaces needed. Security – As the amount of shared information grows, the need to restrict access to specific users of for specific use arises. Mehmet Uner 27 Barriers Guaranting Acceptable Outcomes – Transacation managemnet, a barrier to both system performance and ability to specify acceptable outcomes – New or enchanced transaction technology is needed – Making data unavaliable is not acceptable – Aborting transactions is unacceptable Technology Transfer – Barrier between research and industry • Insufficient knowledge of each other Mehmet Uner 28 Research In order to achieve the vision and overcome these barriers, a number of central research topics must be addressed: – – – – – – – – – Extensibility and Componentization Imprecise Results Schemaless Databases Ease-of Use New transaction Model Query Optimization Data Movement Security Database Mining Mehmet Uner 29 Research – Extensibility and Componentization • DBMS in a modular way • Lighter-weight applications – Imprecise Results • In the web search engines do not provide 100% accuracy • A general theory of imprecision must be developed – Schemaless Databases • Able to work with unstructured data Mehmet Uner 30 Research – Ease-of-use • Better database interfaces are required. – New transaction Models • Overcome blocking. • Provides Correctness. – Query Optimization • New indexing methods, query processing strategies. • Cheaper but slower response time. • Sensitive to bandwidth and power considerations. Mehmet Uner 31 Research – Data Movement • In a distributed environment, the cost of moving data can be extremely high • Asymmetric communication channels, (low bandwidth lines) – Security • Formulation of an authorization model • Interoperability between differen security policies – Database Mining • Machine Learning • Statistical Analysis • Database Technologies Mehmet Uner 32 Conclusions Database research must be broadly defined. Database community must apply its experience and expertise to new areas and new solution packet must be found. The vision is an integration that supports the application of database functionality in small modules that give just the right capability. These modules should also represent a unified theory of information that allows for the querying information of all types without having to switch languages or paradigms. Mehmet Uner 33 References E. F. Codd, “A relational Model for Large Shared Databanks”, Communications of the ACM, 13:6,(June 1970), pp. 377-387. J. Gray,http://www.cs.washington.edu/homes/lazowska/cra/database.html A. Silberschatz, M. Stonebraker, and J. Ullman, “Database Systems: Achievements and Opportunities,” SIGMOD Record, 19:4, pp.6-22. A. Silberschatz, M. Stonebraker, and J. Ullman, “Database Systems: Achievements and Opportunities Into the 21st Century”, http://www.cs.stanford.edu/pub/papers/lagii.ps J. Toole and P. Young, http://www.hpcc.gov/cic/forum/CIC_Cover.html Mehmet Uner 34 Thanks! Any Questions? Mehmet Uner 35