Download Database Systems - Department of Computer Engineering

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

IMDb wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Database model wikipedia , lookup

Database Systems
“Breaking Out of the Box”
Avi Silberschatz
Bell Laboratories
Stan Zdonik
Brown University
July 7, 1997
Mehmet Uner
The Paper’s Theme (Strategic Directions)
1) Database Research should be devoted to the
problems of data management no matter where
and in what form the data might be found.
2) Database management skills should be applied
to new data management environments that
potentially require radically new software
Mehmet Uner
 Introduction
 Background
 Our Skills
 Scenarios
 Barriers
 Research
 Conclusions
 References
Mehmet Uner
 The field of database systems research and development has
been very successful over its 30 year history.
It has led to $10 billion industry that touches virtually every
major company in the world.
Unthinkable to manage large volume of valuable information
that keeps corporations runing without support from commercial
database management systems (DBMS).
DBMS is a very complex system incorporating a rich set of
Suited for solving problems of large-scale data management in
the corporate setting.
Mehmet Uner
DBMS Requirements:
 Execution Overhead.
 High level of expertise to install and
 Only manages data in fairly specific file
Mehmet Uner
At the same time:
 Data is changing rapidly.
 Data is stored in different places (e.g. files)
 Data is obtained in large volumes from external sources
like sensors.
 Not full-blown DBMS, a lighter-weight solution
 Instead of using an existing tool in a new application, it is
better to embed reusable components.
 Use database system components, techniques and
experience in new ways.
Mehmet Uner
 Some examples that could benefit from data
management techniques but that typically do
not make heavy use of database products:
World Wide Web
Personal Information Systems (e-mail)
News Services
Scientific Applications
Mehmet Uner
 Database field born with release of IMS in 60’s.
– IBM Product
– Managed data as hierarchies
– Data has value, manage independently of application
 Codasyl, most well known successor
– Based on graph-based structure.
 Ted Codd published a paper in 1970
– Suggested relational model.
Mehmet Uner
 Object Oriented Principles in 80’s
– Allow users to create their own application-specific
types that can be managed by the DBMS.
 Hybrid model in 90’s
– Embeds object-oriented features in a relational context.
Mehmet Uner
Our Skills
 Database Management Systems have been
concerned with the following problems:
High Performance
 From point of view of slow-memory devices that
must be shared by multiple concurrent users
 This approach leads to a set of skills and
techniques that can be applied and extended to
other problems.
Mehmet Uner
Skills and Techniques
 Data Modeling
– Language for defining structure of database
– Language for manipulating those structures.
 Query Languages
– High-level language to retrieve data from the
database. (SQL)
 Query Optimization and evaluation
 State-based views
– Restricted and reorganized view of database.
Mehmet Uner
Skills and Techniques
 Data Management
– Automatic maintenance of data structures
– Efficient Movement of data
 Transactions
– A response to correctness problems introduced by
concurrent access and update
 Distributed Systems
 Scalable Systems
– Database systems have been tuned to efficiently and
reliably handle data volumes that exceed the size of the
the physical memory by several orders of magnitude.
Mehmet Uner
 The way for future data management systems
 The technology that would support these scenarios
constitutes a research agenda for the next decade.
1) Instant Virtual Enterprise
2) Personal Information Systems
Mehmet Uner
Instant Virtual Enterprise
 An “instant virtual enterprise” (IVE) is a group of
companies, that do not routinely function as a unit.
 Come together to respond to a customer order or request
for proposal.
 Computer integrated manufacturing (CIM) is an example
of an environment requiring IVE cooperation.
 Engineering side
– Design, Production, Quality Assurance
 Administrative side
– Planning, Production Control, Resource Management
Mehmet Uner
Instant Virtual Enterprise
 Companies in IVE needs to exchange and
manage large amounts of data
 Companies will have many heterogeneous
 Sharing and exchanging data with
coordinating information is critical
Mehmet Uner
IVE Scenario
Building an oil pipeline
Company A
Company Q
Engineering Firm (IVE)
License their design
Company R
Engineering Analysis
Company S
Mehmet Uner
IVE Scenario
Actual Fabrication
Company T
Company U
Design file conversion service
Company V
Documentation and Archiving
Company W
Mehmet Uner
IVE Scenario
 Database Capabilities Needed:
Executing a query for the design
Data translation services for engineering analysis
Coordination and configuration management
Changes to an object in one subsystem require changes
to one or more related objects in other subsystems.
– Security and access control over the information
– Archiving of information, even after the IVE disbands
Mehmet Uner
Personal Information Systems Scenario
 Provides information to an individual
 Uses PID (Personal Information Device)
– Handheld PC
– Laptop
 Equipped with wireless network connection
 Access to internet Anywhere, Anytime.
Mehmet Uner
Personal Information Systems Scenario
 Tightly integrated with individual’s activities.
From morning to bed time.
 In the morning
Local Weather Report
List of Reminders
List of Morning Meetings
Best Route from home to work
Personalized Headlines
Personalized Investment Report
Mehmet Uner
Personal Information Systems Scenario
 Throughout the day
– Tasks for the day
– List of customers to contact
– Summary of breaking news
– Best Driving Routes in the city
 At the end of the day
– Next day’s activities
– Appointments
Mehmet Uner
Personal Information Systems Scenario
 PID must continuosly query remote
databases and monitor broadcast
 PID will magnify today’s client-server
performance, scalibility and reliability
 Where should data reside, PID or Server?
Mehmet Uner
 DBMS provides a tightly controlled and
highly uniform environment
 For the new applications, database
functionality should be provided outside of
the limits of a DBMS.
 For the vision represented in the scenarios,
a number of technical barriers must be
Mehmet Uner
 Overhead
– System requirements, expertise, planning, monetary cost
– Builder of personalized newspaper service do not use DBMS
because there is no need for many of the advanced features.
– A subset of the traditional database services are needed by many
new applications
 Scale
– Greater volume of data (petabytes)
– Hundreds of servers, client population even larger
Mehmet Uner
 Schema Organization
– First create a schema to describe the structure of the database and
populate the database
– Many applications currently create data independently of a
database system. (scientific applications, web sites)
– Schema is incomplete or inconsistent.
– Schema management facilities is needed to adapt the dynamic
nature of foreign data.
 Data Quality
– Information accessed form a WAN may be of varying quality.
– Future information systems must be able to react to the quality of
the data source.
Mehmet Uner
 Heterogeneity
– Data exists in many forms
– These dissimilar formats must be integrated to allow
applications to access data in a high-level and uniform
 Query Complexity
– Different characteristics in future environments
• Conventional, minimize number of disk access
• Future, minimize total “information bill”
Mehmet Uner
 Ease of Use
– Highly-trained, full-time staff is assumed to manage a
– Yet most users have no training in database tech.
– Simple set of interfaces needed.
 Security
– As the amount of shared information grows, the need to
restrict access to specific users of for specific use
Mehmet Uner
 Guaranting Acceptable Outcomes
– Transacation managemnet, a barrier to both system
performance and ability to specify acceptable outcomes
– New or enchanced transaction technology is needed
– Making data unavaliable is not acceptable
– Aborting transactions is unacceptable
 Technology Transfer
– Barrier between research and industry
• Insufficient knowledge of each other
Mehmet Uner
 In order to achieve the vision and overcome these
barriers, a number of central research topics must
be addressed:
Extensibility and Componentization
Imprecise Results
Schemaless Databases
Ease-of Use
New transaction Model
Query Optimization
Data Movement
Database Mining
Mehmet Uner
– Extensibility and Componentization
• DBMS in a modular way
• Lighter-weight applications
– Imprecise Results
• In the web search engines do not provide 100%
• A general theory of imprecision must be developed
– Schemaless Databases
• Able to work with unstructured data
Mehmet Uner
– Ease-of-use
• Better database interfaces are required.
– New transaction Models
• Overcome blocking.
• Provides Correctness.
– Query Optimization
• New indexing methods, query processing strategies.
• Cheaper but slower response time.
• Sensitive to bandwidth and power considerations.
Mehmet Uner
– Data Movement
• In a distributed environment, the cost of moving data can be
extremely high
• Asymmetric communication channels, (low bandwidth lines)
– Security
• Formulation of an authorization model
• Interoperability between differen security policies
– Database Mining
• Machine Learning
• Statistical Analysis
• Database Technologies
Mehmet Uner
 Database research must be broadly defined.
 Database community must apply its experience and
expertise to new areas and new solution packet must be
 The vision is an integration that supports the application of
database functionality in small modules that give just the
right capability.
 These modules should also represent a unified theory of
information that allows for the querying information of all
types without having to switch languages or paradigms.
Mehmet Uner
 E. F. Codd, “A relational Model for Large Shared Databanks”,
Communications of the ACM, 13:6,(June 1970), pp. 377-387.
J. Gray,
A. Silberschatz, M. Stonebraker, and J. Ullman, “Database Systems:
Achievements and Opportunities,” SIGMOD Record, 19:4, pp.6-22.
A. Silberschatz, M. Stonebraker, and J. Ullman, “Database Systems:
Achievements and Opportunities Into the 21st Century”,
J. Toole and P. Young,
Mehmet Uner
Any Questions?
Mehmet Uner