Autonomics and Data Management
Norman Paton
University of Manchester
If database management systems are to be effective
in an increasing range of challenging
environments, such as grids, then automation will
have follow them into these new settings.
• Existing examples of automation.
• Limitations in current practice.
• Opportunities presented by ubiquitous automation.
• Existing examples of automation:
– Database administration.
– Query processing.
– Data integration.
• Limitations in current practice.
• Opportunities presented by ubiquitous automation.
Example: Database Administration
• Database administration involves setting values for a lot of
– Where to put indexes.
– What views to materialise.
– How to allocate memory.
– Maximum number of concurrent transactions.
– Which disks to place data on.
– Which statistics to maintain.
– How often to refresh statistics.
– Which transaction isolation level to use.
• Autonomic database administration may set any of these
Multiprogramming Level
• The multiprogramming level (MPL) indicates the maximum
number of concurrent transactions that may be run.
• Problem: excessive lock conflicts may lead to thrashing, either
through deadlocks or significant amounts of blocking.
• Setting the MPL level:
– If too high, then risk of thrashing.
– If too low, then too many jobs waiting in queue.
• The risk of thrashing at a given MPL depends on the update
intensity of the transactions.
• G. Weikum, A. Mönkeberg, C. Hasse, P. Zabback: Self-tuning
Database Technology and Information Services: from Wishful
Thinking to Viable Engineering. VLDB 2002: 20-3.
Automating the Setting of MPL – 1
• Observation:
– Want to set the MPL as high as possible, but not too high!
– Identify a property that indicates that there is a high risk of
– Conflict ratio:
• (# locks held by all transactions / # locks held by non-blocked
• Experimental and analytical studies indicated that a level of 1.3
or more means there is a high risk of thrashing.
Automating the Setting of MPL – 2
• Monitoring:
– Number of active transactions.
– Number of blocked transactions.
• Assessment:
– Conflict ratio exceeds 1.3.
• Response:
– Transaction admission policy:
• Block admission of new transactions from queue.
– Transaction cancellation policy:
• Cancel one or more blocking transactions.
Example: Query Evaluation
• Query optimization involves making lots of decisions:
– Which operators to use.
– What order to evaluate the operators in.
– What parallelism level to use.
– How to allocate work to parallel nodes.
• Adaptive query processing may revise any of the
decisions made by a query optimizer during query
Adaptation for Load Balancing
• In partitioned parallelism, a task is divided into subtasks that are
run in parallel on different nodes.
• For a join, A⋈B is represented as the union of the results of plan
fragments Fi = Ai ⋈Bi , for i = 1..P, where P is the level of
• The time taken to evaluate the join is max(evaluation_time(Fi )),
for i = 1..P.
• As a result, any delay in completing a fragment Fi delays the
completion of the operator, so it is crucial to match fragment size
to node capabilities.
• Many join algorithms have state; as such changing the size of a
fragment allocated to a machine involves replicating or
relocating operator state.
Load Balancing: Flux
• When load imbalance is
– Halt query execution.
– Compute new distribution policy
– Update hash tables by
transferring data between
– Update dp in parent exchange
– Resume query execution.
• M. Shah, J.M. Hellerstein, S.
Chandrasekaran, M.J. Franklin,
Flux: An Adaptive Partitioning
Operator for Continuous Query
Systems. 25-36, ICDE 2003.
Hash table
Hash table
Example: Data Integration
• Data integration involves assembling information
about the relationships between sources:
– What sources there are.
– The services provided by the source.
– The concepts represented in each source.
– How the data represented.
– What relationships there are between extents.
– What mappings exist between source data types.
• Autonomic data integration involves inferring some of
the above data.
Inferring Web Service Annotations
Web service annotations are useful for:
− discovering services.
− composing workflows.
− characterising and identifying mismatches.
However, service annotation is expensive:
 knowledge of the ontology used for annotation.
 knowledge of the web services to be annotated.
(Semi)automatic annotation can be carried out using:
 schema matching and text classification techniques.
 workflow specifications.
 K. Belhajjame, S.M. Embury, N.W. Paton, N.W., R. Stevens and C.A.
Goble, Automatic Annotation of Web Services Based on Workflow
Definitions, Proc. 5th Intl. Semantic Web Conference, Springer, 116-129,
Inferring Web Service Annotations
• Use workflows to infer information about the
semantics of linked parameters:
Summary on Examples of Automation
• Data management and integration are complex, with
many possibilities to benefit from automation.
• Automation has been applied in many different
settings, with many worthwhile results.
• The diversity in approaches to and technologies
associated with automation is great.
Limitations: Predictability
• Adaptive systems change system behaviour in
response to runtime feedback. Risks include:
– Reacting too quickly in response to temporary effects.
– Reacting too slowly to be effective.
– Reacting in a way that makes things worse.
• It can be difficult for developers of adaptive systems
to predict how effective their proposals might be.
• It sometimes takes several attempts to refine an
adaptive strategy.
Adaptive Load Balancing: Comparison
• Several existing strategies were compared, across a range of
environmental conditions.
• Conditions could be identified in which all of the proposals were
worse than not adapting.
• Published evaluations of the existing proposals gave no
indication of problematic cases.
• Several of the developers did not know under which
circumstances their approaches performed poorly.
• N.W. Paton, V. Raman, G. Swart, I. Narang, Autonomic Query
Parallelization using Non-dedicated Computers: An Evaluation
of Adaptivity Options, Proc. ICAC, 221-230, 2006.
Adaptive Load Balancing: Experiment
• Query:
– P⋈PS (P has 200,000 tuples, PS has 800,000 tuples).
– Simulation of parallel run on three nodes.
• Types of imbalance:
– Constant: A consistent external load exists on one of the
nodes throughout the experiment. The level of the external
load represents the number of external tasks that are
seeking to make full-time use of the machine.
– Periodic: The load on one of the machines comes and goes
during the experiment. The duration of the load indicates for
how long each load spike lasts; and the repeat duration
represents the gap between load spikes.
Results: Constant Imbalance
Periodic Imbalance (1s)
Designing Adaptive Strategies
• Overheads: pessimistic
strategies carry out
additional work on the
assumption that things will
go wrong (e.g. replicating
• Adaptation costs: optimistic
strategies evaluate queries
as normal, but may pay a
high price to carry out
specific adaptations when
Adapt-3 Adapt-1
Adaptation Cost
Limitations: Methodology
• Adaptive data management proposals are generally
described as specific algorithms or techniques:
– It is often not clear what methodology has been followed in
their development.
– It is not necessarily clear if there are well established
techniques that could have been used to direct their design.
• Approaches that have been applied in the design of
adaptive systems include:
– Systematic functional decomposition.
– Control theory.
Autonomic Computing Architecture
• Autonomic systems typically
involve a control loop, with
monitoring information driving
planning and decision making.
• IBM’s Autonomic Computing
Toolkit provides components
that implement a functional
decomposition known as MAPE
(Monitor, Analyze, Plan and
• The toolkit provides
implementations for several of
the components (in particular
Monitor and Analyze).
J.O. Kephart, D.M. Chess, The Vision
of Autonomic Computing, IEEE
Computer, 36(1), 41-50, 2003.
Data Management and MAPE
• Sensors: what monitoring
information should a database
platform expose to enable
effective decision making?
Effectors: what hooks should a
database platform expose to
enable effective runtime
It is not straightforward:
– to retrofit sensing and effecting
– to predict what may be
• Monitor, Analyze, Plan and
Execute components may also
be able to be implemented in
different ways.
• Generic monitoring components
have been proposed for tracking
query progress and for
– A. Gounaris, N.Paton, A.
Fernandes, R. Sakellariou, SelfMonitoring Query Execution for
Adaptive Query Processing,
Data and Knowledge Eng.,
51(3), 325-348, 2004.
– L. Luo, J. Naughton, C.
Ellmann, M. Watzke, Towards a
progress indicator for database
queries, SIGMOD, 791-802,
Monitoring Query Progress
• Progress monitoring predicts properties of an operator
incrementally from monitored data.
• Raw monitoring data may count the number of tuples returned
by an operator, the average tuple size, etc.
• From such information, operator selectivity, result size and
runtime can be estimated.
• Unnest:
–  = (nout / nin)
– cardinality = cardinalityoperand * 
– size = cardinalityoperand *  * avg(sizeresult_tuple)
– time = cardinalityoperand *  * tuple_build_cost
Building Adaptive Databases
• Most adaptive database extensions involve hard
coding changes to the existing code base.
– Complex core infrastructure subject to intrusive changes.
– Steep learning curve for developers of adaptive extensions.
– Incremental changes result in reduced reuse.
• With respect to MAPE:
– Growing experience with generic monitoring.
– Considerable diversity in Analyze, Plan and Execute.
– Control theory provides some insights into decision making.
Control Theory
• Provides a systematic framework for computing a
change to an input given a measured output.
• Designs seek to exhibit SASO properties:
– Stable: bounded input gives bounded output.
– Accurate: measured output converges on desired value.
– Short Settling: converges to stable value quickly.
– No Overshoot: achieves objectives in a steady manner.
• Either find a control engineer, learn the book, or apply
a well established model.
– J.L. Hellerstein, Y. Diao, S. Parakh, D.M. Tilbury, Feedback
Control of Computing Systems, Wiley, 2004.
Control Theory: PID Controllers
PID Controllers Example
• Task: evaluating queries from a queue over a server.
• Objective: keep all query evaluation in memory to
avoid use of multi-pass algorithms.
• Goal for controller: keep the amount of free memory
at 512Mb in order to ensure condition met.
• Control parameter: multiprogramming level.
Proportional Controller
• Terminology:
– m: output signal.
– Kp: proportional gain.
– e: error.
• Definition: m = Kpe.
• Query processing example:
– m: multiprogramming level.
– e: (amount of free memory – 512Mb).
– Kp: 1/(job size in Mb): assumed 0.01, as 100Mb jobs.
Proportional Controller: Example
e: Error
m: Multiprogramming Level Change
Integrative and Derivative Controllers
• Integrative Controller:
– Controller output depends on level and duration of error.
– Ki: proportional gain.
– Ti: integral time.
– Definition:
. Ki
• Differential Controller:
– Controller output depends on rate of reduction in error.
– Kd: differential gain.
– Td: derivative time.
– Definition:
. Kd
Control Theory for Data Management
• There are currently rather few examples of control
theory being used in data management. Recent
example in grid query processing:
– Anastasios Gounaris, Christos Yfoulis, Rizos Sakellariou and
Marios Dikaiakos, Self-optimizing Block Transfer in Web
Service Grids, WIDM, 2007.
• Modelling the relationship between measured values
and controlled inputs can be challenging.
• Many adaptive data management techniques change
more than an input parameter. For example:
– A query may be reoptimized by an adaptive query processor.
Limitations: Composability
• Many proposals for autonomic data management
focus on specific adaptations:
– Selecting views for materialization.
– Selecting data for replication.
– Selecting fields for indexing.
– Allocation of memory to functions.
• … however, such decisions are often inter-related,
and modelling the inter-relationships between such
strategies is challenging.
Query Processing Inter-Dependency
• Load imbalance results from
inappropriate allocation of work
to resources in partitioned
• Bottlenecks result from
inappropriate allocation of work
to resources in pipelined
• There is no benefit from
resolving load imbalance if the
bottleneck is elsewhere in the
• Resolving load imbalance may
change the location of the
Limitations: Semantics
• Property guarantees:
– Autonomic systems change behaviour mid-task.
– Non-trivial adaptations may leave uncertainty as to whether
an adaptation is meaning-preserving.
– Few adaptations have had their meaning-preserving
properties proved:
• K. Eurviriyanukul, A. Fernandes, N. Paton, A Foundation for the
Replacement of Pipelined Physical Join Operators in Adaptive
Query Processing, EDBT Workshops, 589-600, 2006.
Limitations: Semantics
• Performance guarantees:
– Autonomic behaviour may take certain risks with
– Some proposals may redo work, leading to the need for
thresholds to remove the risk of continuous reoptimization:
• V. Markl, V. Raman, D. Simmen, G. Lohman, H. Pirahesh:
Robust Query Processing through Progressive Optimization.
SIGMOD Conference 2004: 659-67.
– Some algorithms provide bounded worst case performance:
• Daniel M. Yellin: Competitive algorithms for the dynamic
selection of component implementations. IBM Systems Journal
42(1): 85-97 (2003).
Summary on Limitations of Automation
• Automation is currently partial in scope and often ad
hoc in development.
• Automation is a second class citizen in data
management; there is interest in the benefits it can
bring but not so much in automation per se.
• As a result, automation in data management can be
seen as immature, with considerable scope for
improving the predictability, composability and clarity
of proposals through enhanced methodologies.
Increasing Manageability - 1
• Database products:
– Commercial database systems are typically associated with high
total cost of ownership, resulting in significant measure from high
administrative costs.
– Vendors are seeking to improve competitiveness by automating or
supporting management of their intrinsically complex products.
• Data management components:
– It has been suggested that current database products are too
complex, and that more data should be managed by lighter weight
– As of yet, there is little evidence that light-weight data management
components are being designed with automation in mind, but this is
perhaps a practical proposition.
Increasing Manageability - 2
• There are increasing needs to manage personal data, and data
management within workgroups or laboratories is often hindered
by the complexity of current data management platforms.
• Personal and workgroup data management often has evolving
requirements, but rarely needs the full range of capabilities of
current database products.
• Proposals in this space:
– Data services: I. Subasu, P. Ziegler, K. Dittrich: Towards ServiceBased Database Management Systems. BTW Workshops 2007:
– Data components: S. Chaudhuri, G. Weikum: Rethinking Database
System Architecture: Towards a Self-Tuning RISC-Style Database
System. VLDB 2000: 1-1.
Increasing Reach - 1
• Most automation in data management has sought to
ask the question:
– Which current requirements can be met better by increasing
the ranges of tasks that are carried out automatically?
• An alternative view gives rise to a different question:
– If we assume that there is to be no manual administration,
what sorts of data management system can be developed?
Increasing Reach - 2
• The vision of dataspaces is to support database style access
over diverse sources with minimal manual integration.
– A. Halevy, M. Franklin, D. Maier: Principles of dataspace systems.
PODS 2006: 1-9.
• Preliminary proposals match schemas automatically but
partially, thus giving approximate answers that can be ranked.
– J-P. Dittrich, M. Salles: iDM: A Unified and Versatile Data Model for
Personal Dataspace Management. VLDB 2006: 367-378.
– S. Abiteboul, N. Polyzotis: The Data Ring: Community Content
Sharing. CIDR 2007: 154-16.
• The challenge is to enable querying over structured data in a
personal file store, within an organisation or at internet scale,
with no manual integration.
• Automation is already in lots of places:
– Database administration.
– Query evaluation.
– Data integration.
• Automation in data management is not mature:
• If automation becomes a more central focus:
– Understanding of automation per se should improve.
– The nature of data management systems will change.