Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Simplicity as applied to Relational Databases David Livingstone IMLab, CEIS May 2007 A Dangerous Topic Suppose the talk is : – difficult to understand ? – boring ? Simple things are ‘obvious’ with hindsight. Simple things are of limited use. Academics don’t ‘do’ simple things. They do difficult things. Simplicity Assumed to be good and worthwhile. In research : • Inherently important in the concepts dealt with; • Important objective to achieve in the results. Brought home to me in the feedback from my PhD Mid-Point Progression. Need to deal with simplicity explicitly and clearly. Overview • My own introduction to simplicity What about Simplicity ? • What simplicity is NOT • Importance of simplicity ‘Model’ of Simplicity. “Open Database Project”. My PhD project. Simple Computer Languages • APL • Unix Shell Languages 1 means of holding data; 1 means of executing processes. Completely generalised ‘means’. High level of abstraction; algebra programming style. Higher programming productivity (1 order of magnitude). “No. of lines of valid code written per day is independent of the language”. Power was the motivation for use, not simplicity. • Relational DBs with Relational Algebra. Simplicity in Physics • Raise level of abstraction. • Raise level of generality. Example : electricity & magnetism are special cases of electromagnetism. Example : unify the 4 forces of nature into one. (Needs 10 & 26 dimensions !) Example : gravitons inferred from gravity waves. Beauty : simplicity, elegance, symmetry. (Hyperspace by Michio Kaku). Einstein developed relativity because symmetry was more fundamental than Newtonian space-time. reform space-time to fit symmetry. What Simplicity is NOT (1) NOT Minimalism Minimalism provides simplicity by limiting explicit functionality. Minimalism Essentiality. Essentiality maintains functionality. Codd used ‘Essentiality’ to create relational DBs. • Only one essential data construct, the relation. • Earlier database models had 2 or more data constructs, but only the functionality of relations. greater complexity. (NB Each construct requires its own operators). What Simplicity is NOT (2) NOT (necessarily) intuitive. “Intuition is simply a state of subconscious knowledge that comes about after extended practice”. “Difficult tasks will always have to be taught. The trick is to ensure the technology is not part of the difficulty”. (Donald Norman). “Although a programming language is unlikely to contribute directly to a solution, it may obstruct solution, even contributing to errors and oversights”. (Petre) ‘Intuition’ ≈ Skill. The “2-year programming experience” Catch-22. ‘Wrong’ experience may require ‘un-learning’ Proponents of OO Programming insist on the need to ‘think in OO terms’ to be able to program effectively. “The use of COBOL cripples the mind; its teaching therefore should be regarded as a criminal offence”. “It is practically impossible to teach good programming to students that had a prior exposure to Basic ... they are mentally mutilated beyond hope of regeneration”. (Dijkstra). “The tools we use have a profound (and devious !) influence on our thinking habits and therefore on our thinking ability”. (Dijkstra) Desire for Simplicity “Everything should be made as simple as possible, but not simpler”. (Albert Einstein). “Entities should not be multiplied without necessity”. (Ockham’s Razor). “The aim of science is always to reduce complexity to simplicity”. (William James). “Great engineering is simple engineering”. (James Martin). “Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away”. (Antoine de Saint-Exupery). Note there’s an Irreducible Minimum Practical Importance of Simplicity “If projects or programmes are overly complex, there is a good chance they are simply wrong”. (Brian Jones, IBM). “Complexity leads to design problems & greater risk of error”. (Martyn Thomas, Praxis MD). “There are no complex systems that are secure. Complexity .. almost always comes in the form of features or options”. (Ferguson & Schneier). “.. that a product with fewer features might be more usable, more functional, & superior .. is considered blasphemous”. (Donald Norman). Overview (2) What about Simplicity ? ‘Model’ of Simplicity. • The Simplicity Required • Simplification Principles “Open Database Project”. My PhD project. What Kind of Simplicity is Required ? Mental Model Principle - “People understand & interact with systems & environments based on mental representations developed from experience”. (Universal Principles of Design). “A good conceptual model is .. fundamental to good design”. “Good designers present explicit conceptual models for users”. “Start with a simple, cohesive conceptual model and use it to direct all aspects of the design”. (D. Norman - “Design of Everyday Things”). The simplicity required of a software product is that of its conceptual model. Conceptual Integrity Fred Brooks - “The Mythical Man-Month”. • “Because ease of use is the purpose, the ratio of function to conceptual complexity is the ultimate test of system design. Neither function alone nor simplicity alone defines a good design”. • “Simplicity and straightforwardness proceed from conceptual integrity. Every part must reflect the same philosophies & the same balancing of desiderata .. the same techniques in syntax & analogous notions in semantics”. Simplification Principles • Parsimony of concepts. • Simplicity. Straightforward concepts. Terseness. Elegance. • Generality. No limitations/exceptions. Exceptions / limitations, & ways round them, need to be modelled complexity & less functionality. • Orthogonality. Each concept is independent of every other concept, so that they can be combined in any arbitrary way. • Uniformity. Consistency, regularity, naturalness. Obtaining Simplicity in an Application • Raise the level of abstraction as much as possible. • Derive a conceptual model that has conceptual integrity. Use the simplification principles to achieve this. Use ‘essentiality’ to achieve simplicity & elegance. • Implement the model with as much automation as is feasible. (Defaults can be useful). • Separate the model from its implementation. May need complex software to get a simple conceptual model. Overview (3) • What about Simplicity ? • ‘Model’ of Simplicity. • “Open Database Project”. • My PhD project. • The Third Manifesto • What about SQL ? • Produce a Proof of Concept of TTM. Today’s Relational DB ‘Problem’ How to handle more sophisticated data in a simple manner ? R A1 ... A2 ... A3 ... ... ... ... ... ... ... ... ... ... A4 TTM ‘cleans up’ : relational aspects, confusion between scalars and containers, scalar typing. greater simplicity and functionality. The Third Manifesto by Chris Date & Hugh Darwen • “A formal proposal for a solid foundation for data & database management systems”. • Based on - the relational model, - type theory. • Aim : to solve the problem of how to support new kinds of data (e.g. pictures, music, maps) in relational DBs. • Concerns principles; derives a logical model. • Implementation of the logical model is a separate matter. Physical Data Independence. SQL Has Problems ! SQL is now > 30 years old. SQL doesn’t fully apply relational theory. For example : • Doesn’t fully apply set theory (duplicate/sequenced rows, sequenced columns). SQL sometimes contradicts relational theory. For example : • Implementation pointers appear among logical data. Poor language design. For example : • Many ad hoc constraints on applicable expressions. Complications in adding data types, nested containers. SQL is unnecessarily complex & limited. Open DB Project : TTM Proof of Concept Uses RAQUEL (= Relational Algebra Query, Update & Executive Language) : • One means of holding data, relations. • One means of deriving relational values, operators. • One means of manipulating relational variables, assignments/actions. Unlike traditional programming languages, DBs need multiple ‘assignments’; e.g. retrieve, insert, delete. These have been generalised, and include assigning integrity constraints. (Concept developed in earlier research). Includes sublanguages to : • handle DB aspects (c.f. relational) : schemas and storage; • scalar data types. These have the same structure and style as the relational sublanguage. Overview (4) What about Simplicity ? ‘Model’ of Simplicity. “Open Database Project”. My PhD project. • Nested containers & their complexity • Removing their complexity without loss of functionality 10 Kinds of Container Type R A1 ... A2 ... A3 ... ... ... ... ... ... ... ... ... ... A4 PhD ‘cleans up’ kinds of container type. • Relations • Records/structs • Sets • Dictionaries • Bags • Lists • Queues • Stacks • Arrays • Insertable arrays Complexity Arising From 1 kind of container type - the relation - to 10. • 10 different structures, • 10 different sets of operators. Not always used in isolation, sometimes together. master how to combine them. n (n - 1) 2 possible pairs 45 possibilities. n (n - 1) (n - 2) 2 possible trios 360 possibilities. In practice, no SQL product currently provides more than 4 different kinds of container type. Reduction to 3 Container Types Relations • Relations • Records/structs • Sets • Dictionaries • Bags Bags • Sequences Lists • Queues • Stacks • Arrays • Insertable arrays Special implementations of relations. Different versions of sequences. Generalisation of Kinds of Container Generalise the containers. • Relation ≡ set of tuples. • Bag ≡ bag of tuples. • Sequence ≡ sequence of tuples. • Simple mapping : bags & sequences sets / relations. Generalise the operators. • The corresponding operator is provided for each kind of container (as far as possible). • Operators provide closure (as for relations), plus conversion operators. Exploit the Nesting of Containers Nesting is orthogonal. ‘External’ as well as ‘internal’ containers can be sets, bags or sequences. R A1 ... A2 ... A3 ... ... ... ... ... ... ... ... ... ... A4 Conclusion • The simplicity required is that of the user’s conceptual model of the software product. • Simplicity maximises the ‘power to weight’ ratio of the software for the user. • The software implementation may need to be (very) complex to achieve simplicity for the user. ( Apply principles of simplicity again in a layered architecture ? ) Acknowledgements Nick Rossiter (PhD Supervisor) Open DB Project Group Paul Irvine Chris Date Hugh Darwen Third Manifesto authors Paul Vickers Akhtar Ali Mid-Point Progression Alas, the mistakes are all mine.