Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Database History Why we want to use them Copyright © 1998-2016 by Curt Hill Six Generations of Data Management • • • • • • • Manual – Prehistory Punch cards – 1900 Programmed Unit record – 1955 Online database – 1965 Client Server, Relational – 1980 Multimedia – 1995 All of these still continue Copyright © 1998-2016 by Curt Hill Manual • Writing has existed for millennia • Kings used writing to inventory their goods, record their laws • Sumerian tablets date from 2000 BC or before Copyright © 1998-2016 by Curt Hill Punch cards • Originally used by Jacquard to program silk weaving machines – Not really data management • Herman Hollerith used to record census data in 1890 • Created a suite of machines that would punch, sort, print and tabulate from cards – Programmed by rewiring control panels • Known as unit record or electronic accounting machines Copyright © 1998-2016 by Curt Hill Machines Copyright © 1998-2016 by Curt Hill Programmed Unit record • Stored program computers change the face of data management • Cards still used for input • Tapes store the data much more densely than cards • Programming removes the limits on what sort of calculations or transformations may be done on the data • Produced a file-oriented record processing approach Copyright © 1998-2016 by Curt Hill Consider an example • A college has many files that describe their system – – – – – – – Faculty with department and specialty Staff with position Payroll Catalog of courses Grades Students Among many others • We will look at payroll and grades as an example Copyright © 1998-2016 by Curt Hill Payroll File Fields • • • • • Name Address Salary/wage Earnings Year to Date Among very many more Copyright © 1998-2016 by Curt Hill Payroll File Name Addr Salary YTD Martin, Barb 94 Ridge Jones, Al 121 Colton $30,000 $21,402 Johnson, Tom 604 46th $32,000 $25,425 $35,000 $26,952 Copyright © 1998-2016 by Curt Hill Grades Fields • Course – Including section • • • • Student name Term Letter grade Instructor Copyright © 1998-2016 by Curt Hill Grades File Crs. Name Term Grade Instr Math 103 Smith, Bill Fall 03 A Johnson CIS 385 Smith, Bill Fall 03 C Hill Hist 101 Johns, Sarah Fall 03 B Rupert Copyright © 1998-2016 by Curt Hill Background Vocabulary • Fields – Collection of related characters • Records – Collection of related fields • Files – Collection of related records • Database – Collection of related files Copyright © 1998-2016 by Curt Hill How to use • Actions we might want on each file: – – – – Create Update (add, remove, change records) Sort Generate any of several reports • Each action for each file would be a program for an overworked programming staff – Typically a COBOL program – Eight programs, or sections of programs, for two files • These are typically done in a batch environment Copyright © 1998-2016 by Curt Hill Online database • Many things do not work well in batch mode: – Travel reservations need up to the second information – The database is born • Started out as disk based unit record, but that is not the best organization for this type of application • Originally developed into two models: – Hierarchical and network Copyright © 1998-2016 by Curt Hill Client Server, Relational • EF Codd suggests the relational model and he and others develop a substantial theoretical base • Queries may now be simple and short – Needs to know a schema, but not complete organization – This allows transmission of a simple query – Client server computing is born Copyright © 1998-2016 by Curt Hill Relational Database • The key – All the programs previously described are about the same – every update is nearly the same – All that changes is the underlying file • The solution – Describe the file in a general way – Generate a program that handles the file based on the description Copyright © 19982016 by Curt Hill How to describe a file • A file is a collection of records • Each record is a collection of fields – Typically only one type of record in a file • Each field is described by a: – Name – Type • For example numeric, string, boolean etc. – Length • Booleans have a predefined length, others require specification Copyright © 19982016 by Curt Hill SQL • Structured Query Language • Has become the “standard” for queries • A relational database does not have to accept SQL – Unless it wants to be commercially viable • SQL is mostly declarative but with some procedural features – Declarative – what is wanted – Procedural – how to get it Copyright © 1998-2016 by Curt Hill The Files Name Addr Salary YTD Martin, Barb 94 Ridge $32,000 $25,425 Jones, Al 121 Colton $30,000 $21,402 Johnson, Tom 604 46th Crs. Name $35,000 Term $26,952 Grade Instr Math 103 Smith, Bill Fall 03 A Johnson CIS 385 Smith, Bill Fall 03 C Hill Hist 101 Johns, Sarah Fall 03 B Rupert Copyright © 1998-2016 by Curt Hill What is wrong with the original example? • Redundancy in faculty description – Space is wasted – Discrepancies may occur between grades and payroll • Some reports need to access multiple files – Eg. Transcript generation – Complicates the programming issue Copyright © 19982016 by Curt Hill Advantages (1 of 3) • Data independence – Application program no longer need some or all of the files – Do not know or care how data is stored, aka abstraction – Simplifies application development • Efficient access – The DBMS employs sophisticated access techniques seldom used by normal programmers Copyright © 19982016 by Curt Hill Advantages (2 of 3) • Integrity constraints – The DBMS may check data in a way seldom done in normal file processing – Eg. Account validity • Security – A DBMS may enforce requirements on who can access the data and in what way Copyright © 19982016 by Curt Hill Advantages (3 of 3) • Administration – Minimize redundancy – Manage sharing of the data – Optimize for the enterprise, not a small group – Easier to backup the data • Concurrent access – Manages the simultaneous update problem Copyright © 19982016 by Curt Hill Disadvantages • A DBMS is: – – – – Complex Expensive Bulky Simple file access is much quicker and less expensive – In the era of inexpensive hardware these are not as significant as they once were • The view a DBMS provides may not be helpful to a particular application Copyright © 19982016 by Curt Hill The Champion • The relational model was undisputed champion for a long time • It is still the most widely used database model • There are now many alternatives • Most fill a niche for which the relational model is not very well suited – There are many such niches now Copyright © 1998-2016 by Curt Hill NoSQL • All manner of new types of databases have now appeared • These are the so called NoSQL databases – SQL is the universal query language – NoSQL may mean no SQL or Not Only SQL • This is a field that is not finished developing Copyright © 1998-2016 by Curt Hill Finally • Enterprises have learned that the database is foundational to business intelligence • It is imperative to be able to store and retrieve the data in both a simple way and also in way not previously anticipated Copyright © 1998-2016 by Curt Hill