* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 2
Survey
Document related concepts
Data Protection Act, 2012 wikipedia , lookup
Expense and cost recovery system (ECRS) wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Data center wikipedia , lookup
Concurrency control wikipedia , lookup
Data analysis wikipedia , lookup
Versant Object Database wikipedia , lookup
Relational model wikipedia , lookup
Information privacy law wikipedia , lookup
3D optical data storage wikipedia , lookup
Data vault modeling wikipedia , lookup
Clusterpoint wikipedia , lookup
Transcript
INFO 340 Lecture 2 Intro to Databases • Book – Need it – Order individually through UW Bookstore – Or Amazon Previous Class • How many cards did your system require total? • How difficult was it to answer the standard queries? – The ad-hoc queries? What if -- iTunes Store via File Based Approach • • Imagine all details of a song encapsulated in a single file (or nest of files) Problems – Difficulty in accessing data • We need that particular application to access the data – Duplication of data • do we want a separate copy of the album cover for every single song ? – Data dependence • What if we want to get at the same data, but with a different application? – File format incompatibilities • What if the applications use different file formats? Ie one uses text, another XML, another binary, etc – Fixed queries • Largely dependent on app developer – Results in fixed (non- ‘ad-hoc’) queries and/or – Multiple applications Limitations of Files • • • • • • Separation and isolation of data Duplication of data Data Dependence Incompatible file formats Fixed queries / proliferation of apps Two main problems emerge: – Data definition is embedded in application – No control over access and data manipulation. Introducing Databases • Solves many problems introduced in a file system. • Multi-user, simultaneous connections • Security • Separation of application and data • Common query language • One copy of data Database Def (pg 15) A shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization Types of databases • Relational (we’ll focus on this type in this class) • Network and Hierarchical model • Object Oriented Database are “Self Describing” • System catalog • Data dictionary • Meta-data It is this local or in-place description of the data that allows “program-data independence” (BTW good quiz item) Data abstraction • Similar to programming classes – You can change the internal coding to the object (instantiated class) as long as the external behavior (behavior presented to the outside world) doesn’t change • Similarly, in databases, we can separate the data from the application Logically related • Entities -- distinct object • Attributes -- describes some aspect of the object • Relationships -- associations between the entities DBMS Database Management System • Provides facility to define the database via a Data Definition Language -- DDL • Provides facility to store, retrieve, & manipulate data through Data Manipulation Language – SQL (Structured Query Language) is the standard – Note SQL & MySQL are 2 different things DBMS • Controlled access to database: – Authorization & possibly authentication – Control of integrity/consistency – Handle multiple simultaneous users (concurrency) – Handle recovery -- return to a known, consistent state – User-accessible catalog Database Application Program • Program that interacts with a database. • Database can expose Views to the application. Views help to – Present the data to the user in a useful format – Choose who sees what • ie not all should see payroll – Customize data presentation • Same db can have many different looks/appearances • Generally better than having multiple databases – Allows db to hide any evolutionary structural changes to the database ‘behind the scenes’ Three (or four) levels of the DBMS External Level Conceptual Level • User’s view of the DB • Can change per user • What data is stored • Complete view of the database and relationships Internal Level • How data is stored • Represents how data is physically stored. Physical Data Organization • Operating system / DBMS interaction. Database Design • Data structure first & application second – Data structure will be the basis – Data structure is also harder to change on the fly – At the design phase, want to think it through as much as possible – Data design decisions can make application design easier (or harder) Database Design • Data Administrator -- “Data Monk” • Database Administrator • Logical Database Designer – Business rules • Ie only HR or finance person can see payroll • Physical Database Designer • Application developers • User types – Naïve – Sophisticated DBMS Advantages • Control (though not elimination) of redundancy • Consistency - reduce points of update/modification – Caused by ‘same’ data existing in multiple places • • • • • • Data sharing Integrity - use of constraints Security - users, roles, privileges Standardization forced Concurrency Backup & recovery DBMS Disadvantages • You pay for robustness • Complexity - Lots of forethought • Cost (upfront, but generally not over time) • Higher impact of failure - centralization • Conversion cost Schema & Instance Schema -- Description of the database Instance -- A functioning database at any point in time -- sometimes refers to a particular running installation DDL & DML • DDL – Language to specify the db • System uses sets of tables itself to support user tables/schema – System catalog – Data dictionary • DML – Language to update the db & retrieve data – Procedural • How – Non-Procedural • What • SQL DBMS Functions • Data storage, retrieval, update • User-accessible catalog – Data about data -- metadata – System catalog • Transaction support -- ie ‘rollback’ – Failure of db, network, etc before operation is complete • Concurrency support • Authorization services • Data comm support • Integrity services • Utility services - imports, exports, analysis, monitoring/alerts Some Behind the Scenes Processes of DBMS • • • • • • • • • Query Processor File Manager DML preprocessor DDL compiler - creates metadata tables Authorization control Integrity checker Query optimizer Transaction processor Recovery manager Entity Relationship Modeling/Entity Relationship Diagram (ERD) Introduction Example - Used car business with multiple lots, multiple cars, multiple sales people color make address name Engine size Phone number car M year M has has 1 1 Sales lots location Lighted? size salesperson In-class work • Groups of 5 • Create ERD for following 3 scenarios – iTunes (w/customers, songs, artists) – Live inventory (w/computers, users, & ‘onnet’ snapshots) – Hacker attack incident database • Attacker, victim, locations, computers (addresses, OS’s, other pertinent info,etc) • Present your results for each Assignment • Read Chapter 3-4 of Connolly & Begg