Download database - San Diego Supercomputer Center

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Encyclopedia of World Problems and Human Potential wikipedia , lookup

Microsoft Access wikipedia , lookup

Commitment ordering wikipedia , lookup

Oracle Database wikipedia , lookup

Serializability wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

PL/SQL wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Ingres (database) wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

SQL wikipedia , lookup

Concurrency control wikipedia , lookup

Functional Database Model wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational algebra wikipedia , lookup

Database wikipedia , lookup

Healthcare Cost and Utilization Project wikipedia , lookup

Versant Object Database wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
NEU 221: Neuroinformatics Seminar
Introduction to Databases
Bertram Ludäscher
[email protected]
San Diego Supercomputer Center
U.C. San Diego
Outline
•
•
•
•
What is a DB and why should I care?
DB Basics & Architecture
Relational Model (SQL, ER)
Extended/Other Models
– Deductive Databases
– Object-Oriented Databases
– Semistructured/Graph-Databases
B. Ludäscher: Introduction to Databases
2
What is a Database?
• The term database can stand for ...
– a concrete collection of data (books@amazon,
CCDB@NCMIR)
– a system (software & hardware) for storing and managing
databases (=> Database Management System: DBMS + DB)
• Underlying data model => Type of DBMS (short: DB)
–
–
–
–
relational model: based on relations (“tables”) and entities
object-oriented model: complex objects, classes
object-relational model: relations + objects
XML: “semistructured” model, trees
• Specialized/extended models
– deductive DBs
– multimedia DBs
B. Ludäscher: Introduction to Databases
3
Functions of a DBMS (aka what does it buy me?)
• Persistent Data Storage
– but don’t forget to backup!
• Efficient & High-Level Querying of Very Large Datasets
– file systems + your homegrown “scans” won’t do for VLDBs!!
• Same for Updates: insert, delete, and modify
• Data Integrity, Security
– Checking/enforcement of integrity constraints
– Access control
• Concurrent (multi-user) Access, Transactions, Recovery
• Robust, Scalable Data Management Solutions
B. Ludäscher: Introduction to Databases
4
3-Level ANSI/SPARC Database Architecture
• external (user) level
• conceptual (logical) level
• internal (physical) level
View -1
View -2
View -n
conceptual/logical schema
=> Data independence
– logical data independence
– physical data independence
B. Ludäscher: Introduction to Databases
physical schema
5
Concurrency Control
• Concurrent execution of simultaneous requests
– long before web servers where around...
– transaction management guarantees consistency despite
concurrent/interleaved execution
• Transactions
= Sequence of DB operations (read/write)
– Atomicity: a transaction is executed completely or not at all
– Consistency: a transaction creates a new consistent DB state,
i.e., in which all integrity constraints are maintained
– Isolation: to the user, a transaction seems to run in isolation
– Durability: the effect of a successful (“committed”) transaction
remains even after system failure
B. Ludäscher: Introduction to Databases
6
The Relational Model
• Relation/Table Name:
employee
Emp
john
anne
bob
jane
Salary
60k
62k
57k
45k
Deptno
1
2
1
3
– employee, dept
• Attributes = Column Names:
– Emp, Salary, Deptno, Name, Mgr
• Relational Schema:
– employee(Emp:string, Salary:float,
DeptNo:integer), ...
dept
DeptNo Name
1
Toys
2
Sales
3
Shoes
B. Ludäscher: Introduction to Databases
Mgr
anne
anne
tim
• Tuple = Row of the table:
– (“anne”, “62000”, “2”)
• Relation = Set of tuples:
– {(...), (...), ...}
7
Database Design: Entity-Relationship (ER) Model
Name
Salary
Employee
•
•
•
•
Entities:
Relationships:
Attributes:
ER Model:
Name
works-for
Manager
Department
since
– initial, high-level DB design (conceptual model)
– easy to map to a relational schema (database tables)
– comes with more constraints (cardinalities, aggregation) and
extensions: EER (is-a => class hierarchies)
– related: UML (Unified Modeling Language) class diagrams
B. Ludäscher: Introduction to Databases
8
Example: Creating a Relational Database in SQL
CREATE TABLE employee (
ssn
CHAR(11),
name
VARCHAR(30),
deptNo
INTEGER,
PRIMARY KEY (ssn),
FOREIGN KEY (deptNo) REFERENCES department )
CREATE TABLE department (
deptNo
INTEGER,
name
VARCHAR(20),
manager
CHAR(11),
PRIMARY KEY (deptNo),
FOREIGN KEY (manager) REFERENCES employee(ssn) )
B. Ludäscher: Introduction to Databases
9
Important Relational Operations
• Select(Relation, Condition)
– filter rows of a table wrt. a condition
• Project(Relation, Attributes)
– keep the columns of interest
• Join(Rel1, Att1, Rel2, Att2, Condition)
– find “matches” in a “related” table
– e.g. match Rel1.foreign key = Rel2.primary key
• Union (“OR”), Intersection (“AND”)
• Set-Difference (“NOT IN”)
B. Ludäscher: Introduction to Databases
10
Why (Declarative) Query Languages?
“If you have a hammer,
everything looks like a nail.”
,,Die Grenzen meiner Sprache bedeuten die Grenzen meiner Welt.”
“The limits of my language mean the limits of my world.”
Ludwig Wittgenstein, Tractatus Logico-Philosophicus
• Things we talk and think about in PLs and QLs
– Assembly languages: registers, memory locations, jumps, ...
– C: if-then-else, for, while, memory (de-)allocation, pointers, ...
– Object-oriented languages:
•
•
•
•
C++: C plus objects, methods, classes, ...
Java: objects, methods, classes, references, ...
Smalltalk: objects, objects, objects, ...
OQL: object-query language
– Functional languages (Haskell, ML):
• (higher-order) mappings, recursion/induction, patterns, ...
=> Relational languages (SQL, Prolog)
• relations, relational operations: , , , , ..., ,,,,,..., , , 
=> Semistructured/XML (Tree) & Graph Query Languages
B. Ludäscher: Introduction to Databases
11
Example: Querying a Relational Database
input tables
employee
dept
Emp
anne
john
DeptNo Mgr
1
anne
2
anne
Salary Deptno
62k
2
60k
1
join
SQL query
(or view def.)
SELECT Emp, Mgr
FROM employee, dept
WHERE employee.DeptNo = dept.DeptNo
result
answer
(or view)
B. Ludäscher: Introduction to Databases
Emp
john
anne
Mgr
anne
anne
12
Query Languages for Relational Databases
B. Ludäscher: Introduction to Databases
13
Deductive Databases (DATALOG) Syntax
B. Ludäscher: Introduction to Databases
14
DATALOG: Examples for Relational Operations
B. Ludäscher: Introduction to Databases
15
Recursive DATALOG Example: Transitive Closure
B. Ludäscher: Introduction to Databases
16
Non-Relational Datamodels
• Relational model is “flat”: atomic data values
– extension: nested relational model (“tables within tables”, cf.
nested HTML tables)
– values can be nested lists {...}, tuples (...), sets [...]
– ISO standard(s): SQL
– identity is value based
• Object-oriented data model:
– complex (structured) objects with object-identity (oid)
– class and type hierarchies (sub-/superclass, sub-/supertype)
– OODB schema may be very close to “world model” (no
translation into tables)
(+) queries fit your OO schema
(-) (new) queries that don’t fit nicely
– ODMG standard, OQL (Object Query Language)
B. Ludäscher: Introduction to Databases
17
Example: Object Query Language (OQL)
SELECT DISTINCT STRUCT(
E: e.name,
C: e.manager.name,
M: ( SELECT c.name
FROM c IN e.children
WHERE FOR ALL d IN e.manager.children: c.age > d.age ) )
FROM e IN Employees;
• Q: what does this OQL query compute?
• Note the use of path expressions like e.manager.children
=> Semistructured/Graph Databases
B. Ludäscher: Introduction to Databases
18
A Semistructured (Graph) Database
B. Ludäscher: Introduction to Databases
19
Querying Graphs with OO-Path Expressions
?- dblp."Inf. Systems".L."Michael E. Senko".
Answer:
L="Volume 1, 1975”;
L="Volume 5, 1980".
?- dblp."Inf. Systems".L.P, substr("Volume",L),
P : person.spouse[lives_in = P.lives_in].
B. Ludäscher: Introduction to Databases
20
Constructs for Querying Graphs
Example: ?- dblp . any* . (if(vldb)| if(sigmod))
B. Ludäscher: Introduction to Databases
21