Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Clusterpoint wikipedia , lookup

Database wikipedia , lookup

Relational algebra wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Versant Object Database wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Geog 480: Principles of GIS
Guofeng Cao
CyberInfrastructure and Geospatial Information Laboratory
Department of Geography
National Center for Supercomputing Applications (NCSA)
University of Illinois at Urbana-Champaign
What we have learned
• Database concepts:
• Database and Database Management System (DBMS)
• Elements of a DBMS
• Transaction Management: Recovery and Concurrency
• Relational Database
o Relations (tables)
o Operations on relations:
select, project, join
o Relational database and spatial data:
• Structure of spatial data does not naturally fit with tables
• Performance is impaired by the need to perform multiple joins
with spatial data (spatial join)
• Indexes are non-spatial in a conventional relational database
Conceptual data model
• A conceptual data model provides a
model of the proposed system that
is independent of implementation
details
• An effective conceptual model will
o provide a means for communication
between analysts, designers and
users
o aid the design of the system
o provide basic reference material for
implemented system
Entity relationship model #1
• The entity relationship model is a
conceptual data modeling
technique where
o An entity type represents a
collection of similar objects
o An entity instance is an
occurrence of a particular entity
o An attribute type is a property
associated with an entity
attribute type
entity type
• An attribute type that serves to
uniquely identify an entity type is
called an identifier /key
o Identifiers/keys are usually
underlined
identifier
Entity relationship model #2
• Entity types are connected using
relationships
o A relationship type connects one or
more entity types
o A relationship occurrence is a
particular instance of a relationship
• Relationships may have their own
attributes independent of entities
• Entity, attribute, and relationship
types are shown in an entity
relationship diagram (E-R diagram)
relationship
type
Entity relationship model #3
• Relationship types may be
o many-to-many: e.g., a town may have many road, which in turn may
pass through many towns
o many-to-one: e.g., a town may have many cinemas, but a cinema can
be located in at most one town
o one-to-one: e.g., a cinema may have one manager who manages only
one cinema
• These constraints constitute cardinality conditions
Entity relationship model #4
• In addition to cardinality conditions, relationships may also have
participatory conditions:
o optional or mandatory (indicated with a double line)
• A relationship from an entity to itself is called involutory
Buddies
Drinkers
• A relationship connecting three entities is called a ternary
relationship
Design Guidelines
• Avoid redundancy.
• Don’t use an entity when an attribute will do.
8
Avoiding Redundancy
• Redundancy occurs when we say the same thing in two
different ways.
• Redundancy wastes space and (more importantly)
encourages inconsistency.
o The two instances of the same fact may become inconsistent if we change one and
forget to change the other, related version.
9
Example: Good or Bad?
name
Beers
name
ManfBy
addr
Manfs
manf
This design states the manufacturer of a beer twice: as an
attribute and as a related entity.
10
Example: Good or Bad?
name
Beers
name
ManfBy
addr
Manfs
This design gives the address of each manufacturer exactly once.
11
Example: Good or Bad?
name
manf
manfAddr
Beers
This design repeats the manufacturer’s address once for each
beer; loses the address if there are temporarily no beers for a
manufacturer.
12
Entity Versus Attributes
An entity should satisfy at least one of the following
conditions:
•
o
o
It is more than the name of something; it has at least one nonkey attribute.
or
It is the “many” in a many-one or many-many relationship.
13
Example:Good or Bad?
name
Beers
name
ManfBy
Manfs
14
Example: Good or Bad?
name
Beers
name
ManfBy
addr
Manfs
•Manfs deserves to be an entity set because of the nonkey
attribute addr.
•Beers deserves to be an entity set because it is the “many” of
the many-one relationship ManfBy.
15
Example: Good or Bad?
name
manf
Beers
There is no need to make the manufacturer an entity type,
because we record nothing about manufacturers besides their
name.
16
Extended entity relationship model
• The extended entity relationship model (EER) adds further features:
o An entity type E1 is a subtype of E2 if every occurrence of E1 is also an
occurrence of E2. In this case, E2 is a supertype of E1
o The operation of forming subtypes is called specialization; the inverse
operation of forming supertypes is called generalization
• For specialization (and conversely for generalization)
o A subtype has the same identifying attribute(s) as the supertype
o A subtype has all the attributes of the supertype, and possibly some more
o A subtype enters into all the relationships in which the supertype is
involved, and possibly some more.
• Subtypes and supertypes are organized into an inheritance hierarchy
Extended entity relationship model
• Subtypes may be:
o disjoint: where no occurrence of one subtype is an occurrence
of another
o overlapping: subtypes are not disjoint
• EER uses an extended diagrammatic notation to represent
specialization/generalization constructs
supertype
subtype
disjoint
overlapping
EER for spatial information #1
• E-R or EER can
be used to
model spatial
entities
• Most vectorbased GIS use
a similar
structure
(Coverage file or
Geodatabase of
ArcGIS)
node
directed arc
area
EER for spatial information #2
Relational database design: From E-R model to Database Schema
• An E-R model can be transformed into a relational database
scheme
• Advantageous features for a relational database scheme
are:
o Lack of redundancy (redundant data wastes space and causes
integrity problems)
o Fast access to data
• There usually exists a balance between space (lack of
redundancy) and speed (fast access to data)
o Many relations leads to lower redundancy, but more joins (slower
speed)
o Fewer relations leads to fewer joins (slower speed), but greater
redundancy (and integrity problems)
Example
Star
Cast
Name
Birth year
Film
M
N
Title
gender
Role
director
year
length
Redundancy
• For example, the following relation and relation
scheme will be able achieve fast access but
involves considerable redundancy
Removing redundancy
Building relational schemes
• Another guideline is to ensure relations are in first normal
form, a process known as normalization
• A first pass at building a relational scheme from an E-R
model is to:
o Convert each entity into a relation
o Convert each relationship into a relation
• However, not all relationships will require a relation
(combining relations)
o For entities in a mandatory many to one relation, we can always
opt to define a single joined relation in the relation scheme,
known as posting the foreign key
Example: Relationship -> Relation
name
addr
name
Drinkers
Likes
manf
Beers
husband
1
2
Favorite
Buddies
Likes(drinker, beer)
wife
Married
Favorite(drinker, beer)
Buddies(name1, name2)
Married(husband, wife)
26
Combining Relations
• It is OK to combine the relation for an entity E with the
relation R for a many-one relationship from E to another
entity.
• Example: Drinkers(name, addr) and Favorite(drinker, beer)
combine to make Drinker1(name, addr, favBeer).
27
Risk with Many-Many Relationships
• Combining Drinkers with Likes would be a mistake. It leads
to redundancy, as:
name
Sally
Sally
addr
123 Maple
123 Maple
beer
Bud
Miller
Redundancy
28
• How to represent the following spatial data set in relations?
Object-orientation
Foundations of object-orientation
• The object is at the core of object-orientation
• Objects have attributes that model the static, data-oriented
aspects of a system (similar to tuples in a relation)
o The totality of attribute values constitutes the state of an object
• Objects also have operations that model the behavior of a
system
o Behaviors are also called methods
• Objects with similar behaviors are grouped into classes
o The set of behaviors for a object form an interface
object = state + behavior
Example of object-orientation
Features of object-orientation
• The four main features of object-orientation from a
modeling perspective are:
o Reduces complexity: decomposes complex phenomena into simpler
objects
o Combats impedance mismatch: object-orientation can be applied at
every level of system development
o Promotes reuse: System development is more efficient if
constructed from collections of well-understood components
o Metaphorical power: Objects in object-orientation are metaphors for
physical objects, making the modeling process easier
• In addition, four key constructs are closely associated with
object-orientation: identity, encapsulation, inheritance, and
association
Identity and encapsulation
• An object has an identity that is independent of its attribute
values
o Even if an object changes all its attribute values, it retains its
identity
o Identity is immutable, created with an object and destroyed only
when that object is destroyed
• Objects hide the internal mechanisms of their behavior from
the external access to that behavior, called encapsulation
o What behaviors an object exhibits are separated from how those
behaviors are achieved
o Encapsulation promotes reuse, because changes to an object’s
internal mechanisms will not affect the object’s external interface
Inheritance and polymorphism
• Classes may be organized into an inheritance hierarchy that allows
objects to share common properties
o A class that provides more specialized behaviors is a subclass
o A class that provides more generalized behaviors is a superclass
• Inheritance allows objects to perform different roles within specific
contexts, termed polymorphism
o Inclusion polymorphism is where a subclass is substituted for a superclass
o Overloading is where subclasses implement their own specialized versions
of general behaviors
• There exists two types of inheritance:
o Single inheritance: each class may have zero or one superclasses
o Multiple inheritance: each class may have zero or more superclasses
(requires some protocol for resolving behavioral conflicts)
Class diagram
superclass
behavior
(single) inheritance
subclass
overloading
(polymorphism)
Association
• An association groups objects together to in order to model
phenomena with complex internal structure
• Aggregation is a type of association concerned with
part/whole relationships (e.g. a wheel is “part of” a car)
o Aggregation relationships will form a hierarchy often referred to as a
partonomy
• An association is homogenous if it is formed from objects
all of the same class. E.g., a soccer team is a homogenous
association (aggregation)
• An association is ordered where the ordering of component
objects is important. E.g., a polyline might be a linear
ordering of points
Object-oriented modeling #1
• Object-oriented modeling comprises defining the classes,
attributes, behaviors, associations, and inheritance for a
system
o Attributes for a class can be defined in a similar way to E-R modeling
• Behaviors for a class fall into three categories
o Constructors are behaviors that are activated when an object is
created, while destructors are activated when an object is destroyed
o Accessors are behaviors that may be used to examine the state of
an object
o Transformers are behaviors that change the state of an object
Object-oriented modeling #2
• Defining associations and inheritance relationships is an
iterative and application-dependent process
• As a rule of thumb:
o Inheritance relationships can be detected by using the connection “is a” in a
sentence with two classes. E.g., ‘a car “is a” vehicle’
o Aggregation relationships can be detected using “part of” in a sentence. E.g., ‘a
steering wheel is “part of” a car’
Class diagrams
transformer
association
aggregation
constructor
accessor
attribute
Object-oriented DBMS
• A DBMS that utilizes an object-oriented data model is called an objectoriented DBMS (OODBMS)
• In addition to OO constructs, several other features are needed by
OODBMS
o
o
o
o
Scheme management (ability to create and change class schemes)
Automatic query optimization
Storage and access management
Transaction management
• There exists technical problems with achieving these features:
o System complexity means that there are no longer a few simple operators, like in
relational systems
o Encapsulation means that internal state may be hidden from DBMS
• As a result, performance for OODBMS is lower that for RDBMS
• Hybrid object-relational DBMS (ORDBMS) use a combination of
relational data management and object-oriented “shell” for mediating
user access to the DBMS
Reading
• Chap. 2
• http://www.spatial.maine.edu/~max/oomodeling.pdf
Hands on
• Connecting to Server
o
o
o
o
o
Use openssh client (Start → All Programs → OpenSSH)
hostname = geog480.cigi.illinois.edu
username = netid
port = 22
Enter your netid passwd when prompted
• If successful, you just logged in a Linux system (Ubuntu):
o Out of your comfortable zone
Unix Basics
• Folder and directories
Unix Basic Commands
Directory command:
pwd
cd Exercise1
mkdir Exercise2
rmdir temp
Print the name of the working directory
Change the working directory to Exercise1
Make a new directory and call it Exercise2
Delete the (empty) directory temp
Basic file command:
ls
cat File1
mv File1 File3
cp File1 File3
rm File4
less File1
List the files and directories in the working (current) directory
Display the contents of the file
Change the name of (move) file File1 to File3
Make a copy of File1 and call it File3
Erase (remove) the file File4
Display the contents of File1 a page at a time, q to stop displaying
Connecting to Database
• psql -U username -d database_name
o username = geog480
o database_name = tutorial
o Enter passwd when prompted (same as username)
• Postgres Commands
o
o
o
o
\l List all accessible databases
\dt List all the tables in current DB
\? Help
\q Quite
Operating Database
• Create Table
o create table REPLACE_ME_your_netid (key int, attr varchar(20),value float);
• Insert a row
o insert into your_netid values(1, 'attr0', 100);
o insert into your_netid values(2, 'attr1', 101);
o insert into your_netid values(3, 'attr1', 102);
• List contents of table (Notice that the select statement
allows you to view contents in the table and the where
clause allows you to filter what the records you what to
view)
o
o
o
o
select * from your_netid;
select * from your_netid where attr='attr1';
select * from your_netid where key=2;
select key, value from your_netid limit 5;
• Update table contents
o update your_netid set attr='attr1' where key=1;
o update your_netid set value=105 where key=1;
• Sorting
o select * from your_netid Order by key asc;
o select * from your_netid Order by key desc;
• Counting
o select count(*) from your_netid;
o select count(*) from your_netid where attr like '%1';
• Max/Min/Avg
o select max(value) from your_netid; select avg(value) from your_netid where attr ilike
'%1%';
• Delete Rows
o delete from your_netid where key=1;
• Copying a CSV file (postgres specific)
o \COPY your_netid FROM 'your_file' with CSV HEADER
o You may use /srv/cigi/code/test.csv for your_file
• Drop Table
o drop table your_netid;
• End of this topic