Download Data Models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Data Protection Act, 2012 wikipedia , lookup

Concurrency control wikipedia , lookup

Data center wikipedia , lookup

Versant Object Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Data analysis wikipedia , lookup

Data model wikipedia , lookup

Information privacy law wikipedia , lookup

Forecasting wikipedia , lookup

3D optical data storage wikipedia , lookup

Database wikipedia , lookup

Business intelligence wikipedia , lookup

Clusterpoint wikipedia , lookup

Data vault modeling wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Data Models for Ecological
Databases
John Porter
Department of Environmental
Sciences
University of Virginia
Characteristics of Ecological
Data
High
Satellite
Images
GIS
Weather
Stations
Business
Data
Data
Volume
(per
dataset)
Primary
Productivity
Gene Sequences
Biodiversity
Surveys
Population Data
Soil Cores
Low
High
Complexity/Metadata Requirements
Choosing a DBMS

What tasks to do you want the DBMS to
accomplish?
query
sorting
analysis

Is there a type of DBMS whose
structure best mirrors that of the
underlying data?
Database Management System
(DBMS) Types
 File
system-based
 Hierarchical
 Network
 Relational
 Object-oriented
Advantages and Disadvantages
of using a DBMS

Advantages
• additional
capabilities
– sorting
– query
– integrity checking
• easy access to data

Disadvantages
• few graphical or
statistical capabilities
• proprietary formats
may limit archival
quality of data
• require expertise and
resources to
administer
File-System Based
Directory
Files
Files
Files
Filesystem-based
very simple and easy to set up
inefficient
few capabilities
Hierarchical
Project
Hierarchical
efficient
Datasets
Investigators
not very general
Variables Locations
e.g. phylogenetic
structures
Codes
Methods
geographical
images
Network Database
Projects
Datasets
Links are hard-coded into
database. They are not a
property of the data
Locations
Network Database
very flexible
unwieldy to modify
not widely used
Relational Database
Projects
Location_id
Data_id
Datasets
Location_id
Linkages are through
the properties of the
data itself - not hard
coded
Locations
Relational
widely-used, mature
table-oriented
restricted range of structures
Object Oriented
Methods
Object Data
Structure
Object-oriented
•developing -few
commercial
implementations
•diverse structures
•extensible
Data Modeling
Data modeling is used to develop the
database structures used in a database
 Your data model effects

• reliability of the data
• efficiency and speed of queries
• the complexity of the database

Data modeling is an art, not a science!
Flat-file
Genus
Quercus
Quercus
Quercus
Quercus
Quercus
Species
alba
alba
alba
rubra
rubra
Common Name
White Oak
White Oak
White Oat
Red Oak
Red Oak
Species
Genus
Observer
Jones, D.
Smith, D.
Doe, J.
Fisher, K.
James, J.
Date
Observation
Species
Common
Name
Observer
Date
15-Jun-1998
12-Jul-1935
15-Sep-1920
15-Jun-1998
15-Sep-1920
Normalization
One widely-used approach for reducing
errors within a database is to normalize
your data structures
 Normalization is the process of
eliminating duplicate or redundant
information

Two-table Relational Database
Spec_code
QRCALB
QRCRBR
Spec_code
QRCALB
QRCALB
QRCALB
QRCRBR
QRCRBR
Genus
Quercus
Quercus
Observer
Jones, D.
Smith, D.
Doe, J.
Fisher, K.
James, J.
Species
Species
alba
rubra
Common Name
White Oak
Red Oak
Date
15-Jun-1998
12-Jul-1935
15-Sep-1920
15-Jun-1998
15-Sep-1920
Spec_code
Spec_code
Observation
Genus
Species
Common
Name
Observer
Date
Complex Data Model
Species
Images
Observations
Internet Links
Locations
Observers
Specimens
Data Model for Metadata at
VCR/LTER
Personnel
Projects
Mailing Lists
Dataset
Locations
Variable
Codes
Dataset
Variable
Optional Linkage
Mandatory Linkage
“Beanstalk”& “String of Pearls”
What Value Date
Location
Temp
SEV
23
10/19/00
Metadata
•methods
•units
Location Table
•Lat/Lon
Humid 95
10/19/00
SEV
Precip 0.01
10/18/00
VCR
Beanstalk / String of Pearls
Highly normalized
 Extremely flexible - capable of handling
many different kinds of data
 Inefficient

• Querys can be very slow
• Can require large amounts of space