Download OO7 Benchmark

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Commitment ordering wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Serializability wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Versant Object Database wikipedia , lookup

Transcript
‘The OO7 Benchmark’
M.J.Carey, D.J.DeWitt, J.F.Naughton
University of Wisconsin,Madison
Version of January 1991
‘The SEQUOIA 2000 Storage Benchmark’
M.Stonebraker, J.Frew, K. Gardels, J. Meredith
University of California, Berkeley
1993
Μπαζιάνα Περιστέρα
[email protected]
Προχωρημένα Θέματα Βάσεων
Δεδομένων
1
The OO7 Benchmark

Benchmark:




a comprehensive test of OODBMS (Object-Oriented DataBase
Management System) performance
goals to evaluate new techniques and algorithms for OODBMS
implementation
gives the performance metrics for OODBMS design
OO7 Benchmark:


is implemented in 4 OODB systems: E/Exodus, Objectivity/DB,
Ontos, Versant – Fail to test the ObjectStore, from Object
Design Inc.
uncovers correctness and/or performance problems of the
tested OODBMS
Προχωρημένα Θέματα Βάσεων
Δεδομένων
2
Performance Characteristics tested by OO7



Speed of pointer traversals: over cashed data, over disk-resident
data, sparse and dense traversals
Efficiency of updates: to indexed and unindexed object fields,
repeated and sparse updates, updates of cashed data, creation
and deletion of objects
Performance of query processor: on different kinds of queries
Προχωρημένα Θέματα Βάσεων
Δεδομένων
3
Related Work





OO1 Benchmark
HyperModel Benchmark
Initial Sun Benchmark, evaluating Vbase
A Complex Object Benchmark (ACOB)- for client/server
applications
Magic Editor, UC Berkeley
Προχωρημένα Θέματα Βάσεων
Δεδομένων
4
OO1 Benchmark
(Object-Operations ver.1, Sun Benchmark)



First standard benchmark
Measures performance for navigation and simple updates
Is based on a database consisting of:





Part objects – fields: id, type, (x,y) coordination, build date
Connections between part objects
Each part has 3 ‘out-going’ and many ‘in-going’ connections
Database size: 20.000 and 200.000 parts, to model applications
whose data fits or exceeds memory
OO1 Benchmark operations:



‘Lookup’ -1000 random parts by ids
‘Traversal’- accesses 3.280 connected parts
‘Insert’ – adds 100 new parts to database
Προχωρημένα Θέματα Βάσεων
Δεδομένων
5
HyperModel Benchmark




Richer schema and wider range of operations, than OO1
Performance test based on hypertext application model
HyperModel database: graph of inter-connected nodes
Node Relationships:



Types of nodes:



One hierarchical 1:N relationship
Two M:N relationships
k-1 levels of non-leaf nodes, which hold many integer values
1 level of leaf nodes, which hold a text string or a bitmap
HyperModel Benchmark operations:




Exact-match lookup (by integer attribute value)
Range query (1% and 10%)
Group lookup (follows the relationship from a random node to its related nodes)
Reference lookup (the inverse)
Προχωρημένα Θέματα Βάσεων
Δεδομένων
6
Why another benchmark?


We want to evaluate a wide range of OODB features
OO1 Benchmark:





Tests on simple navigations and updates only
Do not use complex objects, being important for many OODB applications
Reference between ‘nearby’ objects
Do not examine: density of traversals, traversal with updates, object queries
HyperModel Benchmark has no tests for:




Object queries
Updates to indexed vs. non-indexed object attributes
Repeated object updates
Impact of transaction boundaries
Προχωρημένα Θέματα Βάσεων
Δεδομένων
7
OO7 Database Description


OO7 goals to test many aspects of OODBMS performance,
not to model a specific application
Size of OO7 Benchmark Database: Small, Medium, Large
Προχωρημένα Θέματα Βάσεων
Δεδομένων
8
The design library (1)

Basic component of the OO7 database:


Design library in the OO7 database:



Id (integer), buildDate (integer), Type (character array)
Document object:




Is a set of all composite parts
Num of composite parts per Module: NumCompPerModule (500)
Attributes of a composite part:


A set of composite parts
Associated with each composite part
Models a documentation concerning the composite part
Attributes: id (integer), title (character), text (character string) – length
of string: DocumentSize
Composite part object and its document object:

Connected by a bi-directional association
Προχωρημένα Θέματα Βάσεων
Δεδομένων
9
The design library (2)

A composite part contains:





Attributes of an atomic part:


Id (integer), buildDate (integer), x,y (integer), docId (integer), Type (character array)
Each atomic part:



A graph of atomic parts – basic units constructing a comp. part
Num of atomic parts per com. Part: NumAtomicPerComp
20 atomic parts per com. part (small benchmark)
200 atomic parts per com. part (medium and large benchmark)
Is connected via bi-directional connections with other atomic parts (3,6,9)NumConnPerAtomic
Connections between atomic parts in a ring, plus in random way
Connection objects:


Connect atomic parts
Attributes: length (integer), type (character array)
Προχωρημένα Θέματα Βάσεων
Δεδομένων
10
A Composite part and its associated
document object
Προχωρημένα Θέματα Βάσεων
Δεδομένων
11
Assembling Complex Design (1)


We introduce the ‘assembly hierarchy’ to the database
Each assembly object is made up from:



Either composite parts (base assembly level)
Or other assembly objects (complex assembly level)
A base assembly object:


Attributes: Id (integer), buildDate (integer), Type (character array)
Has bi-directional association with:



A complex assembly object (higher level of hierarchy):


Attributes: Id (integer), buildDate (integer), Type (character array)
Has bi-directional association with 3 sub-assemblies (NumAssmPerAssm):



3 ‘shared’ composite parts (on a module basis)
3 ‘unshared’ composite parts (both of them: NumCompPerAssm)
Either base assemblies (if the complex assembly is at level two)
Or other complex assemblies (if the complex assembly is at higher level)
7 levels in assembly hierarchy (NumAssmLevels)
Προχωρημένα Θέματα Βάσεων
Δεδομένων
12
Assembling Complex Design (2)

buildDate:




Range in base assemblies: 1000-1999
Range in ‘young’ composite parts: 2000-2999
Range in ‘old’ composite parts : 0-999
The percentage of ‘young’ vs. ‘old’ composite parts: YoungCompFrac
Προχωρημένα Θέματα Βάσεων
Δεδομένων
13
Assembling Complex Design (3)


Each assembly hierarchy is called ‘module’
Characteristics of a module:


Attributes: Id (integer), buildDate (integer), Type (character array)
Associated Manual object – larger version of a document
Προχωρημένα Θέματα Βάσεων
Δεδομένων
14
Testbed Configuration


2 Sun Workstations on an Ethernet LAN
1 Sun IPX workstation as Server:





1 Sun Sparc ELC workstation as client:



48 MB memory
424 MB disk drive (model Sun0424): holds system S/W and swap space
1.3 GB disk drive (model Sun 1.3.G): holds the database (actual data)
424 MB disk drive (model Sun0424): holds recovery information
24 MB memory
207 MB disk drive (model Sun0207): holds system S/W and swap space
SunOS run on both workstations
Προχωρημένα Θέματα Βάσεων
Δεδομένων
15
Design Syntax
Προχωρημένα Θέματα Βάσεων
Δεδομένων
16
Software - E/Exodus (1)

Exodus consists of:



Exodus Storage Manager (EMS)
E programming language
The EMS :


Provides files of objects, B-trees, linear hashing
Uses a page-server architecture (current version 2.2)
Client processes request pages from server via TCP/IP
 Server answers:
- either from its buffer pool
- or by invoking a disk process to perform the I/O operation. After reading
the page, it gives it to the client process and keeps a copy to buffer pool
Provides concurrency control – at page and file level with non 2-PL protocol
Provides recovery services – via logging the changed portion of objects



Προχωρημένα Θέματα Βάσεων
Δεδομένων
17
Software - E/Exodus (2)

The E-language:



Current version of E:





Extends C++, adding persistence, collection of persistent objects, Btrees indexes
No support for associations, queries
Based on GNU g++ compiler
Uses EMS for storing persistent objects
Operations on objects are compiled – Interpreter: EPVM 3.0, which
stores the objects in the buffer pool of the EMS client process
Pointers between such objects are swizzled at traversal, and are
unswizzled at the replacement of the page by the EMS
Our experiment with E/Exodus:


Disk page size: 8 KB – transfer unit between client and server
Client buffer pool: 12 MB - Server buffer pool: 36 MB
Προχωρημένα Θέματα Βάσεων
Δεδομένων
18
Software – Objectivity/DB,ver.2.1

No page-server architecture: We have file-server architecture



Recovery via shadows:




During transaction, updates are written to shadow database
At commit time, these updates apply to the database – are used to recover if
commit fails
If transaction aborts, shadow database is deleted
Objectivity/DB provides:



No server process for handling data
Clients access database pages via NFS – need of separate lock server placed in
Sun IPX
persistence to C++ :Persistent objects are defined by inheritance from a
persistent root class
Sets, relations and iterators
Our experiment with Objectivity/DB :


Client buffer pool: 12 MB - Server buffer pool: 36 MB
Database and shadow files are stored as Unix files
Προχωρημένα Θέματα Βάσεων
Δεδομένων
19
Software – Ontos,ver.2.2 (1)


Ontos employs client-server architecture
Objects are created in one of the three storage managers (SM)




‘in-memory’ SM, manages transient objects –like C++ implementation
‘Standard’ SM, implements the object-server architecture including the
objects of locking and transferring from client to server
‘group’ SM, implements the page-server architecture – OO7
benchmark composite parts, atomic parts, connection objects are
created by the ‘group’ SM
Ontos:



Provides: sets, lists and associations
Do not support query optimizer for object-SQL
Supports nested transactions, optimistic concurrency control
Προχωρημένα Θέματα Βάσεων
Δεδομένων
20
Software – Ontos,ver.2.2 (2)

Recovery in Ontos:





Buffering at the client:



Via REDO logging
During a transaction, all updates are buffered to virtual memory
At commit time, the updates are written to files in the server
After success, the updates are applied to the database
No client buffer pool
Objects are kept in client’s virtual memory
Our experiment with Ontos:


Disk page size: 7.5 KB – transfer unit between client and server
Unix file systems are used to hold the database
Προχωρημένα Θέματα Βάσεων
Δεδομένων
21
Software – Versant 3.0 Beta

Versant employs:




Versant Object Manager:


caches objects during a transaction
Server Storage Manager:


Client-server architecture
Object-server architecture
Different objects for locking and transferring between client and server
Performs I/O with page granularity
Versant C++ interface:


adds persistence to C++
Do not modify the C++ compiler
Προχωρημένα Θέματα Βάσεων
Δεδομένων
22
Results (1)

Results: Traversals, Queries, Structural Modification Operations

Database Size:
Small Databases
Medium Databases
Προχωρημένα Θέματα Βάσεων
Δεδομένων
23
Results (2)

Traversals




Are implemented as methods of the database objects
Navigate from object to object, invoking the appropriate method
at each object
Run over ‘small’ and ‘medium’ OO7 benchmark databases
In ‘small’ OO7 databases, run in two ways:


‘cold’, if the traverse begins with the client and server cashes
empty
‘hot’, if the first running is a ‘cold’ traversal and then run the same
query 4 times reporting the average of the middle 3 runs
Προχωρημένα Θέματα Βάσεων
Δεδομένων
24
Traversal T1: Raw traversal speed

‘Traverse the assembly hierarchy. As each base assembly is visited, visit
each of its referenced unshared composite parts. As each composite part
is visited, perform a depth first search on its graph of atomic parts.
Return a count of the number of atomic parts visited when done’
Medium DB
Προχωρημένα Θέματα Βάσεων
Δεδομένων
25
Traversal T6: Sparse traversal speed

‘Traverse the assembly hierarchy. As each base assembly is visited, visit
each of its referenced unshared composite parts. As each composite part is
visited, visit the root atomic part. Return a count of the number of atomic
parts visited when done’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
26
Traversal T2: Traversal with updates

‘Repeat Traversal T1, but update objects during the traversal. There are three types
of update patterns in this traversal. In each, a single update to an atomic part consists
of swapping its (x,y) attributes. The three types of updates are:
A: Update one atomic part per composite part
B: Update every atomic part as it is encountered
C: Update each atomic part in a composite part 4 times
When done, return the number of update operations that were actually performed’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
27
Traversal T3: Traversal with indexed field
updates

‘Repeat Traversal T2, except that now the update is on the date field,
which is indexed. The specific update is to increment the date if it is odd,
and decrement the date if it is even’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
28
Traversals T8 and T9: Operations on
Manual

‘Traversal T8 scans the manual object, counting the number of
occurrences of the character ‘I.’. Traversal T9 checks to see if the first
and the last character in the manual objects are the same’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
29
Traversals CU (Cached Update)

‘Perform traversal T1, followed by T2A, in a single transaction. Report
the total time minus the T1 hot time minus the T1 cold time’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
30
Queries…
Query Q1: exact match lookup
•
‘Generate 10 random atomic part id’s; for each part id generated, lookup
the atomic part with that id. Return the number of atomic parts processed
when done’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
31
Queries Q2, Q3 and Q7



Query Q2: ‘ Choose a range for dates that will contain the last 1% of the
dates found in the database’s atomic parts. Retrieve the atomic parts
that satisfy this range predicate’
Query Q3: ‘ Choose a range for dates that will contain the last 10% of
the dates found in the database’s atomic parts. Retrieve the atomic
parts that satisfy this range predicate’
Query Q7: ‘ Scan all atomic parts’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
32
Query Q4: path lookup

‘Generate 100 random document titles. For each title generated, find all
base assemblies that use the composite part corresponding to the
document. Also, count the total number of base assemblies that qualify’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
33
Query Q5: single-level make

‘Find all base assemblies that use a composite part with a build date later
than the build date of the base assembly. Also, report the number of
qualifying base assemblies found’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
34
Query Q8: ad-hoc join

‘Find all pairs of documents and atomic parts where the document id in
the atomic part matches the id of the document. Also, return a count of
the number of such pairs encountered’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
35
Structural Modifications:


1: Insert
Create five new composite parts, which includes creating a number of
new atomic parts (100 in small configuration, 1000 in large, and five new
document objects) and insert them into the database by installing
references to these composite parts into 10 randomly chosen base
assembly objects’
2: Delete
‘Delete the five newly created composite parts (and all of their associated
atomic parts and document objects)’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
36
Small and large databases…..
Small Databases
Medium Databases
Προχωρημένα Θέματα Βάσεων
Δεδομένων
37
Conclusion..

OO7 test results give OODBMS performance characteristics that
are not observed in previous benchmarking systems

Each application’s requirements lead us to choose the best
OODBMS suitable for it
Προχωρημένα Θέματα Βάσεων
Δεδομένων
38
The SEQUOIA 2000 Storage Benchmark

Collects the database requirements of Earth Sciences (ES):



Hopes that it will be applied to general database community
Tests the performance of three DBMSs:


GRASS, IPW, POSTGRESS
Characteristics of ES applications:




Geography, hydrology, oceanography…
Massive size: Database contains (satellite) images: 10^14 bytes
(100Tbytes)
Complex data types: support for arrays, spatial objects, complex
objects
Sophisticated searching: B-trees are adequate for searching for
arrays and spatial data
Need for new benchmark for ES applications
Προχωρημένα Θέματα Βάσεων
Δεδομένων
39
Scaling benchmark

Regional benchmark: geographic region 1280 km x 800 km
National benchmark: geographic region 5500 km x 3000 km
Earth benchmark: all the earth…..

Kinds of data:






Raster data: tile area 1 km x 1 km (or 0.5 x 0.5), tile size 10 bits
regional benchmark contains: 2x1280x2x800=4.096.000 tiles
each: (5 observations)*(2bytes)=10 bytes
Point data: a geogr. feature is given by its name and its location (2x32 bits)
regional benchmark occupies: 1.83 Mbytes
Polygon data: polygons contain regions of same landuse, av. pol. sides=50
regional benchmark occupies 19.1 bytes
Directed graph data: area unit is segment
regional benchmark occupies 47.8 bytes
Προχωρημένα Θέματα Βάσεων
Δεδομένων
40
Benchmark Queries…(1)

Query 1: data load
‘Create and load the data base and build any necessary secondary indexes’

Query 2: raster query
‘Select data for a given wavelength band and rectangular region ordered by
ascending time’
POSTQUEL
 Query 3: raster query
‘Select data for a given time and geographic rectangle and then calculate an
arithmetic function of the five wavelength band values for each cell in the study
rectangle’
POSTQUEL
 Query 4: raster query
‘Select data for a given time, wavelength band and geographic rectangle. Lower
the resolution of the image by a factor of 64 to a cell size of 4x4km and store it as a
new object
Προχωρημένα Θέματα Βάσεων
Δεδομένων
41
Benchmark Queries…(2)

Query 5: Point query
‘Find the POINT record that has a specific name’

Query 6: Polygon query
‘Find all the polygons that intersect a specific rectangle and store them in DBMS’

Query 7: Polygon query
‘Find all the polygons that are more than a specific size and within a specific circle’

Query 8: Spatial join
‘Show the landuse/landcover in a 50 km quadrangle surrounding a given point’

Query 9: Spatial join
‘Find the raster data for a given landuse type in a study rectangle for a given wavelength
and time’

Query 10: Spatial join
‘Find the names of all points within polygons of a specific vegetation type and create this
as a new DBMS object’

Query 11: Recursion
‘Find all segments of any waterway that are within 20 km downstream of a specific point’
Προχωρημένα Θέματα Βάσεων
Δεδομένων
42
Results
Προχωρημένα Θέματα Βάσεων
Δεδομένων
43