Download ObjectStore

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Versant Object Database wikipedia , lookup

Transcript
ObjectStore
Martin Wasiak
ObjectStore Overview




Object-oriented database system
Can use normal C++ code to access
tuples
Easily add persistence to existing
applications
Critical operation: retrieving data
from a tuple must be as fast as possible
Why Object-oriented?


Needs for CAD, ECAD and some other
engineering applications are different
from traditional database systems.
These applications need to store large
amounts of data that are inefficient to
retrieve using traditional relational
database systems.
CAD & CAM Applications


Store data objects that are connected
to one another forming complicated
networks.
Each object represents a part that
contains some attributes and is also
connected to another part (object).
What to Optimize?


Most CAD apps have to traverse a list of
those objects such as a list of vertices
in a 3-D CAD program.
Another example: traversing a network
of objects representing a circuit and
carrying out computation along the
way.
Bottlenecks!

Example:


SELECT p1.weight, p2.weight, p3.weight
FROM Pipes p1, Pipes p2, Pipes p3
WHERE p1.left_pipe_id=p2.right_pipe_id AND
p2.left_pipe_id=p3.right_pipe_id AND
p1.pipe_id=5 AND
p1.contents=“Water”
Inefficient if we want to find the total weight
of our pipeline composed of thousands of
parts.
More Bottlenecks…



Our pipeline can be thought of a long
linked list, and so doing joins on it is
simply inefficient.
In C++ we can write a simple loop to
traverse a list or a tree type structure
with many pointers.
So instead of a join we simply
dereference a pointer!
Note About Pointers


Pointers when stored on disk point to
actual memory addresses, not some
other logical pointers within the file.
This means when an object is paged
into memory, ObjectStore tries to fit it
into the memory address that it was
loaded in last.
Dereferencing Pointers





ObjectStore sets page permission to “no access”
if a record is not a memory.
Client tries to access the page it has no access
to.
Hardware detects an access violation and reports
memory fault to ObjectStore.
ObjectStore loads the record into memory and
sets the page permission to read-only.
Client tries to dereference the record and
succeeds.
More Dereferencing

What if DB is bigger than VM pool?


Dynamically assign address space to db.
What if address of a record in db is
already in use?


Tag table keeps track of all objects in the
database.
Used to relocated pointers.
Client Caching



Client side caching is used to eliminate the
need to page over network and speed up
performance.
Server keeps track of all objects present in
client caches.
What if a client tries to modify an object that
exists in another client’s cache?

Callback message is sent to the client to check
whether the object is locked.
Defining Relations


Relations are defined using pointers.
Pointers are kept in both directions to
facilitate updates.
Associative Queries

Query against all_employees:


Query against employees of dept. d:


os_Set<employee*>& overpaid_employees =
all_employees [: salary >= 100,000 :];
d->employees [: salary >=100000 :];
Nested queries:

all_employees [: dept->employees
[: name == ‘Fred’ :] :];
Versions



ObjectStore supports version control.
Allows teams to check out read-only and
read-write objects for extended period of
time. Also called “long transaction.”
Example: A new CPU can have people
working on the ALU while others work on LSU
(load/store unit) at the same time.
Performance




Relational database schemas are normalized
and queries usually involve joins of different
tables.
ObjectStore queries generally involve
embedded collections or paths through
objects.
In addition, indexes can be created over
those paths.
The problem ends up being of how to
traverse a linked list as fast as possible!
Warm and Cold Cache Results


Cold cache is an “empty” cache.
Warm cache is… non-empty…
94
100
84
80
Time (sec.)
60
40
20
Cold
27
0.7
13
18
0.1
1.2
17
17
1.2
8.4
0
oodb1
OBjectStore
oodb3
oodb4
System
rdbms1
index
Warm
QuickStore (Part 2)

Similar to ObjectStore.


Both try to load objects into same memory
space since as before since the pointers on
disk reflect the actual memory pages.
Few differences.

QS implements a buffer manager system
based on simplified version of clock.
More QuickStore





Also built using C++.
Storage provided by EXODUS storage
manager (EMS).
In the paper QS is compared to E and QS-B.
E uses software to emulate hardware paging.
QS-B uses bitmaps to keep track of pointers
which takes space.



Cold times on small
database of an object
built with components
with each component
containing atomic parts.
t1: depth first traversal
including atomic parts.
t6: same as t1, but
excluding atomic parts.
Response Time (sec)
Depth-first Traversal Test
40
35
30
25
20
15
10
5
0
QS
E
QS-B
t1
t6
Conclusion



OS and QS use virtual memory
hardware to facilitate loading of objects
from disk to memory.
VERY efficient for CAD/CAM applications
which rely heavily on traversals of
complicated networks of objects.
ObjectStore and QuickStore add
persistence to C++ programs.