Download Spinning the Structured Data Web

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Draft
Spinning the Structured Data Web:
Experience with SOFTools 1.2d
Douglas E. Dyer, PhD
Active Computing
31 May 04
Introduction
This technical note briefly describes the experience of interfacing SOFTools 1.2d to
publish information to the structured data web. The structured data web (SDW) is like
the world-wide web (WWW), except the intent is to store and serve variable values that
can be interpreted by algorithms rather than HTML and other content types that can only
be interpreted by humans. Large software systems traditionally have been designed from
the top down, a method that seems to impacts scalability because of coordination and
synchronization required by developers. The SDW is intended to increase scalability by
supporting bottom-up development which requires less coordination and synchronization.
The SDW web aims to reduce developer effort needed to access and understand
information available in third-party applications. For applications wishing to publish
information, the SDW aims to be as simple as possible to reduce development effort and
maximize the number of publishers. Having previously defined a relational database
implementation of the structured data web, it was important to assess the level of effort
required to create an appropriate interface for a prototypical application and to
characterize performance of the system in terms of read and write times.
SOFTools 1.2d, a prototypical application
SOFTools 1.2d is a research prototype temporal plan authoring tool developed with
SOFTools version 1.3, the operational precursor to current versions of SOFTools. Like
the current operational version, 1.2d supports an XML file format for saving and loading
temporal plans. Although the file formats have changed over time and code
improvements have impacted performance, comparing plan load/save times for XML and
the relational database schema of our SDW should give some indication of expected
results for the current operational version of SOFTools as well as for other applications
that use XML file formats.
Interfacing level of effort for SOFTools 1.2d
To interface SOFTools1.2d, generic database code used by another application (d3i) was
modified to enable SOFTools to connect to our SDW database and save element data.
This is about 50 lines of code that can be reused for other Tcl applications. Next, the
application name and user identity were hardwired. This is appropriate for test purposes,
but SOFTools 1.2d needs a stronger notion of user identity if it is to share information
Draft
1
Draft
reliably using the SDW. In general, most document-based applications will have this
requirement (which is relatively easy to meet). Finally, the procedure saveMatrixToDb
was written as appears below (Tcl/Tk source code):
proc saveMatrixToDb {} {
global attributes el allElements
global template userIdentity instance
if ![info exists template] {setGlobals}
if ![info exists instance] {set instance [findLatestInstance $template $userIdentity] ; incr instance}
foreach attrib $attributes(metadata) {global $attrib ; saveElementData $attrib [reval $attrib]}
foreach attrib $attributes(timelinedata) {global $attrib ; saveElementData $attrib [reval $attrib]}
foreach tag $allElements {
set elementType [findElementType $tag]
foreach attrib $attributes($elementType) {saveElementData "$tag.$attrib" $el($tag.$attrib)}
}
}
The procedure saveElementData writes variables and values, along with relevant
metadata, to the SDW database using the context of application (template), user identity,
and instance.
At this point, SOFTools plans can be written to the SDW, and the programming level of
effort to get to this point was 75 lines of code and about 3 hours.
The intent of the SDW is to allow other developers to access information published by
third-party applications (e.g., SOFTools 1.2d). However, to test load times, the
procedure loadMatrixFromDb was written as shown:
proc loadMatrixFromDb {instance {user ""}} {
global template userIdentity el
if ![newMatrix] return
set thePlan [sql "select element, value from element where template = '$template' and userIdentity
= '$userIdentity' and instance = '$instance'"]
foreach pair $thePlan {
set var [lindex $pair 0]
set val [lindex $pair 1]
switch -glob $var {
place* eevent* move* separator* {
set el($var) $val
if [string match "*.tag" $var] {
lappend allDbElements $val
}
}
default {global $var ; set $var $val}
}
}
timeline $timelineDefList
# These need to be done in a certain order
foreach tag $allDbElements {if [string match "place*" $tag]
{place $tag}}
foreach tag $allDbElements {if [string match "eevent*" $tag]
{eevent $tag}}
foreach tag $allDbElements {if [string match "move*" $tag]
{move $tag}}
foreach tag $allDbElements {if [string match "separator*" $tag] {separator $tag}}
global matrix_changed ; set matrix_changed 0
gotoHHour
}
In this procedure, the entire plan is recalled using a single query and the tuples returned
are parsed to set variable values appropriately. The SOFTools canvas requires a specific
order of laying out the plan elements, slightly complicating the procedure. Writing this
procedure required another hour and brought the total lines of code to just over 100.
Draft
2
Draft
Figure 1. An example SOFTools plan.
Performance
A key performance parameter for the SDW is the time required to read or load
information (just as it is for the WWW). For an example SOFTools plan of 21 elements
(places, movement between places, and events), SDW database retrieval time was
roughly 0.7s while total time to retrieve, parse, and display the plan was 1.8 seconds1.
This total time compares favorably to SOFTools 1.2d load time of 3.6 seconds using its
native XML file format. XML files require additional parsing effort that relational
databases do not, an important consideration.
A less important performance parameter for many applications is the time required to
write or save information. For SOFTools 1.2d’s native XML file and the example plan,
write times were found to be very fast: 0.7s. Using the SDB interface, SOFTools 1.2d
required nearly 6 seconds to save a plan to the database. The example plan of 21
elements includes 365 variables (in addition to some global variables, there are many
attributes for each plan element). The procedure saveMatrixToDb shows that each
variable requires an SQL operation on the database, not the most efficient method.
However, for many applications including interactive ones such as SOFTools, write
1
Network communication times were not considered. The SDW database was hosted on the same machine
as SOFTools 1.2d.
Draft
3
Draft
speed is of little consequence because writes should ideally be incremental, as changes
are dictated by the user. No explicit saving should be required.
Summary
At least for SOFTools 1.2d, the SDW interface required only four hours and about 100
lines of code to implement, some of it reusable. Creating an SDW interface for
SOFTools 1.2d means that anyone who has access permission may read SOFTools plan
data from any location on the net---and get access to example data useful for
understanding the meaning of SOFTools variables. Our implementation of the SDW
performed adequately and read times were generally faster than SOFTools own XML file
formats due to reduced parsing required.
Draft
4