* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download XBrain
Survey
Document related concepts
Transcript
XBrain
XQuerying the Brain Mapping Database
Stacy Tang, Yana Kadiyska
Jim Brinkley, Dan Suciu
1
Human Brain Project
• Problem
– Explosion of information due to proliferation of
techniques.
• NIH Goal
– WWW based information tools that allow
management, integration, and sharing of
research data.
2
Brain Mapping Database
• Study of language function through invasive
neurosurgical method, Cortical Stimulation
Mapping
• Combined with non-invasive methods such
as MRI, fMRI, PET scan
• 64 patients with 13 of them published
3
XML
• Document markup language
• Become the standard for data exchange
between inter-enterprise applications
• Platform independent
• “Self-describing” data
4
XML example
<bib>
<book year="2000">
<title>Data on the Web</title>
<author><last>Abiteboul</last><first>Serge</first></author>
<author><last>Buneman</last><first>Peter</first></author>
<author><last>Suciu</last><first>Dan</first></author>
<publisher>Morgan Kaufmann Publishers</publisher>
<price>39.95</price>
</book>
<book year=“1992”>…</book>
...
</bib>
5
SilkRoute
• Data stored in relational database, how to
translate to XML for exchange.
• Is a tool for publishing relational data in
XML.
• Allows querying of the data using XQuery.
• Developed by Dan and Yana, along with
collaborators from other institutions.
6
What SilkRoute Does
Public
Query
User Input:
User Query
Relational
Schema
SilkRoute
SQL
Output:
XML
tuples
RDBMS
7
Objective of Project
To demonstrate the usability of SilkRoute
and XQuery for data sharing by applying it to
a real relational database -- the Brain Mapping
database.
8
Project Background
• Started as a CSE544 (Intro to Database)
project (Spring 2002 Quarter).
• Original project members: Hao Li and
myself.
• Demonstrated feasibility of project.
• Unfinished:
– Covered small part of database
– Depended on manual tweaking of data
– Minimal web interface
9
Tasks of the Project
1. Migrate database from MySQL to
PostgreSQL - automate as much as possible.
2. Complete XQuery-based public view for
the entire database.
3. Work with Yana to smooth out SilkRoute
issues - bug fixes, error handling, etc.
4. Web interface - add new features, improve
look and feel, improve UI.
10
1.MySQL to PostgreSQL
• Why is this necessary?
– Robustness
– Sub-select queries
• Problems: MySQL and PostgreSQL are
very different, and the data needs to be
cleaned up.
• The previous process involved too much
manual tweaking, need to improve. Wrote
scripts for this.
11
MySQL to PostgreSQL - Step 1
Make a dump of the MySQL database
- MySQL database is on tela.biostr
- Use a perl script to create a dump in a specified
directory.
12
MySQL to PostgreSQL - Step 2
Translate MySQL dump to PostgreSQL. Use
scripts to:
- clean up syntax
- rename table/column names that are reserved words
(user, public) in PostgreSQL.
- designate primary keys when lacking
- get rid of WIRM related tables
13
MySQL to PostgreSQL - Step 3
Create SQL files for running later (generated using
python scripts). The SQL files:
- correct some of the bad data
- add foreign key constraints (lacking in the MySQL
dump)
14
MySQL to PostgreSQL - Step 4
Import the data into PostgresSQL
- run the dump and generated SQL files in a specific
order to allow the data to be entered
- reorder the insert statements as to not violate foreign
key constraints
- still errors about bad rows, those aren’t inserted
15
2. The Public View
• Provides a virtual view of the relational
database
• Very large (over 1000 lines)
• Data Privacy
– Choose not to publish some fields.
– Protect patient privacy, e.g. patient.initials,
patient.research_num, etc.
– Protect unpublished research data.
• How to translate graph to tree
– DB tables may not be hierarchical, so have to
force parent-child relationships for the DTD.
16
Brain Mapping DB – Schema
Patient(*oid,initials,first_name,last_name,location,registered,age,sex,viq,pnum,
is_public,handedness,wada,size,copy,pre,description,gao_research_num);
Surgery(*oid,patient,surgery_date,surgeon,diagnosis,side,lobe,grid);
CSMStudy(*oid,surgery,function,trial_data,site_data);
File(*oid,label,domain,locator,source,mime_type,submit_date,submitted_by,
version,context,description);
Photo(*oid,preference,image,csmstudy,image_pathname,image_filename);
StimSite(*oid,site_label,zone,lobe,csmstudy,anatomical_name);
Trial(*oid,trial_num,site_label,trial_time,current,slide,eeg_score,miriam_code,
confidence,comments,km_score,site_suffix,csmstudy,stimulation_site);
UserPerson(*oid,login,first_name,last_name,email,password,user_group);
17
Brain Mapping DB – Schema (cont)
SiteToAnatomyMap(*oid,csmstudy,photo,scene,author,map_date,
sitetoanatomyfile,rendered_map,sitetoanatomy_pathname,
sitetoanatomy_filename, preference,modtime);
SiteToAnatomyMapElement(*oid,sitetoanatomymap,stimsite,site_label,
ant_coord,sup_coord,right_coord,x,y,confidence);
Scene(*oid,imaging_study,description,description_file,preference,
ismapscene);
ImagingStudy(*oid,patient,image_date,billed,prefix,subject,suffix,
computed_image_pathname,computed_image_filename,
computed_coords_pathname,computed_coords_filename,
lowres_surface_pathname,lowres_surface_filename,aligned_pathname);
MRExam(*oid,imaging_study,exam_num,description,import_date,
import_info,location);
Rendering(*oid,rendering_type,preference,image,scene,image_pathname, 18
image_filename);
Brain Mapping DB – Schema (cont)
SceneComponent(*oid,scene,description,surface_model,volume);
SurfaceModel(*oid,volume,model_instance,format,model_file,
model_pathname,model_filename,preference);
RadialSliceModelInstance(*oid,volume,model,landmarks_file,instance_file,
expansion_factor,instance_pathname,instance_filename,preference,
landmarks_pathname,landmarks_filename,derived_from);
RadialSliceModel(*oid,pathname,filename,comment,theta_radials,slices,
training_set,model_file,preference);
MRSeries(*oid,mrexam,location,showing,total_images,plane,scan_start,
scan_end,psd,type,description,fov_x,fov_y,height,width,bytes_per_pixel,
bits_per_pixel,optical_disk,start_img,stop_img,threshold,tissue,first,last,
label,thickness,spacing);
MRSlice(*oid,sequence_num,image_file,mrseries);
19
AlignedVolume(*oid,series,format,volume_file,filename,tissue,patient);
Brain Mapping DB – Schema Diagram
Patient
Surgery
ImagingStudy
20
Brain Mapping DB – Schema Diagram (cont)
21
Brain Mapping DB – Schema Diagram (cont)
22
The Public View – DTD Graph
23
The Public View – DTD Graph (cont)
24
The Public View - In XQuery
<root>
{
for $patient in $cv/Patient
where $patient/is_public/text() = "1"
return
<patient oid="{$patient/oid/text()}">
<first_name> xxx </first_name>
<last_name> xxx </last_name>
<location> {$patient/location/text()} </location>
<sex> {$patient/sex/text()} </sex>
...
{
for $surgery in $cv/Surgery
where data($surgery/patient) = data($patient/oid)
return
<surgery oid="{$surgery/oid/text()}">
<diagnosis> {$surgery/diagnosis/text()} </diagnosis>
…
25
User Queries - A Simple Example
Sample Query 1 (written in XQuery):
List the last names of all patients who DID NOT have
surgery.
<results>
{
for $p in $pv/root/patient
where empty($p/surgery)
return
<last_name>{$p/last_name/text()}</last_name>
}
</results>
26
User Queries - A Simple Example
Alternative (written in XPath):
XPath is a subset of the XQuery language, and thus
perfectly acceptable to use for queries. You can’t do as
much with XPath, but it is very simple to write.
<results>
{
$pv/root/patient[empty(surgery)]/last_name
}
</results>
27
User Queries - A Simple Example (cont)
Sample Query 1:
Intermediate SQL query generated by SilkRoute.
SELECT P78.last_name, P78.oid
FROM Patient as P78
WHERE NOT EXISTS
( SELECT *
FROM Surgery as S99
WHERE S99.patient = P78.oid);
28
User Queries - A Simple Example (cont)
Results (in XML):
<results>
<last_name>Chopra</last_name>
<last_name>Townes</last_name>
</results>
29
3. Improve “plumbing” between
SilkRoute and web application
Worked with Yana to improve error handling.
– If user inputs bad query, then return the parse
error back to the user.
– When SilkRoute encounters an error, gracefully
exit instead of bringing down the web page.
30
4. Web Interface
• Located at:
http://quad.biostr.washington.edu:8080/xbrain/
•
•
•
•
Make application available over web.
Written in JSP and served by Tomcat.
Talks to SilkRoute through a Java interface.
Allows users to input their own queries and
get XML results.
• Added feature for letting certain “super”
users to access a version of the public view
that contains all the patients (not just the 13
31
public ones).
Web Interface - System Diagram
quad.biostr.washington.edu
SilkRoute
Postgres
Tomcat4
XML
MySQL
XQuery
Web Browser
(Internet)
32
Web Interface - System Architecture
Tomcat (Application Server) Runs JSP/Servlets
XBrain pages
JSPs
Java API
Java
Classes
SilkRoute
DB
33
Web Interface - Screen Shots
34
Web Interface - Screen Shots
35
Web Interface - Screen Shots
36
Web Interface - Screen Shots
37
Web Interface - Screen Shots
38
Web Interface - Screen Shots
39
Web Interface - Screen Shots
40
Current Status & Future Work
• Currently, the website is up and running at
http://quad.biostr.washington.edu:8080/xbrain/
• Immediate Future
– Figure out who the super users are by looking in
the “UserPerson” table.
– Store user input in temporary files, to better
handle simultaneous users.
– Add Secure Socket Layer (SSL) to ensure
secure transfer of XML data when user is
logged in.
41
– SilkRoute bug fixes.
Future Work
• Future:
– Graphical User Interface to help users formulate
user queries.
– Flexible format for visualizing results (i.e.,
comma separated values instead of XML).
– Extend this to other databases.
– Eventual goal of allowing multiple applications
to cooperate in a peer data management system.
42
Team/Resources
• SilkRoute support: Yana
• Faculty: Dan Suciu, Jim Brinkley
43
Questions?
For more information, go to the XBrain webpage:
http://quad.biostr.washington.edu:8080/xbrain/
44