Download Original Motivation for the Project New

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Big data wikipedia , lookup

Operational transformation wikipedia , lookup

Data Protection Act, 2012 wikipedia , lookup

Data center wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

SAP IQ wikipedia , lookup

Data model wikipedia , lookup

Data analysis wikipedia , lookup

Forecasting wikipedia , lookup

Information privacy law wikipedia , lookup

3D optical data storage wikipedia , lookup

Relational model wikipedia , lookup

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

Database model wikipedia , lookup

Transcript
The Trio System for Data,
Uncertainty, and Lineage:
Overview and Demo
Anish Das Sarma
Stanford University
Original Motivation for the Project
New Application Domains
• Many involve data that is uncertain
(approximate, probabilistic, inexact, incomplete,
imprecise, fuzzy, inaccurate,...)
• Many of the same ones need to track the
lineage (provenance) of their data
2
Original Motivation for the Project
New Application Domains
• Many involve data that is uncertain
(approximate, probabilistic, inexact, incomplete,
imprecise, fuzzy, inaccurate,...)
• Many of the same ones need to track the
lineage (provenance) of their data
Neither uncertainty nor lineage is
supported in current database systems
3
Sample Applications
Data integration
Information extraction
Scientific experiments
Sensor data management
Deduplication (“data cleaning”)
Approximate query processing
4
Our Goal
Develop a new kind of database management
system (DBMS) in which:
1. Data
2. Uncertainty
3. Lineage
are all first-class interrelated concepts
 With all the “usual” DBMS features
5
Another “Trio” in Trio
1. Data Model
Simplest extension to relational model that’s
sufficiently expressive
2. Query Language
Simple extension to SQL with well-defined
semantics and intuitive behavior
3. System
A complete open-source DBMS that people
want to use
6
Another “Trio” in Trio
1. Data Model
Uncertainty-Lineage Databases (ULDBs)
2. Query Language
TriQL
3. System
Trio-One — built on top of standard DBMS
7
Demo
Ongoing and Future Work
 Efficient Confidence Computation
 Top-K Queries
 Aggregation
 External Lineage
 Data Modifications and Versioning
 Continuous Uncertainty
 Dependency Theory for ULDBs
 Marrying Trio and Bayes Nets
 System Development and Applications
Trio Players, Present and Past
Current
• Jennifer Widom, Jeffrey Ullman
• Parag Agrawal, Anish Das Sarma, Raghotham Murthy,
Martin Theobald
Alums
• Omar Benjelloun, Ashok Chandra, Julien Chaumond,
Alon Halevy, Chris Hayworth, Ander de Keijzer, Michi
Mutsuzaki, Shubha Nabar, Tomoe Sugihara
Thank you!
Search “stanford trio”