Download slides - Database Group

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Database
Research
Group
Interactive SQL Query Suggestion
Making Databases User-Friendly
Ju Fan, Guoliang Li, and Lizhu Zhou
Database Research Group, Tsinghua University
ICDE 2011 – Apr. 13, Hanover
Outline
•
•
•
•
•
•
Motivation
Overview of SQL Query Suggestion
Template Suggestion
SQL Generation
Experiments
Conclusion
5/22/2017
SQL Suggestion, ICDE 2011
2
Outline
•
•
•
•
•
•
Motivation
Overview of SQL Query Suggestion
Template Suggestion
SQL Generation
Experiments
Conclusion
5/22/2017
SQL Suggestion, ICDE 2011
3
SQL: Powerful Yet Difficult
• SQL is powerful but difficult for inexperienced
users to pose queries
▪ Be skillful in SQL syntax to express query intent
▪ Have a thorough understanding of the schema
5/22/2017
SQL Suggestion, ICDE 2011
4
SQL Assistant Tools
• Target Users
▪ The novice users who struggle with the basic SQL
syntax or the structure of the schema.
• Limitations
▪ Only support metadata and SQL syntax
▪ Require users to manually join multi-tables
5/22/2017
SQL Suggestion, ICDE 2011
5
Keyword Search over RDB
• Keyword Search over Relational DB
▪ Data: A database with multiple tables
▪ Query: Keywords
▪ Answer: Joined tuples containing the keywords
• Limitations
▪ Cannot precisely capture users’ query intent
▪ May involve irrelevant results
▪ Cannot support aggregate functions, range
queries, etc.
5/22/2017
SQL Suggestion, ICDE 2011
6
SQL Suggestion from Keywords
5/22/2017
SQL Suggestion, ICDE 2011
7
Features of SQL Suggestion
• Objective: Assist users to formulate SQL
queries using keywords
• Main Features
▪
▪
▪
▪
5/22/2017
Query intent prediction
Answer grouping
Aggregation queries
Range queries
SQL Suggestion, ICDE 2011
8
Usability
Easier
Comparison of Query Paradigms
Keyword
Search
SQL
Suggestion
SQL
Expressiveness
5/22/2017
SQL Suggestion, ICDE 2011
More Powerful
9
Outline
•
•
•
•
•
•
Motivation
Overview of SQL Query Suggestion
Template Suggestion
SQL Generation
Experiments
Conclusion
5/22/2017
SQL Suggestion, ICDE 2011
10
Problem Definition
Query: Keywords
User
Answer: SQL Queries
Data: A Database
with Multiple Tables
…
5/22/2017
SQL Suggestion, ICDE 2011
11
A Two-Step Framework
One of Relevant Templates
Step 1
Template Suggestion
User
“count paper ir”
One of Generated SQL Queries
SELECT COUNT (P.id)
FROM Paper P, Author A, Write W
WHERE A.name CONTAINS “ir”
AND A.id = W.aid AND P.id = W.pid
5/22/2017
SQL Suggestion, ICDE 2011
Step 2
SQL Generation
12
Outline
•
•
•
•
•
•
Motivation
Overview of SQL Query Suggestion
Template Suggestion
SQL Generation
Experiments
Conclusion
5/22/2017
SQL Suggestion, ICDE 2011
13
Template Suggestion
One of Relevant Templates
Step 1
Template Suggestion
User
“count paper ir”
One of Generated SQL Queries
SELECT COUNT (P.id)
FROM Paper P, Author A, Write W b
WHERE A.name CONTAINS “ir”
AND A.id = W.aid AND P.id = W.pid
5/22/2017
SQL Suggestion, ICDE 2011
Step 2
SQL Generation
14
Queryable Template
• The skeleton of SQL queries that models the
joined entities and their attributes.
• A template is an undirected graph
5/22/2017
SQL Suggestion, ICDE 2011
15
Template Generation
• Atom Entities
Size
▪ E.g., Paper
1
• Expansion Rules
▪ E.g., P – W  P – W – A
• Combinatory Explosion
▪ A ranking model for
avoiding exploring all
templates
2
3
…
5/22/2017
SQL Suggestion, ICDE 2011
ID
Template
T1
P
T2
A
T3
W
T4
P–W
T5
A–W
T6
P–W–A
T7
W–P–W
T8
W–A–W
…
…
16
Template Ranking Model
P(Q,T) = P(T) ∑R∈T ∑k ∈Q P(k|R) P(R|T)
Query
Keyword1 Keyword2 … Keywordn
Keywords Q
P(k|R):Relevance of
R to k(TF-IDF)
Entities in
template T
Paper
Write
P(R|T):Importance of
Rto T: (PageRank)
5/22/2017
SQL Suggestion, ICDE 2011
Author
P(T)
Query Ability of T
17
Top-k Suggestion Algorithm
P(Q,T) = P(T) ∑R∈T ∑k ∈Qw
P(k|R)
P(R|T)
R
• Fagin Algorithm [5]
▪ Lists of templates for all
entities ordered by P(R|T)
• Indexing
▪ Inverted Index:
wp*
wA*
wW*
◦ Entity-to-Template
▪ Forward Index:
◦ Template-to-Entity
P(P|T)
5/22/2017
SQL Suggestion, ICDE 2011
P(A|T)
P(W|T)
18
Outline
•
•
•
•
•
•
Motivation
Overview of SQL Query Suggestion
Template Suggestion
SQL Generation
Experiments
Conclusion
5/22/2017
SQL Suggestion, ICDE 2011
19
SQL Generation
One of Relevant Templates
Step 1
Template Suggestion
User
“count paper ir”
One of Generated SQL Queries
SELECT COUNT (P.id)
FROM Paper P, Author A, Write W
WHERE A.name CONTAINS “ir”
AND A.id = W.aid AND P.id = W.pid
5/22/2017
SQL Suggestion, ICDE 2011
Step 2
SQL Generation
20
Match Keywords to Attributes
database
count
author
A Matching
Keyword-to-Attribute
SELECTION σ
Mapping
Φ ,σ
Aggregation Φ
Φ
σ
id
title
booktitle
Φ
year
Paper
5/22/2017
Projection
Φ
π
id
Φ
name
Author
SQL Suggestion, ICDE 2011
21
SQL Generation Model
S(M)= ∑m∈M ρ(k,A) I(A)
database
count
author
A Matching
σ
ρ(k,A)
the degree of a
mapping
Φ
id
title
booktitle
year
Paper
5/22/2017
π
id
I(A):
the importance of
mapped attributes
(Entropy)
name
Author
SQL Suggestion, ICDE 2011
22
Best SQL Query Generation
• Optimization Problem
MAX.
S(M)= ∑m∈M ρ(k,A) I(A)
• Weighted Set Covering Problem (NP-hard)
▪ A greedy approximation algorithm
• Extensions
▪ Find Top-k matchings
5/22/2017
SQL Suggestion, ICDE 2011
23
Outline
•
•
•
•
•
•
Motivation
Overview of SQL Query Suggestion
Queryable Template Suggestion
SQL Generation from Templates
Experiments
Conclusion
5/22/2017
SQL Suggestion, ICDE 2011
24
Experiment Setup
• Data sets
▪ DBLP: More than one million publication records
▪ DBLIFE: Activity information of people in DB comm.
• Query sets, E.g.,
▪ count author mining (DBLP)
▪ database jim gray (DBLIFE)
• Baseline method: DISCOVER-II
• User-Study for effectiveness evaluation
5/22/2017
SQL Suggestion, ICDE 2011
25
Template Suggestion
Precision-Recall Curves on the DBLife data set
5/22/2017
SQL Suggestion, ICDE 2011
26
SQL Generation
Query: database Jim Gray
Precisions on the DBLife data set
5/22/2017
SQL Suggestion, ICDE 2011
27
Record Retrieval
Query: count author mining
Advantages of SQL Suggestion
• Support aggregation functions
• Support meta-data matching
Precisions on the DBLife data set
5/22/2017
SQL Suggestion, ICDE 2011
28
Efficiency
Efficiency Comparison
(DBLife)
5/22/2017
SQL Suggestion, ICDE 2011
29
Scalability
Scalability (DBLP)
5/22/2017
SQL Suggestion, ICDE 2011
30
Outline
•
•
•
•
•
•
Motivation
Overview of SQL Query Suggestion
Queryable Template Suggestion
SQL Generation from Templates
Experiments
Conclusion
5/22/2017
SQL Suggestion, ICDE 2011
31
Conclusion
• An effective and user-friendly keyword-based
method
• Assist users to formulate SQL queries
• Suggest templates relevant to keyword queries
• Generate SQL queries from templates
• Extensive experiments
5/22/2017
SQL Suggestion, ICDE 2011
32
Future Work
• This study opens many new interesting and
challenging problems
▪ Cardinality estimation of suggested SQL queries
▪ Personalized SQL suggestion
5/22/2017
SQL Suggestion, ICDE 2011
33
Thanks
Demo: http://tastier.cs.tsinghua.edu.cn/sqlsugg
My Homepage: http://dbgroup.cs.tsinghua.edu/fanju
5/22/2017
SQL Suggestion, ICDE 2011
34
Comparison with Existing Work
• CN-Based Methods
▪ Better template ranking
▪ SQL Generation
▪ Aggregation functions, range queries, etc.
5/22/2017
SQL Suggestion, ICDE 2011
35
Related documents