Download Keyword Searching and Browsing in Databases using BANKS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

ContactPoint wikipedia , lookup

Clusterpoint wikipedia , lookup

Versant Object Database wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Keyword Searching and Browsing
in Databases using BANKS
Seoyoung Ahn
Mar 3, 2005
The University of Texas at Arlington
Outline
 Introduction
 Database and Query Model
 Searching for the best answers
 Browsing features of BANKS
 Experiment
 Conclusion
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Introduction
 Search engines on Web have popularized an
unstructured querying and browsing

Simple and user-friendly

Users just type in keywords and follow hyperlink
 Relational databases are commonly searched using
structured query language

Users need to know the schema
 Keyword searching techniques cannot be used on
data stored in databases

It often splits across the tables/tuples due to normalization
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Introduction(cond..)
 BANKS (Browsing And Keyword
Searching)
 a system which enables keyword-based search on
relational databases, together with data and schema
browsing
User
Seoyoung Ahn
HTTP
BANKS
system
JDBC
Database
Keyword Searching and Browsing in Databases using BANKS
Introduction(cond..)
 BANKS (Browsing And Keyword
Searching)
 a framework for keyword querying of relational
database
 a novel and efficient heuristic algorithm for executing
keyword queries
 key features of BANKS system
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Outline
 Introduction
 Database and Query Model

Informal Model

Formal Model

Query and Answer Model
 Searching for the best answers
 Browsing features of BANKS
 Experiment
 Conclusion
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 Informal Model
 Model Description
database

each tuple in db
fk-pk-Link
Seoyoung Ahn
directed graph
node in the graph


directed edge
Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 The Schema
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 A Fragment of the Database
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 Informal Model(cond.)
 An answer to a query should be a subgraph
connecting nodes matching the keywords.
 The importance of a link depends upon the type of
the link i.e. what relations it connects and on its
semantics
 Ignoring directionality would cause problems because
of “hubs” which are connected to a large numbers of
nodes.
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 Formal Database Model
Nodes and edges
 Node Weight : N(u)
Depends on the prestige
Set the node prestige = the indegree of the node
Nodes that have multiple pointers to them get a higher
prestige
Node score N = root node weight
+ ∑ leaf node weight
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 Formal Database Model (Cond.)
 Edge Weights
Some pupluar tuples can be connected many other
tuples  Edge with forward and backward edge weights
Weight of a forward link = the strength of the proximity
relationship between two tuples
(set to 1 by default)
Weight of a backward link = indegree of edges pointing
to the node
Total edge weight = ∑ edge weights
Edge score E = 1 / Total edge weight
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 Formal Database Model (Cond.)
 Overall relevance score
= Node weights + Edge Weight
Normalize in the range [0,1]
 Combine using weighting factor 
Additive: (1- ) E + N;
multiplicative: E * N
Seoyoung Ahn

Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 Query and Answer Model
 Query
A set of keywords e.g.{k1,k2,…kn}
A set of nodes Si = {S1,S2,…Sn}
Locate nodes matching search terms t1,t2,…tn
 Answer Model
A rooted directed tree connecting keyword nodes
 Relevance score of an answer tree
Relevance scores of it nodes and its edge weight
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 Answer Model
 A rooted directed tree connecting keyword
nodes
 Multiple answers
Ranked by proximity + prestige
Proximity  edges weights
Prestige  indegree of nodes
 Relevance score of an answer tree
Relevance scores of it nodes and its edge weight
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
 Result of query “sudarshan soumen”
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Outline
 Introduction
 Database and Query Model
 Searching for the best answers

Backward expanding search algorithm
 Browsing features of BANKS
 Experiment
 Conclusion
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Searching for the best answers
 Backward expanding search algorithm

Offers a heuristic solution for incrementally computing query
results.

Assume that the graph fits in memory

Start at leaf nodes each containing a query keyword

Run concurrent single source shortest path algorithm from each
such node

Traverses the graph edges backwards

Confluence of backward paths identify answer tree roots
Output a node whenever it is on the intersection of the sets of nodes
reached from each keyword

Answer trees may not be generated in relevance order
Insert answers to a small buffer (heap)
Output highest ranked answer from buffer to user when buffer is full
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Searching for the best answers
 Model (Query : Charuta Sudarshan Roy )
BANKS: Keyword search…
paper
writes
Charuta
Seoyoung Ahn
S. Sudarshan
Prasan Roy
author
Keyword Searching and Browsing in Databases using BANKS
Outline
 Introduction
 Database and Query Model
 Searching for the best answers
 Browsing features of BANKS
 Experiment
 Conclusion
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Browsing
 BANKS system provides
 A rich interface to browse data stored in a relational
database
 Automatically generates browsable views of database
relations and query results
 Schema browsing and data browsing
 A hyperlink to the referenced tuple
 Templates for several predefined ways of displaying
data
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Browsing
 Data browsing
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Browsing
 Schema browsing
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Outline
 Introduction
 Database and Query Model
 Searching for the best answers
 Browsing features of BANKS
 Experiment
 Conclusion
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Error scores vs parameter choices
 The rankings are
relatively stable
across different
choices of parameter
values
  = 0.2 coupled
with log scaling of
edges weights
does best
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Outline
 Introduction
 Database and Query Model
 Searching for the best answers
 Browsing features of BANKS
 Experiment
 Conclusion
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS
Conclusion
 BANKS system
 provides an integrated browsing and keyword
querying system for relational databases
 allows users with no knowledge of database systems
or schema to query and browse relational database
with ease
Seoyoung Ahn
Keyword Searching and Browsing in Databases using BANKS