Download purpose - Rakesh Agrawal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

SQL wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

PL/SQL wikipedia , lookup

Transcript
Implementing P3P
Using Database Technology
Rakesh Agrawal
Jerry Kiernan
Ramakrishnan Srikant
Yirong Xu
IBM Almaden Research Center
The Context for This Work
 Central theme of our current research
– How to design information systems that respect the privacy of
individual information while not impeding information flow
 An important aspect
– Users should be able to express how they would like their
information to be treated
– Businesses should be able to state what they are going to do with
the information they collect
– Data exchange should only happen if the two are compatible
– P3P provides mechanisms for accomplishing this goal
 Other aspects
– Mechanisms for enforcing that businesses act according to their
stated policies (“Hippocratic Databases”)
– Mechanisms for doing analytics at aggregate level while respecting
privacy of individual data (“Privacy Preserving Data Mining”)
Outline
 Overview of P3P (Platform for Privacy
Preferences)
 Architectures for implementing P3P
 Client-Centric (prevailing)
 Server-centric (our proposal)
 Use of database technology for implementing
server-centric architecture
 Performance
 Conclusion and future work
What is P3P
 Traditional privacy policies do not work
– by the lawyers, for the lawyers
 New W3C recommendation (standard) since April 2002
 A standard way to communicate privacy practices
– Privacy Policies
encode a web site’s data-collection and data-use practices in the
P3P policy language
– Privacy Preferences
specify user’s preferences in the APPEL language
– Matching
programmatically compare a preference against a policy
P3P Policy for Volga
<POLICY>
... ...
<STATEMENT>
<PURPOSE><current/><telemarketing/></PURPOSE>
<RECIPIENT><ours/><delivery/></RECIPIENT>
<RETENTION><indefinitely/></RETENTION>
<DATA-GROUP>
<DATA ref="#user.name"/>
<DATA ref="#user.home-info.telecom.telephone"/>
</DATA-GROUP>
</STATEMENT>
<POLICY>
APPEL Preference for Jane
<appel:RULESET>
<appel:RULE behavior="block">
<POLICY>
<STATEMENT>
<PURPOSE appel:connective="or">
<telemarketing/><contact/>
</PURPOSE>
</STATEMENT>
</POLICY>
</appel:RULE>
<appel:RULE behavior="request"/>
<appel:OTHERWISE/>
</appel:RULE>
</appel:RULESET>
Current Implementations
 Tools for creating policies
– IBM Tivoli Privacy Wizard
– P3PEdit
 Tools for creating preferences
– JRC APPEL Preference Editor
 Tools for matching preferences
– AT&T Privacy Bird
– Microsoft Internet Explorer 6.0
– JRC P3P Proxy
Policy-Preference Matching
(Client-Centric)
3
policy and
user
preference
APPEL
Engine
1
request
policy
2
send
policy
5
request web
page if
policy
conforms to
preference
Browser
4
result of
matching
Client Side Matching
Specialized Engine
Web Server
Server-Centric Architecture
 We propose a server-centric architecture for deploying
P3P:
– Server-side matching
– Reuse proven database technology
 Store privacy policies in a database system
 Query the database for matching preferences against
privacy policies
Policy-Preference Matching
(Server-Centric)
1
send
preference
and URI of
a web page
Browser
5
6
send result
of matching
preference
against
policy
request web
page if policy
conforms to
preference
Web 2
Server
preference
and web
page URI
APPEL to
Query
Converter
3
4
query
results
query
Database
policy
metadata
Alternative Architectures
 Two orthogonal dimensions for implementing P3P
– What matching engine should be used?
– Where should the matching take place?
Client
Server
Specialized
Engine
Current
?
Database
Engine
?
Proposed
Discussion of Server-Centric
Solution
 Advantages of server-side matching
–
–
–
–
Support for thin, mobile clients
Better support for new privacy-sensitive applications
Extra information for policy refinement
Easier upgrade of P3P specification
 Advantages of using database
– No reinvention, reuse of proven technology
– Better Management of policies
– Infrastructure for policy enforcement
 Disadvantages
– Greater amount of trust in the server
Variations of the Server-Centric
Architecture
 Relational tables
 Relational tables + XML view
 Native XML store
+
+
+
SQL queries
XQueries
XQueries
Storing Policies in Database
Policy Creation
Wizard
P3P
policies
Shredder
Database
SQL
inserts
policy
metadata
Storing Policies (cont.)
Policy
…
policy_id
name
Statement
statement_id
policy_id
retention
consequence
Purpose
statement_id
policy_id
purpose
required
Recipient
statement_id
policy_id
recipient
required
Datagroup
datagroup_id
statement_id
policy_id
base
data_id
datagroup_id
statement_id
Data
…
……
policy_id
ref
Converting APPEL into Queries
String main(Rule r) {
String sql = “SELECT” + r.behavior() +
“FROM” + applicablePolicy() +
“WHERE” + connect(r);
return sql;
}
String connect(Expression e) {
// matching attributes of e
String sqlAttr = genAttr(e);
// match subexpressions of e
String sqlSub;
let theta = e.connective(); // theta is either “or” or “and”
for each subexpression se of e do
sqlSub += “EXISTS(” + path(se) + “AND” + connect(se) + “)”;
sqlSub += theta;
return sqlAttr + “AND(” + sqlSub + “)”;
}
String path(Expression e) {
return “SELECT *” +
“FROM” + e.name() +
“WHERE” + e.foreignKey() + “=” +
e.parent().primaryKey();
}
Converting APPEL into SQL
APPEL
<appel:RULE
behavior="block">
<POLICY>
<STATEMENT>
<PURPOSE
appel:connective="or">
<telemarketing/>
<contact/>
</PURPOSE>
</STATEMENT>
</POLICY>
</appel:RULE>
 Recursive algorithm
 APPEL behavior  Select list
 APPEL elements  SQL predicates
 Link predicates by foreign keys
SQL
SELECT ‘block’
FROM Policy
WHERE
EXISTS(
SELECT *
FROM Statement
WHERE Statement.policy_id =
Policy.policy_id AND
EXISTS(
SELECT *
FROM Purpose
WHERE Purpose.statement_id =
Statement.statement_id
AND Purpose.policy_id =
Statement.policy_id
AND Purpose.purpose =
‘telemarketing’
OR
Purpose.purpose =
‘contact’
Converting APPEL into XQuery
APPEL
<appel:RULE
behavior="block">
<POLICY>
<STATEMENT>
<PURPOSE
appel:connective="or">
<telemarketing/>
<contact/>
</PURPOSE>
</STATEMENT>
</POLICY>
</appel:RULE>
XQuery
if (document(“policy”)
POLICY
/STATEMENT
/PURPOSE
[ telemarketing
OR
contact
]
then
return <block/>
Performance Experiments
 Experiment Setup
– Windows NT 4.0 Sever with dual 600MHz processors and 512M
memory
– DB2 UDB 7.1
– Public domain APPEL engine (from JRC)
– XTable (aka XPERANTO) prototype for the XQuery alternative
Datasets
 29 P3P policies from
 5 APPEL preferences
Fortune 1000
company web site
from JRC test suite
Policy
Preference # Rules
Size
(KB)
# Statement
Size
(KB)
Average
2
4.4
Very High
10
3.1
Max
5
11.9
High
7
2.8
Min
1
1.6
Medium
4
2.1
Low
2
0.9
Very Low
1
0.3
Average
4.8
1.9
Experiment Results
APPEL
SQL
Engine Convert Query
XQuery
Total
Average
2.63
0.08
0.08
0.16
1.65
Max
9.08
0.14
0.24
0.34
5.00
Time for matching a preference against a policy (seconds)
Experiment Results
Preference APPEL
Engine Convert
SQL
Query
XQuery
Total
Very High
2.65
0.09
0.08
0.17
2.63
High
2.68
0.10
0.14
0.24
2.33
Medium
2.66
0.13
0.14
0.27
-
Low
2.60
0.06
0.03
0.09
1.51
Very Low
2.54
0.04
< 0.01
0.05
0.31
Matching times for different preferences (seconds)
 Latency of the SQL implementation is more than
acceptable for practical deployment
Why APPEL is Slow
 Significant cost for augmenting data elements
appearing in a policy with categories predefined in P3P
base schema
 The APPEL engine incurs this cost for every
preference checking
 SQL implementation only incurs this cost when
shredding policies into database, which is amortized
over a large number of matchings of different
preferences
Why XQuery is Slow
 Significant cost for the XML view to convert
XQueries into SQL against relational database
 Untapped optimization opportunities 
Summary
 P3P is an important application area for database
systems
 Server-centric architecture reuses database
technology for implementing P3P
 Adequate performance for it to be used in
practical deployment of P3P
Future Work
 Checking policies against preferences before access
web sites is only a small aspect of enabling web users
gain control over their private information
 P3P will not succeed unless it provides mechanism for
enforcing that a site acts according to its stated policy
 To this end, we are implementing the Hippocratic
Database architecture (VLDB-02)
–
–
–
–
XPref: XPath-based privacy preference language (WWW-03)
Order preserving encryption
Access control through query analysis and rewriting
Nibbling open problems outlined in the Hippocratic Database
vision
Backup
Proxy model
 Imagine a site that has policies for all
companies, and checks user preferences
– individual company can take our technology
also
Preference Matching
DB2
IBM policy
Browser
Preferences
Browser
DB2
……
Internet
IBM
policy
DB2
……
ATT
ATT policy
policy
Ford policy
DB2
Ford policy
Policy-Preference Matching (Client-Centric)

Reference File
Cache
APPEL
Engine

reference
file

URI of a
web page

URI of the
applicable
policy
policy and
 user
preference

result of
matching

request
reference file
send reference
file
Browser
Web Server

request policy

send policy

request web
page if policy
conforms to
preference
Policy-Preference Matching (ServerCentric)
1
send preference
and URI of a
web page
Browser
2
preference
and web
page URI
APPEL to
SQL
Converter
Web Server
5
6
send result of
matching
preference
against policy
request web
page if policy
conforms to
preference
3
4
query
results
SQL
query
Database
policy
metadata