Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Implementing P3P Using Database Technology Rakesh Agrawal Jerry Kiernan Ramakrishnan Srikant Yirong Xu IBM Almaden Research Center The Context for This Work Central theme of our current research – How to design information systems that respect the privacy of individual information while not impeding information flow An important aspect – Users should be able to express how they would like their information to be treated – Businesses should be able to state what they are going to do with the information they collect – Data exchange should only happen if the two are compatible – P3P provides mechanisms for accomplishing this goal Other aspects – Mechanisms for enforcing that businesses act according to their stated policies (“Hippocratic Databases”) – Mechanisms for doing analytics at aggregate level while respecting privacy of individual data (“Privacy Preserving Data Mining”) Outline Overview of P3P (Platform for Privacy Preferences) Architectures for implementing P3P Client-Centric (prevailing) Server-centric (our proposal) Use of database technology for implementing server-centric architecture Performance Conclusion and future work What is P3P Traditional privacy policies do not work – by the lawyers, for the lawyers New W3C recommendation (standard) since April 2002 A standard way to communicate privacy practices – Privacy Policies encode a web site’s data-collection and data-use practices in the P3P policy language – Privacy Preferences specify user’s preferences in the APPEL language – Matching programmatically compare a preference against a policy P3P Policy for Volga <POLICY> ... ... <STATEMENT> <PURPOSE><current/><telemarketing/></PURPOSE> <RECIPIENT><ours/><delivery/></RECIPIENT> <RETENTION><indefinitely/></RETENTION> <DATA-GROUP> <DATA ref="#user.name"/> <DATA ref="#user.home-info.telecom.telephone"/> </DATA-GROUP> </STATEMENT> <POLICY> APPEL Preference for Jane <appel:RULESET> <appel:RULE behavior="block"> <POLICY> <STATEMENT> <PURPOSE appel:connective="or"> <telemarketing/><contact/> </PURPOSE> </STATEMENT> </POLICY> </appel:RULE> <appel:RULE behavior="request"/> <appel:OTHERWISE/> </appel:RULE> </appel:RULESET> Current Implementations Tools for creating policies – IBM Tivoli Privacy Wizard – P3PEdit Tools for creating preferences – JRC APPEL Preference Editor Tools for matching preferences – AT&T Privacy Bird – Microsoft Internet Explorer 6.0 – JRC P3P Proxy Policy-Preference Matching (Client-Centric) 3 policy and user preference APPEL Engine 1 request policy 2 send policy 5 request web page if policy conforms to preference Browser 4 result of matching Client Side Matching Specialized Engine Web Server Server-Centric Architecture We propose a server-centric architecture for deploying P3P: – Server-side matching – Reuse proven database technology Store privacy policies in a database system Query the database for matching preferences against privacy policies Policy-Preference Matching (Server-Centric) 1 send preference and URI of a web page Browser 5 6 send result of matching preference against policy request web page if policy conforms to preference Web 2 Server preference and web page URI APPEL to Query Converter 3 4 query results query Database policy metadata Alternative Architectures Two orthogonal dimensions for implementing P3P – What matching engine should be used? – Where should the matching take place? Client Server Specialized Engine Current ? Database Engine ? Proposed Discussion of Server-Centric Solution Advantages of server-side matching – – – – Support for thin, mobile clients Better support for new privacy-sensitive applications Extra information for policy refinement Easier upgrade of P3P specification Advantages of using database – No reinvention, reuse of proven technology – Better Management of policies – Infrastructure for policy enforcement Disadvantages – Greater amount of trust in the server Variations of the Server-Centric Architecture Relational tables Relational tables + XML view Native XML store + + + SQL queries XQueries XQueries Storing Policies in Database Policy Creation Wizard P3P policies Shredder Database SQL inserts policy metadata Storing Policies (cont.) Policy … policy_id name Statement statement_id policy_id retention consequence Purpose statement_id policy_id purpose required Recipient statement_id policy_id recipient required Datagroup datagroup_id statement_id policy_id base data_id datagroup_id statement_id Data … …… policy_id ref Converting APPEL into Queries String main(Rule r) { String sql = “SELECT” + r.behavior() + “FROM” + applicablePolicy() + “WHERE” + connect(r); return sql; } String connect(Expression e) { // matching attributes of e String sqlAttr = genAttr(e); // match subexpressions of e String sqlSub; let theta = e.connective(); // theta is either “or” or “and” for each subexpression se of e do sqlSub += “EXISTS(” + path(se) + “AND” + connect(se) + “)”; sqlSub += theta; return sqlAttr + “AND(” + sqlSub + “)”; } String path(Expression e) { return “SELECT *” + “FROM” + e.name() + “WHERE” + e.foreignKey() + “=” + e.parent().primaryKey(); } Converting APPEL into SQL APPEL <appel:RULE behavior="block"> <POLICY> <STATEMENT> <PURPOSE appel:connective="or"> <telemarketing/> <contact/> </PURPOSE> </STATEMENT> </POLICY> </appel:RULE> Recursive algorithm APPEL behavior Select list APPEL elements SQL predicates Link predicates by foreign keys SQL SELECT ‘block’ FROM Policy WHERE EXISTS( SELECT * FROM Statement WHERE Statement.policy_id = Policy.policy_id AND EXISTS( SELECT * FROM Purpose WHERE Purpose.statement_id = Statement.statement_id AND Purpose.policy_id = Statement.policy_id AND Purpose.purpose = ‘telemarketing’ OR Purpose.purpose = ‘contact’ Converting APPEL into XQuery APPEL <appel:RULE behavior="block"> <POLICY> <STATEMENT> <PURPOSE appel:connective="or"> <telemarketing/> <contact/> </PURPOSE> </STATEMENT> </POLICY> </appel:RULE> XQuery if (document(“policy”) POLICY /STATEMENT /PURPOSE [ telemarketing OR contact ] then return <block/> Performance Experiments Experiment Setup – Windows NT 4.0 Sever with dual 600MHz processors and 512M memory – DB2 UDB 7.1 – Public domain APPEL engine (from JRC) – XTable (aka XPERANTO) prototype for the XQuery alternative Datasets 29 P3P policies from 5 APPEL preferences Fortune 1000 company web site from JRC test suite Policy Preference # Rules Size (KB) # Statement Size (KB) Average 2 4.4 Very High 10 3.1 Max 5 11.9 High 7 2.8 Min 1 1.6 Medium 4 2.1 Low 2 0.9 Very Low 1 0.3 Average 4.8 1.9 Experiment Results APPEL SQL Engine Convert Query XQuery Total Average 2.63 0.08 0.08 0.16 1.65 Max 9.08 0.14 0.24 0.34 5.00 Time for matching a preference against a policy (seconds) Experiment Results Preference APPEL Engine Convert SQL Query XQuery Total Very High 2.65 0.09 0.08 0.17 2.63 High 2.68 0.10 0.14 0.24 2.33 Medium 2.66 0.13 0.14 0.27 - Low 2.60 0.06 0.03 0.09 1.51 Very Low 2.54 0.04 < 0.01 0.05 0.31 Matching times for different preferences (seconds) Latency of the SQL implementation is more than acceptable for practical deployment Why APPEL is Slow Significant cost for augmenting data elements appearing in a policy with categories predefined in P3P base schema The APPEL engine incurs this cost for every preference checking SQL implementation only incurs this cost when shredding policies into database, which is amortized over a large number of matchings of different preferences Why XQuery is Slow Significant cost for the XML view to convert XQueries into SQL against relational database Untapped optimization opportunities Summary P3P is an important application area for database systems Server-centric architecture reuses database technology for implementing P3P Adequate performance for it to be used in practical deployment of P3P Future Work Checking policies against preferences before access web sites is only a small aspect of enabling web users gain control over their private information P3P will not succeed unless it provides mechanism for enforcing that a site acts according to its stated policy To this end, we are implementing the Hippocratic Database architecture (VLDB-02) – – – – XPref: XPath-based privacy preference language (WWW-03) Order preserving encryption Access control through query analysis and rewriting Nibbling open problems outlined in the Hippocratic Database vision Backup Proxy model Imagine a site that has policies for all companies, and checks user preferences – individual company can take our technology also Preference Matching DB2 IBM policy Browser Preferences Browser DB2 …… Internet IBM policy DB2 …… ATT ATT policy policy Ford policy DB2 Ford policy Policy-Preference Matching (Client-Centric) Reference File Cache APPEL Engine reference file URI of a web page URI of the applicable policy policy and user preference result of matching request reference file send reference file Browser Web Server request policy send policy request web page if policy conforms to preference Policy-Preference Matching (ServerCentric) 1 send preference and URI of a web page Browser 2 preference and web page URI APPEL to SQL Converter Web Server 5 6 send result of matching preference against policy request web page if policy conforms to preference 3 4 query results SQL query Database policy metadata