Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Online Auditing Kobbi Nissim Microsoft Based on a position paper with Nina Mishra The Setting q = (f ,i1,…,ik) f (di1,…,dik) Statistical database • Dataset: {d1,…,dn} – Entries di: Real, Integer, Boolean • Query: q = (f ,i1,…,ik) – f : Min, Max, Median, Sum, Average, Count… • Some users are bad… 2 Auditing Here’s the answer OR Here’s a new query: qi+1 Query denied (as the answer would cause privacy loss) Auditor Query log q1,…,qi Statistical database 3 Auditing • [Adam, Wortmann 89] classify auditing as a query restriction method – Such methods limit the queries users may post, usually imposing some structure (e.g. combinatorial, algebraic) – “Auditing of an SDB involves keeping up-to-date logs of all queries made by each user (not the data involved) and constantly checking for possible compromise whenever a new query is issued” • Partial motivation: May allow for more queries to be posed, if no privacy threat occurs • Early work: Hofmann 1977, Schlorer 1976, Chin, Ozsoyoglu 1981, 1986 • Recent interest: Kleinberg, Papadimitriou, Raghavan 2000, Li, Wang, Wang, Jajodia 2002, Jonsson, Krokhin 2003 4 Design choices in Prior Work • Out of the scope for this talk (but important): – Very weak privacy guarantee: Privacy breached (only) when a database entry may be uniquely deduced – Exact answers given • Important for this talk: – Data taken into account in decision procedure • Answers to q1,…,qi and qi+1 taken into account • Denials ignored 5 Some Prior Work on Auditors Sum/Max [Chin] Boolean [KPR00] Data real Queries Breach Sum/max di learned Complexity NP-hard 0/1 Sum --”-- NP-hard Max [KPR00] Real Max --”-- PTIME Interval based [LWWJ02] Generalized results [JK03] di [a,b] sum di within accuracy . PTIME NP-hard / PTIME 6 Example 1: Sum/Max auditing di real, sum/max queries q1 = sum(d1,d2,d3) sum(d1,d2,d3) = 15 q2 = max(d1,d2,d3) Denied (the answer would cause privacy loss) is denied iff Ohq2 well… d1=d2=d3 = 5 I win! Auditor 7 Example 2: Interval Based Auditing di [0,100], sum queries, =1 (PTIME) q1 = sum(d1,d2) Sorry, denied q2 = sum(d2,d3) sum(d2,d3) = 50 d1,d2 [0,1] d3 [49,50] Auditor 8 Sounds Familiar? Colonel Oliver North, on the Iran-Contra Arms Deal: On the advice of my counsel I respectfully and regretfully decline to answer the question based on my constitutional rights. David Duncan, Former auditor for Enron and partner in Andersen: Mr. Chairman, I would like to answer the committee's questions, but on the advice of my counsel I respectfully decline to answer the question based on the protection afforded me under the Constitution of the United States. 9 What about Max Auditing? d1 d2 d3 d4 d5 d6 d7 d8 … dn-1 dn di real q1 = max(d1,d2,d3,d4) M1234 q2 = max(d1,d2,d3) M123 / denied If denied: d4=M1234 q2 = max(d1,d2) M12 / denied If denied: d3=M123 Recover 1/8 of the database! Auditor 10 What about Boolean Auditing? d1 d2 d3 d4 d5 d6 d7 d8 … dn-1 dn q1 = sum(d1,d2) di Boolean 1 / denied q2=sum(d2,d3) … 1 / denied qi denied iff di = di+1 learn database/complement Let di,dj,dk not all equal, where qi-1, qi, qj-1, qj, qk-1, qk all denied q2=sum(di,dj,dk) 1/2 Recover the entire database! Auditor 11 What are the Problems? • Obvious problem: denied queries ignored – Algorithmic problem: not clear how to incorporate denials in the deicion • Subtle problem: – Query denials leak (potentially sensitive) information • Users cannot decide denials by themselves Possible assignments to {d1,…,dn} Assignments consistent with (q1,…qi) qi+1 denied 12 A Spectrum of Auditors Decision data q1,…,qi, qi+1 q1,…,qi, qi+1 a1,…,ai, ai+1 Examples • Size overlap restriction • Algebraic structure • Sum/Max, Interval based, Boolean, Max • Cell suppression • k-anonimity “safe” “unsafe” *Note: can work in “unsafe” region, but need to prove denials do13 not leak crucial information Simulatable Auditing* An auditor is simulatable if a simulator exists s.t.: qi+1 q1,…,qi Statistical database Auditor Deny/answer qi+1 q1,…,qi a1,…,ai Simulator Deny/answer Simulation denials do not leak information * `self auditors’ in [DN03] 14 Summary • Subtleties in current definition of auditors allow for information leakage, and potentially, privacy breaches – Denials are not taken into account – Auditor uses information not available to user • Simulatable auditors provably don’t leak information in decision – New starting point for research on auditors 15