Download Online Auditing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

ContactPoint wikipedia , lookup

Transcript
Online Auditing
Kobbi Nissim
Microsoft
Based on a position paper with Nina Mishra
The Setting
q = (f ,i1,…,ik)
f (di1,…,dik)
Statistical
database
• Dataset: {d1,…,dn}
– Entries di: Real, Integer, Boolean
• Query: q = (f ,i1,…,ik)
– f : Min, Max, Median, Sum, Average, Count…
• Some users are bad…
2
Auditing
Here’s the answer
OR
Here’s a new query: qi+1
Query denied (as the answer
would cause privacy loss)
Auditor
Query log
q1,…,qi
Statistical
database
3
Auditing
• [Adam, Wortmann 89] classify auditing as a query restriction
method
– Such methods limit the queries users may post, usually
imposing some structure (e.g. combinatorial, algebraic)
– “Auditing of an SDB involves keeping up-to-date logs of all
queries made by each user (not the data involved) and
constantly checking for possible compromise whenever a
new query is issued”
• Partial motivation: May allow for more queries to be
posed, if no privacy threat occurs
• Early work: Hofmann 1977, Schlorer 1976, Chin,
Ozsoyoglu 1981, 1986
• Recent interest: Kleinberg, Papadimitriou, Raghavan 2000,
Li, Wang, Wang, Jajodia 2002, Jonsson, Krokhin 2003
4
Design choices in Prior Work
• Out of the scope for this talk (but important):
– Very weak privacy guarantee: Privacy breached
(only) when a database entry may be uniquely
deduced
– Exact answers given
• Important for this talk:
– Data taken into account in decision procedure
• Answers to q1,…,qi and qi+1 taken into account
• Denials ignored
5
Some Prior Work on Auditors
Sum/Max
[Chin]
Boolean
[KPR00]
Data
real
Queries
Breach
Sum/max di learned
Complexity
NP-hard
0/1
Sum
--”--
NP-hard
Max [KPR00]
Real
Max
--”--
PTIME
Interval based
[LWWJ02]
Generalized
results [JK03]
di [a,b]
sum
di within
accuracy .
PTIME
NP-hard /
PTIME
6
Example 1: Sum/Max auditing
di real, sum/max queries
q1 = sum(d1,d2,d3)
sum(d1,d2,d3) = 15
q2 = max(d1,d2,d3)
Denied (the answer would
cause privacy loss)
is denied iff
Ohq2
well…
d1=d2=d3 = 5
I win!
Auditor
7
Example 2: Interval Based Auditing
di  [0,100], sum queries,  =1 (PTIME)
q1 = sum(d1,d2)
Sorry, denied
q2 = sum(d2,d3)
sum(d2,d3) = 50
d1,d2  [0,1]
d3  [49,50]
Auditor
8
Sounds Familiar?
Colonel Oliver North, on the Iran-Contra Arms Deal:
On the advice of my counsel I respectfully and
regretfully decline to answer the question based on
my constitutional rights.
David Duncan, Former auditor for Enron and
partner in Andersen:
Mr. Chairman, I would like to answer the
committee's questions, but on the advice of
my counsel I respectfully decline to answer
the question based on the protection afforded
me under the Constitution of the United
States.
9
What about Max Auditing?
d1 d2 d3 d4 d5 d6 d7 d8 … dn-1 dn
di real
q1 = max(d1,d2,d3,d4)
M1234
q2 = max(d1,d2,d3)
M123 / denied
If denied: d4=M1234
q2 = max(d1,d2)
M12 / denied
If denied: d3=M123
Recover 1/8 of the database!
Auditor
10
What about Boolean Auditing?
d1 d2 d3 d4 d5 d6 d7 d8 … dn-1 dn
q1 = sum(d1,d2)
di Boolean
1 / denied
q2=sum(d2,d3)
…
1 / denied
qi denied iff di = di+1  learn database/complement
Let di,dj,dk not all equal, where qi-1, qi, qj-1, qj, qk-1, qk all denied
q2=sum(di,dj,dk)
1/2
Recover the entire database!
Auditor
11
What are the Problems?
• Obvious problem: denied queries ignored
– Algorithmic problem: not clear how to incorporate denials in
the deicion
• Subtle problem:
– Query denials leak (potentially sensitive) information
• Users cannot decide denials by themselves
Possible assignments to {d1,…,dn}
Assignments consistent
with (q1,…qi)
qi+1 denied
12
A Spectrum of Auditors
Decision data
q1,…,qi, qi+1
q1,…,qi, qi+1
a1,…,ai, ai+1
Examples
• Size overlap restriction
• Algebraic structure
• Sum/Max, Interval based,
Boolean, Max
• Cell suppression
• k-anonimity
“safe”
“unsafe”
*Note: can work in “unsafe” region, but need to prove denials do13 not
leak crucial information
Simulatable Auditing*
An auditor is simulatable if a simulator exists s.t.:
qi+1
q1,…,qi
Statistical
database
Auditor
Deny/answer
qi+1
q1,…,qi
a1,…,ai

Simulator
Deny/answer
Simulation  denials do not leak information
* `self auditors’ in [DN03]
14
Summary
• Subtleties in current definition of auditors
allow for information leakage, and
potentially, privacy breaches
– Denials are not taken into account
– Auditor uses information not available to user
• Simulatable auditors provably don’t leak
information in decision
– New starting point for research on auditors
15