Download personalized anonymity

Personalized Privacy Preservation: beyond k-anonymity and ℓ-diversity SIGMOD 2006 Presented By Hongwei Tian Outline       What is Privacy Breaching? Drawback of K-Anonymity and ℓ-Diversity Personalized Anonymous How Adversary attacks? How Data owner defeats the attacks? Experiments What is Privacy Breaching  Mainly, there are two classes  Compare prior belief and posterior belief   prior belief < posterior belief : help adversary 50% --> 80% “Bob has cancer” prior belief > posterior belief : reasonable? 80% --> 50% “Bob has cancer” =>I don’t think so. And many others have the same thought with me.  What is Privacy Breaching  Only posterior belief  K-Anonymity: posterior belief ≤ 1/ k   ℓ-Diversity: posterior belief = p ≤ threshold   In a QI group, there are at least k tuples. In a QI group, p percent of the tuples appear in the largest sub-group Personalized: posterior belief = Prbreach ≤ threshold  Prbreach : Breaching probability Drawback of K-Anonymity and ℓ-Diversity  a k-anonymous table only prevents association between individuals and tuples.  ℓ-Diversity and Personalized methods both prevent association between individuals and sensitive values. Drawback of K-Anonymity and ℓ-Diversity  a k-anonymous table may lose considerable information  ℓ-Diversity also has this problem. Drawback of K-Anonymity and ℓ-Diversity  Consider such a situation:   In one QI group, all tuples come from the same individual v Adversary only knows v in this QI group from external datasets I am Bob and I am unlucky that I have so many diseases Bob must be here Aha, I know Bob has four diseases Drawback of K-Anonymity and ℓ-Diversity  Do not take into account personal anonymity requirements Personalized Anonymous   personalized anonymity: a person can specify the degree of privacy protection for her/his sensitive values. So far, the literature has focused on a universal approach that exerts the same amount of privacy preserving for all persons, without catering for their concrete needs. Personalized Anonymous Personalized Anonymous  BREACH PROBABILITY:  For a tuple t ∈ T, its breach probability Pbreach(t) equals the probability that an adversary can infer from T∗ that any of the associations {o, v1},..., {o, vx} exists in T, where v1, ..., vx are the leaf values in SUBTR(t.GN). Personalized Anonymous  BREACH PROBABILITY    Data owner and Adversary both can compute it Data owner want to Pbreach(t) < threshold, then the privacy of the individual corresponding to t holds Adversary hope to get a Pbreach(t) > threshold, which breaches the privacy of the individual.  How the adversary do (attack)? ??? How Adversary attacks  Adversary know One Individual One Tuple (Primary Case) Possible reconstruction P(5,4) ×3 ×3=1080; Breaching reconstruction 2 × P(4,3) × 3 ×3=432; Pbreach(t) = 432/1080=2/5 How Adversary attacks  Adversary know One Individual Multiple Tuples (Non-Primary Case) Possible reconstruction 54 ×3 ×3=5625; Breaching reconstruction 2 × 53 × 3 ×3 － 52× 3 ×3 =2025; Pbreach(t) = 2025/5625=9/25 How Data owner defeats the attacks  The formal computation for Pbreach(t).  Primary Case  Non-Primary Case Overlap disjoint n=5,b=2 n=2,b=2, c=1/3 How Data owner defeats the attacks  Utility Measure: Information Loss How Data owner defeats the attacks  Algorithm Picture Group1 Split New Table … Table SA-Generalization Group N SA-Generalization Replace if having more utility How Data owner defeats the attacks  Algorithm      Start from all QI values are roots and SA values have been generalized to satisfy every Pbreach (t)< threshold Top-down split QI attributes, in order to increase utility information “Single Split” means every time, only one attribute can be split into its direct children SA-Generalization guarantee in every QI group, every Pbreach (t)< threshold, then the whole table prevents privacy Every iteration, it should find a “Split” and after SA-Generalization, the utility information increases; otherwise, it quits How Data owner defeats the attacks  Algorithm     Bottom-up generalize SA values to improve privacy If tuples in Sprob satisfy privacy requirement, means all tuples in G satisfy privacy requirement All SA values of the tuples which dissatisfy privacy requirement will be generalized to the parent of the Guarding Node which approaches the root mostly Finish when no tuples in Sprob dissatisfy privacy requirement; or no possibility to generalize. Experiments      Adult dataset (http://www.ipums.org) 5 QI Attributes, 1 SA Attribute Pri-leaf, Pri-mixed, Nonpri-leaf, Nonpri-mixed Breaching Threshold =0.25 All Weight for attributes =1 Experiments  Breaching Probability References  Xiaokui Xiao and Yufei Tao. Personalized Privacy Preservation. In SIGMOD 2006.  A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacybeyond k-anonymity. In ICDE, 2006.  K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD, 2005.  A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaching in privacy preserving data mining. In ACM Symposium on Principles of Database Systems, 2003.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download personalized anonymity