Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Overview of Privacy Preserving Techniques This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas Focus on problems and the basic ideas Outline Privacy problem in computing Major techniques Data perturbation Data anonymization Cryptographic methods Privacy in different application areas Data mining Data publishing Databases Data outsourcing Social network Mobile computing Privacy vs. Security Network security Assumption: the two parties trust each other, but the communication network is not trusted. Alice Communication channel Bob Encrypting data Decrypting data Bob knows the original data that Alice owns. Privacy problems Information about a person or a single party Parties do not trust each other: curious parties (including malicious insiders) may look at sensitive contents Parties follow protocols honestly (semi-honest assumption) Alice Deliver “sanitized” data Bob Bob is an untrusted party. He may try to figure out some Private information from the sanitized data Two categories (1) Transformation based methods a “curious party” Alice Communication channel Bob transformed data Works on the transformed data only Bob does not know the original data. (2) Cryptographic protocol methods Some protocol using cryptographic primitives Statistical Info/ Intermediate result Info from other parties Party 1 Party 2 Party n data data data Computing scenarios Web model user 1 user 1 Private info Web Apps data user 1 collaboration model Party 1 Party 2 Party n data data data Outsourcing model Data owner data Export Service data provider to use the service Issues with data transformation Techniques performing the transformation Transformation should preserve important information How much information loss How to recover the information from the transformed data Threat model Attacks reconstructing the original data from the transformed data Attacks finding significant additional information The cost Transforming data Recovering the important information Transformation techniques Data Perturbation Additive perturbation Multiplicative perturbation Randomized responses Data Anonymization k-anonymization l-diversity t-closeness m-invariance Attacks on transformation techniques Data reconstruction and noise reduction techniques (on data perturbation) random matrix theory spectral analysis Inference attacks (on data anonymization) Utilizing background knowledge Cryptographic approaches Using the following cryptographic primitives Secure multiparty computation (SMC) Yao’s millionaire problem Alice wants to know whether she has more money than Bob Alice&Bob cannot know the exact number of each other’s money. Alice knows only the result Oblivious transfer Bob holds n items. Alice wants to know i-th item. Bob cannot know i – Alice’s privacy Alice knows nothing except the i-th item Homomorphic encryption Allow computation on encrypted data E.g., E(X)*E(Y) = E(X+Y) Characteristics: Pro: preserving total privacy Con: expensive, limited # of parties Applications: for distributed datasets (the corporate model) Protocols for data mining algorithms Statistical analysis (matrix, vector computation) Often discussed in two-party (or a small number of parties) scenarios. Privacy-preserving data mining Purpose Mining the models without leaking the information about individual records topics Basic statistics (mean, variance, etc.) Data classification Data clustering Association rule mining Privacy of mined models Privacy preserving database applications [Du&Atallah2000] Statistical databases Private information retrieval Outsourced databases Social Network Privacy Publishing social network structure Anonymization is a popular method Attacks can be applied to reveal the mapping [163,167] Characteristics of subgraph Adversarial background knowledge Social network privacy Privacy settings of SN Help users set/tune privacy settings Understand the relationship between privacy and functionalities of SN They are a pair of conflicting factors Privacy in Mobile computing Preserving location privacy User-defined or system supplied privacy policies [Bamba&Liu2008, Beresford&Stajano2003] Extending k-anonymity techniques to location cloaking [Gedik&Liu2008, Gruteser&Grunwald2002] Pseudonymity of user identities – frequently changing internal id. [Beresford&Stajano2003]