Download Privacy Preserving Data Mining - Data Mining and Security Lab

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Privacy Preserving Data Mining
Benjamin Fung
bfung(at)cs.sfu.ca
Privacy Preserving Data Mining
• What is data mining?
– Non-trivial extraction of implicit, previously unknown,
and potentially useful information from large data
sets or databases [W. Frawley and G. PiatetskyShapiro and C. Matheus, 1992]
• What is privacy preserving data mining?
– Study of achieving some data mining goals without
scarifying the privacy of the individuals
Scenario (Information Sharing)
• A data owner wants to release a person-specific data
table to another party (or the public) for the purpose of
classification analysis without scarifying the privacy of
the individuals in the released data.
Person-specific
data
Data owner
Data recipients
Privacy Threat
• If a description on (Education, Sex) is so specific that not
many people match it, releasing the table will lead to
linking a unique or a small number of individuals with
sensitive information.
Education
Sex
Age
Class
# of Recs.
9th
F
30
0G3B
3
10th
M
32
0G4B
4
11th
F
35
2G3B
5
12th
F
37
3G1B
4
Bachelors
F
42
4G2B
6
Education
Sex
Diagnosis
…
Bachelors
F
44
4G0B
4
Bachelors
F
Depression
…
Masters
M
44
4G0B
4
Bachelors
M
Heart disease
…
Masters
F
44
3G0B
3
Masters
F
Depression
…
Doctorate
F
44
1G0B
1
Masters
F
Heart disease
…
Total:
34
Frecipients
Knee injury
Data
Adversary
Doctorate
…
Solution: Generalization
Education
Sex
Age
Class
# of Recs.
Education
Sex
Age
Class
# of Recs.
9th
F
30
0G3B
3
9th
F
30
0G3B
3
10th
M
32
0G4B
4
10th
M
32
0G4B
4
11th
F
35
2G3B
5
11th
F
35
2G3B
5
12th
F
37
3G1B
4
12th
F
37
3G1B
4
Bachelors
F
42
4G2B
6
Bachelors
F
42
4G2B
6
Bachelors
F
44
4G0B
4
Bachelors
F
44
4G0B
4
Masters
M
44
4G0B
4
Grad School
M
44
4G0B
4
Masters
F
44
3G0B
3
Grad School
F
44
4G0B
4
Doctorate
F
44
1G0B
1
References
1. K. Wang, B. C. M. Fung, and P. S. Yu. Template-Based Privacy
Preservation in Classification Problems. In Proc. of the 5th IEEE
International Conference on Data Mining (ICDM 2005), Houston,
TX, USA, November 27-30, 2005.
2. K. Wang, B. C. M. Fung, and G. Dong. Integrating Private
Databases for Data Analysis. In Proc. of the 2005 IEEE
International Conference on Intelligence and Security Informatics
(ISI 2005), pages 171-182, Atlanta, GA, USA, May 19-20, 2005.
3. B. C. M. Fung, K. Wang, and P. S. Yu. Top-Down Specialization for
Information and Privacy Preservation. In Proc. of the 21st IEEE
International Conference on Data Engineering (ICDE 2005), pages
205-216, Tokyo, Japan, April 5-8, 2005.
For more information, visit http://www.cs.sfu.ca/~bfung
Related documents