Download Quantification of Privacy

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Information security wikipedia , lookup

Information theory wikipedia , lookup

Transcript
New Approach to
Quantification of Privacy on
Social Network Sites
IEEE AINA 2010
Tran Hong Ngoc
Isao Echizen
Kamiyama Komei
Hiroshi Yoshiura
VNU, Vietnam
NII, Japan
UEC, Japan
UEC, Japan
Presenter: Yu-Song Syu
Social Network Sites

Growth of SNSs


Leads to an explosion in online informationsharing
With SNSs


People share information with friends
Information include sensitive data

Location, age, career, …
Intruders in SNSs

By making statistics, Intruders may achieve
personal information:





Commercial purpose
Identity theft
Physical harm
…
How to get such information?
http://www.iis.sinica.edu.tw
http://www.iis.sinica.edu.tw

Usually, people do not know How Much private
information they reveal about themselves and others
Privacy Metric

Based on probability and entropy

Helps user know how much private information may
leak from their blog sentences

Defines the Leaked Privacy Value, Δ, as the amount
of knowledge that intruders can learn about a
“problem of interest”
Proposed System Model
Info. Retrieval techniques
based on NLP methods
Quantification of Privacy
System Model
Find the information about someone
Prefecture, age, city, university, …
Blog sentences that users post
Event
Event & Blog Set
BlogSetj
BlogSeti

Event:
 ( k )  x | x U ,0  p( x)  1

Blog Set:


(k )
i

 x | x
(k )

,0  p( x)  1 , 
(k )
i

(k )
Intersection:
~ ( k )    x | 0  p( x)  1, x  1k , x   2k ,..., x   nk 
n
i 1
Blog Set / Joint Blog Set
Assumed to never be empty
Example: Prefecture
Before Proposed Metric…
Math Backgrounds

Entropy (Uncertainty)
Event

Conditional Entropy

Joint Entropy
Possible Value
Why Use Entropy?

Idea:
Difference of Uncertainty
Leaked Privacy
Privacy Leakage Metric

Leaked Privacy Value:

The change in the privacy value that is had by subtracting
the privacy after sentences are posted from the privacy
before the sentences are posted
before
(k )
after
(k )
  H ({ })  H ({~ })
# events
H ({ ( k ) })  H (  (1) ,  ( 2) ,...,  ( m) )
,&
H ({~ ( k ) })  H ( ~(1) , ~ ( 2) ,..., ~ ( m) )
Experiments

Dataset:


Statistical Survey Department, Statistics Bureau,
Ministry of Internal Affairs and Communications
Problem of Interest:

Gaining information relating to a victim in an
accident, which happened in Japan’s subway and
were discussed by SNS users
Experiments - Prefecture
Experiments - Age
(Age)
Prefecture
Age
Experiments – Total Leaked Privacy

Total Leaked Privacy Before & After Blogging
Conclusions

Proposed a new metric to quantify how much
private information is leaked from blog on
SNSs

SNS users can see if the posting carelessly
expose private information

Based on probability and entropy, the proposal
is simpler then others but effective, as proved
in experiments