Download Preservation of Trajectories data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Privacy-preserving of Trajectory
Data : A Survey
Huo Zheng
OUTLINE
• Motivating Applications
• Privacy-preserving in Different Scenarios
• Conclusions & Future work
Motivating Applications
1.
Trajectory
data
publication
& analysis
2.
LBS
3.
4.
ITS
Trajectory
data
outsourcing
OUTLINE
• Motivating Applications
• Privacy-preserving in Different Scenarios
• Conclusions & Future work
Solutions-overview
Data
Publication
Suppression
Data
Outsourcing
ITS
[Terrovitis MDM’08]
LBS
[Gruteser ISE’04]
[Abul ICDMW’07]
[Ghinita TDP’09]
Anonimizatio
n
[Nergiz TDP’09]
[Hoh
MobiSys’08]
[Abul ICDE’08]
[Xu INFOCOM’08]
[Gidofalvi MDM’07]
[Divanis SIAM’09]
Perturbation
Encryption
[Hoh SecureCom’05]
[Lee CIKM’09]
[ You PALSM’07]
[Xu Proposal’10]
Scenario #1
Trajectory
data publication
& analysis
ITS
Trajectory
data
outsourcing
LBS
Solutions #1 Overview
• Protecting trajectory data privacy against attackers in
the following aspects:
▫ Protecting trajectory data to be identified by the adversary
▫ Protecting sensitive location samples in trajectory data.
▫ Attackers may have background knowledge to induce
users’ information,For example, home and work place
can help adversary to infer the trajectory’s owner
▫ Protect data privacy while preserving the utility of data
Data
Privacy
Data Utility
• Basic Idea
Dummies
▫ Increasing the number of
possible trajectories from the
adversaries’ perspective
▫ Decreasing disclosure of the
user trajectory
• Method
▫ Generate dummy trajectories
as human behavior
▫ Generate dummy trajectories
with distances larger than a
predefined distance deviation
[You PALMS’07]
Dummies (cont’)
• Procedure
▫ Set a disclosure rate
▫ Generate dummies
Source
destination
 Random
 Trajectories with
intersections
 Rotate
 Compute distance
deviation
[You PALMS’07]
Pros and cons
• Pros
▫ Attackers can’t distinguish which trajectory is real user
trajectory under a threshold which is given by users
▫ Simple, easy to understand
• Cons
▫ High cost in storage, for example, to protect a single
trajectory, you need to store several dummy
trajectories, causing lower data utility.
▫ High disclosure rate for adversaries with strong
background knowledge
[You PALMS’07]
Suppress locations in trajectory data
publication
• Basic Idea
▫ Suppress location samples in a
trajectory database
• Procedure
▫ Decide which location to
suppress
 If the location sample is sensitive,
suppress it.
 If the location sample may
reveal other information,
suppress it.
Id
Loc1
Loc2
Loc3
Loc4
Loc5
01
(1, 3)
(1, 5)
(2, 6)
(2,9)
(3,10)
02
(2, 5)
(4, 8)
(5, 10)
(5,15)
(5,20)
03
(0,2)
(4, 2)
(5, 4)
(5,10)
(6,11)
04
(2, 3)
(2, 5)
(2, 8)
(3,9)
(3,15)
Loc
Name
(2,6)
Clinic
(5,20)
Hotel
(3,15)
Bar
▫ Suppress the location when
publishing data
[Terrovitis MDM’08]
Privacy preservation in the publication
of trajectories
• Motivation
▫ Octopus RFID card is
commonly used by HK
residents to pay for their
transportations, transactions
at point-of-sale services;
▫ If the Octopus company
publish the data directly, it
may cause privacy linkage,
since other agencies may have
partial knowledge of a same
person.
a1
a3
ID
Trajectory
t1
a1->a3
[Terrovitis MDM’08]
An Example
[Terrovitis MDM’08]
Pros and Cons
• Pros
▫ Protecting moving objects’ privacy even the
adversaries have partial knowledge
▫ Easy to understand, low computation cost.
• Cons
▫ May cause serious information loss if suppressed
too much location samples.
Never Walk Alone
• Motivation
▫ Due to the imprecision of GPS devices, where its
radius δ represents the possible location imprecision
• Key Idea
▫ Anonymize trajectories in a same time span under
uncertainty δ
[Abul ICDE’08]
Never Walk Alone(cont’)
• Key Methods
▫ Preprocessing
……
……
 Uniform trajectories
in a same time span
tn
t1
▫ Clustering
 Greedy Clustering
based on the Euclid
distance
Time
……
……
▫ (K, δ)-anonymity
 Space translation
y
x
[Abul ICDE’08]
Pros and cons
• Pros
▫ It exploits the inherent uncertainty of location in order
to reduce the amount of distortion needed to
anonymize data;
▫ It is a simple, efficient and effective method.
• Cons
▫ It assumes a uniform uncertainty level, in some
applications it is not suitable;
▫ Due to the limitation of the uncertainty level,
distortion grows rapidly when K is larger.
Towards trajectory anonymity
• Motivation
▫ To improve the utility of the published data
 Most data mining and statistical applications work
on atomic trajectory
• Procedure
▫ Trajectory grouping
 Logic cost metric
▫ K-Anonymity
▫ Reconstruction
[Nergiz TDP’09]
An Example
Anonymization
tr* of tr1 and tr2
Reconstruction
Anonymization
tr* and tr3
Randomly
select points
Complete
[Nergiz TDP’09]
Conclusions
• Trajectory data privacy preserving in data
publication has been widely studied.
• Several methods are proposed in trajectory data
privacy preserving, most of them come from
privacy preserving in data publication.
• Challenges lies in privacy preserving in high
frequency sampling while providing high
quality of data utility.
Scenario #2
Trajectory
data
mining
LBS
ITS
Trajectory
data
outsourcing
Solutions #2 overview
• Protecting trajectory data privacy against
attackers in the following aspects
▫ Protecting trajectory privacy against nontrustworthy LBS server
▫ Protecting users’ privacy when acquiring LBS
services, such as sending queries.
▫ Protecting data privacy while providing high
quality of services.
MOB’ privacy
QoS
Navigational path privacy protection
Mr.Q is going to a
psychiatrist , he
may have some
psychopathic ward
• Motivation
▫ Navigational path query is
one of the most popular LBS,
which determines a route
from a source to a
destination
▫ Issuing path queries to some
non-trustworthy service
providers may pose privacy
threats
How to get
to the
psychiatrist
from home?
User queries
Queries
Results
Service
providers
[Lee CIKM’09]
Navigational path privacy
protection(cont’)
• Solutions
▫ Landmark: replace both
source and destination of a
path query Q(s, t) to with
other locations, thus resulting
in another path query Q(s’, t’)
▫ Cloaking: it may cloak both
the source and destination
into locations at the same
street level, the result may be
irrelevant.
[Lee CIKM’09]
• Solutions
Navigational path privacy
protection(cont’)
▫ Obfuscate a path query by
injecting some fake sources
and destinations
S
s
Mr.Q ’s
home
• Three methods
▫ Independent obfuscate path
query
▫ Shared obfuscate path query
▫ Anti-collusion path query
Clinic
t
[Lee CIKM’09]
T
System overview
Independent obfuscate query :
Obfuscate one independent
path queries by randomly
inject fake locations
S={sA, s1}, T={tA, t1, t2}
Pb=1/2*3=1/6
Shared obfuscate query:
Obfuscate two or more path
together with injecting fake
locations.
S={sA, s1, sB}, T={tA, t1, t2, tB}
Pb=1/3*4=1/12
[Lee CIKM’09]
Anti-collusion obfuscate
query: Injecting more fake
locations in order to get a low
breach probability.
S={sA, s1, s2, sB}, T={tA, t1, t2
t2, tB}
Pbmin=1/4*5=1/20;
Pbmax=1/2*3=1/6
Pros and Cons
• Pros
▫ Developed a framework to obfuscate path queries
in order to protect mobile users’ trajectory privacy
▫ Mixing some fake sources and destinations greatly
reduced the breach probability
• Cons
▫ Provide weak privacy protection when the
adversary have strong background knowledge
Cut-Enclose
• Motivation
▫ Overlapping of
trajectory anonymity
rectangles may cause
location privacy
linkage
Problems with existing methods
[ti-1,ti]
[ti,ti+1]
[ti+1,ti+2]
Problems with simple cut-enclose
▫ Simply cut and enclose
methods may cause
privacy leakage in the
joint of grids
Time delay factor
[Gidofalvi MDM’07]
Cut-Enclose(cont’)
• Procedure
▫ Users set privacy levels
(individual privacy level/region
sensitive level);
▫ Separate 2D space into grids;
▫ According to user specified
individual privacy level (CRP
/IRP)or region sensitive level(IIR),
combine girds into partitions;
▫ Anonymize trajectory pieces in
each partition with time delay
factor.
Common Regular
Partitioning
Individual Regular
Partitioning
Individual Irregular
Partitioning
Anonymized trajectory
[Gidofalvi MDM’07]
Anonymity with historical data
• Motivation
▫ Existing cloaking methods
highly depend on the
network density ;
?
▫ Existing methods are not
suitable for time-series
sequence
 The cloaking box form a
trajectory that may disclose
a user’s trajectory.
[Toby INFOCOM’08]
Anonymity with historical data(cont’)
• Procedure
Cloaking K-1 additive trajectory
1. Liner: the cloaking result is
considered as a new base
trajectory T0
2. Quadratic: the selection of the new
trajectory is based on its distance
to T, not T0
Clocking one additive trajectory
1. Select a pivot for each footprint;
2. Choose the one with the smallest
MBC and index No. as the next pivot;
3. Until all trajectory points of the base
trajectory is all anonymized.
C2
C1
c2
c1
T0
C4
C3
c4
c3
a4
Ta
Tb
a2
a1
b1
a5
b4
b2 a3
b3
a8
a7
b5
b7
a6
b6
[Toby INFOCOM’08]
Senario #3
Trajectory
data
mining
LBS
ITS
Trajectory
data
outsourcing
Privacy preserving traffic monitoring
• Motivation
▫ GPS-equipped vehicles send their location info to
traffic monitoring center in a regular frequency
▫ The location traces might reveal sensitive places
that drivers have visited
[Hoh MobiSys’08]
Privacy preserving traffic
monitoring(cont’)
• Key Idea
▫ Minimizing tracking
time reduces the risk that
an adversary can
correlate an identity with
sensitive locations
• Method
▫ A time-to-confusion level
▫ An uncertainty level
[Hoh MobiSys’08]
Conclusions
• Trajectory data privacy preserving in online
applications are necessary, no dominant
methods exists to solve this problem.
• Challenges lies in the current trajectory privacy
preserving without location privacy leakage
while providing high quality of online services.
Scenario #4
Trajectory
data
mining
ITS
LBS
Trajectory
data
outsourcing
Solutions #2 overview
• Motivation
▫ Cloud emerges as a new way of DaaS;
▫ More and more agencies are moving their
data to the cloud, they worried the privacy
and security in the cloud;
▫ Privacy protection in the cloud is necessary.
Dark Cloud
Green Cloud
[Xu Proposal’10]
Privacy Threats in the Cloud
• Users’ Query Privacy
Data
Owner
▫ Eg. Mr.Q want to protect
his query against the
Cloud, since his query is
about mental disease
• Data Privacy of the Data
Owner
• Mutual Privacy
▫ Semi-honest model
Data
Query
Cloud
Results
[Xu Proposal’10]
Main Framework
Data Owner encrypts
the database R and
sends it to the Cloud
Data Owner sends a shadow
index E(I) and S-1() to the client,
and sends E-1() to the Cloud for
the following processing
The Cloud decrypted
Ec(E(i)) to get Ec(i),
return it to the client.
E(i) is retrieved locally
and encrypts as Ec(E(i)),
then sent back to the
Cloud for decryption
[Xu Proposal’10]
If it is a leaf node,
decrypt it with S-1(), get
the result. If it is not a
leaf node, get the next i
Research issue
• Efficient Privacy-Preserving Query
Processing Techniques
▫ Challenges lie in those complex
queries, especially queries that are
based on distances. Typical examples
like k-nearest neighbor (kNN)
• Privacy-Aware Query Result
Authentication Techniques
Cloud
Results
“Nearest
Clinic”
▫ If the cloud is malicious or does not
follow the protocol faithfully, there is
a need for the client to authenticate
the correctness of query results
[Xu Proposal’10]
OUTLINE
• Motivating Applications
• Privacy-preserving in different scenarios
• Conclusions & Future work
CONCLUSIONS
• This survey discussed trajectory data privacy
preservation techniques
▫ For online trajectory data privacy preservation, service
is centric, trade-off is between QoS and privacy
preservation
▫ For offline trajectory data privacy preservation, data is
centric, trade-off is between data quality and privacy
preservation
• Most of the techniques deals with this problem in
free space, and most of them are offline algorithms
FUTURE WORK
○ Complete the survey in following aspects:
1. Privacy preserving in time-series data.
2. Privacy preserving in outsourcing data.
3. ……
○ Trajectory data protection in online applications
● Trajectory data protection in data publication / data
outsourcing
1. ITS/LBS
2. Trajectory data outsourcing
References
•
•
•
•
•
•
•
•
•
•
•
•
G.Gidofalvi, X. Huang, and T. B. Pedersen. Privacy-Preserving Data Mining on Moving Object
Trajectories, In proceedings of MDM’07, 2007
J. Krumm. Inference attacks on location tracks. In Proceedings of the 5th International
Conference on Pervasive Computing (Pervasive 2007), May 2007.
M. Terrovitis, and N. Mamoulis. Privacy Preserving in the Publication of Trajectories. In
proceedings of MDM’08, 2008
A.Gkoulalas-Divanis, V.S.Verykios. A Privacy-Aware Trajectory Tracking Query Engine. In
proceedings of SIGKDD 2008.
Mehmet Eran Nergiz, Maurizio Atzori, Yucel Saygin, Baris Guc. Towards Trajectory
Anonymization: a Generalization-Based Approach. IEEE Transactions on Data Privacy 2(2009)
47-75.
Tun-Hao You, Wen-Chih Peng, Wang-Chien Lee. Protecting Moving Trajectories with Dummies.
In proceedings of PALMS 2007.
Kido H., Yanagisawa Y., Satoh T..An anonymous communication technique using dummies for
location based services. In proceedings of ICPS 2005
O. Abul, F. Bonchi, and M. Nanni. Never Walk Alone: Uncertainty for Anonymity in Moving
Objects Databases. In proceeding of ICDE 2008.
G.Ghinita. Private Queries and Trajectory Anonymization: a Dual Perspective on Location Privacy.
Transactions on Data Privacy 2009(3-19).
V. Rastogi, S. Nath. Differentially Private Aggregation of Distributed Time-Series with
Transformation and Encryption. In proceedings of SIGMOD ’10, 2010.
T. Xu, Y. Cai. Exploring Historical Location Data for Anonymity Preservation in Location-based
Services. In Proceedings of INFOCOM’08, 2008.
K. C. K. Lee, W. Lee, H.Va Leong, B.Zheng. Navigational Path Privacy Protection. In Proceedings
of CIKM’09 2009.
References(cont’)
•
•
•
•
•
•
A. Gkoulalas-Divanis, V.S. Verykios, M. F. Mokbel. Identifying Unsafe Routes for Network-Based
Trajectory Privacy. In Proceedings of SPC’09. 2009
O. Abul, M. Atzori, F. Bonchi, F. Giannotti. Hiding Sensitive Trajectory Patterns. In Proceedings
of ICDMW’07, 2007.
M. Gruteser, X. Liu. Protecting Privacy in Continuous Location-Tracking Applications. In IEEE
Security and Privacy, 2004.
X. Pan, X. Meng, J.Xu. Distortion-based Anonymity for Continuous Queries in Location-Based
Mobile Services. In Proceedings of SIGGIS’09, 2009.
S.Mukherjee , Z. Chen, A. Gangopadhyay. A privacy-preserving technique for Euclidean distancebased mining algorithms using Fourier-related transforms. InVLDB Journal (2006) 15:293–315
B. Hoh, M. Gruteser, H.Xiong, A. Alrabady. Preserving Privacy in GPS Traces via UncertaintyAware Path Cloaking. In proceedings of CCS’07, 2007
Thanks for
your time!
I got your interests~
Q&A