Download Slide

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

CAN bus wikipedia , lookup

Peering wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Computer network wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Network tap wikipedia , lookup

Airborne Networking wikipedia , lookup

Peer-to-peer wikipedia , lookup

Transcript
Adam Sweeney
CS 898AB
Feb. 23, 2017
Community-Enhanced Deanonymization of Online Social
Networks
1
Introduction
• Online Social Networks (OSN) host vast
amounts of personal information
• Many applications
– Targeted advertising
– Health care
– Study of human behavior
• Information is anonymized before provided to
customers
• Adversaries will want to de-anonymize the data
2
Introduction (cont.)
• Narayanan and Shmatikov demonstrated a
method of de-anonymizing across social
networks using ‘network alignment’
• This paper focuses on network alignment deanonymization with no additional attributes
– But two networks have a high overlap
• Authors propose a divide-and-conquer approach
– Divide network into communities
– Use two-stage mapping
3
Agenda
Definitions and Attack Models
Background
Community Enhanced De-Anonymization
Degree of Anonymity
Evaluation
Results
Related Work
4
Definitions and Attack
Models
5
Definitions
• Undirected graph, 𝐺 𝑉, 𝐸
• Clique or k-clique
– Fully connected sub-graph, k can specify size of
clique
• A community-blind algorithm is one that does not
see communities, like that of Narayanan and
Shmatikov
6
Attack Model
• Assume adversary has access to two networks,
𝐺 𝑉, 𝐸 and 𝐺′ 𝑉 ′ , 𝐸 ′ , where 𝑉 ∩ 𝑉 ′ ≠ ∅ and 𝐸 ∩
𝐸′ ≠ ∅
– Focus on cases where 𝑉 ≈ 𝑉′ and 𝐸 ≈ 𝐸′
• Goals of the attacker
– Align the anonymized network with the ‘reference’
network
– Re-identify anonymized users
– Reveal private information
• Problem changes if both networks are
anonymized
7
Background
8
Background
• A community is a typically regarded as a group
of densely connected nodes, where there are
few connections to nodes outside the community
• Communities often overlap
– Paper focuses on disjoint, non-overlapping
communities to simplify the problem
• Degree of anonymity
– 0 ≤ 𝐴 𝑋 ≤ 1, where A(X) = 1 indicates complete
anonymity
• Bullet point goes here
9
Community Enhanced DeAnonymization
10
Community Enhanced DeAnonymization
• Any community mapping method can be used to
identify communities
– Authors used Infomap
• Communities need to be mapped (2 methods)
– Identifying seed communities
• Needs pre-identified seed mappings
– Creating a network of communities
• Community structure is considered a high-level, coarse
grained graph (communities are nodes)
11
Community Enhanced DeAnonymization (cont.)
• Community mapping allows for the identification
of additional seeds
– Searching within communities provides a
comparatively narrow scope
12
Community Enhanced DeAnonymization (cont.)
• Finding additional seeds is called “seed
enrichment”
• Seed enrichment at the community level is done
by using two distance metrics
– A node’s degree, and a node’s clustering coefficient
– Metrics are computed and tested between each pair
of nodes across mapped communities
– Community-blind algorithm is run locally (between two
mapped communities)
• Finally, apply community-blind algorithm to the
whole network using all mapped nodes as seeds
13
Degree of Anonymity
14
Degree of Anonymity
• Community structure may reveal information
about true mappings
– Even if a node cannot be mapped by a deanonymization algorithm
• Skipping over a lot of math
• This measure is estimating the upper bound
– Quantifies the minimum possible damage from a deanonymization attack
15
Evaluation
16
Evaluation Overview
• Simulation-based experiments using real-world
network datasets
• For each experiment
– Prepare a copy of original network
– Partially alter the structure
– Compare network alignment of community-blind
against community-aware algorithms
17
Data Sets
• Network of co-authorships between scientists
that posted to a specific archive
– Authors are connected if they wrote a paper together
• Twitter mention network
– Users who mutually mentioned each other
• Smaller sub-section of the same Twitter mention
network
18
Experimental Setup
• Original network is assumed to be anonymized
– Prepare an array of networks with different noise
levels
– Θ = 0.10 means that 10% of edges are re-wired
• After noisy networks are created, a percentage
of nodes are randomly removed from all
networks
– For example, when Θ = 0.10, an additional 5% of
nodes are removed
19
Experimental Setup (cont.)
• Eccentricity of node-mapping algorithms set to
0.1
– Threshold of community mapping set to 0
– Observed that more mapped communities always
returned more correctly mapped nodes
• More false positives, but effect is limited
• Both algorithms given the same set of initial
seeds
– Mimics prior knowledge of attacker
20
Two-Column Layout
Results
21
Results
22
Results (cont.)
23
Results for Overlapping Data Sets
24
Related Work
25
Graph Anonymization
• Can be classified into four approaches
• Clustering
– Many possible mappings from clusters to mappings,
including the original mapping
• Clustering with constraints
– Merges nodes of a cluster into a single node
– Decides which edges to include such that
equivalence class nodes have same constraints as
original data
26
Graph Anonymization (cont.)
• Modification of graph
– Approach used in this paper
– Re-wiring, node removal
– Attempts to subvert attacks based on a known
structure
• Hybrid
– Any combination of the prior three
27
De-Anonymization Attacks Based on
Structure
• Leverage patterns of connectivity
• Active attack
– Adversary chooses victims ahead of time
– Create Sybils and attempt to form connections to the
victims
– Adversary can force unique structure that can be
identified from anonymized graph
• Passive attack
– Small group of attackers identifies its location in the
network
– Attempt to discover existence of edges
28
De-Anonymization Attacks Based on
Other Attributes
• Use a victim’s public and non-sensitive data
• Users that are part of multiple social networks
share different data
– More public on one, more private on another
• Not a trivial problem to match users across
networks
–
–
–
–
29
Exploit activity patterns
Tagging behavior
Item preferences
Communication patterns
Network Alignment
• Of interest in other fields
• A biological context
– Map two protein interaction networks to infer the
functions of unknown proteins in each species
30
Review
Definitions and Attack Models
Background
Community Enhanced De-Anonymization
Degree of Anonymity
Evaluation
Results
Related Work
31
Reference
• Nilizadeh, Shirin, Apu Kapadia, and Yong-Yeol
Ahn. "Community-Enhanced De-anonymization
of Online Social Networks." Proceedings of the
2014 ACM SIGSAC Conference on Computer
and Communications Security - CCS '14 (2014)
32
Questions
33