Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Distributed operating system wikipedia , lookup
Backpressure routing wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Airborne Networking wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
ATCN: The Mathematics of Social Networks (15/12/08) Thrasyvoulos Spyropoulos (Akis) [email protected] Complex Networks Internet Paper X Paper Y Paper A Paper B Paper C Paper D Article Citations 2 Complex Networks (2) A complex network is a graph What are the statistical properties of this graph? Node degrees? Path length? How can we model this graph? Why care about complex networks? Connectivity (epidemiology, resilience) Spread (information, disease) Search (Web page, person) 3 Complex Network Measurements Measurements of Internet connectivity (traceroute maps), Citation network, Protein network, IMDB actor collaborations. Metrics: Path length: how many hops between two nodes on average? 1 dij n(n - 1)/2 i j Degree distribution: probability p(k) of a node having k neighbors Clustering coefficient (transitivity): If links A-B and B-C exist: probability that A-C exists? C (1) triangles centered at node i triples centered at node i i i 4 Measurement Findings: Path Length Milgram’s experiment => Small World Phenomenon Short paths exist between most nodes: Path length l << total nodes N Line network: path length l = O(N) “Small world” = avg. path length l is at most logN M. Newman (2003) 5 Measurement Findings: Degree Distribution The degree distributions of most real-life networks follow a power law p(k) = Ck-α Right-skewed/Heavy-tail distribution there is a non-negligible fraction of nodes that has very high degree (hubs) scale-free: f(ax) = bf(x), no characteristic scale, average is not informative Power-law distribution gives a line in the log-log plot log p(k) = -α logk + logC log frequency frequency degree α log degree 6 Measurement Findings: Degree Distribution Taken from [Newman 2003] 7 Collective Statistics (M. Newman 2003) α : power-law exponent (typically 2 ≤ α ≤ 3) 8 Measurement Findings: Transitivity/Clustering Is the friend of your friend, your friend also? Clustering coefficient: density of triangles (local clusters) in the graph C (1) triangles centered at node i triples centered at node i i i C = Prob (link A-C exists | A-B, B-C exist) If C > Prob(link A-C exists) => network is clustered In a random network, C → 0 as N → ∞ 1 4 C 3 2 (1) 3 3 1 1 6 8 5 9 Communities Community: higher link density between nodes Between communities: lower link density 10 Measurement Findings: Transitivity/Clustering 11 Summary of Findings Most real networks have… 1. Short paths between nodes (“small world”) 2. Transitivity/Clustering coefficient that is finite > 0 3. Degree distribution that follows a power law Q1. Can we design graph models that exhibit similar characteristics? Q2. Can we explain how/why these phenomena occur in the first place? 12 Random (or Poisson or Erdös-Renyi) Graphs The basic Gn,p (Erdös-Renyi) random graph model: n : the number of vertices 0≤p≤1 for each pair (i,j), generate the edge (i,j) independently with probability p p(degree = k): p(k) B(n; k; p) pk 1 pnk Average node degree z = np n k Simple: very well studied in the literature (average) path length l is logn/logz => Small World Very short paths exist between any nodes. Why?? Number of nodes k hops far = zk => zl = n 13 Random Graphs (2) Is path length enough? Degree Distribution: zk z As N → ∞ the degree distribution is Poisson p(k) P(k; z) e k! Highly concentrated around the mean Probability of very high node degrees is exponentially small Very different from power law! Clustering Coefficient: The probability of two of your neighbors also being neighbors is p, independent of local structure clustering coefficient C = p when z is fixed C = z/n =O(1/n) No Navigability: Cannot find a node using local criteria (remember Milgram?) 14 Small World Graphs Short paths must be combined with High clustering coefficient Ease of navigation: find a short path using only local criteria Watts and Strogatz model [WS98] Start with a ring, where every node is connected to the next k nodes With probability p, rewire every edge (or, add a shortcut) to a random node order p=0 randomness 0<p<1 p=1 15 Small World Graphs (2) Clustering Coefficient – Characteristic Path Length log-scale in p When p = 0, C = 3(k-2)/4(k-1) ~ ¾ L = n/k For small p, C ~ ¾ L ~ logn 16 Scale-free Graphs: What About Power Laws? The configuration model input: the degree sequence [d1,d2,…,dn] 4 1 3 2 process: - Create di copies of node I; link them randomly - Take a random matching (pairing) of the copies • self-loops and multiple edges are allowed But: Too artificial! 17 Preferential Attachment in Networks “The rich get richer” First considered by [Price 65] as a model for citation networks each new paper is generated with m citations (on average) new papers cite previous papers with probability proportional to their indegree (citations) what about papers without any citations? - each paper is considered to have a “default” citation - probability of citing a paper with degree k, proportional to k+1 Power law with exponent α = 2+1/m 18 Barabasi-Albert model The BA model (undirected graph) input: some initial subgraph G0, and m the number of edges per new node the process: - nodes arrive one at the time - each node connects to m other nodes selecting them with probability proportional to their degree - if [d1,…,dt] is the degree sequence at time t, the node t+1 links to node i with probability d d i d i i i 2mt Results in power-law with exponent α = 3 Various Problems: cannot account for every power law observed (Web), correlates age with degree, etc. 19 Processes on Social/Complex Networks Percolation Connectivity of graph as nodes/links are added or removed Network resilience to attacks, immunization, etc. Epidemiological Processes Disease spread Rumor spread Network Navigation Search for a node/data using only local (“greedy”) algorithms Design a network that facilitates fast navigation 20 Search in Scale-Free Networks Locating a Piece of Data on a Network (e.g. Database, Web) Breadth-first Search If data not available locally => ask all neighbors one-by-one If a neighbors doesn’t have it => returns list of its own neighbors Summary: Query 1-hop neighbors, then 2-hop neighbors, etc. Popularity-based Search (Adamic et al.) If data not available locally => find neighbor with highest node degree This neighbor takes over: looks for its own highest degree neighbor Quickly find “hubs” => connected to large subsets of nodes 21 Milgram’s experiment revisited What did Milgram’s experiment show? (a) There are short paths in large networks that connect individuals (b) People are able to find these short paths using a simple, greedy, decentralized algorithm Small world models take care of (a) Kleinberg: what about (b)? 22 Kleinberg’s model Consider a directed 2-dimensional lattice For each vertex u add q shortcuts choose vertex v as the destination of the shortcut with probability proportional to [d(u,v)]-r when r = 0, we have uniform probabilities 23 Searching in a Small World: Algorithm Given a source s and a destination t, the search algorithm 1. knows the positions of the nodes on the grid (geography information) 2. knows the neighbors and shortcuts of the current node (local information) 3. operates greedily, each time moving as close to t as possible (greedy operation) 4. knows the neighbors and shortcuts of all nodes seen so far (history information) 24 Small World (r < 2 or r > 2) “Long-range” contacts are chosen uniformly: r = 0 (Watts/Strogatz) Short paths (O(logn)) between any two nodes DO exist BUT: a local greedy algorithm (1-4) needs expected time (Kleinberg) Ω(n2/3) Generally: - When r<2 a local greedy algorithm (1-4) needs expected time Ω(n(2-r)/3). - When r>2 a local greedy algorithm (1-4) needs expected time Ω(n(r-2)/(r-1)). 25 Small World (r = 2) Short paths (O(logn)) between any two nodes exist (Kleinberg) When r=2, an algorithm that uses only local information at each node (not 4) can reach the destination in expected time O(log2n) Generalizes for a d-dimensional lattice, when r=d (query time is independent of the lattice dimension) - d = 1, the Watts-Strogatz model Why? r > 2: “long-range contacts” are too “short” => too little progress r < 2: “long-range contacts do not provide any “geographic cues” r = 2: just the right navigational cues 26 Applications: Using Social Networks for Routing in Delay Tolerant Networks Represent node contacts over time as a social graph Navigate social graph Mobility Model using Social Networks Use a social network as an overlay between “actors”/nodes Determine mobility decisions based on social network 27 SimBet: DTN Routing Using Social Networks Mobility Model => (limited duration) contacts between nodes Convert contacts over time → Social Graph 1. 2. 3. Contact between node A and B => social “link” between A and B Algorithm: Each node A keeps track of past contacts For every node j that A has encountered: create a link A-j Each node may exchange its connectivity table with other nodes it encounters (or even other nodes’ connectivity tables) 28 SimBet: DTN Routing Using Social Networks (2) Exploit Social Network Analysis Techniques in order to: Identify bridging ties - Centrality Identify clusters - Similarity Similarity: Number of common neighbors between two nodes 29 SimBet: DTN Routing Using Social Networks (3) Degree centrality popular nodes in the network N C D ( pi ) a( pi , p k ) k 1 Closeness centrality the distance of a given node to each node in the network CC ( pi ) Betweenness centrality N 1 N d( p , p i k 1 k ) the extent to which a node can facilitate communication to other nodes in the network N j 1 C B ( pi ) j 1 k 1 g jk ( pi ) g jk 30 SimBet Utility Calculation Goal: to select node that represents the best trade off across both attributes Sim n (d ) SimUtiln (d ) Sim n (d ) Sim m (d ) Bet n (d ) BetUtiln (d ) Bet n (d ) Bet m (d ) Combined: SimBetUtiln (d ) SimUtiln BetUtiln where 1 31 Social Networks Graph model: Vertices = humans, Weighted Edges = strength of interaction 32 Mapping Communities to Locations Assume a grid with different locations of interest Geographic consideration might gives us the candidate locations 33 Mobility Between Communities pc(i) = attraction of node i to community/location c p1(B)(t) pC(i) w j C p2(B)(t) ij {j C} p3(B)(t) 34