Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Internet How valuable is a network? Metcalfe’s Law Domain Name System: translates betweens names and IP addresses Properties of the Internet Heterogeneity Redundancy Packet-switched 1.08 billion online (Computer Industry Almanac 2005) Who has access? How important is access? CompSci 001 4.1 Tim Berners-Lee I want you to realize that, if you can imagine a computer doing something, you can program a computer to do that. Unbounded opportunity... limited only by your imagination. And a couple of laws of physics. TCP/IP, HTTP How, Why, What, When? CompSci 001 4.2 Graphs: Structures and Algorithms How do packets of bits/information get routed on the internet Message divided into packets on client (your) machine Packets sent out using routing tables toward destination • Packets may take different routes to destination • What happens if packets lost or arrive out-of-order? Routing tables store local information, not global (why?) What about The Oracle of Bacon, Erdos Numbers, and Word Ladders? All can be modeled using graphs What kind of connectivity does each concept model? Graphs are everywhere in the world of algorithms (world?) CompSci 001 4.3 Vocabulary Graphs are collections of vertices and edges (vertex also called node) Edge connects two vertices • Direction can be important, directed edge, directed graph • Edge may have associated weight/cost A vertex sequence v0, v1, …, vn-1 is a path where vk and vk+1 are connected by an edge. If some vertex is repeated, the path is a cycle A graph is connected if there is a path between any pair of vertices CompSci 001 78 NYC Phil 268 204 190 Wash DC LGA $412 Boston 394 $441 $186 LAX $1701 DCA $186 ORD 4.4 Network/Graph questions/algorithms What vertices are reachable from a given vertex? Two standard traversals: depth-first, breadth-first Find connected components, groups of connected vertices Shortest path between any two vertices (weighted graphs?)! Longest path in a graph No known efficient algorithm Longest shortest path: Diameter of graph Visit all vertices without repeating? Visit all edges? With minimal cost? Hard! What are the properties of the network? Structural: Is it connected? Statistical: What is the average number of neighbors? CompSci 001 4.5 Network Nature of Society Slides from Michael Kearns - Univ. of Pennsylvania CompSci 001 4.6 Emerging science of networks Examining apparent similarities between many human and technological systems & organizations Importance of network effects in such systems How things are connected matters greatly Structure, asymmetry and heterogeneity Details of interaction matter greatly The metaphor of viral spread Dynamics of economic and strategic interaction Qualitative and quantitative; can be very subtle A revolution of measurement theory breadth of vision CompSci 001 (M. Kearns) 4.7 “Real World” Social Networks Example: Acquaintanceship networks vertices: people in the world links: have met in person and know last names hard to measure Example: scientific collaboration vertices: math and computer science researchers links: between coauthors on a published paper Erdos numbers : distance to Paul Erdos Erdos was definitely a hub or connector; had 507 coauthors how do we navigate in such networks? CompSci 001 (M. Kearns) 4.8 Online Social Networks A somewhat recent example: Friendster vertices: subscribers to www.friendster.com links: created via deliberate invitation More recent and interesting: thefacebook Join the Computer Science 1 group! Older example: social interaction in LambdaMOO LambdaMOO: chat environment with “emotes” or verbs vertices: LambdaMOO users links: defined by chat and verb exchange could also examine “friend” and “foe” sub-networks CompSci 001 (M. Kearns) 4.9 Content Networks Example: document similarity vertices: documents on the web links: defined by document similarity (e.g. Google) here’s a very nice visualization not the web graph, but an overlay content network Of course, every good scandal needs a network vertices: CEOs, spies, stock brokers, other shifty characters links: co-occurrence in the same article Then there are conceptual networks a thesaurus defines a network so do the interactions in a mailing list CompSci 001 (M. Kearns) 4.10 Business and Economic Networks Example: eBay bidding vertices: eBay users links: represent bidder-seller or buyer-seller fraud detection: bidding rings Example: corporate boards vertices: corporations links: between companies that share a board member Example: corporate partnerships vertices: corporations links: represent formal joint ventures Example: goods exchange networks vertices: buyers and sellers of commodities links: represent “permissible” transactions CompSci 001 (M. Kearns) 4.11 Physical Networks Example: the Internet vertices: Internet routers links: physical connections vertices: Autonomous Systems (e.g. ISPs) links: represent peering agreements latter example is both physical and business network Compare to more traditional data networks Example: the U.S. power grid vertices: control stations on the power grid links: high-voltage transmission lines August 2003 blackout: classic example of interdependence CompSci 001 (M. Kearns) 4.12 US Power Grid CompSci 001 4.13 Business & Economic Networks Example: eBay bidding vertices: eBay users links: represent bidder-seller or buyer-seller fraud detection: bidding rings Example: corporate boards vertices: corporations links: between companies that share a board member Example: corporate partnerships vertices: corporations links: represent formal joint ventures Example: goods exchange networks vertices: buyers and sellers of commodities links: represent “permissible” transactions CompSci 001 4.14 Content Networks Example: Document similarity Vertices: documents on web Edges: Weights defined by similarity See TouchGraph GoogleBrowser Conceptual network: thesaurus Vertices: words Edges: synonym relationships CompSci 001 4.15 Enron CompSci 001 4.16 Social networks Example: Acquaintanceship networks vertices: people in the world links: have met in person and know last names hard to measure Example: scientific collaboration vertices: math and computer science researchers links: between coauthors on a published paper Erdos numbers : distance to Paul Erdos Erdos was definitely a hub or connector; had 507 coauthors How do we navigate in such networks? CompSci 001 4.17 CompSci 001 4.18 Acquaintanceship & more CompSci 001 4.19 Network Models (Barabasi) Differences between Internet, Kazaa, Chord Building, modeling, predicting Static networks, Dynamic networks Modeling and simulation Random and Scale-free Implications? Structure and Evolution Modeling via Touchgraph CompSci 001 4.20 Web-based social networks http://trust.mindswap.org Myspace 73,000,000 Passion.com 23,000,000 Friendster 21,000,000 Black Planet 17,000,000 Facebook 8,000,000 Who’s using these, what are they doing, how often are they doing it, why are they doing it? CompSci 001 4.21 Golbeck’s Criteria Accessible over the web via a browser Users explicitly state relationships Not mined or inferred Relationships visible and browsable by others Reasons? Support for users to make connections Simple HTML pages don’t suffice CompSci 001 4.22 CSE 112, Networked Life (UPenn) Find the person in Facebook with the most friends Document your process Find the person with the fewest friends What does this mean? Search for profiles with some phrase that yields 30-100 matches Graph degrees/friends, what is distribution? CompSci 001 4.23 CompSci 1: Overview CS0 Audioscrobbler and last.fm Collaborative filtering What is a neighbor? What is the network? CompSci 001 4.24 What can we do with real data? How do we find a graph’s diameter? This is the maximal shortest path between any pair of vertices Can we do this in big graphs? What is the center of a graph? From rumor mills to terrorists How is this related to diameter? Demo GUESS (as augmented at Duke) IM data, Audioscrobbler data CompSci 001 4.25 My recommendations at Amazon CompSci 001 4.26 And again… CompSci 001 4.27 Collaborative Filtering Goal: predict the utility of an item to a particular user based on a database of user profiles User profiles contain user preference information Preference may be explicit or implicit • Explicit means that a user votes explicitly on some scale • Implicit means that the system interprets user behavior or selections to impute a vote Problems Missing data: voting is neither complete nor uniform Preferences may change over time Interface issues CompSci 001 4.28