Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Computation, Information, and Intelligence (COMS/ENGRI/INFO/COGST 172), Fall 2005 10/17/05: Lecture 22 aid — PageRank and hubs ’n’ authorities Agenda: Further discussion of PageRank and the hubs-and-authorities algorithm. Announcements: An organizational suggestion: if you haven’t already done so, it might be worthwhile to acquire a three-ring binder and a three-hole punch. Then, if one took notes for each lecture on loose-leaf paper, one could keep one’s lecture notes appropriately interleaved with each lecture’s handouts in the binder. I. Reminder: the main PageRank equation Let ² be some number between 0 and 1. score(t+1) (dj ) = II. ² + (1 − ²) n X d pointing to dj score(t) (d) . outdegree(d) Some facts about probabilities • The probability of a non-impossible event e1 happening and then an event e2 happening is the probability that e1 happens times the probability that e2 happens given that e1 happened. • The probability of either (but not both) of two mutually exclusive alternative events e1 and e2 happening is the probability of e1 happening plus the probability of e2 happening. • The sum of the probabilities over all possible mutually exclusive alternatives for a given probabilistic choice must be 1. III. The “random surfer” model Upon arriving at a document, the user either chooses to follow an existing hyperlink or to randomly jump to any document on the Web. The two cases have probability (1 − ²) and ², respectively (note that these sum to 1), and in either case, the choice among alternatives that then result is made uniformly at random. We then interpret score(t) (dj ) as the probability that the surfer is at document dj at time t. (OVER) IV. Document set for comparison calculations We have to use a slightly different example than that from last time’s handout because there’s a slight but annoying technical problem in applying PageRank (even with “mini-links”/random jumps) when there are documents with zero out-degree. W Y X V. Z V PageRank results ² = 0.15. PageRank scores, epsilon=.15 0.5 V W X Y Z 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0 10 20 30 40 iteration 50 60 70 80 VI. Hubs-and-authorities results Since we are only interested in relative comparisons, we have not aligned the axes of the three plots. Authority scores Hub scores 0.9 V W X Y Z 0.8 V W X Y Z 0.5 0.7 0.4 0.6 0.5 0.3 0.4 0.2 0.3 0.2 0.1 0.1 0 0 0 2 4 6 8 iteration 10 12 14 16 0 2 4 6 8 iteration 10 12 14 16