Download Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Centrality wikipedia , lookup

Signal-flow graph wikipedia , lookup

Transcript
CS522: Algorithmic and Economic
Aspects of the Internet
Instructors:
Nicole Immorlica ([email protected])
Mohammad Mahdian ([email protected])
Previously in this class

Ranking using the hyperlink structure:


HITS
PageRank
Today
Dealing with web spam
An axiomatic approach to PageRank
Next Lecture: Kamal Jain
Recap

The PageRank of a page p is the probability
of p in the stationary distribution of a random
walk that in each stage with probability 1 – ε
follows a random link from the current page,
and with probability ε, starts from a random
page.

Typically, ε = 0.15.
The collusion problem

What if a group of nodes “collude” to increase
the PageRank of one or more in the group?

Zhang, Goel, Govindan, Mason, and Van
Roy, WAW 2004.

Define “amplification” of a group of nodes,
and prove that it is always at most O(1/ ε).
The collusion problem

Question: Is collusion really a problem?

Experiment (on a web subgraph, and blogstreet):



Take, say, the 1000th and the 1001th nodes in the
PageRank order.
Each of these nodes removes all links to other pages, and
adds a link to the other.
Compute PageRanks in the new graph.

Results: Ranks of the colluding nodes increase
significantly.

Exercise: Go to eBay and search for PageRank.
Finding colluding groups

Approach 1: Find a set S with the largest
amplification.

However, it can be shown that this problem is
NP-hard.
Finding colluding nodes

Approach 2: Identify colluding individuals

Observation: If we increase ε, the PageRank
of a colluding individual decreases (often
proportional to 1/ ε).

Heuristic: Compute PageRanks for multiple
values of ε, and compute the correlation of
the PageRank of each node with 1/ ε. Nodes
with high correlation are probably colluding.
Dealing with collusion

We can “punish” colluding individuals by
increasing their ε, so that they cannot pass
their reputation on to others.

Experimental results
Explaining PageRank

Axiomatic approach



Define a set of “natural” axioms
Prove that PageRank satisfies these axioms
Prove that any page ranking algorithm satisfying
these axioms outputs the same ranking as
PageRank
Axiomatic Approaches: Voting

Consider a democracy where people submit
preference lists over candidates.

A voting rule (or social welfare function)
outputs a global ordering of candidates for
every set of preference lists.
Voting Axioms

Unanimity: If everyone prefers the candidate
x to y, then the global ordering also ranks x
above y.

Independence of irrelevant alternatives (IIA):
For any two candidates x and y, changes in
people’s rankings of candidates other than x
and y should not affect the relative position of
x and y in the global ordering.
Arrow’s (Im)possibility Theorem

Theorem [Arrow, 1951]: The only function
satisfying unanimity and IIA is dictatorship.

Extensions


Similar results hold for social choice functions
where a single candidate (winner) must be chosen
[Muller-Satterthwaite, 1977]
Majority rule arises naturally when we relax IIA or
restrict the preference domain of people (i.e.,
impose rules on how they can rank candidates).
Axiomatic Approach: PageRank

Agents are nodes of graph. Agents output a
“vote” over other agents as represented by a
directed graph G.

A ranking algorithm is a function mapping
every directed graph to an ordering of its
nodes.
PageRank Axioms





Isomorphism: The ranking procedure should be
independent of the names of the nodes.
Self edge: Adding self loops should not harm a node
and should not affect other nodes.
Vote by committee: Importance a gives to b and c by
voting shouldn’t change if a votes via committee.
Collapsing: If two nodes vote similarly, and are
linked to by disjoint sets of nodes, the ranking does
not change when they are collapsed to one node.
Proxy: There is an equal distribution of importance.
PageRank: Altman and Tennenholtz

Theorem: PageRank satisfies axioms.

Theorem: PageRank is only ranking
algorithm which satisfies axioms (i.e., every
other ranking algorithm which satisfies
axioms outputs same ranking as PageRank).