Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Randomness wikipedia , lookup

Probability box wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Probabilistic context-free grammar wikipedia , lookup

Conditioning (probability) wikipedia , lookup

Probability interpretations wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
The Goldreich-Levin Theorem:
List-decoding the Hadamard code
Outline



Motivation
Probability review
Theorem and proof
Hadamard Codes




[2n, n, 2n-1]2 linear code
The encoding for a message xFn is given by
all 2n scalar products <x,y> for yFn
(Note: all string related math here is mod 2.)
Why is the relative distance 1/2?
We will see a probabilistic algorithm that
provides list decoding for Hadamard codes
when up to 1/2-e of the bits are corrupted
Low error case: p = 3/4+e


Unique decoding
Probabilistic algorithm:
Estimate-Had(x):
For j = 1…k (k to be fixed)
Choose rj{0,1}n randomly
aj f(rj+x) - f(rj)
Return majority(a1,…,ak)

Now set the ith bit of the solution to
Estimate-Had(ei)
Analysis

Analysis:
Choose rj{0,1}n randomly
aj f(rj+x) - f(rj)

If both f(rj+x) and f(rj) are correct then
aj = f(rj+x) - f(rj) = <s, rj+x>-<s, rj> = <s,x>

Using a union bound we get
Pr[aj <s,x>] ≤ 2(1-p) = 1/2-2e
Analysis (contd.)




Since we take a majority vote of a1,…,ak we
can use the fact that they’re independent to
get a Chernoff bound of at most e-(ke2) on
the probability of error
The probability of getting some bit wrong is
Pr[Estimate-Had(ei) is wrong for some i] ≤ ne-(ke2)
Taking k = O(logn/e2) gives an O(nlogn/e2)
algorithm with arbitrarily small error
Note that the error probability is doubled, so
doesn’t work with p<3/4
Decoding - The noisy scenario


If m<d/2 then there's a unique solution
If d/2<m<d there could be multiple solutions
List Decoding



Fix an (n, k, d) code C, and suppose there is
an unknown message xk
We are given a vector yn which is equal to
the codeword C(x) with at most m of the
places corrupted
Suppose we want to find possible values
x'k for the original messages so that
dH(C(x'),y)m
List decoding Had



Input: function f() that agrees with Had(s) at p
fraction of the function inputs:
Prx[f(x)=<s,x>] = p
Assume calling the function has O(1) cost.
Output: a list of possible messages.
A message is possible..
General case: p = 1/2+e




List decoding
Theorem (Goldreich-Levin): there exists a
probabilistic algorithm that solves this problem.
Specifically:
Output: List L of strings such that each possible
solution s appears with high probability:
Prx[f(x)=<s,x>] ≥ 1/2+e  Pr[sL] ≥1/2
Run time: Poly(n/e)
Basic probability theory review



Random variables (discrete)
Expected value (m)
E(X) = Sxp(x)
Variance (s2)
Var(X) = E[(X-E(X))2]
= E[X2]-E[X]2
Binary random variables

Pr(X=1)=p, Pr(X=0)=1-p

Often used as indicator variables

E(X)=…

Var(X) = p(1-p) ≤ 1/4
Majority votes



Consider a probabilistic algorithm that returns a
binary value (0 or 1), with probability > 1/2 of
returning the correct result
We can amplify the probability of getting the correct
answer by calling the algorithm multiple times and
deciding by the majority vote
In order for this to work well there should be some
independence between the algorithm’s results in
each invocation
Independence

Events A1,...,An are independent if
Pr[A1,...,An] = Pr[A1]...Pr[An]

Likewise, random variables X1,...,Xn are
independent if for each possible assignment
x1,...,xn:
Pr[X1=x1,...,Xn=xn] = Pr[X1=x1]...Pr[Xn=xn]
Pairwise independence

A set of r.v.'s (or events) is pairwise
independent if each pair of the set is
independent

Does one type of independence imply the
other?
Chernoff bound

The probability of simultaneous occurance of
the majority of n independent events, each
having probability p≥1/2+e, has the lower bound
2}
-2ne 2
P
≥
1-exp{-2n
e
Pr  1-e
Chebyshev inequality

For any r.v. X with expected value μ and
variance s2:
Pr(|X-m|≥a) ≤ s2/a2

Can be used to get a lower bound for the
probability of getting a majority of n pairwise
independent events with p≥1/2+e:
Pr ≥ 1 - 1/(4ne2)
No error case

In this case we can recover the ith bit of the
secret string by computing f(ei) where ei is
the string with 1 at the ith position and 0
everywhere else.
General case: p = 1/2+e




List decoding
Theorem (Goldreich-Levin): there exists a
probabilistic algorithm that solves this problem.
Specifically:
Output: List L of strings such that each possible
solution s appears with high probability:
Prx[f(x)=<s,x>] ≥ 1/2+e  Pr[sL] ≥1/2
Run time: Poly(n/e)
The algorithm (almost)



Suppose that we somehow know the values of Had(s) in m
places. Specifically, we are given the strings r1,…,rm and
the values b1,…,bm where bj = <s,rj>
We can then try to compute the value of Had(s) in any x:
Estimate-With-Guess(x , r1,…,rm , b1,…,bm):
For J {1,...,m} (Jf)
aJ f(x+SjJ rj) - SjJ bj
Return majority of all aJ
Now get the bits of s by calling Estimate-With-Guess with ei
as before
Analysis




The idea here is that due to linearity we can
get the correct values in more places than
we are given
For any J {1,...,m} define rJ=SjJ rj.
Then <s, rJ>=<s, SjJrj>=SjJ<s, rj >=SjJ bj
If the rjs are uniformly random so are the rJs
The probability of getting aJ wrong is
therefore the probability of getting f(x+rJ)
wrong, which is bounded by 1/2-e
But!



The rJs are not independent, so Chernoff
bound can’t be used
However, they are pairwise independent so
we can use Chebyshev
Pr[EWG(x , r1,…,rm , b1,…,bm) <s,x>] ≤ 1/(2me2)
when the ris are independent and chosen
uniformly and for each i, bi=<s,ri>

We can recover all bits with an error of at
most n/(2me2). Taking 2m = O(n/e2) gives an
O(n2/e2) algorithm with arbitrarily small error
Completing the algorithm



We don’t actually have the correct values for the
bis
But if m is small we can try all 2m combinations –
one of them must be correct!
The final algorithm:
1. Choose r1,…,rm randomly
2. For each (b1,…,bm){0,1}m:
2.1 For i=1,..,n
aiEWG(ei , r1,…,rm , b1,…,bm)
2.2 Output (a1,…,an)

Complexity: O(n3/e4)
Back to the Goldreich-Levin theorem

The only thing we assumed about the
desired output string s was the agreement of
Had(s) with f(). So in fact the algorithm
produces with high probability any string with
the same agreement.
Alternative algorithm?