* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Probability --
Survey
Document related concepts
Transcript
Discrete Math CS 2800 Prof. Bart Selman [email protected] Module Probability --- Part e) 1) The Probabilistic Method 2) Randomized Algorithms 1 The Probabilistic Method 2 2 The Probabilistic Method Method for providing non-constructive existence proofs: Thm. If the probability that a randomly selected element of the set S does not have a particular property is less than 1, then there exists an element in S with this property. Alternatively: If the probability that a random element of S has a particular property is larger than 0, then there exists at least one element with that property in S. Note: We saw an earlier example of the probabilistic method when discussing the 7/8 alg. for 3-CNF. 3 Example: Lower bound for Ramsey numbers Recall the definition of Ramsey number R(k,k): Let R(k,k) be the minimal n such that if the edges of the complete graph on n nodes are colored Red and Blue, then there either is a complete subgraph of k nodes with all edges Red or a complete subgraph of k nodes with all edges Blue. R(3,3) = 6. So, any complete 6 node graphs has either a Red or a Blue triangle. (Proof: see “party problem”.) 4 Reminder: “The party problem” Dinner party of six: Either there is a group of 3 who all know each other, or there is a group of 3 who are all strangers. By contradiction. Assume we have a party of six where no three people all know each other Let’s say she knows 3 others. and no three people are all strangers. She either knows or doesn’t know each other person. If any of those 3 know each other, we have a blue , which means 3 people know each other. Contradicts assumption. So they all must be strangers. But then we have three strangers. Contradicts assumption. But there are 5 other people! So, she knows, or doesn’t know, at least 3 others. (GPH) The case where she doesn’t know 3 others is similar. Also, leads to constradiction. So, such a party does not exist! QED Consider one person. How do we get a lower bound on R(k,k)? E.g., lower bound for R(3,3)? We need to find an n such that the complete graph on n nodes does not contain a Red or a Blue triangle. E.g. So, R(3,3) > 5. I.e., we have a lower bound on the Ramsey Number for k=3. Can we do this for R(k,k) in general? Very difficult to construct the graphs, but… we can prove they exist for non-trivial k. Thm. For k ≥ 4, R(k,k) ≥ 2k/2 Unbiased coloring is not essential. Only matters that each possible coloring has some non-zero probability. So, e.g., k = 20, then there exists a Red/Blue coloring of the complete graph with 2^10 – 1 = 1023 nodes that does not have any complete monochromatic sub graph of size 20. (But we have no idea of how to find such a coloring!) Proof: Consider a sample space where each possible coloring of the n-node complete graph is equally likely. A sample coloring can be obtained by randomly coloring each edge. I.e., with probability ½ set edge to Blue, otherwise Red. Let n < 2k/2 Aside: each particular coloring has probability (1/2) (n * (n-1) /2) We want to show that there is a larger than 0 probability of getting a coloring with no monochromatic k-clique. Hmm… Why? Consider a subset of k nodes. There will be subsets, S_1, S_2, … S_ . of such Let E_i event that S_i is a monochromatic subgraph (a Red or a Blue clique). So, a sample is a randomly selected graph coloring. What is the probability of E_i? Why? Note: number of edges in clique of size k is: So, the probability that the randomly selected coloring will have some monochromatic k-clique is: If we can show that this probability is strictly less than 1, then we know that there must exist some coloring that does not have any monochromatic k-clique! We’ll do this. It’s mainly a matter of rewriting the combinatorial expression for the probability and finding an upper bound. First step: Can we just add up the individual probabilities? But, computing the exact value of very tricky. Why?? is Events are not disjoint and not independent! A coloring can have multiple monochromatic cliques. Also, having e.g. several Blue cliques in part of the graph, makes it more likely to have more Blue cliques on other cliques that share many edges with the Blue cliques. So, all kinds of subtle dependencies! Would need to use the inclusion-exclusion formula (many terms)! Fortunately, we “just” need to upper bound the probability. So, we can truncate the formula to the first set of terms. This gives us Boole’s Inequality: p(E1 U E2) ≤ p(E1) + p(E2) General form, see exercise 15, sect. 6.2. ? So, we get: <1 So, we have “upper bounded” our probability. What’s left? We need to show that the left hand side is strictly less than 1. “Just” combinatorics… O(nk) Note: We have many terms in the sum, but p(E_i) can be pretty small. Ex. 17, 5.4 by assumption: by assumption: So, Note: Proof is non-constructive. No effective method known for finding such coverings! So, when and So, is the probability of having a monochromatic k-clique is strictly less than 1. Therefore, there must exist edge colorings in our sample space that do not have such monochromatic k-cliques! So, R(k,k) ≥ 2k/2 QED Probabilistic Algorithms 14 14 Monte Carlo Algorithm Probabilistic Algorithms --- algorithms that make random choices at one or more steps. Decision problems --- problems for which the answer is True or False. E.g., is this number composite? Monte Carlo algorithms – there is a small probability that they will return an incorrect answer. 15 Probabilistic Primality Test: Miller’s Test Let n be a positive integer and let n-1 = 2s t (i.e. divide out the factors of 2), where s is a non-negative integer and t is an odd positive integer. We say that n passes the Miller test for the base b if either bt ≡1 (mod n) or b 2jt 1(mod n), for some j, 0 j s 1 It can be shown that a composite integer n passes the Miller’s test for fewer than n/4 bases b, with 1 < b < n. 16 Probabilistic Primality Testing Goal of the algorithm is to decide the question – “Is n composite?” The basic structure of randomized primality tests is as follows: 1. Randomly pick a number 1 < b < n. 2. Does n pass the Miller test for b? If n fails the test, then n is a composite number, and the answer is True; b is known as a witness for the compositeness, and the test STOPS. Otherwise the answer is “unknown” 3. Repeat step 1, k times until the required certainty is achieved. After k iterations, if n is not found to be a composite number, then it can be declared probably prime. 17 Probabilistic primality testing Note: This algorithm can only make mistakes when the answer is unknown. The probability that a composite integer n passes the Miller’s test for a randomly selected base b is less than ¼. By repeating it k times, given that the iterations are independent, the probability that n is composite but the algorithm responds that is prime is less than ¼ k 18 Probabilistic primality testing By taking k sufficient large, we can make the probability of error really, really small. 10 iterations less than 1 in 106 30 iterations less than 1 in 1018 If we use n as prime as one of the two primes used in the RSA cryptosystem and n is actually a composite, the procedures used to decrypt messages will not produce the original encrypted message. They key is then discarded and two new possible primes are used. 19 Probability theory is a very rich topic of increasing importance in computer science, but for now, this ends our probabilistic adventures. And, our adventures in Discrete Mathematics!! Of course, you’re still left with the popular party question: How does discrete math differ from indiscrete math?