Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Background Knowledge Brief Review on • Counting, • Probability, • Statistics, • I. Theory Counting: Permutations Permutations: The number of possible permutations of r objects from n objects is n ( n-1) (n-2) … (n –r +1) = n! / (n-r)! We denote this number as nPr Remember the factorial of a number x = x! is defined as x! = (x) (x-1) (x-2) …. (2)(1) Counting: Permutations Permutations with indistinguishable objects: Assume we have a total of n objects. r1 are alike, r2 are alike,.., rk are alike. The number of possible permutations of n objects is n! / r1! r2! … rk! Counting: Combinations Combinations: Assume we wish to select r objects from n objects. In this case we do not care about the order in which we select the r objects. The number of possible combinations of r objects from n objects is n ( n-1) (n-2) … (n –r +1) / r! = n! / (n-r)! r! We denote this number as C(n,r) Statistical and Inductive Probability Statistical: Relative frequency of occurrence after many trials Inductive: Degree of belief on certain event Proportion of heads We will be concerned with the statistical view only. Law of large numbers 0.5 Number of flips of a coin The Sample Space The space of all possible outcomes of a given process or situation is called the sample space S. Example: cars crossing a check point based on color and size: S red & small blue & small red & large blue & large An Event An event is a subset of the sample space. Example: Event A: red cars crossing a check point irrespective of size S A red & small blue & small red & large blue & large The Laws of Probability The probability of the sample space S is 1, P(S) = 1 The probability of any event A is such that 0 <= P(A) <= 1. Law of Addition If A and B are mutually exclusive events, then the probability that either one of them will occur is the sum of the individual probabilities: P(A or B) = P(A) + P(B) B If A and B are not mutually exclusive: A P(A or B) = P(A) + P(B) – P(A and B) Conditional Probabilities Given that A and B are events in sample space S, and P(B) is different of 0, then the conditional probability of A given B is P(A|B) = P(A and B) / P(B) If A and B are independent then P(A|B) = P(A) The Laws of Probability Law of Multiplication What is the probability that both A and B occur together? P(A and B) = P(A) P(B|A) where P(B|A) is the probability of B conditioned on A. If A and B are statistically independent: P(B|A) = P(B) and then P(A and B) = P(A) P(B) Random Variable Definition: A variable that can take on several values, each value having a probability of occurrence. There are two types of random variables: Discrete. Take on a countable number of values. Continuous. Take on a range of values. Discrete Variables For every discrete variable X there will be a probability function P(x) = P(X = x). The cumulative probability function for X is defined as F(x) = P(X <= x). Random Variable Continuous Variables: Concept of histogram. For every variable X we will associate a probability density function f(x). The probability is the area lying between two values. x2 Prob(x1 < X <= x2) = ∫x1 f(x) dx The cumulative probability function is defined as x F(x) = Prob( X <= x) = ∫-infinity f(u) du Multivariate Distributions P(x,y) = P( X = x and Y = y). P’(x) = Prob( X = x) = ∑y P(x,y) It is called the marginal distribution of X The same can be done on Y to define the marginal distribution of Y, P”(y). If X and Y are independent then P(x,y) = P’(x) P”(y) Expectations: The Mean Let X be a discrete random variable that takes the following values: x1, x2, x3, …, xn. Let P(x1), P(x2), P(x3),…,P(xn) be their respective probabilities. Then the expected value of X, E(X), is defined as E(X) = x1P(x1) + x2P(x2) + x3P(x3) + … + xnP(xn) E(X) = Σi xi P(xi) The Binomial Distribution What is the probability of getting x successes in n trials? Assumption: all trials are independent and the probability of success remains the same. Let p be the probability of success and let q = 1-p then the binomial distribution is defined as P(x) = nCx p x q n-x for x = 0,1,2,…,n The mean equals n p The Multinomial Distribution We can generalize the binomial distribution when the random variable takes more than just two values. We have n independent trials. Each trial can result in k different values with probabilities p1, p2, …, pk. What is the probability of seeing the first value x1 times, the second value x2 times, etc. P(x1,x2,…,xk) = [n! / (x1!x2!…xk!)] p1x1 p2x2 … pk xk Other Distributions Poisson P(x) = e-u ux / x! Geometric f(x) = p(1-p)x-1 Exponential f(x) = λ e-λx Others: Normal χ2, t, and F Entropy of a Random Variable A measure of uncertainty or entropy that is associated to a random variable X is defined as H(X) = - Σ pi log pi where the logarithm is in base 2. This is the “average amount of information or entropy of a finite complete probability scheme” (Introduction to I. Theory by Reza F.). Example of Entropy There are two possible complete events A and B (Example: flipping a biased coin). P(A) = 1/256, P(B) = 255/256 H(X) = 0.0369 bit P(A) = 1/2, P(B) = 1/2 H(X) = 1 bit P(A) = 7/16, P(B) = 9/16 H(X) = 0.989 bit Entropy of a Binary Source It is a function concave downward. 1 bit 0 0.5 1 Derived Measures Average information per pairs H(X,Y) = - ΣxΣy P(x,y) log P(x,y) Conditional Entropy: H(X|Y) = - ΣxΣy P(x,y) log P(x|y) Mutual Information: I(X;Y) = ΣxΣy P(x,y) log [P(x,y) / (P(x) P(y))] = H(X) + H(Y) – H(X,Y) = H(X) – H(X|Y) = H(Y) – H(Y|X)