Download Info Theo. Background

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Background Knowledge
Brief Review on
•
Counting,
•
Probability,
•
Statistics,
•
I. Theory
Counting: Permutations
 Permutations:
The number of possible permutations of r objects
from n objects is
n ( n-1) (n-2) … (n –r +1) = n! / (n-r)!
 We denote this number as nPr
Remember the factorial of a number x = x! is defined as
x! = (x) (x-1) (x-2) …. (2)(1)
Counting: Permutations
 Permutations with indistinguishable objects:
 Assume we have a total of n objects.
 r1 are alike, r2 are alike,.., rk are alike.
 The number of possible permutations of n objects is
n! / r1! r2! … rk!
Counting: Combinations
 Combinations:
 Assume we wish to select r objects from n objects.
 In this case we do not care about the order in
which we select the r objects.
 The number of possible combinations of r objects
from n objects is
n ( n-1) (n-2) … (n –r +1) / r! = n! / (n-r)! r!
 We denote this number as C(n,r)
Statistical and Inductive Probability
 Statistical:
Relative frequency of occurrence after many trials
 Inductive:
Degree of belief on certain event
Proportion of heads
We will be concerned with the statistical view only.
Law of large numbers
0.5
Number of flips of a coin
The Sample Space
The space of all possible outcomes of a given process
or situation is called the sample space S.
Example: cars crossing a check point based on color and size:
S
red & small
blue & small
red & large
blue & large
An Event
 An event is a subset of the sample space.
Example: Event A: red cars crossing a check point
irrespective of size
S
A
red & small
blue & small
red & large
blue & large
The Laws of Probability
The probability of the sample space S is 1, P(S) = 1
The probability of any event A is such that 0 <= P(A) <= 1.
Law of Addition
If A and B are mutually exclusive events, then the probability that
either one of them will occur is the sum of the individual probabilities:
P(A or B) = P(A) + P(B)
B
If A and B are not mutually exclusive:
A
P(A or B) = P(A) + P(B) – P(A and B)
Conditional Probabilities
 Given that A and B are events in sample space S, and P(B) is
different of 0, then the conditional probability of A given B is
P(A|B) = P(A and B) / P(B)
 If A and B are independent then P(A|B) = P(A)
The Laws of Probability
 Law of Multiplication
What is the probability that both A and B occur together?
P(A and B) = P(A) P(B|A)
where P(B|A) is the probability of B conditioned on A.
If A and B are statistically independent:
P(B|A) = P(B) and then
P(A and B) = P(A) P(B)
Random Variable
Definition: A variable that can take on several values,
each value having a probability of occurrence.
 There are two types of random variables:
Discrete.
Take on a countable number of values.
Continuous.
Take on a range of values.
Discrete Variables
 For every discrete variable X there will be a probability function
P(x) = P(X = x).
 The cumulative probability function for X is defined as
F(x) = P(X <= x).
Random Variable
Continuous Variables:
 Concept of histogram.
 For every variable X we will associate a probability density
function f(x). The probability is the area lying between
two values.
x2
Prob(x1 < X <= x2) =
∫x1 f(x) dx
 The cumulative probability function is defined as
x
F(x) = Prob( X <= x) = ∫-infinity f(u) du
Multivariate Distributions
 P(x,y) = P( X = x and Y = y).
 P’(x) = Prob( X = x) = ∑y P(x,y)
It is called the marginal distribution of X
The same can be done on Y to define the marginal
distribution of Y, P”(y).
 If X and Y are independent then
P(x,y) = P’(x) P”(y)
Expectations: The Mean
 Let X be a discrete random variable that takes the following
values:
x1, x2, x3, …, xn.
Let P(x1), P(x2), P(x3),…,P(xn) be their respective
probabilities. Then the expected value of X, E(X), is
defined as
E(X) = x1P(x1) + x2P(x2) + x3P(x3) + … + xnP(xn)
E(X) = Σi xi P(xi)
The Binomial Distribution
 What is the probability of getting x successes in n trials?
 Assumption: all trials are independent and the probability of
success remains the same.
Let p be the probability of success and let q = 1-p
then the binomial distribution is defined as
P(x) = nCx p x q n-x for x = 0,1,2,…,n
The mean equals n p
The Multinomial Distribution
 We can generalize the binomial distribution when the
random variable takes more than just two values.
 We have n independent trials. Each trial can result in k different
values with probabilities p1, p2, …, pk.
 What is the probability of seeing the first value x1 times, the
second value x2 times, etc.
P(x1,x2,…,xk) = [n! / (x1!x2!…xk!)] p1x1 p2x2 … pk xk
Other Distributions
 Poisson
P(x) = e-u ux / x!
 Geometric
f(x) = p(1-p)x-1
 Exponential
f(x) = λ e-λx
 Others:
 Normal
 χ2, t, and F
Entropy of a Random Variable
A measure of uncertainty or entropy that is associated
to a random variable X is defined as
H(X) = - Σ pi log pi
where the logarithm is in base 2.
This is the “average amount of information or entropy of a finite
complete probability scheme” (Introduction to I. Theory by Reza F.).
Example of Entropy
There are two possible complete events A and B
(Example: flipping a biased coin).
 P(A) = 1/256, P(B) = 255/256
H(X) = 0.0369 bit
 P(A) = 1/2, P(B) = 1/2
H(X) = 1 bit
 P(A) = 7/16, P(B) = 9/16
H(X) = 0.989 bit
Entropy of a Binary Source
It is a function concave downward.
1 bit
0
0.5
1
Derived Measures
Average information per pairs
H(X,Y) = - ΣxΣy P(x,y) log P(x,y)
Conditional Entropy:
H(X|Y) = - ΣxΣy P(x,y) log P(x|y)
Mutual Information:
I(X;Y) = ΣxΣy P(x,y) log [P(x,y) / (P(x) P(y))]
= H(X) + H(Y) – H(X,Y)
= H(X) – H(X|Y)
= H(Y) – H(Y|X)
Related documents