Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Permutations & Combinations and Distributions Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University 1 Contents Permutations and Combinations Calculating probabilities using combinations Distribution Proof of Law of Large Numbers Binomial Distribution Normal Distribution 2 Law of Large Numbers The law of large numbers (LLN) describes the long-term stability of the mean of a random variable. Given a random variable with a finite expected value, if its values are repeatedly sampled, as the number of these observations increases, their mean will tend to approach and stay close to the expected value for example, consider the coin toss experiment. The frequency of heads (or tails) will increasingly approach 50% over a large number of trials. Mathematically, it can be represented as, if Mean is 3 , then Proof of Law of Large Numbers First, let us derive the Chebyshev Inequality which simplifies the derivation of law of large numbers Chebyshev Inequality: Let X be a discrete random variable with expected value µ= E(X), and let > 0 be any positive real number where V(X) is the variance of X Proof of Chebyshev Inequality Let m(x) denote the distribution function of X. Then the probability that X differs from µ by at least 4 is given by Proof of Law of Large Numbers We know that, But, V(X) is clearly at least as large as Replacing (x- µ)2 with Hence, we get 5 , to get a lower bound, Proof of Law of Large Numbers Let X1, X2, . . . , Xn be an independent trials process, with finite expected value µ = E(Xj) and finite variance Let Xn be the mean of X1,X2,… Xn. Hence, Equivalently, But from Chebyshev’s inequality, we have 6 = V (Xj ). Proof of Law of Large Numbers Replacing X with Xn, we get Hence, we get As n approaches infinity, the expression approaches 1. Hence, we have obtained, 7 Binomial Distribution Binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p It can be applied in a wide variety of practical situations for k = 0,1,2,3…. n, where is called the ‘Binomial Coefficient’ 8 Contents Permutations and Combinations Calculating probabilities using combinations Distribution Proof of Law of Large Numbers Binomial Distribution Normal Distribution 9 Binomial Distribution Binomial distribution is a very interesting distribution in the sense that it can be applied in a wide variety of practical situations. An example, Assume 5% of a very large population to be green-eyed. You pick 40 people randomly. The number of green-eyed people you pick is a random variable X which follows a binomial distribution with n = 40 and p = 0.05. Let us see how this distribution varies with different values of n and p with respect to X. 10 For the previous example, this graph shows the variation in probability Notice how it peaks in the middle and dies away at the ends probability(p) Binomial Distribution X=number of green eyed people Another elementary example of a binomial distribution is: Roll a standard die ten times and count the number of sixes. Denote the number of sixes by the random variable X The distribution of this random number X is a binomial distribution with n = 10 and p = 1/6. Can you plot this distribution and see how it varies with X 11 In-Class Exercise Let us try out an example of a binomial distribution: Consider a standard die roll for 20 times Q) Denote the number of times the outcome of the roll is an even number by a random variable X. Compute the probability distribution of X = 8 which is the probability of getting exactly eight even numbers out of the 20 rolls. Q) Denote the number of times the outcome of the roll is ‘6’ by the random variable Y. Compute the probability distribution of Y equal to 4 which is the probability of getting the outcome “6” exactly 4 times out of the 20 rolls. Q) Denote the number of times the outcome of the roll is ‘2’ by the random variable Z. Compute the probability distribution of Z less than or equal to 4 which is the probability that the outcome “2” appears less than or equal to 4 times out of the 20 rolls. 12 Use Binomial Distribution to solve these questions. Attributes of Binomial Distribution If X ~ B(n, p) (that is, X is a binomially distributed random variable with total ‘n’ events and probability of success ‘p’ in each event), Expected value or mean of X is Variance of X is Standard deviation of X is 13 Video on Binomial Distribution : A Summary 14 Deriving the Expectation of Binomial Distribution If X ~ B(n, p) (that is, X is a binomially distributed random variable with total ‘n’ events and probability of success ‘p’ in each event), then the expected value of X is We apply the definition of the expected value of a discrete random variable to the binomial distribution The first term in the summation (for k=0) equals to 0 and can be removed. In the rest of the summation, we expand the C(n,k) term, 15 Deriving the Expectation of Binomial Distribution Since n and p are independent of the sum, we get Assume, m = n − 1 and s = k − 1. Limits are changed accordingly This is similar to the expansion of a binomial theorem where x=1-p, y=p, m=n & s=k Hence, as (x+y) = ((1-p)+p) = 1, we get 16 Derivation of Variance of Binomial Distribution We have seen that variance is equal to In using this formula we see that we now also need the expected value of X 2: We can use our experience gained before in deriving the mean. We know how to process one factor of k. This gets us as far as 17 Derivation of Variance of Binomial Distribution (again, with m = n − 1 and s = k − 1). We split the sum into two separate sums and we recognize each one The first sum is identical in form to the one we calculated in the Mean (above). It sums to mp. The second sum is unity. Using this result in the expression for the variance, along with the Mean (E(X) = np), we get 18 In-Class Exercise Let us continue the previous example of the binomial distribution: Consider a standard die roll for 100 times instead of 20 times Q) Denote the number of times the outcome of the roll is ‘2’ by the random variable X. Compute the probability distribution of X greater than or equal to 60 for this event. Difficult What if we consider the die roll a million times and need to compute the probability that X is greater than or equal to 100,000 for this event? 19 Impossible ! How to Compute Distributions for Large ‘N’? Abraham de Moivre noted that the shape of the binomial distribution approached a very smooth curve when the number of events increased he considered a coin toss experiment De Moivre tried to find a mathematical expression for this curve to find the probabilities involving large number of events more easily. led to the discovery of the Normal curve 20 Example by De Moivre Coin Toss Experiment Random variable X = Number of heads Number of events ‘N’ increases Can be approximated as a curve 21 Video on Galton Board Game Demonstrates how Binomial distribution gives rise to a Normal/Gaussian distribution as number of trials/events tends to infinity 22 Contents Permutations and Combinations Calculating probabilities using combinations Distribution Binomial Distribution Normal Distribution 23 Video on Normal Distribution 24 First 2 mins only Normal Distribution To indicate that a real-valued random variable X is normally distributed with mean μ and variance σ2 ≥ 0, we write The normal distribution is defined by the following equation: All normal distributions are symmetric and have bell-shaped density curves with a single peak. 25 Note: Normal distribution is a continuous probability distribution while Binomial distribution is a discrete probability distribution In-Class Exercise Let us try out an example of a normal distribution: Consider a coin toss experiment for 1000 tosses Q) Denote the number of times the outcome of the toss is heads by a random variable X. Compute the probability distribution of X occurring at most 600 times. How would you use Binomial Distribution to solve this question? A) 600 C(1000, k ) *(1 / 2) 1000 k 0 Difficult How would you use Normal Distribution to solve this question? A) Since, the original event is a binomial distribution and we use normal distribution to approximate it, we can use µ=np & = np(1-p). Hence, x<=600; µ = 1000*1/2 = 500 and = 1000*1/2*(1-1/2) =250 Substituting this in the normal distribution equation, we get Calculating, we get Probability of x<=600 = 0.65542 26 Source of calculation: http://stattrek.com/Tables/Normal.aspx Examples of Few Applications of Normal Distribution Approximately normal distributions occur in many situations In counting problems Binomial random variables, associated with yes/no questions; Poisson random variables, associated with rare events; In physiological measurements of biological specimens: logarithm of measures of size of living tissue (length, height, weight); length of inert appendages (hair, claws, nails, teeth) of biological specimens, in the direction of growth Measurement errors Financial variables Light intensity intensity of laser light is normally distributed; 27 Normal Distribution To indicate that a real-valued random variable X is normally distributed with mean μ and variance σ2 ≥ 0, we write The normal distribution is defined by the following equation: All normal distributions are symmetric and have bell-shaped density curves with a single peak. 28 Note: Normal distribution is a continuous probability distribution while Binomial distribution is a discrete probability distribution In-Class Exercise Let us try out the previously stated “nearly impossible” problem using a normal distribution: Consider a coin toss experiment for 1,000,000 tosses Q) Denote the number of times the outcome of the toss is heads by a random variable X. Compute the probability distribution of X occurring at most 100,000 times. How would you use Binomial Distribution to solve this question? A) 100, 000 C(1000000 , k ) *(1 / 2) 1, 000, 000 Difficult k 0 How would you use Normal Distribution to solve this question? 29 In-Class Exercise Since, the original event is a binomial distribution and we can use normal distribution to approximate it. We know that µ=np & = np(1-p). Hence, x<=100000; µ = 1,000,000*1/2 = 500,000 and = 1,000,000*1/2*(1-1/2) =250,000 Substituting this in the normal distribution equation, we get Calculating the integral with limits from 0 to 100,000; 30 we get Probability of x<=100,000 = 0.0548 Source of calculation: http://stattrek.com/Tables/Normal.aspx Examples of Few Applications of Normal Distribution Approximately normal distributions occur in many situations In counting problems Binomial random variables, associated with yes/no questions; Poisson random variables, associated with rare events; In sports statistical analyses: calculating mean physical attributes like heights, weights etc and their standard deviations estimating the probabilities of winning the games Measurement errors Financial variables Light intensity intensity of laser light is normally distributed; 31 END 32 Example Application of Bayes Theorem 33