Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Notes: Discrete Probability Distributions DISCRETE PROBABILITY DISTRIBUTIONS Discrete Uniform Distribution Discrete uniform distribution - is one in which the random variable assumes each of its values with an equal probability. Discrete Uniform Distribution. If the random variable X assumes the value x1, x2, …, xk, with equal probabilities, then the discrete uniform distribution is given by f(x; k) = 1/k, x = x1, x2, …, xk. Theorem 4.1 The mean and variance of the discrete uniform distribution f(x; k) are k k µ = ∑ xi / k and σ = ∑ (xi - µ)2/k. i=1 2 i=1 Binomial and Multinomial Distributions The Bernoulli Process. The Bernoulli process must possess the following properties: 1. 2. 3. 4. The experiment consists of n repeated trials. Each trial results in an outcome that may be classified as a success or a failure. The probability of success, denoted by p, remains constant from trial to trial. The repeated trials are independent. ENGSTAT Notes of AM Fillone Notes: Discrete Probability Distributions Binomial Distribution. A Bernoulli trial can result in a success with probability p and a failure with probability q = 1 - p. Then the probability distribution of the binomial random variable X, the number of successes in n independent trials, is n b(x; n, p) = x pxqn-x, x = 0, 1, 2, …, n. Theorem 4.2 The mean and variance of the binomial distribution b(x; n, p) are µ = np and σ2 = npq. Multinomial Distribution. If the given trial can result in the k outcomes, E1, E2, …, Ek with probabilities p1, p2, …, pk, then the probability distribution of the random variables X1, X2, …, Xk, representing the number of occurrences for E1, E2, …, Ek in n independent trials is f(x1, x2, …, xk; p1, p2, …, pk, n) = n x1,x2,…,xk px11px22…pxkk with k ∑ xi = n and i=1 ENGSTAT Notes of AM Fillone k ∑ pi = 1. i=1 Notes: Discrete Probability Distributions Hypergeometric Distribution. The probability distribution of the hypergeometric random variable X, the number of successes in a random sample of size n selected from N items of which k are labeled success and N - k labeled failure, is k N - k x n - x h(x; N, n, k) = ------------------ , x = 0, 1, 2, …, n. N n Theorem 4.3 The mean and variance of the hypergeometric distribution h(x; N, n, k) are µ = nk/N and N-n k σ2 = -------- . n . --- ( 1 - k/N). N-1 N Multivariate Hypergeometric Distribution. If N items can be partitioned into the k cells A1, A2, …, Ak with a1, a2, …, ak elements, respectively, then the probability distribution of the random variables X1, X2, …, Xk, representing the number of elements selected from A1, A2, …, Ak in a random sample of size n, is f(x1, x2, …, xk; a1, a2, …, ak, N, n) = ENGSTAT Notes of AM Fillone Notes: Discrete Probability Distributions a1 a2 ak x1 x2 … xk ------------------------ N n k k i=1 i=1 with ∑ xi = n and ∑ ai = N. Negative Binomial Distribution. If repeated independent trials can result in a success with probability p and a failure with probability q = 1 - p, then the probability distribution of the random variable X, the number of the trial on which the kth success occurs, is given by x - 1 b*(x; k, p) = k - 1 pkqx-k, x = k, k + 1, k + 2, … Geometric Distribution. If repeated independent trials can result in a success with probability p and a failure with probability q = 1 - p, then the probability distribution of the random variable X, the number of the trial on which the first success occurs, is given by g(x; p) = pqx-1, x = 1, 2, 3, … Theorem 4.4 The mean and variance of a random variable following the geometric distribution are given by ENGSTAT Notes of AM Fillone Notes: Discrete Probability Distributions µ = 1/p and σ2 = (1 - p)/p2. Poisson Distribution and the Poisson Process Experiments yielding numerical values of a random variable X, the number of outcomes occurring during a given time interval or in a specified region, are often called Poisson experiments. A Poisson experiment is derived from the Poisson process and possesses the following properties: 1. The number of outcomes occurring in one time interval or specified region is independent of the number that occurs in any other time interval or region of space. In this way we say that the Poisson process has no memory. 2. The probability that a single outcome will occur during a very short time interval or in a small region is proportional to the length of the time interval or the size of the region and does not depend on the number of outcomes occurring outside this time interval or region. 3. The probability that more than one outcome will occur in such a short time interval or fall in such a small region is negligible. The number X of outcomes occurring in a Poisson experiment is called a Poisson random variable and its probability distribution is called the Poisson distribution. ENGSTAT Notes of AM Fillone Notes: Discrete Probability Distributions Poisson Distribution. The probability distribution of the Poisson random variable X, representing the number of outcomes occuring in a given time interval or specified region denoted by t is given by p(x; λt) = e-λt(λt)x/ x!, x = 0, 1, 2, …, where λ is the average number of outcomes per unit time or region and e = 2.71828 … Theorem 4.5 The mean and variance of the Poisson distribution p(x; λt) both have the value λt. Theorem 4.6 Let X be a binomial random variable with probability distribution b(x; n, p). When n → ∞, p → 0, and µ = np remains constant, b(x; n, p) → p(x; µ). ENGSTAT Notes of AM Fillone Notes: Discrete Probability Distributions Derivation of the Binomial Probability Distribution Consider a binomial experiment consisting of n trials and represented by the symbols SFSFFFSSSF…SFS where the letter in ith position (left to right), denotes the outcome of the ith trial Objective: To find the probability p(x) of observing x successes in the n trials. Procedure: 1. Sum the probabilities of all simple events that contain x successes (S’s) and (n - x) failures (F’s). (n – x) x SSSS…S FF…F or some different arrangement of these symbols. 2. Since the trial are independent, the probability of a particular simple event implying x successes is P(SSS…SFF…F) = px qn-x 3. The number of these equiprobable simple events is equal to the number of ways of selecting x positions (trials) for the x S’s from a total of n positions. This is given by n! n x = -------------x!(n-x)! 4. The sum of the probabilities of these simple events is p(x) = (Number of simple events implying x success) * (Probability of one of these equiprobable simple events) or n p(x) = x px qn-x Derivation of the Multinomial Probability Distribution p(n1, n2, …, nk) Let k = 3 categories. ENGSTAT Notes of AM Fillone Notes: Discrete Probability Distributions 1. Let the three outcomes corresponding to the k = 3 categories be denoted as A, B, and C with respective category probabilities p1, p2, and p3. Any observation of the outcome of n trials will result in a simple event as shown Trial 1 2 3 4 5 6 … n C A A B A C … B The outcome of each trial is indicated by the letter that was observed. 2. Now consider a simple event that will result in x1 A outcomes x2 B outcomes where x1 + x2 + x3 = n. x3 C outcomes The probability of the simple event x1 x2 x3 AAA…A BBB…B CCC…C is (p1)x1(p2)x2(p3)x3. 3. The number of simple events that will imply x1 A’s, x2 B’s, and x3 C’s in the sample space S is equal to the number of different ways that we can arrange the x1 A’s, x2 B’s, and x3 C’s in the n distinct positions of the setup above and is given by n! --------------x1!x2!x3! 4. It then follows that the probability of observing x1A’s, x2 B’s, and x3 C’s in n trials is equal to the sum of the probabilities of these simple events: n! P(x1, x2, x3) = --------------- (p1)x1(p2)x2(p3)x3 x1!x2!x3! ENGSTAT Notes of AM Fillone Notes: Discrete Probability Distributions Derivation of the Negative Binomial Distribution 1. Every simple event that results in x trials until the rth success will contain (x – r) F’s and r S’s, as depicted (x – r)F’s and (r-1) S’s rth S F F S F F … S F S 2. The number of different simple events that result in (x – r) F’s before the rth S is the number of ways we can arrange the (x – r) F’s and (r –1)S’s, namely, (x –r ) + (r –1) x – 1 x–r = x–r 3. Since the probability associated with each of these simple events is prqx-r, we have p(x) = x–1 x – r prqx-r Derivation of the Hypergeometric Probability Distribution Note: The total number of simple events in S is equal to the number of ways of N selecting n elements from N, namely n . 1. A simple event implying x successes will be a selection of n elements in which x are S’s and (n – x) are F’s. Since there are r S’s from which to choose, the number of different ways of selecting x of them is a= r x . 2. The number of ways of selecting (n – x) F’s from among the total of (N-r) is N –r b= n – x. ENGSTAT Notes of AM Fillone Notes: Discrete Probability Distributions 3. To determine the number of ways of selecting x S’s and (n – x)F’s, the number of simple events implying x successes: r N - r a . b = x . n – x 4. Finally, since the selection of any one set of n elements is as likely as any other, all the simple events are equiprobable and thus, r N – r No. of simple events that imply x successes x n – x p(x) = -------------------------------------------------------- = -------------------Number of simple events N n ENGSTAT Notes of AM Fillone