Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Stat 141 DISCRETE RANDOM VARIABLES AND DISTRIBUTIONS(continued) 10/12/04 • Discrete Random variables, variances, probability mass functions and c.d.f.’s. Comparisons between expected frequencies (theoretical pmf) and observed frequencies. • Binomial examples • Poisson distribution • Poisson approximation to the Binomial Announcements: Homework 2: due Thu. October, 14, 2004, on the web. Lab 1: due Fri. October, 15, 2004, (or Monday 18) is on the web. Next Thursday lecture : Topic: Continuous Random variables, density functions, c.d.f.’s. Examples: uniform, exponential and normal density families, read chapter 3, beginning of 4. Variance:E[X] does not say anything about the the spread of the values. This is measured by V ariance(X) = E[(X − µ)2 ], where µ = E[X] which can also be written: V ar(X) = E[X 2 ] − (E[X])2 = σ 2 The unit in which this is measured is not coherent with that of X, we very often use the standard deviation p SD(X) = V ar(X) = σ V ar(aX + b) = a2 V ar(X), SD(aX + b) = |a|SD(X) Example for Binomial: Expected frequencies in samples of 5 insects from a large population where the infected proportion is 40%. Y 10 0.4Y 0.6(10−Y ) p(Y ) fexp fobs Y 0 1 1.00000 0.07776 0.07776 188.41 202 1 5 0.40000 0.12960 0.25920 628.04 643 2 10 0.16000 0.21600 0.34560 837.39 817 3 10 0.06400 0.36000 0.23040 558.26 535 4 5 0.02560 0.60000 0.07680 186.09 197 5 1 0.01024 1.00000 0.01024 24.81 29 Total 1.00000 2423.0 2423 Mean 2.00000 2.000004 1.98721 Std Dev 1.09545 1.09543 1.11934 √ Mean of Binomial: µ = np Standard deviation σ = npq. Consequence: When we don’t know the parameter p, we estimate it from the sample: infected insects we call this the estimator, it is actually the maximum likelihood estimator, it is the p̂ = #Total # insects value of p that makes the data the most likely. For E[X] = np and E[X 2 ] = np So that the variance is obtained by: V ar(X) = E[X 2 ] − (E[X])2 = n(p − p2 ) = npq Odds Ratios and Mode The odds of k successes relative to (k-1) successes are: P (X = k) n−k+1p = P (X = k − 1) k q 1 This is very useful for computing by recursion the probability mass of the binomial. Property: For X a B(n,p) random variable with probability of success p neither 0 or 1, then as k varies from 0 to n, P (X = k) first increases monotonically and then decreases monotonically, (it is unimodal) reaching its highest value when k is the largest integer less or equal to (n + 1)p(=floor(n+1)p). P (X = k) ≥ P (X = k − 1) is equivalent to (n − k + 1)p ≥ k(1 − p) if f (n + 1)p ≥ k The value where the the probability mass function takes on its maximum is called the mode. Poisson random variable Motivation Situations occur where an event happens at random over a period of time: A tap drips a drop about every 5 minutes. or Police office receives emergency calls. or Typos on a page We have to take a period of time where the rate is about unchanged. (not like the police calls in early morning/late afternoon). Poisson Distribution Large samples, small p what happens to the Binomial situation. Suppose that p ∼ 2/1000 and we do 1000 trials, how many ‘sucesses’ will we see? Around 2, so the expected value of the rv is 2. What is the probability mass distribution like? Using the following R output: Theoretically: 2 )1000 = 0.135 = exp(−2) p0 (1 − p)1000 = (1 − 1000 1000 1 2 p (1 − p)999 = 1000 1000 0.135 = 2 × 0.135 = 2exp(2) 1 1000 2 2 1000×999 998 ( 1000 )2 0.135 = 2exp(−2) p (1 − p) = 2 2 3 1000 3 2 p (1 − p)997 = 1000×999×998 ( 1000 )3 × 0.135 = 23! exp(−2) 3 3×2 ..... 1000 r 2r p (1 − p)1000−r ∼ exp(−2) r r! > (1-2/1000)^1000 [1] 0.1350645 > exp(-2) [1] 0.1353353 > (1-2/1000)^999 [1] 0.135332 > (1-2/1000)^998 [1] 0.1356064 > (1-2/1000)^997 [1] 0.1358782 > pbinom(0:9,1000,0.002) [1] 0.13506 0.40573 0.67668 0.8573 0.94753 0.98354 0.99551 [8] 0.9989 0.99976 0.99995 > ppois(0:9,2) [1] 0.13533 0.40600 0.676677 0.8571 0.94734 0.98344 0.9955 [8] 0.9989 0.99976 0.99995 > ppois(0:9,2)-pbinom(0:9,1000,0.002) [1] 2.707608e-04 2.709416e-04 -9.040431e-08 -1.807784e-04 -1.806276e-04 [6] -1.082439e-04 -4.802677e-05 -1.711494e-05 -5.120733e-06 -1.323387e-06 Most useful approximation: λ n ) ∼ exp(−λ) n So that for np < 5, p small, n large, we always use the Poisson approximation to the Binomial. (1 − 2 Mean of the Poisson with distribution given by formula is µ = λ = np, the variance, consider the Binomial approximation where np = µ, the variance of the Binomial is σ 2 = npq, so the variance for the Poisson will be σ 2 = µ(1 − p) ∼ µ, in fact the Poisson has its mean and variance equal, and this provides a way for testing. Examples: Number of weevils in azuki beans. Men killed by horse kicks in the Prussian army, Ȳ = 0.610. Number of double sixes in the 100 throws of 2 dice. Definition A discrete random variable taking on values 0,1,2,... with the probability mass function: λk e−λ k! P (X = k) = is called the Poisson distribution. We can check it is a probability mass function because ∞ X e−λ λk k=0 k! = e−λ eλ = 1 ∞ X ∞ X e−λ λk e−λ λk−1 k× Expectation: E[X] = =λ =λ k! (k − 1)! k=0 k=1 Variance: E[X 2 ] = λ(λ + 1) so that V ar(X) = E[X 2 ] − λ2 = λ Geometric This is the number of trials until a success is obtained in a sequence of Bernouill(p) trials. P (X = k) = pq k−1 Expectation: E[X] = 1 p Variance: var(X) = 1−p p2 Continuous Random Variables A continuous random variable takes on a non-countable infinity of possible values, here we will define it with the helps of a density function. A density is a continuous non-negative function defined on all the reals and such its integral is equal to 1. Z P {X ∈ B} = f (x)dx B For B = [a, b] an interval: b Z P {a ≤ X ≤ b} = f (x)dx a If we take b = a, we see that for all a, the probability that a continuous random variable takes on that value is 0, this is the big difference with discrete random variables. Z P {a ≤ X ≤ a} = f (x)dx = 0 a 3 a This implies: P {a ≤ X ≤ b} = P {a ≤ X < b} = P {a < X < b} Intuitively for a very small width δx the probability will be proportional to the density at x: P {x ≤ X ≤ x + δx} ∼ f (x)δx The density is the relative concentration of the variable along the horizontal axis. If the vertical value of the density is high at a point, it means that a class containing this point will have a high expected frequency. The expected frequency of a class is measured as the area under the curve between the two vertical lines delimiting the class. One can never estimate the probability that the continuous rv is 3 for instance, since this is impossible. Examples with simulations Here we will show intuitively what the density is as a limit of a histogram. Example 1: One uniform random variable Sum of two uniform random variables: X = chosen between 0 and 1: U1 + U2 . One uniform random variable chosen between 0 and 1: unif1=runif(100000) hist(unif1) unif2=apply(matrix(unif1,nrow=2,byrow=T),2,sum) hist(unif2) Histogram of unif1 1500 Frequency 0 0 500 1000 1000 500 Frequency 1500 2000 2000 2500 Histogram of unif2 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 unif1 1.5 2.0 unif2 400 Frequency 0 apply(matrix(unif1,nrow=4,byrow=T),2,sum) hist(unif2) 200 Sum of four uniform random variables: X = U1 + U2 + U3 + U4 . 600 800 Histogram of unif4 0 1 2 3 4 unif4 Cumulative Distribution Function Definition: Let X be a continuous real-valued random variable, its cumulative distribution function is: FX (x) = P (X ≤ x) Theorem: If X has a density f (x) then: Z x d 1. F (x) = f (t)dtis the cdf and 2. F (x) = f (x) dx −∞ 4