Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ch5. Probability Densities II Dr. Deshi Ye [email protected] 5.4 Other Prob. Distribution Uniform distribution: equally likely outcome 1 f ( x) 0 for x else 1 x dx 2 1 2 2 2 2 x dx 3 Variance of uniform ( )2 2 2 2 Mean of uniform 12 2/33 Ex. Students believe that they will get the final scores between 80 and 100. Suppose that the final scores given by the instructors has a uniform distribution. What is the probability that one student get the final score no less than 85? 3/33 Solution f(x) 0.05 1 1 100 80 1 20 x 80 85 100 P(85 x 100)= (Base)(Height) = (100 - 85)(0.05) = 0.75 4/33 5.6 The Log-Normal Distr. Log-Normal distribution: (ln x ) 1 1 2 2 x e dx for x 0, 0 f ( x) 2 0 2 0, 1 By letting y=lnx It has a long right-hand tail ln b ln a Hence 1 e 2 ( y ) 2 2 2 dy F ( , ln b ) F( ln a ) 5/33 Mean of Log-Normal Mean and variance are 2 / 2 e Proof. e x2 , e 2 2 2 2 (e 1) dx 6/33 Gamma distribution x 1 x 1e for x 0, , 0, f ( x) ( ) 0 else ( ) x 1e x dx ( 1)( 1) 0 ( 1)! when α is a positive integer Mean and Variance , 2 2 7/33 The Exponential Distribution By letting 1 distribution in the Gamma 1 x for x 0, , 0, e f ( x) 0 else Mean and Variance , 2 2 8/33 5.8 The Beta Distribution When a random variables takes on values on the interval [0,1] ( ) 1 1 x ( 1 x ) for 0 x 1, , 0 f ( x) ( )( ) 0 else Mean and Variance , 2 ( ) ( 1) 2 9/33 Beta distribution Are used extensively in Bayesian statistics Model events which constrained to take place within a interval defined by minimum and maximum value Extensively used in PERT, CPM, project management 10/33 5.9 Weibull Distribution x f ( x) 0 1 x for x 0, , 0 e Mean and Variance 1 (1 1 ), 2 2 ((1 2 ) ((1 1 )) 2 ) 11/33 Weibull distribution Is most commonly used in life data analysis Manufactoring and delivery times in industrial engineering Fading channel modeling in wireless communication 12/33 5.10 Joint distribution Experiments are conduced where two or more random variables are observed simultaneously in order to determine not only their individual behavior but also the degree of relationship between them. 13/33 Two discrete random variables f ( x1 , x2 ) P( X1 x1 , X 2 x2 ) The probability that X1 takes value x1 and X2 will take the value x2 EX. x1 0 1 2 0.1 0.2 0.4 0.2 0.1 0 x2 0 1 14/33 Marginal probability distributions f1 ( x1 ) P( X 1 x1 ) f (x , x ) 1 2 all x2 EX. x1 0 x2 0 0.1 1 0.2 0.3 f1 ( x1 ) 1 2 0.4 0.2 0.6 0.1 0 0.1 15/33 Conditional Probability distribution The conditional probability of X1 given that X2=x2 f ( x1 , x2 ) f1 ( x1 | x2 ) f 2 x2 for all x1 provided f 2 ( x2 ) 0 If two random variables are independent f1 ( x1 | x2 ) f1 ( x1 ) for all x1 and x2 f x1, x2 f1 x1 f 2 ( x2 ) 16/33 EX. With reference to the previous example, find the conditional probability distribution of X1, given that X2=1. Are X1 and X2 independent? f (0,1) 0.2 0.5 Solution. f1 (0 | 1) f 2 (1) 0.4 f (1,1) 0.2 f1 (1 | 1) 0.5 f 2 (1) 0.4 f (2,1) 0 f1 (2 | 1) 0 f 2 (1) 0.4 f1 (0 | 1) 0.5 0.3 f1 (0) Hence, it is dependent 17/33 Continuous variables If X , X ,, X are k continuous random variables, we refer to f ( x1, x2 ,, xk ) as the joint probability density of these random variables 1 2 b1 b2 a1 a2 k bk f ( x1 , x 2 ,, x k )dx1 dx 2 dx k ak 18/33 EX. P179. 6e 2 x1 3 x2 f ( x1 , x2 ) 0 for x1 0, x2 0 elsewhere Find the probability that the first random variable between 1 and 2 and the second random variable between 2 and 3 19/33 Marginal density Marginal density of X1 f1 ( x1 ) f ( x1 , x2 )dx2 Example of previous f1 ( x1 ) 6e 0 2 x1 3 x2 dx2 for x1 0 20/33 Distribution function F ( x1 , x2 ) x1 x2 f ( x1 , x2 )dx1dx2 F1 ( x1 ) F ( x1 , ) F2 ( x2 ) F (, x2 ) 21/33 Independent If two random variables are independent iff the following equation satisfies. f ( x1 , x2 ) f1 ( x1 ) f 2 ( x2 ) for all ( x1 , x2 ) 22/33 Properties of Expectation Consider a function g(x) of a single random variable X. For example: g(x) =9x/5 +32. If X has probability density f(x), then the mean or expectation of g(x) is given by E[ g ( x)] g ( x) f ( x)dx Or E[ g ( x)] g ( xi ) f ( xi ) all xi 23/33 Properties of Expectation If a and b are constants E[ax b] aE[ x] b D[ax b] a 2 D[ x] Proof. Both in continuous and discrete case 24/33 Covariance Covariance of X1 and X2: to measure E[( X1 1 )( X 2 2 )] Theorem. When X1 and X2 are independent, their covariance is 0 E[( X1 1 )( X 2 2 )] 0 25/33 5.11 Checking Normal Question: A data set appears to be generated by a normal distributed random variable Collect data from students’ last 4 numbers of mobiles 26/33 Simple approach Histogram can be checked for lack of symmetry A single long tail certainly contradict the assumption of a normal distribution 27/33 Normal scores plot Also called Q-Q plot, normal quantile plot, normal order plot, or rankit plot. Normal scores: an idealized sample from the standard normal distribution. It consists of the values of z that divide the axes into equal probability intervals. For example, n=4. m1 z0.2 0.84 m2 z0.4 0.25 m3 z0.4 0.25 m4 z0.2 0.84 28/33 Steps to construct normal score plot 1) order the data from smallest to largest d1 d 2 d n 2) Obtain the normal scores 3) Plot the i-th largest observation, versus i-th normal score mi, for all i. Plot (d i , mi ) 29/33 Normal scores in Minitab In minitab, the normal scores are calculated in different ways: The i-the normal score is 1 ((i 3 / 8) /( n 1 / 4)) Where 1 ( x) is the inverse cumulative distribution function of the standard normal 30/33 Property of Q-Q plot If the data set is assumed to be normal distribution, then normal score plot will resemble to a 450 /line through the original. 31/33 5.12 Transform observation to near normality When the histogram or normal scores plot indicate that the assumption of a normal distribution is invalid, transformations of the data can often improve the agreement with normality. Make larger values smaller 1 x ln x x 1/ 4 x Make large value larger 2 x ,x 3 32/33 Simulation Suppose we need to simulate values from the normal distribution with a 2 specified and •From z x The value x can be calculated from the value of a standard normal variable z 1) z can be obtained from the value for a uniform variable u by numerically solving u=F(z) 2) Box-Muller-Marsaglia method: it starts with a pair of independent variable u1 and u2, and produces two standard normal variables 33/33 Box-Muller-Marsaglia It starts with a pair of independent variable u1 and u2, and produces two standard normal variables z1 2 ln( u2 ) cos( 2 u1 ) z2 2 ln( u2 ) sin( 2 u1 ) Then x1 z1 x2 z2 34/33 Simulation from exponential distribution Suppose we wish to simulate an observation from the exponential distribution F ( x) 1 e 0.3 x , 0 x The computer would first produce the value u from the uniform distribution. Then ln( 1 u ) x 0.3 35/33 Population and sample 36/33 Population and Sample Investigating: a physical phenomenon, production process, or manufactured unit, share some common characteristics. Relevant data must be collected. Unit: the source of each measurement. A single entity, usually an object or person Population: entire collection of units. 37/33 Population and sample Population sample 38/33 Key terms Population All items of interest Sample Portion of population Parameter Summary Measure about Population Statistic Summary Measure about sample 39/33 Examples Population Unit variables All students currently enrolled in school student GPA Number of credits All books in library book Replacement cost 40/33 Sample Statistical population: the set of all measurement corresponding to each unit in the entire population of units about which information is sought. Sample: A sample from a statistical population is the subset of measurements that are actually collected in the course of investigation. 41/33 Sample Need to be representative of the population To be large enough to contain sufficient information to answer the question about the population 42/33 Discussion P10, Review Exercises 1.2 A radio-show host announced that she wanted to know which singer was the favorite among college students in your school. Listeners were asked to call and name their favorite singer. Identify the population, in terms of preferences, and the sample. Is the sample likely to be more representative? Comment. Also describe how to obtain a sample that is likely to be more representative. 43/33