Last Update 5th May 2011 SESSION 37 & 38 Continuous Probability Distributions Lecturer: University: Domain: Florian Boehlandt University of Stellenbosch Business School http://www.hedge-fundanalysis.net/pages/vega.php Learning Objectives 1. Normal Distribution 2. Z-Score and Transformation Quantitative Data Data variables can only assume certain values and are collected typically by counting observations Data Discrete Continuous Binomial and Poisson distribution Continuous probability distribution functions (most notably normal, student-t, F, Chi-Squared) Data variables can assume any value within a range (real numbers) and are collected by measurement. Normal Distribution The normal distribution is the most important of all continuous distribution functions due to its importance in statistical inference. The probability density function of a normal random variable is: Thus, for every x the function is characterized by the population mean μ and population variance σ. The normal distribution function is symmetric about the mean and the random variable ranges between -∞ and +∞. The function is asymptotic. Normal Probabilities To calculate the probability that a normal random variable falls into any interval, the area in the interval under the curve needs to be computed. This requires calculations that are beyond the scope of introductory integral calculus. Fortunately, probability tables may be used to determine the area underneath the curve. This requires the use of a standard normal variable: Assuming the population parameters are known, Z may be calculated for any X. This is called Z-score conversion. The corresponding probabilities (area underneath the curve) are then determined from normal probability tables. Normal Probabilities Table THE STANDARD NORMAL DISTRIBUTION (Z) – Areas under the curve Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 … … … … … … … … … … … 1.0 0.3413 0.3438 0.3481 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830 1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3982 0.3980 0.3997 0.4015 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 The number s in the left column describe the values of z to one decimal place and the column headings specify the second decimal place. The table provides the probability that a standard normal variable falls between 0 and values of z. It is important to note that the normal curve is symmetric as only one half of the distribution function is displayed in most textbooks (e.g. P(Z > 0) = P(Z < 0) = 0.5). It is assumed that P(Z > 3.1) ≈ 0. Example Suppose that the amount of time to assemble a computer (assembly time is an interval variable) is normally distributed with a mean μ = 50 minutes and a standard deviation σ = 10 minutes. What is the probability that the computer is assembled in a time between 45 and 60 minutes? (i.e. we want to find the probability P(45 < X < 60)). Solution Step 1: Transformation of the variables Step 2: Determine the probabilities for the Z-scores The probability we seek is actually the sum of two probabilities: Or Exercise Consider an investment whose return is normally distributed with a mean of 10% and a standard deviation of 5%. a) What is the probability of loosing money (Hint: P(X < 0))? b) Find the probability of loosing money when the standard deviation is equal to 10%. Finding values of z There is a family of problems that require researchers to determine the value of Z given a certain probability. From the financial area, for example, value-at-risk: What is the rate of return that 5% of all observations fall below of? This requires reading of a probability from the normal probability tables and determine the associated z-score. From there, we can use a re-arranged version of the previous formula to determine X: Central Limit Theorem The sampling distribution of the mean of a random sample drawn from any population is approximately normal for sufficiently large sample sizes. The larger the sample size, the more closely the sampling distribution of x-bar will resemble the normal distribution.