Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math/Stat 352 Lecture 9 Section 4.5 Normal distribution 1 Abraham de Moivre, 1667-1754 A French mathematician, who introduced the Normal distribution in his book “The doctrine of chances: or, a method for calculating the probabilities of events in play”, first published in 1718 and considered the first textbook on Probability. Pierre-Simon Laplace (1749–1827) A French mathematician and astronomer. Extended the Moivre’s result in the book “Analytical theory of probabilities”, published in 1812. (So-called de Moivre-Laplace theorem) De Moivre-Laplace theorem suggests an approximation to the central part of Binomial distribution Johann Carl Friedrich Gauss (1777–1855) A German mathematician and scientist who contributed significantly to many fields, including number theory, statistics, analysis, differential geometry, geodesy, geophysics, astronomy and optics. Referred to mathematics as "the queen of sciences.” Rigorously justified the method of least squares in 1809 using the Normal distribution for errors. Charles Sanders Peirce, 1839-1814 Francis Galton, 1822-1911 an American philosopher, logician, an English prolific scientist mathematician, and scientist Wilhelm Lexis, 1837-1914 an eminent German statistician, economist, social scientist, a founder of the interdisciplinary study of insurance. Coined the term “Normal distribution” around 1875 Karl Pearson (March 27, 1857 – April 27, 1936) established the discipline of mathematical statistics Made the term “Normal distribution” popular 0.02 0.04 0.06 0.08 0.10 Concentrated around 90 0.00 dbinom(q, 100, 0.9) 0.12 Binomial(100,0.9), np=90 70 75 80 85 90 Number of succsses 95 100 Bin(100,0.9)np=90 0.04 0.06 0.08 0.10 Only a small fraction of possible outcomes has not negligible probability (i.e. only small part can be seen in experiment) 0.02 prob is very small (not 0!) here 0.00 dbinom(q, 100, 0.1) 0.12 Bin(100,0.1)np=1 0 0 20 40 60 Number of succsses 80 100 Poisson? Binomial? Normal? Binomial(1000,.03) N(30,30) Poisson 0.04 Binomial 0.02 Normal 0.00 dpois(q, 30) 0.06 Poisson(30) 0 10 20 30 Number of succsses 40 50 Poisson(1) 0.3 0.2 0.1 0.0 0.2 dpois(q, 1) 0.3 dbinom(q, 1000, 1/1000) Binomial(1000,.001] 0 2 4 6 8 10 0.0 0.1 Number of succsses 0 2 4 6 8 10 Number of succsses The distributions are not symmetric, np=λ=1 too small for normal apprxmtn. Poisson? Binomial? Normal? Rule of thumb: If n is large (n > 100), p is small (p < 0.05), and both np and n(1-p) are not small (say >10) then B(n,p)~P(np)~N(np,np(1-p)) The Normal Distribution The normal distribution (also called the Gaussian distribution) is by far the most commonly used distribution in the sciences. It provides a good model for many, although not all, continuous populations. The normal distribution is continuous. The mean of a normal population may have any value, and the variance may have any positive value. 0.7 Normal densities Properties of Normal pdf: 0.6 Mean= 0, Std=0.6 0.4 0.3 Mean= 7, Std=1 Mean= 5, Std=1 • Centered at the mean 0.1 0.2 Mean= 0, Std=1 • Larger standard deviation gives “flatter” curve with longer “tails”. Mean= 0, Std=3 0.0 Density 0.5 • Bell shaped curve, symmetric -10 -8 -6 -4 -2 0 2 4 6 8 10 Value of random variable 11 Normal R.V.: pdf, mean, and variance The pdf of a normal random variable with mean µ and variance σ2 , X ~ N(µ, σ2) , is s given by 1 = f ( x) e− ( x−µ ) σ 2π 2 / 2σ 2 , −∞< x<∞ If X ~ N(µ, σ2), then the mean and variance of X are given by µX = µ σ X2 = σ 2 Note: The normal pdf is symmetric, so the mean =median. 12 Normal distributions – what are the most likely observations? 68-95-99.7% Rule X ~ N(µ, σ2) pdf. About 68% of the observations are in the interval µ ± σ. About 95% of the observations are in the interval µ ± 2σ. About 99.7% of the observations are in the interval µ ± 3σ. The proportion of a normal observations that are within a given number of standard deviations of the mean is the same for any normal population. 13 Standard Normal Distribution The standard normal distribution is a normal distribution with mean 0 and variance 1, X ~ N(0, 1). Standard normal distribution is usually denoted by Z: Z ~ N(0, 1). We can convert the observations from any X ~ N(µ, σ2), to the “standard units”: z= x−µ σ This process is often called “standardization”. The value of X in standard units is often called “z-score”. Standard units tell how many standard deviations an observation is from the population mean. 14 Computing standard normal probabilities: Table A.2 P(Z < 0.47) = 0.6808 Computing standard normal probabilities: Table A.2 P(Z ≤ 1.38) =P(Z < 1.38) = 0.9162 Reminder: The total area under a pdf curve is 1. P(Z > 1.38)=0.0838 Computing standard normal probabilities: Table A.2 P(0.71 < Z < 1.28) = P(Z < 1.28) – P(Z < 0.71) Computing standard normal probabilities: Table A.2 Symmetry in action P(Z < 0.67) = symmetry = P(Z > -0.67) P(Z < 0.67) = symmetry = 1- P(Z < -0.67) Standard Normal Percentiles Given that P(Z < a)=0.95 find a. Here a is called 95th percentile of Z. Inside the table I looked for 0.95. Found 0.9495 and 0.9505. Used z-value corresponding to the midpoint (0.95) between the two available probabilities 1.645. a=1.645 If an available probability is closer to the one we need, use the z-value corresponding to that probability. 0.95 a =? Finding Probabilities for any Normal Variable Problem: Let X ~ N(µ, σ2), find P(a< X < b). The probability that X lies within any interval is given by the integral of the pdf of X over that interval: 𝒃 where 𝑷 𝒂 < 𝑿 < 𝒃 = � 𝒇 𝒙 𝒅𝒅, 1 = f ( x) e− ( x−µ ) σ 2π 2 / 2σ 2 𝒂 , −∞< x<∞ The integral that provides probability does not have a closed form solution. How to proceed? Solution: Convert X to standard normal (“standardize”) and use z-table. 20 Computing normal probabilities: any X ~ N(µ, σ2) Let X ~ N(50, 25). Find P( 42 < X < 52). 𝟒𝟒 −𝟓𝟓 𝟓 P(42 < X < 52)= 𝑷( <𝒁< 𝟓𝟐 −𝟓𝟓 ) 𝟓 = P( 1.6 < Z < 0.4) = 0.6554 – 0.0548 = 0.6006. Computing Normal Percentiles: any Normal distribution Let X ~ N(50, 25). Find the 40th percentile of X. Find 40th percentile of standard normal: z = -0.25. “Destandardize”: −𝟎. 𝟐𝟐 = 𝒂−𝟓𝟓 , 𝟓 thus a=48.75. Example. A process manufactures ball bearings whose diameters are normally distributed with mean 2.505 cm and standard deviation of 0.008 cm. Specifications call for the diameter to be in the interval 2.5±0.01 cm. What proportion of the ball bearings will meet the specifications? Soln: Let X = diameter of a ball bearing. X ~ N(2.505, 0.0082). P( 2.49 < X < 2.51)= standardize= P( -1.88 < Z < 0.63)= 0.7357 – 0.0301 = 0.7056 Example: ball bearings contd. Suppose the machine was recalibrated, so that the mean diameter is now 2.5 cm. To what value must the standard deviation be lowered, so that 95% of the diameters will meet the specifications. Soln. X = diameter of a ball bearing, X ~ N(2.5, σ2). Want σ s.th. P( 2.49 < X < 2.51) = 0.95. Standardize: P( 𝟐.𝟒𝟒−𝟐.𝟓 𝝈 <𝒁< 𝟐.𝟓𝟓−𝟐.𝟓 ) 𝝈 = P( −𝟎.𝟎𝟎 𝝈 <𝒁< 𝟎.𝟎𝟎 ) 𝝈 =0.95. Need z-values that satisfy this equation: -1.96 = -0.01/σ and 1.96 = 0.01/σ. Thus σ=0.01/1.96=0.0051cm. Linear Functions of Normal Random Variables Let X ~ N(µ, σ2) and let a ≠ 0 and b be constants. Then aX +b ~ N(aµ+b, a2σ2) Let X1, X2, …, Xn be independent and normally distributed with means µ1, µ2,…, µn and variances σ12 , σ22 , … , σn2 . Let c1, c2,…, cn be constants, and c1 X1 + c2 X2 +…+ cnXn be a linear combination. Then 𝒄 𝟏 𝑿𝟏 + 𝒄 𝟐 𝑿𝟐 + … + 𝒄𝒏 𝑿𝒏 ~ 𝑵(𝒄𝟏 𝝁𝟏 + 𝒄𝟐 𝝁𝟐 + … + 𝒄𝒏 𝝁𝒏 , 𝒄𝟐𝟏 σ12 + 𝒄𝟐𝟐 σ22 + …+ 𝒄𝟐𝒏 σn2 ) 25 Example A chemist measures the temperature of a solution in oC. The measurement is denoted C, and is normally distributed with mean 40oC and standard deviation 1oC. The measurement is converted to oF by the equation F = 1.8C + 32. What is the distribution of F? Soln: let X=temperature in oC, X ~ N(40, 1). Let Y= temperature in oF. Then Y = 1.8X +32. Since Y is a linear function of a normal random variable, Y has a normal distribution. Mean of Y= 1.8(40)+32= 104 oF. Variance of Y = (1.8)2(1)= 3.24. So, Y ~ N(104, 3.24). 26 Distributions of Functions of Normals Let X1, X2, …, Xn be independent and identically normally distributed with mean µ and variance σ2. Then σ2 X ~ N μ, . n Let X and Y be independent, with X ~ N(µX, σ 2 X ) and Y ~ N(µY, σY2 ). Then X + Y ~ N ( μ X + μY , σ X2 + σY2 ) X − Y ~ N ( μ X − μY , σ X2 + σY2 ) 27 SAMPLING DISTRIBUTION OF THE SAMPLE MEAN EXAMPLE: Students in an university have a weight distribution that is known to be N(150, 20). Let X1, X2, …, X16 represent the weights of 16 randomly selected students from this university. If X is the average weight for this sample, find P( X > 160). Solution: Since the sample came from a normal distribution, the sample mean has a normal distribution as well. X ~N(μ, σ2/n )=N(150, 202/16)=N(150, 25). Thus, P ( X > 160) =P ( X − 150 160 − 150 > ) =P ( Z > 2) =1 − P ( Z ≤ 2) =1 − 0.9772 =0.0228. 5 5 EXAMPLE, CONTD. An elevator at this university has a capacity of 1500 pounds. What is the probability that 9 students who enter the elevator will have a safe ride, i.e. their total weight is less than 1,500 lb? Solution: The sample mean has a normal distribution: X ~ N(μ, σ2/n )= N(150, 202/9)=N(150, 44.44). Also, P( Total weight < 1500)=P( P( X < 166.67) = P ( X <1500/9)=P( X <166.67). So, X − 150 166.67 − 150 > ) = P ( Z < 2.5) = 0.9938. 6.67 6.67 Estimating the Parameters If X1,…, Xn are a random sample from a N(µ,σ2) distribution, µ is estimated with the sample mean and σ2 is estimated with the sample variance 𝟏 � )𝟐 . ∑𝒏𝒊=𝟏(𝑿𝒊 − 𝑿 deviation 𝒔𝟐 = X 𝒏−𝟏 As with any sample mean, the uncertainty (standard deviation) in X is σ / n which we replace withs / n , if σ is unknown. The mean is an unbiased estimator of µ. 30