Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Normal Distribution “the bell curve” Some slides downloaded from www.registart.co.uk The Most Important Distribution in Statistics! Describes the characteristics of many realworld data sets: – – – – – – test scores for large groups of students actual sizes (length, width) of jeans at Kohl’s eyesight of all 20-year-olds in Kissimmee actual lifetimes of 1000 AA batteries testosterone level of all male students at GHS length of middle finger of 250 students Characteristics Symmetric, bell-shaped curve. X can take any value (continuous RV) Shape of curve depends on 2 parameters: – Center of distn is population mean – Spread is determined by std deviation Most values fall around the mean, but a few values are much smaller and a few are much larger (equal chance). Probability Density Function (PDF) “X is distributed normally with mean μ and standard deviation σ” 1 x 2 1 2 f ( x) e 2 x Shape Depends on Mean, Std. Dev Bell-shaped curve 0.08 Mean = 70 SD = 5 0.07 Density 0.06 0.05 0.04 Mean = 70 SD = 10 0.03 0.02 0.01 0.00 40 50 60 70 Grades 80 90 100 As a Histogram (Area of rectangle = probability) Symmetrical Binomial Distribution B(10, 0.5) 0.3 P(X=r) 0.25 Prob 0.2 0.15 0.1 0.05 0 0 1 2 3 4 5 r 6 7 8 9 10 Decrease interval size... Symmetrical Binomial Distribution B(30, 0.5) 0.16 P(X=r) 0.14 0.12 Prob 0.1 0.08 0.06 0.04 0.02 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 r Decrease interval size more…. 0.09 Binomial Distribution : B(100,0.5) P(X=r) 0.08 0.07 0.05 Almost a nice continuous curve 0.04 0.03 0.02 0.01 r 95 10 0 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0 0 Prob 0.06 Probability = Area under Curve Curve describes probability of getting a range of values – e.g., P(X > 60), P(X < 30), P(20 < X < 50) Area under whole curve = 1 Probability of getting specific number is 0, e.g. P(X=60) = 0 – so P(x < 60) is the same as P(x ≤ 60) Probability that X is less than a # f(x) 0.095 P(X < 23) [or P(X ≤ 23)] 0.09 0.085 0.08 mean 20 std dev 5 0.075 0.07 0.065 0.06 0.055 0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 x -1 -0.005 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Probability that X is more than a # P( X 23) 1 P( X 23) f(x) 0.095 P(X > 23) [or P(X ≥ 23)] 0.09 0.085 0.08 mean 20 std dev 5 0.075 0.07 0.065 0.06 0.055 0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 x -1 -0.005 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Probability that X is between 2 #s P(13 X 21) P( X 21) P( X 13) f(x) 0.095 P(13 < X < 21) [or P(13 ≤ X < 21), etc.] 0.09 0.085 0.08 mean 20 std dev 5 0.075 0.07 0.065 0.06 0.055 0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 x -1 -0.005 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Standard Deviation Graph (H&H p.730) Draw with GDC Set window – – X from μ - 3σ to μ + 3σ 1 Y from 0 to 2 (99.7% of all values) Draw – – 2nd PRGM (DRAW) CLRDRAW (#1) 2nd VARS (DISTR) DRAW ShadeNorm(lower limit, upper limit, [μ, σ]) if omit [μ, σ] 0, 1 Draw with GDC (con’t) For normally distributed X with mean 15, std dev 2: P(8 ≤ X ≤ 12): P(X ≥ 17): P(X ≤ 16): use ShadeNorm(8, 12, 15, 2) use ShadeNorm(17, E99, 15, 2) use ShadeNorm(-E99, 16, 15, 2) use E99 in place of ∞, -E99 in place of -∞ Calculate with GDC For normally dist’d X with mean 71.5, std dev 3.8: 2nd VARS (DISTR) #2 normalcdf(lower limit, upper limit, [μ, σ]) if omit [μ, σ] 0, 1 P(62.1 ≤ X ≤ 68.7): P(X ≥ 89.0): P(X ≤ 42.5): use normalcdf(62.1, 68.7, 71.5, 3.8) use normalcdf(89.0, E99, 71.5, 3.8) use normalcdf(-E99, 42.5, 71.5, 3.8) Note: P(62.1 ≤ X ≤ 68.7) is just P(X ≤ 68.7) - P(X ≤ 62.1) Standard Normal Distribution (Z-Distn) To make a table of values for X, need to know both μ and σ – – One table for each combination of μ and σ LOTS of tables!!! Make a new random variable Z = (X – μ)/σ Z is called the standard normal distribution Need only one table of values for Z, since μ = 0 and σ = 1 always Symmetric, so P(Z < -k) = P(Z > k) The Standard Normal Distribution (Z) Z ~ N (0,1) Z-Values (“Z-Scores”) Value of Z is just the # of standard deviations from the mean: – – – – – Z = -2 corresponds to X = μ - 2σ Z = -1 corresponds to X = μ - σ Z = 0 corresponds to X = μ Z = 1 corresponds to X = μ + σ Z = 2 corresponds to X = μ + 2σ Etc. (Insert graph of preceding slide) Z-Values with GDC P(-1.5 ≤ Z ≤ 2.1) normalcdf(-1.5, 2.1) (Omitting μ and σ means μ = 0 and σ = 1) If starting with X-values (μ ≠ 0 and/or σ ≠ 1), don’t forget to convert to Z, then back to X The Standard Normal Distribution Z ~ N (0,1) The probabilities are given by the area under the curve P(Z<-1.6) The Standard Normal Distribution Z ~ N (0,1) The probabilities are given by the area under the curve P(Z< -1.6) =0.0548 By symmetry: P(Z < -1.6) = P(Z > 1.6) P(Z < -1.6) = 1 - P(Z < 1.6) Z-Values from Tables Table in formula packet “Area under the standard normal curve (topic 6.11)” Gives probability that Z is less than (actually < or ≤) a specified value Table is for positive values of Z, only Before using table, convert X-values to Z Reading Table of Z-Values (INSERT picture of table), with animations showing how to read z to 2 decimal places Highlight Z-values on top and on left, highlight cross-indexed area Example : Using Table of Z-Values For X with mean μ = 26, std dev σ = 1.4, find P(X < 27.1) Z = (X – μ)/σ = (27.1 – 26)/1.4 = 0.786 use Z = 0.79 P(X < 27.1) = P(Z < 0.79) = 0.7852 compare to answer from normalcdf(-E99, 27.1, 26, 1.4) P(X < 27.1) = 0.7840 (slightly different because we rounded Z) P(X < 27.106) = P(Z < 0.79) (no rounding) = 0.7852 (to 4 d.p.’s) Extending the Table Table from formula packet only works for: – – P(Z < z) Positive Z-values What to do if you want P(Z > z), or if Z is a negative value? Think of the graph and which areas you should shade… Calculating P(Z > z) from Table Use the fact that the total area under the bell curve equals 1 P(Z < z) + P(Z > z) = 1 (remember P(Z = z) = 0) P(Z > z) = 1 – P(Z < z) Example: P(Z > z) from Table Find P(Z > 2.58) P(Z > 2.58) = 1 – P(Z < 2.58) From table, P(Z < 2.58) = 0.9951 P(Z > 2.58) = 1 – 0.9951 = 0.0049 Example: P(X > x) For X with mean μ = 26 and std dev σ = 1.4, find P(X > 26.8) Z = (X – μ)/σ = (26.8 – 26)/1.4 = 0.571 use Z = 0.57 P(X > 26.8) P(Z > 0.57) = 1 - P(Z < 0.57) = 1 - 0.7157 = 0.2843 compare to P(X > 26.8) using normalcdf(26.8, E99, 26, 1.4): P(X > 26.8) = 0.2839 (again, difference due to rounding) Using Table with Negative Z’s Use the fact that the bell curve is symmetric! (insert graph) P(Z < -z) = P(Z > z) = 1 – P(Z < z) P(Z > -z) = P(Z < z) Example: Using Table with Z < 0 Given normally dist’d X with μ = 54.4, σ = 6.7, find P(X < 49.8) Z = (X – μ)/σ = (49.8 – 54.4)/6.7 = -0.687 use Z = -0.69 P(Z < -0.69) = P(Z > 0.69) = 1 – P(Z < 0.69) = 1 – 0.7549 = 0.2451 Compare: normalcdf(-E99, 49.8, 54.4, 6.7) = 0.2462 Using Table for P(a < X < b) Subtract the areas P(a < X < b) = P(X < b) – P(X < a) INSERT pictures Example: P(a < X < b) from Table Given normally dist’d X with μ = 54.4, σ = 6.7, find P(45.0 < X < 49.8) Z1 = (X – μ)/σ = (45.0 – 54.4)/6.7 = -1.40 Z2 = (X – μ)/σ = (49.8 – 54.4)/6.7 = -0.69 P(-1.40 < Z < -0.69) = P(Z < -0.69) – P(Z < -1.40) = P(Z > 0.69) – P(Z > 1.40) = [1 – P(Z < 0.69)] – [1 – P(Z < 1.40)] = [1 – 0.7549] – [1 – 0.9192] = 0.1643 Compare: normalcdf(45.0, 49.8, 54.4, 6.7) = 0.1659 Inverse Normal Probabilities Now we work backwards: – – Examples of questions: – – – know the probability want to find corresponding value of X (or Z) Find k such that P(X ≤ k) = 95.4% If P(-0.10 < X < b) = 0.357 (i.e., 35.7%), find b Find μ so that P(X > 0.771) = 80.8% Could use trial and error, but there’s a better way Inverse Normal Probabilities by GDC Use 2nd VARS (DISTR) #3 invNorm(area, [μ, σ]) μ and σ are optional If omitted, then: μ=0 σ=1 (omit when using Z-score, not X) Example: Inv. Normal Prob. by GDC X is normally distributed with μ = 70, σ = 10. Find k such that P(X ≤ k) = 0.954 (i.e., 95.4%) 2nd VARS (DISTR) invNorm(0.954, 70, 10) = 86.8 k = 86.8 Check: normalcdf(-E99, 86.8, 70, 10) = 0.954 Inverse Normal Probabilities by Table Table in formula packet (2 pages) “Inverse Normal Probabilities (topic 6.11)” Gives probability that Z is less than (actually < or ≤) a specified value Table is for probabilities between 0.5 and 0.999, and only for positive values of Z Before using table, convert X-values to Z Reading Inverse Probability Table (INSERT picture of table), with animations showing how to read z to 2 decimal places Highlight probabilities on top and on left, highlight cross-indexed Z-score Examples: Using Inverse Table Find k such that P(Z < k) = 0.695 p = 0.695 read Z = 0.5101 k = 0.5101 Check: normalcdf(-E99, 0.5101) = 0.695 (omit μ, σ) Find k such that P(Z > k) = 0.128 P(Z < k) = 1 – P(Z > k) = 1 – 0.128 = 0.872 p = 0.872 read Z = 1.1359 k = 1.1359 Check: normalcdf(1.1359, E99) = 0.128 (omit μ, σ) Example: Using Inverse Table for X X is dist’d normally with μ = 24.6, σ = 0.8 For what value of k is P(X < k) = 0.602? read Z = 0.2585 X = Zσ + μ = (0.2585)(0.8) + 24.6 = 24.8 Check: normalcdf(-E99, 24.8, 24.6, 0.8) = 0.599 (difference due to rounding X) p = 0.602 Z = (X – μ)/σ Extending the Inverse Table Table from formula packet only works for 0.5 < p < 0.999 and Z > 0 What to do if p < 0.5? – use P(Z < k) + P(Z > k) = 1 What to do if P(Z > k)? – use P(Z > k) = P(Z < -k) (symmetry) Example: Inverse Table for p < 0.5 For what value of k is P(Z < k) = 0.210? P(Z < k) = 1 – P(Z > k) P(Z > k) = 1 – 0.210 = 0.79 (which is > 0.5, so we can use the table now) By symmetry, P(Z > k) = P(Z < -k) (the table requires P(Z < k)) read Z = 0.8064 -k = 0.8064 Check: normalcdf(-E99, -0.8064) = 0.210 p = 0.79 k = -0.8064 Example: Inverse Table, P(a < X < b) Insert example… Example: Inverse Table, μ = ? X is dist’d normally with σ = 1.75 but unknown μ. Find μ if P(X > 4.92) = 0.4. P(X > 4.92) = 0.4 read Z = 0.2534 Z = (X – μ)/σ P(Z > k) = 0.4 1 - P(Z < k) = 0.4 P(Z < k) = 1 - 0.4 = 0.6 0.2534 = (4.92 – μ)/1.75 μ = 4.92 – (0.2534)(1.75) = 4.48 Check: normalcdf(4.92, E99, 4.48, 24.6, 1.75) = 0.401 (difference due to rounding X)