Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Normal Distribution MARE 250 Dr. Jason Turner Define Normal A variable is normally distributed if it is in the shape of a normal curve (Bell-Shaped Curve) Normal Curve Associated with a Normal Distribution is: Bell Shaped Centered at μ Range is between +3 and -3 std dev from the mean So, am I Normal? Standardized Normal Distribution – Mean 0, Std Dev 1 Associated curve – Standard Normal Curve You can standardize a variable by subtracting its Mean and then dividing by its Std Dev Properties of Normality 1. Total Area under Standard Normal Curve (SNC) is 1 2. SNC extends indefinitely in both directions, approaching, but not touching the horizontal axis 3. SNC is symmetric about 0; mirror image right/left 4. Most area under SNC lies between -3 and 3 (std dev) Properties of Normality 1. 68.26% of all possible observation lie w/in 1 std. dev. of the mean μ – σ and μ + σ 2. 95.44% of all possible observation lie w/in 2 std. dev. of the mean μ – 2σ and μ + 2σ 3. 99.74% of all possible observation lie w/in 3 std. dev. of the mean μ – 3σ and μ + 3σ Assessing Normality Large samples: Histogram can give a rough estimate of Normality Small sample: difficult to tell with histogram need a more sensitive graphical technique Assessing Normality Normal Probability Plot: plot of the observed values of the variable versus the Normal Scores (observations expected for a normally dist. variable) A normal distribution should have highly sample data which is highly correlated (1:1 ratio, linear relationship) with normally distributed values Probability Plots - PP Probability Plot of Weight Normal 99.9 Mean StDev N RJ P-Value 99 Percent 95 90 80 70 60 50 40 30 20 10 5 1 0.1 -200 -100 0 100 200 Weight 300 400 500 600 192.2 110.5 143 0.955 <0.010 When Using Probability Plots Decision of whether PP plot is linear is subjective Using a of sample observations to assess all Guidelines for Probability Plots Plot is roughly linear – accept as reasonable that variable is approximately normally distributed Plot shows deviations from linear – conclude variable probably not normally distributed Testing for Normality How do we test for normality? Use Linear Correlation Coefficient: Compute the linear correlation coefficient between the sample data and normal scores Normality Tests Many Statistical Tests require normal data You must verify normality with a test Three primarily utilized include: Anderson-Darling More powerful Ryan-Joiner (Shapiro-Wilk) Kolmogorov-Smirnov Probability Plots - PP Probability Plot of Weight H0 hypothesis: data normally distributed Normal 99.9 Mean StDev N RJ P-Value 99 80 70 60 50 40 30 20 10 5 Histogram of Weight 1 Normal 0.1 -200 -100 0 100 200 Weight 300 400 500 35 600 Mean StDev N 30 25 If p value is less than α, then reject H0 Data does not follow a normal distribution Frequency Percent 95 90 192.2 110.5 143 0.955 <0.010 20 15 10 5 0 0 80 160 240 Weight 320 400 480 192.2 110.5 143 This is not a Test… Hypothesis testing – used for making decisions or judgments Hypothesis – a statement that something is true Hypothesis test typically involves two hypothesis: Null and Alternative Hypotheses Hypothesis Testing 101