Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Standard Scores • Raw scores are standardized by subtracting the mean and dividing by the standard deviation For samples zi = Xi − X s For populations zi = Xi − µ σ – So-called z-scores always have mean 0 and s.d. 1 – Allow score comparison across populations or samples • Beware, however, if the distributions are different • One use is to enter the table of the normal distribution • Another is to construct T-scores – Indicate how many s.d. units above or below the mean an observation is – This linear transformation leaves distribution shape unchanged The Normal Curve • Don’t be confused: – Any set of scores can be converted to z-scores, i.e. standardized, regardless of their distribution – The standard normal distribution table is used only if the distribution is assumed to be normal • What is the normal curve? – It is a mathematical equation that provides a theoretical distribution of scores – Many variables are approximately normally distributed – Sampling distributions of many statistics, very important to inferential statistics, are (asymptotically) normally distributed 1 • You do not need to know the equation, but – The text gives it as Y = N e−( X −µ ) 2σ , where N is 2π σ the total frequency – And refers to the percentage of scores, or percent of area under the curve, between 2 z-scores, below a z-score, etc 2 2 • With this equation, the area under the curve is N • It does not matter that one often does not know what N is, as we can divide both sides of the equation by N – Letting f(X) = Y/N, we have f (X ) = 2 1 e −( X − µ ) 2π σ 2σ 2 • This is the equation of the standard normal distribution • It describes a probability distribution; we will return to it when we cover probability theory Two Normal Curves Normal Distribution 0.03 0.025 f(x) 0.02 sigma=15 sigma=25 0.015 0.01 0.005 0 0 50 100 150 200 X 2 Using the Standard Normal Distribution Table (Areas under the Normal Curve) f(Z) • The equation is not needed, because one can obtain necessary information from the table B B C -5 -4 -3 -2 C -1 0 1 2 3 4 5 Z 3 Examples For a certain school district, the mean SAT-Q is 620 with a standard deviation of 100. Assume the scores are normally distributed Q: What is the percentile rank of the score 500? A: z = (500-620)/100 = -1.20. Area to the left of –1.20 (C in table) is .1151. Rounding, the score is at the 12th percentile. Q: What is the percentile rank of the score 795? A: z = (795-620)100 = 1.75. Area between mean and z (B in the table) is .4599. Adding that to area to the left of the mean (.5000), and rounding, the score is at the 96th percentile. Q: What percent of scores are between 650 and 750? A: z1 = (650-620)/100 = 0.30 and z2 = (750-620)/100 = 1.30. Area1 (B) = .1179 and Area2 (B) = .4032. Fraction = .4032-.1179 = .2853, rounded to 29%. Scores on an aptitude test are normally distributed with a mean of 100 and a standard deviation of 15. Q: What score is at the 50th percentile? A: 100 (How do we know that without any calculation?) Q: What score is at the 40th percentile? A: Find the z-score such that C = .40 and B = .10, and take the negative of that value. That operation yields z = -0.25. Next, solve for the score X, using z = (X − µ ) σ . Rearrange to obtain X = z ⋅ σ + µ . Substituting, X = -0.25(15)+100 = 96.25. Q: What score is at the 75th percentile? A: Find the (positive) z-score such that C = .25 and B = .25, and proceed as above. 4