Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Nominal Scaling Measures of Central Tendency, Spread, and Shape • The lowest level of scaling • In Nominal Scaling, each of the values serves as a representation. • Each observation belongs to one mutually exclusive category Dr. J. Kyle Roberts and has no logical order • Examples: • Gender • Ethnicity • School Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Ordinal Scaling Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Interval Scaling • In Interval Scaling, each of the values has a specific order that • In Ordinal Scaling, each of the values is in rank order. reflects equal differences between observations. • Each observation belongs to one mutually exclusive category, • Each observation belongs to one mutually exclusive category, but we now have logical order. • Examples: with logical order, and equal differences between each of the points and no absolute zero. • Examples: • Letter Grades • Place Finished in a Race • Likert-type Scaling • Temperature (Fahrenheit and Celcius) • IQ Scores • SAT/GRE Scores Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Central Tendency Ratio Scaling Measures of Spread/Variation Confidence Intervals Measures of Shape Measures of Central Tendency • In Ratio Scaling, each of the values has a specific order that • The measures of central tendency try and give us a picture of reflects equal differences and a ”true” zero. what is going on at the middle of the distribution • Each observation belongs to one mutually exclusive category, with logical order, equal differences between each of the points, and has a ”true” zero. • Examples: • Kelvin Scale • Height and Weight • Speed Levels of Scale Central Tendency • There are 3 types of measures of central tendency • The mode - this is the most frequently appearing score • The median - this is also called the ”middle” score • The mean - this is also called the ”average” score Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Central Tendency Outline Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Measures of Spread/Variation Confidence Intervals Measures of Shape The Mode • The mode is the most frequently appearing score • There can be as many modes as there are pieces of data in a dataset • Datasets with two modes are usually referred to as bimodal Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Central Tendency Representing the Mode • X = 1,2,2,3,3,3,4,4,5 • Y = 1,1,1,2,2,3,4,4,4,5 Histogram of Y Data 30 30 Percent of Total Percent of Total 25 20 10 20 15 10 5 0 0 1 2 3 4 5 1 2 x Levels of Scale Central Tendency 3 4 5 y Measures of Spread/Variation Confidence Intervals Measures of Shape The Median • The median is often called the ”middle” score • To compute the median, you first arrange the scores in ascending order • For datasets with an odd number of scores, the median is the middle score. • For datasets with an even number of scores, the median is the score halfway between the two middle scores • If X = 1,2,3,4,10, the median would be 3 • If X = 1,2,3,10, the median would be 2.5 • If X = 3,2,10,1, the median would still be 2.5 Confidence Intervals Measures of Shape Confidence Intervals Measures of Shape Outline • Suppose we have 2 separate datasets Histogram of X Data Measures of Spread/Variation Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Levels of Scale Central Tendency Measures of Spread/Variation Outline Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Central Tendency The Mean X= i=1 i=1 (Xi ) n • Given the dataset X = 1,2,3,4 X= Levels of Scale Central Tendency 1+2+3+4 10 = = 2.5 4 4 Measures of Spread/Variation Confidence Intervals Measures of Shape Range • The range is used to describe the number of units on the scale of measurement • It is computed as: • (Highest score - Lowest score) • Suppose we have the dataset X = 1,2,3,4, the range would be: • (4 - 1) = 3 Measures of Shape Confidence Intervals Measures of Shape Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Pn (Xi )/n = Confidence Intervals Outline • The mean is also called the ”average” score n X Measures of Spread/Variation Levels of Scale Central Tendency Measures of Spread/Variation Outline Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Standard Deviation Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Computing the Standard Deviation • Suppose that we have a dataset where X = 1,2,3,4 r • The standard deviation is also sometimes referred to as the (1 − 2.5)2 + (2 − 2.5)2 + (3 − 2.5)2 + (4 − 2.5)2 4−1 r 2 2 (−1.5) + (−0.5) + (0.5)2 + (1.5)2 = 3 r 2.25 + 0.25 + 0.25 + 2.25 = 3 r 5 = 3 √ = 1.67 SDX = mean deviation • The SD represents the ”average distance” of each score from the mean s Pn SDX = − X)2 n−1 i=1 (Xi = 1.29 Levels of Scale Central Tendency Measures of Spread/Variation Outline Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Confidence Intervals Measures of Shape Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Z-Scores • As opposed to the other measure of variation, the z-score is a measure of deviation for a single individual, as opposed to a group of scores. • The Z-score represents the number of Standard Deviation units a given piece of data is from the mean. Zi = Xi − X SDX Raw Score 1 2 3 4 Z-Score -1.16 -0.39 0.39 1.16 Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Central Tendency Outline Central Tendency Confidence Intervals Measures of Shape Standard Error of the Mean Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Levels of Scale Measures of Spread/Variation • The standard error of the mean is really just computed for the purposes of computing Confidence Intervals SDX SEMX = √ n • For our dataset, X = 1,2,3,4 , 1.29 SEMX = √ 4 = 0.645 Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Central Tendency Confidence Intervals Measures of Spread/Variation Confidence Intervals Measures of Shape Area Under the Curve • Suppose that we have a uniform normal distribution of 0.3 0.3 0.4 95% area at z=1.95 0.4 68.26% area at z=1 population scores between -1 and +1 [U (−1, 1)] and we take many samples of n=10. 4 0.2 6 dnorm(x, 0, 1) 8 0.2 dnorm(x, 0, 1) Percent of Total 10 0 [1] 6.570396e-05 > sd(res) [1] 0.1825853 0.0 0.5 1.0 0.0 > mean(res) −0.5 0.0 −1.0 0.1 0.1 2 −3 −2 −1 0 x 1 2 3 −3 −2 −1 0 x 1 2 3 Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Computing 95% Confidence Intervals Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Graphing 95% Confidence Intervals 8 6 Let’s look back at our previous dataset where X=1,2,3,4. In this case, we noted that the mean of X = 2.5 and the SD of X = 1.29. This would make the SEM = 1.29/2 = 0.645. 95% confidence intervals for any dataset are computed as: > x <- c(1, 2, 3, 4) > y <- c(1, 3, 5, 7, 9) > plotmeans(c(x, y) ~ rep(c("x", "y"), c(4, 5))) c(x, y) = 2.5 ± 1.96 ∗ 0.645 = 2.5 ± 1.264 ● 4 CI95% = X ± 1.96 ∗ SEM 2 ● n=4 n=5 x y rep(c("x", "y"), c(4, 5)) Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Thinking Through Confidence Intervals 13 12 11 ● ● Given the properties of the above two studies, which study would have the LARGER confidence intervals? 7 8 Thought Question > study1 <- rnorm(30, 10, 5) > study2 <- rnorm(30, 10, 10) > plotmeans(c(study1, study2) ~ rep(c("Study 1", + "Study 2"), c(30, 30))) 10 Study 2 10 10 30 Confidence Intervals 9 mean SD n Study 1 10 5 30 Measures of Spread/Variation Thinking Through Confidence Intervals c(study1, study2) Suppose that we have two different datasets with the following properties: Central Tendency n=30 n=30 Study 1 Study 2 rep(c("Study 1", "Study 2"), c(30, 30)) Measures of Shape Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Thinking Through Confidence Intervals Confidence Intervals Measures of Shape Thought Question 12 11 ● ● 10 c(study1, study2) 13 14 Study 2 10 10 30 > study1 <- rnorm(30, 10, 10) > study2 <- rnorm(300, 10, 10) > plotmeans(c(study1, study2) ~ rep(c("Study 1", + "Study 2"), c(30, 300))) 9 Study 1 10 10 300 Measures of Spread/Variation Thinking Through Confidence Intervals Suppose that we have two different datasets with the following properties: mean SD n Central Tendency 8 Given the properties of the above two studies, which study would have the LARGER confidence intervals? n=30 n=300 Study 1 Study 2 rep(c("Study 1", "Study 2"), c(30, 300)) Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Central Tendency Outline Confidence Intervals Specific Normal Distributions N(10,1) N(10,4) N(12,1) 0.4 0.3 f(x) Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Measures of Spread/Variation 0.2 0.1 0.0 6 8 10 x 12 14 Measures of Shape Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale The standard normal distribution Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Evaluating the Normal Distribution • The standard normal distribution is the one with µ = 0 and σ = 1. We often use Z to denote it, i.e. Z ∼ N (0, 12 ). • Its density function is often written φ(z) and 1 2 φ(z) = √ e−z /2 2π • As given from the previous slide, we assume that the normal distribution has a structure such that mean meadian mode. • We can also evaluate the shape of our distribution relative to it’s deviation from the standard normal distribution through multiple means. −∞<z <∞ 0.4 • Skewness - a measure of shape determining whether or not the ”left half” looks like the ”right half” 0.3 • Kurtosis - a measure of shape determining relative hieght to width 0.2 • Gaussian probability density plot 0.1 0.0 −3 Levels of Scale −2 Central Tendency −1 0 Measures of Spread/Variation 1 2 Confidence Intervals 3 Measures of Shape Levels of Scale Central Tendency Outline Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Measures of Spread/Variation Confidence Intervals Measures of Shape Skewness • Skewness measures typically range from -3 to +3 • A skewness of 0 means that the left half looks exactly like the right half • Positively skewed means that the mean is influenced by outliers making it ”seem” larger than the relative mean of the population from which the sample was drawn n n X X n xi − x̄ 3 n skew = = zi3 (n − 1)(n − 2) SDx (n − 1)(n − 2) i=1 i=1 Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Skewness Examples - Normal Distribution Histogram for Normally Distributed Data QQ plot for a Normal Distribution Histogram for Positively Skewed Data 10 5 ●●●● ● ● ● ● ● ● ● ● 100 80 60 0 ● 50 QQ plot for Positively Skewed Data 100 −2 0 0 2 Measures of Spread/Variation Measures of Shape QQ plot for Negatively Skewed Data ● ●●●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● 0 25 rskt(200, 2, −2) 20 15 −5 ● ● −10 10 ● ●● ● 5 −15 ● 0 ● −15 −10 −5 rskt(200, 2, −2) 0 −3 −2 −1 0 qnorm 20 40 60 80 −3 rskt(200, 2, 2) Confidence Intervals 30 ● 0 qnorm Histogram for Negatively Skewed Data ● 1 2 3 Levels of Scale Central Tendency ● ● 10 20 Skewness Examples - Negatively Skewed Distribution −20 40 0 150 Central Tendency ● 20 ● ●● ●●●● rnorm(1000, 100, 15) Measures of Shape 30 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 120 Confidence Intervals 60 ● Percent of Total rnorm(1000, 100, 15) Percent of Total Measures of Spread/Variation ● 140 Percent of Total Central Tendency Skewness Examples - Positively Skewed Distribution 15 Levels of Scale Levels of Scale rskt(200, 2, 2) Levels of Scale ● ● ●●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● −2 −1 0 1 2 3 qnorm Measures of Spread/Variation Outline Levels of Scale Central Tendency The Mode The Median The Mean Measures of Spread/Variation Range Standard Deviation Z-Scores Standard Error of the Mean Confidence Intervals Measures of Shape The Normal Distribution Skewness Kurtosis Confidence Intervals Measures of Shape Levels of Scale Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Levels of Scale Kurtosis Central Tendency Measures of Spread/Variation Confidence Intervals Measures of Shape Graphical Examples of Kurtosis • Although all of these theoretically could be normal, they illustrate leptokurtic and platykurtic distributions. • Kurtosis measures typically range from -3 to +3 Normal Platykurtic Leptokurtic • A kurtosis of 0 means that your distribution has the same 0.4 relative height to width properties as a normal distribution • Positive kurtosis means that your distribution is leptokurtic (taller and skinnier) 0.3 • Negative kurtosis means that your distribution is platykurtic kurt = n(n + 1) (n − 1)(n − 2)(n − 3) f(x) (shorter and fatter) n X i=1 xi − x̄ SDx 4 − 0.2 1)2 3(n − (n − 2)(n − 3) 0.1 0.0 6 8 10 x 12 14