Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript

CRIM 483 Measuring Variability Variability Variability refers to the spread or dispersion of scores Variability captures the degree to which scores within a dataset differ from one another – High variability=large distance between scores Score set 7,6,3,3,1 – Low variability=small distance between scores Score set 4,2,3,3,1 – No variability=no distance between scores Score set 4,4,4,4,4 Variability & mean are used together to describe the characteristics of a distribution (sample) and show how distributions differ from one another There are 3 measures for variability: range, standard deviation, and variance Range The range is the most general measure of variability The formula: r=h-l – r=range – h=highest score – l=lowest score Calculation of range provides a general estimate of how wide or how much scores differ from one another Examples: – Highest age=35, lowest age=21 35-21=14 years difference between age scores in sample – Highest age=50, lowest age=15 50-15=35 years difference between age scores in sample – Which sample has the greatest variability with regard to age? Standard Deviation Standard deviation (SD)=average amount of variability from the mean in the set of scores (average distance from the mean) – Standard deviation is used most often to measure variability – Reported (as a rule) in combination with means – The greater the SD, the larger the distance between the score and the mean Formula to calculate the SD s=√(∑(x-mean)2)/(n-1) s=standard deviation ∑=sigma (sum of) x=individual score Mean=mean of all scores n=sample size Clarification of Formula Why not add up the deviations from the mean? – Sum of deviations from the mean is always equal to zero (good way to check your work) Why square the deviations? – To rid of the negative sign in order to avoid summing to 0 Why take the square root? – To return to the same units that you started with Unbiased v. Biased Estimates Unbiased – You produce an unbiased estimate by dividing by (n-1) in the SD formula – Artificially forces the SD to be larger than it would be otherwise – Why? This produces a more conservative estimate that we can feel more comfortable with–it is safer to overestimate than underestimate Biased – You produce a biased estimate by dividing by (n) in the SD formula – Use biased estimate when you are merely describing your sample and you have no intention of comparing it to the population Ultimately, the larger your sample size the less difference there is between the unbiased and biased estimates (p. 40) In Sum… Must always compute the mean first SD play a critical role later when comparing scores between groups (e.g., do male and female attitudes differ) Like means, SD are sensitive to extreme scores Variance Final method of measuring variability is variance Very similar to SD formula – Formula to calculate the SD s2= (∑(x-mean)2)/(n-1) – – – – – s2=standard deviation ∑=sigma (sum of) x=individual score Mean=mean of all scores n=sample size Variance is difficult to interpret and apply by itself Variance has greater utility in the formulas of more advanced statistics Standard Deviation v. Variance Both measure variability, dispersion, or spread SD produces variability in original units and variance produces variability in units squared – Example from book re: circuit board assembly 8.6 boards assembled/hour on average 1.59=SD: difference across workers on average boards produced is 1.59 boards 2.53=Variance: difference is 2.53 boards squared from the mean Differences in Variability