Download Chapter 3: A Statistics Refresher

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 3: A Statistics Refresher
What are Scales of Measurement?
•
Nominal
– Categories
•
Ordinal
– Magnitude
•
Interval
– Magnitude and equal interval
•
Ratio
– All others and has a true zero point
Examples
•
Groups of low, medium, high?
•
How much you like something?
– 1 not at all
5 very much
Frequency Distribution
•
Summary information of scores and their occurrence.
Grouped Frequency Distribution
•
Intervals replace specific values
– Identification of 12-15 class intervals recommended
Example of G.F.D.
•
20 students surveyed.
– Values are the number of alcoholic beverages.
– How do you create a frequency distribution using six classes?
Steps
•
Step 1: Find the highest and lowest
values: H = 11 and L = 0.
•
Step 2: Find the range:
R = H – L = 11 – 0 = 11.
•
Step 3: Select the number of classes
desired. In this case: 6.
•
Step 4: Find the class width by dividing the range by the number of classes. Width = 11/6
= 1.83. Value is rounded up.
•
Step 5: Select a starting point for the lowest class limit: 0, 2, 4, 6, 8, 10.
•
Step 6: Upper class limits will be 1, 3, 5, 7, 9, and 11.
Most Common Graphs
•
Histogram
•
Frequency Polygon
•
Ogive
Histogram
•
Intervals = columns; Height = frequency
Frequency Polygon
•
Intervals points; Plotted at middle.
Ogive
•
Represents the cumulative frequency.
Measures of Central Tendency
•
Mean
•
Median
•
Mode
Mean
•
Average score
Mode
•
Most commonly occurring score
•
Can you have more than one?
•
Can you have no mode?
Mode Example
•
Ten different sports cars were tested for 0-60mph times.
•
Data set: 4.2, 4.5, 4.5, 4.5, 5, 5.2, 5.5, 5.5, 5.5, 6.2, 6.2
Median (MD)
•
Rank scores, the median is the middle score .
•
Number of scores above and below it?
– 50th percentile
•
Even number of scores, median is the average between the two middle scores.
Measures of Variability
•
Extent of dispersion around central tendency
•
Variability is useful in interpreting individual differences in the distribution
•
Range
•
Variance
•
Standard Deviation
Range
•
Difference between extreme scores
•
Highest score is 100, lowest score is 55?
Variance
•
Variance is equal to the sum of the squared deviations divided by the total number of scores
Variance Problem
•
Summing the deviations about the mean cancels out the differences (i.e., “0”)
•
Square the deviations
•
Squaring the deviation results in positive numbers, though inflated/distorted
•
What do we do?
Standard Deviation
•
Square root of the squared deviations about the mean
Standard Deviation Pluses
•
Expresses individual standings
•
Shows the dispersion of a distribution
•
Allows comparisons across independent distributions
– As long as they have the same SD
Properties of a Curve
•
Skewness
– Positive and Negative
•
Kurtosis
•
Leptokurtic
•
Mesokurtic: Normal Curve
•
Properties of a Curve
•
Platykurtic
What is a Normal Curve?
•
Bell-shaped curve representing a symmetrical distribution of scores
•
Mean, mode, median are equal
Figure of a Normal Curve
•
The normal curve is a frequency polygon
What Does the Area Under the Normal Curve Convey?
•
Area of the curve describes the proportional distribution of scores
•
Typically understood in terms of standard deviations from the mean
Standard Deviation Cutpoints of a Normal Curve
•
50% (median) cuts the distribution in half
•
34% is one standard deviation above or below the mean
•
68% between -1 and +1 standard deviations
•
96% between -2 and +2 standard deviations
•
99.7% between -3 and +3 standard deviations
Area Under the Normal Curve
•
34% is one standard deviation above or below the mean
•
68% between -1 and +1 standard deviations
•
47.7% of the curve is between 0 and 1.96 sd units
•
95% of the curve is between –1.96 and +1.96 sd units
Skewed Distribution Problems
•
We can’t normally make comparisons
•
Mathematically, we can try to normalize (perform a non-linear transformation) the
distribution
•
Convert raw scores to percentile ranks then to z-scores
•
Not desirable, generally, you want to obtain a normal curve
What are Standard Scores?
•
Standard scores are linearly transformed scores
•
Raw scores are mathematically computed that make scores comparable
•
A standard score distribution maintains the same shape as the original raw score
distribution
•
Best example is the z score
Standard score (z scores)
•
z score expressed in standard deviation units (0 +/- 1)
•
What is the z score of a score of 50 on a test with a mean of 30 and a standard deviation of
10?
Standard score (T scores)
•
T score expressed in standard deviation units as well (50 +/- 10)
•
What is the T score of a raw score falling 2 standard deviations above the mean would be
equal to a T of ???