Download Central Tendency

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Basic Statistics
(Vocabulary introduced in the Introduction to Research and Statistics course)
CENTRAL TENDENCY
Distribution: A collection, or group, of scores from a sample on a single variable. Often, but
not necessarily, these scores are arranged in order from smallest to largest.
Mean: The arithmetic average of a distribution of scores.
Median: The score in a distribution that marks the 50th percentile. It is the score at which
50% of the distribution falls below and 50% fall above. (More stable than the mean.)
Mode: The score in a distribution that occurs most frequently.
Negative skew: In a skewed distribution, when most of the scores are clustered at the higher
end of the distribution with a few scores creating a tail at the higher end.
Outliers: Extreme scores - more than two standard deviations above or below the mean.
Positive skew: In a skewed distribution, when most of the scores are clustered at the lower
end of the distribution with a few scores creating a tail at the lower end.
Population: The group from which data are collected or a sample selected. The population
encompasses the entire group for which the data are alleged to apply.
Sample: An individual or group, selected from a population, from whom or which data are
collected.

An individual score in a distribution.

The mean of a sample.

The mean of the population.
n
The number of cases, or scores, in a sample.
VARIABILITY
Boxplot: A graphic representation of the distribution of scores that includes the range, the
median, and the interquartile range. (Top of box represents 75th percentile, bottom represents
25th percentile.)
Range: The range is the difference between the largest score and the smallest score of a
distribution. (Remember each score represents a range of – 1/2 to + 1/2 point on either side.)
Text that is outside of the parentheses is quoted from the following source:
Urdan, T. C. (2001). Statistics in plain English. Mahwah, NJ: Lawrence Erlbaum.
Standard deviation: How much the collection of scores in a distribution deviate from the
mean for the distribution. (The larger the standard deviation, the broader the distribution of
scores.)
Z Scores
Raw scores: These are the individual observed scores on measured variables.
Standard score: A raw score that has been converted to a z score by subtracting it from the
mean and dividing by the standard deviation of the distribution. (Standard scores allow you
to compare across different types of tests and different scales.)
z score: One type of standard score that is frequently used in educational research.
STATISTICS
Degrees of freedom: A number used to approximate the number of observations in the data
set for the purpose of determining statistical significance. (In an independent t-test it is n-2.)
Inferential statistics: Statistics generated from sample data that are used to make inferences
about the characteristics of the population the sample is alleged to represent.
Descriptive statistics: Statistics that describe the characteristics of a given sample or
population. These statistics are only meant to describe the characteristics of those sampled.
Effect size: A way of determining the practical significance of a statistic by reducing the
impact of sample size on the outcome. (Effect sizes are reported in standard deviation units.
Remember the scale is approximately –3 to +3.)
Null hypothesis: The hypothesis that there is no effect in the population (e.g., that two
population means are not different from each other). The null is related to the data analysis
section and is usually not reported in published studies; instead, the research hypothesis (e.g.
“Our hypothesis is that…”) is reported.
p value: The probability of obtaining a statistic of a given size from a sample of a given size
by chance, or due to random error. (p < .05 means that the chance of obtaining the results
obtained in the study by error is less than 5 out of every 100 times. Accepted p is .05, but
researchers can set their p value at any level; decision is usually related to the amount of
error that can be tolerated in the findings.)
Statistical significance: When the probability of obtaining a statistic of a given size due
strictly to random sampling error, or chance, is less than the selected alpha level, the result is
said to be statistically significant. It also represents a rejection of the null hypothesis.
Type I error: Rejecting the null hypothesis when in fact the null hypothesis is true.
(Remember, Type I errors can have dramatic consequences; they state that your findings are
unusual and challenge what is currently accepted. Setting your p value at .05 or less will
reduce Type I errors.)
Text that is outside of the parentheses is quoted from the following source:
Urdan, T. C. (2001). Statistics in plain English. Mahwah, NJ: Lawrence Erlbaum.
CORRELATION
Continuous variables: Variables measured using an interval or ratio scale. (You need to use
continuous variables when calculating the Pearson product-moment correlation coefficient.)
Correlation coefficient: A statistic that reveals the strength and direction of the relationship
between two variables.
Direction: A characteristic of a correlation that describes whether two variables are
positively or negatively related to each other. (Remember, a negative correlation isn’t bad, it
only shows direction.)
Negative correlation: A descriptive feature of a correlation indicating that as scores on one
of the correlated variables increase scores on the other variable decrease, and vice-versa.
Positive correlation: A descriptive feature of a correlation indicating that as scores on one of
the correlated variables increase scores on the other variable also increase.
Pearson product-moment correlation coefficient: A statistic indicating the strength and
direction of the relation between two continuous variables.
Perfect negative correlation: A correlation coefficient of r = -1.0. Increasing scores of a
given size on one variable is perfectly associated with decreasing scores of a related size on
the second variable in the correlation.
Perfect positive correlation: A correlation coefficient of r = +1.0. Increasing scores of a
given size on one variable is perfectly associated with increasing scores of a related size on
the second variable in the correlation.
T-tests
Independent samples t-test: A test of the statistical similarity between the means of two
independent samples on a single variable. (Do mean differences exist between two groups?)
Dependent samples t-test: Test comparing the means of paired, or dependent samples on a
single variable. Each score at Time 1 is matched with the score for the same individual at
Time 2. (Also called ‘matched,’ or ‘paired’).
Text that is outside of the parentheses is quoted from the following source:
Urdan, T. C. (2001). Statistics in plain English. Mahwah, NJ: Lawrence Erlbaum.