Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Basic Statistics (Vocabulary introduced in the Introduction to Research and Statistics course) CENTRAL TENDENCY Distribution: A collection, or group, of scores from a sample on a single variable. Often, but not necessarily, these scores are arranged in order from smallest to largest. Mean: The arithmetic average of a distribution of scores. Median: The score in a distribution that marks the 50th percentile. It is the score at which 50% of the distribution falls below and 50% fall above. (More stable than the mean.) Mode: The score in a distribution that occurs most frequently. Negative skew: In a skewed distribution, when most of the scores are clustered at the higher end of the distribution with a few scores creating a tail at the higher end. Outliers: Extreme scores - more than two standard deviations above or below the mean. Positive skew: In a skewed distribution, when most of the scores are clustered at the lower end of the distribution with a few scores creating a tail at the lower end. Population: The group from which data are collected or a sample selected. The population encompasses the entire group for which the data are alleged to apply. Sample: An individual or group, selected from a population, from whom or which data are collected. An individual score in a distribution. The mean of a sample. The mean of the population. n The number of cases, or scores, in a sample. VARIABILITY Boxplot: A graphic representation of the distribution of scores that includes the range, the median, and the interquartile range. (Top of box represents 75th percentile, bottom represents 25th percentile.) Range: The range is the difference between the largest score and the smallest score of a distribution. (Remember each score represents a range of – 1/2 to + 1/2 point on either side.) Text that is outside of the parentheses is quoted from the following source: Urdan, T. C. (2001). Statistics in plain English. Mahwah, NJ: Lawrence Erlbaum. Standard deviation: How much the collection of scores in a distribution deviate from the mean for the distribution. (The larger the standard deviation, the broader the distribution of scores.) Z Scores Raw scores: These are the individual observed scores on measured variables. Standard score: A raw score that has been converted to a z score by subtracting it from the mean and dividing by the standard deviation of the distribution. (Standard scores allow you to compare across different types of tests and different scales.) z score: One type of standard score that is frequently used in educational research. STATISTICS Degrees of freedom: A number used to approximate the number of observations in the data set for the purpose of determining statistical significance. (In an independent t-test it is n-2.) Inferential statistics: Statistics generated from sample data that are used to make inferences about the characteristics of the population the sample is alleged to represent. Descriptive statistics: Statistics that describe the characteristics of a given sample or population. These statistics are only meant to describe the characteristics of those sampled. Effect size: A way of determining the practical significance of a statistic by reducing the impact of sample size on the outcome. (Effect sizes are reported in standard deviation units. Remember the scale is approximately –3 to +3.) Null hypothesis: The hypothesis that there is no effect in the population (e.g., that two population means are not different from each other). The null is related to the data analysis section and is usually not reported in published studies; instead, the research hypothesis (e.g. “Our hypothesis is that…”) is reported. p value: The probability of obtaining a statistic of a given size from a sample of a given size by chance, or due to random error. (p < .05 means that the chance of obtaining the results obtained in the study by error is less than 5 out of every 100 times. Accepted p is .05, but researchers can set their p value at any level; decision is usually related to the amount of error that can be tolerated in the findings.) Statistical significance: When the probability of obtaining a statistic of a given size due strictly to random sampling error, or chance, is less than the selected alpha level, the result is said to be statistically significant. It also represents a rejection of the null hypothesis. Type I error: Rejecting the null hypothesis when in fact the null hypothesis is true. (Remember, Type I errors can have dramatic consequences; they state that your findings are unusual and challenge what is currently accepted. Setting your p value at .05 or less will reduce Type I errors.) Text that is outside of the parentheses is quoted from the following source: Urdan, T. C. (2001). Statistics in plain English. Mahwah, NJ: Lawrence Erlbaum. CORRELATION Continuous variables: Variables measured using an interval or ratio scale. (You need to use continuous variables when calculating the Pearson product-moment correlation coefficient.) Correlation coefficient: A statistic that reveals the strength and direction of the relationship between two variables. Direction: A characteristic of a correlation that describes whether two variables are positively or negatively related to each other. (Remember, a negative correlation isn’t bad, it only shows direction.) Negative correlation: A descriptive feature of a correlation indicating that as scores on one of the correlated variables increase scores on the other variable decrease, and vice-versa. Positive correlation: A descriptive feature of a correlation indicating that as scores on one of the correlated variables increase scores on the other variable also increase. Pearson product-moment correlation coefficient: A statistic indicating the strength and direction of the relation between two continuous variables. Perfect negative correlation: A correlation coefficient of r = -1.0. Increasing scores of a given size on one variable is perfectly associated with decreasing scores of a related size on the second variable in the correlation. Perfect positive correlation: A correlation coefficient of r = +1.0. Increasing scores of a given size on one variable is perfectly associated with increasing scores of a related size on the second variable in the correlation. T-tests Independent samples t-test: A test of the statistical similarity between the means of two independent samples on a single variable. (Do mean differences exist between two groups?) Dependent samples t-test: Test comparing the means of paired, or dependent samples on a single variable. Each score at Time 1 is matched with the score for the same individual at Time 2. (Also called ‘matched,’ or ‘paired’). Text that is outside of the parentheses is quoted from the following source: Urdan, T. C. (2001). Statistics in plain English. Mahwah, NJ: Lawrence Erlbaum.