Download Class 11 Data Analysis

Class Meeting #11 Data Analysis Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency  Variability  Relative Standing  Relationship Inferential Statistics used to make inferences and draw conclusions.  Parametric (t-test, ANOVA, multiple regression)  Non-Parametric (chi-square) Types of Analysis  Univariate – looks at one variable at a time.  Bivariate – looks at variables two at a time. Multivariate – looks at three or more variables at a time.  Types of Variables   Independent (or Predictor)  The variable measured first in time and from which a prediction is made. The “cause” variable. Dependent (or Predicted)  The variable measured later in time and which is desirable to predict. The “effect” variable. Dayton, C. M. & Stunkard, C. L. (1971). Statistics for Problem Solving. New York: McGraw-Hill Book Company. Charles, C. M. & Mertler, C. A. (2002). Introduction to Educational Research, 4th Edition. Boston: Allyn and Bacon. Measurement Scales    Nominal – A scale that measures data by name only, such as gender, hair color, race. Ordinal – A scale that measures data by rank order only, such as medical condition, military rank, socioeconomic status. Interval – A scale that measures data by using equal intervals, such as temperature, percentage correct on a test. Nominal Scales    A number is used to represent a category. The number has no meaning beyond serving as a label. Categories are mutually exclusive but qualitatively different. Ordinal Scales      A number is used to represent a category. The number has no meaning beyond serving as a label. Categories are mutually exclusive but qualitatively different. The categories are ordered in a meaningful way. Differences between consecutive units of measurement can be unequal. Interval Scales    A number is used to represent a specific amount. The numbers are meaningful in that they represent equal-sized units that correspond to equal increases in amounts of the underlying attribute. The scale may include a zero value, but the zero is not meaningful. It is only a convenient starting point for measurement. Ratio Scales    A number is used to represent a specific amount. The numbers are meaningful in that they represent equal-sized units that correspond to equal increases in amounts of the underlying attribute. In addition, there is a true zero on the scale that represents a true absence of the attribute being measured. Organizing Data Frequency Distribution A table showing the number of test takers who received each of the scores possible (simple frequency distribution), or the number of test takers who scored within a specified interval range (grouped frequency distribution). X (score) 9 (frequency) 8 6 7 8 6 4 5 2 4 4 3 3 2 1 f 3 Displaying Data    Histogram (bar graph) Frequency Polygon (line graph) Scatter Plots Bar Graph    Sometimes referred to as “column graph” Useful in presenting or comparing differences between groups Sometimes used to show how groups differ over time Nichol & Pexman Bar Graph 90 80 70 60 50 40 30 20 10 0 East West North 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Effective Elements for Bar Graphs         Dependent variable is on the vertical axis. Independent variable is on the horizontal axis. Length of vertical axis should be 2/3 to 3/4 the length of the horizontal axis. Positive values increase to the right (horizontal axis) or up (vertical axis). Negative values increase to the left (horizontal axis) or down (vertical axis). Highest value on either scale is larger than the highest data value. Bars are clearly differentiated from one another. Bars are of the same width. Nichol & Pexman Line Graph    Used to present a change in one or more dependent variables as a function of an independent variable Particularly useful in demonstrating a trend or an interaction Must have at least 3 data points Nichol & Pexman Line Graph Effective Elements for Line Graphs        Dependent variable is on the vertical axis. Independent variable is on the horizontal axis. Length of vertical axis should be 2/3 to 3/4 that of the horizontal axis. Positive values increase to the right (horizontal) or up (vertical). Negative values increase to the left (horizontal) or down (vertical). No more than four lines or curves per graph. Lines within the graph can be clearly differentiated from one another. Nichol & Pexman Scatter Plot    Present values of single events as a function of two variables scaled along the vertical and horizontal axes. Purpose is usually to explore the relationship between two variables. A linear relationship (high correlation) may be indicated if the data points are clustered along the diagonal within the area of the plot. Nichol & Pexman Scatter Plot Effective Elements for Plots    Length of vertical axis should be 2/3 to 3/4 the length of the horizontal axis. Zero points are indicated on the axes. Data points are represented by symbols that are approximately the same size as lowercase letters used in text on the figure. Nichol & Pexman Measures of Central Tendency    Mean (arithmetic average) Median (middle score in the distribution, better known as the 50th percentile) Mode (most frequently occurring score) Comparing Measures of Central Tendency    The mean is more stable over time because each score in the distribution enters into the computation. It is, however, more affected by extreme scores. The median is less affected by extreme scores. The mode is easiest to determine but is the least stable. Extreme Scores Extreme scores, or “outliers, are individual low or high values in a group (or distribution) or scores that greatly affect the value of the mean. Measures of Variability Range (R) The difference between the highest and lowest scores in a distribution. Standard Deviation (SD) The estimate of variability that accompanies the mean in describing a distribution. Comparing Measures of Variability   Standard deviation is more reliable than range. Standard deviation is used in calculation of other statistics such as standard scores and error scores. Measures of Relationship   Paired Samples t-test compares the means of two variables. It computes the difference between the two variables for each case, and tests to see if the average difference is significantly different from zero. t-test for Independent Samples compares the mean scores of two groups on a given variable. Measures of Relationship   One-Way ANOVA* Used to test for differences among two or more independent groups. * Analysis of Variance Measures of Relationship Pearson’s Chi Square A general test for the existence of a relationship between two or more nominal level variables. Coefficient of Correlation (r) Expresses the degree of relationship between two sets of scores. Statistical Significance    p > .05 means that differences could have occurred 5 or more times in 100 samples. (NOT significant) p < .05 means that differences could have occurred less than 5 times in 100 samples. (significant) p < .01 means that differences could have occurred less than 1 time in 100 samples. (more significant) Error   Type I – You conclude that a relationship exists between variables when in reality there is none. Type II – You conclude that a relationship does not exist between variables when in reality there is one.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Class 11 Data Analysis