Download Analyzing experiment data - Faculty Innovation Center

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Transcript
Analyzing experiment data
Enter your data into a statistical program like SPSS or SAS, that is available to UT Austin faculty, staff, and
students for a modest fee through Information Technology Services (ITS). Inspect the data for errors that
can occur during data entry or when respondents provide inconsistent answers. For large databases, check at
least five percent of entered data for accuracy. If you find any errors, check the remainder of the data and
correct the errors.
You may need to recode some answers to questions that have an "other" response option. For example, one
person may answer the question, "Do you consider yourself African American, Caucasian, Asian, Hispanic, or
other?" by circling "other" and writing, "Chinese." To maintain consistency, code the answer as "Asian"
rather than "other."
Calculate means for outcome measures. Determine if outcome and other variables are normally
distributed , a requirement for many statistical tests. If a variable is not normally distributed, consult with a
statistician to determine if you need to transform the variable.
While comparing means will give you a rough sense of differences on outcome measures, you must use
statistical tests to demonstrate that these differences are unlikely to have occurred by chance. Many
statistical programs provide a p value that indicates the probability that group differences occurred by chance
alone. For example, a p value of .05 indicates that there is a 5% probability that differences between groups
occurred by chance rather than because of the intervention. Prior to analyzing your data, set the p value that
you will use as a criterion for statistical significance . A p value of .05 is most often used as a cutoff.
Single‑group experiment
If you are comparing pre- and posttest scores for a single group, use a t-test for dependent means (also called
a paired samples t-test, repeated measures t-test, or t-test for dependent samples) to determine if there is a
statistically significant change. The easiest way to accomplish this is to enter the data into a statistical
program like Excel or SPSS and to use a pull-down menu to run the test. If you are using Excel, click Data
Analysis on the Tools menu to perform statistical analyses. If Data Analysis is not a listed option, you will
need to install the Analysis ToolPak by clicking Add­Ins on the Tools menu. To compare scores at three or more points in time, one option is a repeated measures analysis of variance
(ANOVA) (also called ANOVA for correlated samples). A significant F value for an ANOVA tells you that,
overall, scores differ at different times, but it does not tell you which scores are significantly different from
each other. To answer that question, you must perform post-hoc comparisons after you obtain a significant F,
using tests such as Tukey's and Scheffe's, which set more stringent significance levels as you make more
comparisons. However, if you make specific predictions about differences between means, you can test these
predictions with planned comparisons, which enable you to set significance levels at p < .05. Planned
comparisons are performed in place of an overall ANOVA.
Linear regression enables you to predict the level of an outcome variable using one or more continuous
variables. For example, you might institute an instructional innovation: students provide and receive on-line
feedback from fellow students every week on essay organization and clarity. Before starting the innovation,
you have students complete measures of openness to feedback and communication skill, and you use these
scores to predict their degree of writing improvement at the end of the semester.
If participants are assessed multiple times, hierarchical linear models (HLM), may be a better choice than a
repeated measures ANOVA. HLM is particularly suited to analyze data from repeated measurements or data
in a hierarchical structure. For example, in much educational research, students are grouped within
classrooms, which are grouped within schools. HLM takes into account that students from a classroom or
school have more in common than individuals who are randomly sampled from a larger population. HLM
requires specialized software, available to UT faculty and staff at a discount.
You might also compute correlations to determine whether there is a statistically significant positive or
negative relationship between two continuous variables. For example, you could determine if writing
improvement is significantly related to course satisfaction. Be aware, however, that computing correlations
between several variables increases the chances of finding a relationship due to chance alone, and that
finding significant correlations between variables does not tell you what causes those relationships.
If you need additional help from someone knowledgeable about statistics, contact the research consulting
staff at UT's Austin's Division of Statistics & Scientific Computation.
Field and controlled experiment
To test for significant differences between two separate groups of students, the most commonly used option
is the t-test for independent groups (also called an independent samples t-test, or the t-test for independent
means). Administer a version of your outcome measure before your intervention to make sure there are not
pre-existing differences that are statistically significant. If there are pre-test differences and you randomly
assigned participants to the groups, you can control for these differences using an Analysis of Covariance
(ANCOVA)procedure. You cannot use an ANCOVA, however, to control for pre-existing group differences in
a field experiment, so consult with a statistician in this case.
Make sure your data meet statistical assumptions for t-tests and other statistical procedures. For t-tests, the
distribution of the outcome variable (for example, test scores) should be roughly normal or bell-shaped.
When you are comparing two sets of scores, the spread of scores (variance) should be roughly equal for both
sets. In addition, your outcome variable should be on a continuous scale (for example, age) and cannot be
categorical (for example, dyslexic versus not dyslexic individuals).
When you assign individuals to groups based on a cutoff score on a placement measure, such as a math
achievement test, you can use a regression-discontinuity design. Participants who score above the cutoff are
assigned to one group, while participants below the cutoff are assigned to a second group. The effect of the
treatment is estimated by using the placement score to predict scores on an outcome measure, such as a
second test grade, and plotting regression lines separately for each group. If the treatment provided to one
group is effective, you should see a "jump" or discontinuity of the regression lines at the cutoff point.
To compare three groups or more, use an independent samples analysis of variance. Again, you will need to
conduct post-hoc comparisons after obtaining a significant F value to determine differences between specific
groups.
Additional information
Aron, A. & Aron, E. N. (2002). Statistics for Psychology, 3rd edition. Upper Saddle River, N J: Prentice Hall.
Lane, D. M. (2003). Tests of linear combinations of means, independent groups. Retrieved June 21, 2006
from the Hyperstat Online textbook:http://davidmlane.com/hyperstat/confidence_intervals.html
Linear regression. (n.d.) Retrieved June 21, 2006 from the Yale University Department of Statistics Index of
Courses 1997-98 Web site:http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm
Lowry, R. P. (2005). Concepts and Applications of Inferential Statistics. Retrieved June 21, 2006
from: http://faculty.vassar.edu/lowry/webtext.html
Osborne, Jason W. (2000). Advantages of hierarchical linear modeling.Practical Assessment, Research &
Evaluation, 7(1). Retrieved June 21, 2006 from: http://PAREonline.net/getvn.asp?v=7&n=1
T-test. Retrieved June 21, 2006 from the Georgetown University, Department of Psychology, Research
Methods and Statistics Resources Web
site:http://www1.georgetown.edu/departments/psychology/resources/researchmethods/statistics/8318.html.
Trochim, W. M. K. (2002). The Regression­Discontinuity Design. Retrieved June 21, 2006
from: http://www.socialresearchmethods.net/kb/quasird.htm