Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
User Study Evaluation Human-Computer Interaction Hypothesis • A statement of prediction • Describes what you expect will happen in your study • Alternative hypothesis (H1) – your prediction, i.e. a claim of difference in the population • e.g. Participants will commit more errors with interface A than with interface B • Null hypothesis (H0) – No difference or no effect • e.g. Participants will commit the same number of errors between interface A and interface B or Participants will commit more errors in interface B than with interface A Hypothesis – one or two tailed? • Alternative hypothesis • One-tailed: Participants will commit more errors with interface A than with interface B (i.e. directional) • Two-tailed: There will be a significant difference in the number of errors participants commit with interface A than with interface B • but I don’t know if there will be more or fewer (i.e. non- directional) • Can’t prove the alternative hypothesis, can only reject the null hypothesis • If your prediction was correct – reject null hypothesis • Not rejecting null hypothesis ≠ accepting it Metrics • What you are measuring • Some types of metrics • Objective – facts of an event • Time to complete task (continuous) • Errors (discrete, i.e. distinct and separate, can be counted) • Subjective – a person’s opinion • Satisfaction Metrics • Types of metrics • Objective – facts of an event • Subjective – a person’s opinion • *Both* are important • How to measure • Instrumentation – record data within your system • Questionnaires / Surveys • Scales • Free-response • Let’s discuss appropriateness of each • Let’s look at a very popular survey (SUS) Analysis •Most of what we do involves: •Normal Distributed Results •Independent Testing •Homogenous Population •Recall, we are testing the hypothesis by trying to prove the NULL hypothesis false Analysis • 3 main steps for analysis • Data Preparation: Cleaning and organizing the data for analysis • Checking the data for accuracy • Transforming data (e.g. reverse coding survey data) • Descriptive Statistics: Describing the data • Provide simple summaries about the sample and the measures • Simply describing what is, what the data shows • Inferential Statistics: Testing Hypotheses and Models • Try to infer from the sample data what the population thinks • Make judgments of the probability that an observed difference between groups is a dependable one or one that might have happened by chance Data preparation • Checking data for accuracy • Are the responses legible/readable? • Are all important questions answered? • Are the responses complete? • Is all relevant contextual information included (e.g., data, time, place, researcher)? Data preparation • Data transformations • Missing values • Depending on program, need designate specific values to represent missing values, e.g. -99 • Scale totals • Add or average across individual items • Item reversals • Likert scale – sometimes rating for items need to be • • • • reversed 1 (strongly disagree) – 5 (strongly agree) “I generally feel good about myself.” “Sometimes I feel like I'm not worth much as a person.” What does a 5 mean in each case? Descriptive statistics • Simple summaries of sample and measures, i.e. data • Describing what is or what the data shows • Central tendency – estimate of the “center” of a distribution of values • Mean – average across a set of values • 15, 15, 18, 25, 33 = 106 • µ = 106/5 = 21.2 • Median – score found in middle of a set of values • 15, 15, 18, 25, 33 • Mode – most frequently occurring value • 15, 15, 18, 25, 33 • Describe the data with a number and a graph Inferential statistics • Try to reach conclusions that go beyond the immediate data – draw inferences • e.g. want to compare the average performance of 2 groups to see if there’s a difference t-test: statistical test used to determine whether two observed means are statistically different t-test • What does it mean to say that the averages for two groups are statistically different? t-test • Variability is the noise that may make it harder to see the group difference • Variance: measure of variability around the mean • Standard deviation: square root of the variance t – test • (rule of thumb) Good values of t > 1.96 (standard deviations from the mean) t-test • Once computed, look up t-value to see whether the ratio is large enough to say that the difference between the groups is not likely to have been a chance finding. • To test the significance, you need to set a risk level (called the alpha level). Accepted standard is alpha level of .05. • 5 times out of 100 you would find a statistically significant difference between the means even if there was none (i.e., by "chance"). • Degrees of freedom (df). For t-test, the df = sum of the persons in both groups minus 2. • Given the alpha level, the df, and the t-value, look up t-value to determine whether the t-value is large enough to be significant. • If yes, conclude that difference between means for the 2 groups is different (even given the variability) and reject null hypothesis. α and p values • α value – probability of making a Type I error (rejecting null hypothesis when really true) • p value – probability that the effect found did not occur by chance. The lower the p value, the higher the statistical significance (the more rigorous the test) Relationship between α and p values • Once the alpha level has been set, a statistic (like t) is computed. • Each statistic has an associated probability value called a p- value, or the likelihood of an observed statistic occurring due to chance, given the sampling distribution. • Alpha sets the standard for how extreme the data must be before we can reject the null hypothesis. The p-value indicates how extreme the data are. • Compare the p-value with alpha to determine whether the observed data are statistically significantly different from the null hypothesis Kinds of t-tests Formula is slightly different for each: • Single-sample: • tests whether a sample mean is significantly different from a pre-existing value (e.g. norms) • Paired-samples: • tests the relationship between 2 linked samples, e.g. means obtained in 2 conditions by a single group of participants • Independent-samples: • tests the relationship between 2 independent populations • Which test fits your situation? t and alpha values Independent samples t-test • Example: social presence questionnaire • “I perceived I was in the presence of a patient in the room with me.” • http://www.vassarstats.net/tu.html Correlations Correlations – relationship between two variables Pearon’s product-moment correlation coefficient – r http://bdaugherty.tripod.com/KeySkills/lineGraphs.html Correlations Pearson’s product-moment correlation coefficient – r http://www.socscistatistics.co m/tests/pearson/Default2.asp x http://en.wikipedia.org/wiki/Co rrelation_and_dependence