Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Emotions during learning: The first steps toward an affect sensitive intelligent tutoring system Scotty D. Craig, Sidney K. D’Mello, Barry Gholson, Amy Witherspoon, Jeremiah Sullins, & Arthur C. Graesser Institute for Intelligent Systems University of Memphis United States [email protected] Abstract. In an attempt to discover links between learning and emotions, this study adopted an emote-aloud procedure in which participants were recorded as they verbalized their affective states while interacting with an intelligent tutoring system, AutoTutor. Participants’ facial expressions were coded using the Facial Action Coding System and analyzed using association rule mining techniques. The resulting rules are discussed along with implications for the larger project of improving the AutoTutor system into a nonintrusive affect-sensitive intelligent tutoring system. While the 20th century has been ripe with learning theory, these theories have mostly ignored the importance of the link between a persons emotions or affective states and learning (Meyer, & Turner, 2002). However, toward the end of the twentieth century, emotions started to get more attention. Some seminal contributions to the literature include the facial action coding system by Ekman & Friesen (1978), Stein and Levine’s (1991) theory of goals and emotion, cognitive theory of emotion proposed by Ortony, Clore, and Collins (1988), and Russell’s (2003) recent theory of emotion. Ekman and Friesen (1978) highlighted the expressive aspects of emotions with their Facial Action Coding System. This system specified how “basic emotions” can be identified by coding specific facial behaviors based on the muscles that produce them. Each movement in the face is referred to as an action unit. There are approximately 58 action units. These prototypical facial patterns were used to identify the emotions of happiness, sadness, surprise, disgust, anger, and fear (Ekman & Friesen, 1978; Elfenbein & Ambady, 2002). The coding system was tested primarily on static pictures rather than on changing expressions over time. Unfortunately, for those researchers interesting in the role of emotions in learning, it is doubtful whether these 6 emotions are frequent and functionally significant in the learning process (Kapoor,Mota, & Picard, 2001). More generally, some researchers have challenged the adequacy of basing a complete theory of emotions on these “basic” emotions (Rozin & Cohen, 2003). The claim has been made that cognition, motivation, and emotions are the three components of learning (Snow, Corno, & Jackson, 1996). Emotion has traditionally been viewed as source of motivational energy (Harter, 1981; Miserandino, 1996; Stipek, 1998), but is often not regarded as independent factor in learning or motivation (Ford, 1992; Meyer & Turner, 2002). However, in the last decade, the link between emotions and learning has received more attention (e.g. Craig, Graesser, Sullins, & Gholson, in press; Kort, Reilly, & Picard, 2001; Meyer & turner, 2002; Picard 1997). Kort, Reilly, and Picard (2001) have recently proposed a four-quadrant model that explicitly links learning and affective states. The learning process is broken up by two axes, vertical and horizontal, labeled learning and affect respectively. The learning axis ranges from “constructive learning” at the top, where new information is being integrated into schemas, and “un-learning” at the bottom where misconceptions are hopefully identified and removed from schemas. The affect axis ranges from positive affect on the right to negative affect on the left. According to this model, learners move around the circle from a state of ease, to encountering misconceptions, to discarding misconceptions, to new understanding, and then back into a state of ease. For a more detailed description of this model see Kort et al (2001) or Craig et al. (in press). Much of the current research into the link between emotions (or affective states) and learning has come from the area of user modeling. Much work in this field has focused on identifying the users’ emotions as they interact with computer systems such as tutoring systems (Fan, Sarrafzadeh, Overmyer, Hosseini, Biglari-Abhari, & Bigdeli, 2003) or educational games (Conati, 2002). However, many of the types of systems only assess intensity, or valence (Ball & Breeze, 2000), or a single affective state (Hudlicka & McNeese, 2002). For example, Guhe, Gray, Schoelles, and Ji (2004) have recently created a system in which a user is monitored in an attempt to detect confusion during interaction with an intelligent tutoring system. The limitation with this approach is that one affective state is not sufficient to encompass the whole gamut of learning (Conati, 2002). Recently, Craig, et al. (in press) for example, presented evidence that boredom, confusion, and flow were all correlated with learning gains. Another problem with the single state detection approach is that the person’s reaction to the presented material can change depending on their goals, preferences, expectations and knowledge state (Conati, 2002). The current research reports the first step in a larger project to integrate affect sensing into an intelligent tutoring system, AutoTutor (Graesser, K. Wiemer-Hastings, P. Wiemer-Hastings, Harter, Kreuz, & TRG, 1999; Graesser, Person, Harter, & TRG, 2001). The purpose of this study is twofold. First, we want to identify affective states that occur frequently during learning. The affective states of interest in this study were anger, boredom, confusion, contempt, curiosity, disgust, eureka, and frustration. Second, we have adopted the Facial Action Coding System (Ekman & Friesen, 1978) as a method to identify these affective states. Association rule mining techniques were employed to locate frequent sets of action units that occur together and to extract association rules that could conditionally influence the presence of action units on the face. Association rule mining (Agarwal, Imielinski, & Swami, 1993) is a widely used technique to find interesting associations among sets of data items. Association rules are probabilistic in nature and take the form “Antecedent → Consequent [support, confidence]”. The antecedent is an item or a set of items whose occurrence influences the occurrence of the consequent (also an item or a set of items). The support of a rule measures its usefulness and is the probability that a record in the data set will contain both the antecedent and the consequent. The confidence measures its certainty and is the conditional probability that a record containing the antecedent will contain the consequent. It provides a measure of the influence the antecedent has on the presence of the consequent. The a prori algorithm (Agarwal and Srikant, 1994) was first used to mine sets of frequent action units (called frequent itemsets). Association rules were then obtained from the frequent itemsets that were mined. Methods Participants The participants used in this study consisted of 7 undergraduates. They were selected from the department of psychology subject pool at the University of Memphis. Two participants were discarded from the current analysis due to insignificant contributions. Materials Electronic materials. Participants interacted with a computer program called AutoTutor on topics in computer literacy. AutoTutor asked questions about computer hardware. The questions were deep-level (such as why, how, what-if) and required about a paragraph of information to answer correctly. AutoTutor holds a mixed initiative dialog to assist the students in answering each question. The conversation typically takes 30-100 conversational turns. In addition to giving students short feedback on their contributions, AutoTutor gives hints, asserts missing information, and corrects misconceptions of the student (Graesser, et al., 1999). Emote-aloud procedure: During the interaction with AutoTutor, the participants were video recorded. They were asked to perform an emote-aloud procedure in which they stated aloud when they experienced an affective state. Participants were given a list with 8 affective states along with definitions. The list of affective states consisted of anger, boredom, confusion, contempt, curiosity, disgust, eureka, and frustration. The affective states were functionally defined for the participants. Anger was defined as a strong feeling of displeasure and usually of antagonism. Boredom was defined as the state of being weary and restless through lack of interest. Confusion was defined as to fail to differentiate from an often similar or related other. Contempt was defined as the act of despising; lack of respect or reverence for something. Curious was defined as an active desire to learn or to know. Disgust was defined as marked aversion aroused by something highly distasteful. Eureka was defined as a feeling used to express triumph on a discovery. Frustration was defined as making vain or ineffectual all efforts however vigorous; a deep chronic sense or state of insecurity and dissatisfaction arising from unresolved problems or unfulfilled needs. All definitions were taken from MerriamWebster online (2003). Procedure As participants came into the lab they were given an informed consent followed by a 24-item pretest on computer hardware. Then, the participants interacted with AutoTutor. During this interaction, participants engaged in the emote-aloud activity. Afterwards, a 24-item posttest on computer hardware was given followed by the debriefing. Data Treatment Scoring procedure: Each participant’s video was divided into 10-second clips that ended in an emotealoud utterance. Since the expression of emotions tends to be very fast and only last for about 3 seconds (Ekman, 1992), two raters independently scored the three seconds before the utterance was made using the Facial Action Coding System (Ekman & Friesen, 1978). Therefore, for the three seconds before an utterance was made raters watched the clips and recorded AUs and the time of observation. Clips with multiple emotions were not included in the analyses. This resulted in a total of 201 valid clips. The two raters demonstrated a high reliability with an overall Kappa score of .80. The raters’ coding of action units was used to create an action unit database. Each record in the database consisted of one or more action units from the same clip based on the time stamp in which they were observed. This allowed for multiple action unit records for the same emote-aloud utterance and increased the database to 437 records. Data Cleaning: Due to a lack of observations for Contempt (n=8), Curious (n=3), and Disgust (n=5), these emotions were not included in the current analysis. Although Anger and Eureka had a non-trivial number of records (26 and 58 respectively), they were excluded from this analysis because the associated records were not distributed among the participants. We found that 88% of Anger’s records were from one participant and this would lead to biased results that would not accurately reflect the data from the other subjects. This data cleaning procedure resulted in reliable data only for boredom, confusion, and frustration. Data Selection: After eliminating the subjects and the emotions mentioned above, data sets with a maximum number of records were selected for each emotion. Therefore each data set consisted of records randomly selected (without replacement) from each of the subjects while ensuring that each subject contributed an equal number of records to the data set. One hundred such data sets were selected for each of the three emotions (Boredom, Confusion, Frustration) yielding 300 data sets. Results The Association rule mining techniques applied to the itemset revealed significant relations between our three affective states and several unique action units. Table 1 highlights significant results obtained from the itemset mining. An itemset here is used to refer to an action unit or a set of action units that frequently occur over the entire database. The coverage for each itemset is the frequency of its presence in all 100 randomly selected data sets. Therefore, coverage of 100% for an itemset indicates that the respective itemset was observed in all the data sets for an emotion. The average support is an average of the support values of all data sets (within an emotion) in which the itemset appeared. Actions units marked with an asterisk (*) are considered to be of secondary importance. Emotion Action Units Avg Support (%) Coverage (%) Description Frustration 2 1 1,2 14* 20.2 18.8 18.8 11.6 100 100 100 66 Inner brow raise Outer brow raise Inner and outer brow raised together Dimpler Confusion 4 7 4,7 12* 23.9 20.2 18.6 18.5 95 78 73 95 Brow lowerer Lid tightener Brow lowered with tightened lids Lip corner puller Boredom 43 23.9 40 Eye closure Table 1. Frequent action units The association rule mining techniques applied to the itemset also revealed significant association rules between our three affective states and several unique action units. Table 2 presents the strongest association rules that were mined from the frequent itemsets listed above. The strength of these rules is derived from having a confidence of 100% for all rules presented. A correlation analysis on the rules presented below ensures that the antecedent and consequent of each rule are positively correlated. Emotion Rules Avg Support (%) Coverage (%) Description Frustration 1→2 18.8 100 Presence of an inner brow raise will trigger an outer brow raise 2→1 19.2 53 Presence of an outer brow raise will trigger an inner brow raise Confusion 7→4 18.6 52 Tightened lids will lead to a lowered brow Boredom None* Table 2. Association rules Discussion When association rule mining techniques were applied to our item set, it revealed the action units primarily associated with frustration, confusion, and boredom (See tables 1 & 2 above). Action units 1, 2, and 14 were primarily associated with frustration, but a strong association was found for a link between action units 1 and 2 occurring together. Confusion displayed associations with action units 4, 7, and 12, but also showed a unique association rule between action units 7 and 4. Boredom showed an association with 43 or eye closure. While boredom did not display any association rules between action units, it did show several weaker trends between eye blinks and various mouth movements such as mouth opening and closing and jaw drop. However, more research would be required to investigate the reliability of these associations. This study is part of a larger research program that investigates the role of affect during the learning experience. Our research program has three main objectives. The first is to identify the emotions (or affective states) that are most important during learning. Identifying these emotions involves current theories of emotion, research on learning, and our own empirical research. Our second objective is to find methods to reliably identify these emotions during learning. We are currently exploring non-intrusive ways of identifying emotions as learners interact with the AutoTutor program. Some of the technologies we are exploring include a video camera that can identify facial features, a posture detector that monitors the learner’s position during learning, and features of the dialog exhibited while learners are interacting with AutoTutor. Finally, we will program AutoTutor to respond appropriately to emotions exhibited by learners. References Agarwal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of the ACM-SIGMOID International Conference Management of Data (pp. 207-216). Washington, DC. Agarwal, R., Srikant, R. (1994). Fast algorithms for mining association rules in large databases. Proceedings of the International Conference Very Large Data Bases (pp. 487-499). Santiago, Chile. Ball, G., & Breeze, J. (2000). Emotion and personality in a conversational agent. In J. Cassel, J. Sullivan, S. Prevost, & E. Churchill (Eds.), Embodied Conversational Agents (pp. 189-219). Boston: The MIT Press. Craig, S.D., Graesser, A. C.,Sullins, J., & Gholson, B. (in press). Affect and learning: An exploratory look into the role of affect in learning. Journal of Educational Media Ekman, P. (1992). Are there basic emotions? Psychological Review, 99, 550-553. Ekman, P, & Friesen, W. V. (1978). The facial action coding system: A technique for the measurement of facial movement. Palo Alto: Consulting Psychologists Press. Elfenbein, H. A., & Ambady, N. (2002). On the universality and cultural specificity of emotion recognition: A metaanalysis. Psychological Bulletin, 128, 203-235. Fan, C., Sarrafzadeh, A., Overmyer, S., Hosseini, H. G., Biglari-Abhari, M., & Bigdeli, A. (2003). A fuzzy approach to facial expression analysis in intelligent tutoring systems. In Antonio Méndez-Vilas and J.A.Mesa González(Eds.) Advances in Technology-based Education: Towards a Knowledge-based Society Vol 3. (pp. 1933-1937). Badajoz, Spain: Junta De Extremadura Ford, M. E. (1992). Motivating humans: Goals, emotions, and personal agency beliefs. London: Sage. Graesser, A. C., Person, N., Harter, D., & the Tutoring Research Group (2001). Teaching tactics and dialog in AutoTutor. International Journal of Artificial Intelligence in Education, 12, 257-279. Graesser, A., Wiemer-Hastings, K., Wiemer-Hastings, P., Kreuz, R., & the Tutoring Research Group (1999). AutoTutor: A simulation of a human tutor. Journal of Cognitive Systems Research, 1, 35-51. Guhe, M., Gray, W. D., Schoelles, M. J., & Ji, Q. (2004, July). Towards an affective cognitive architecture. Poster session presented at the Cognitive Science Conference, Chicago, IL. Harter, S. (1981). A new self-report scale of intrinsic versus extrinsic orientation in the classroom: Motivation and informational components. Developmental Psychology, 17, 300-312. Hudlicka, E., & McNeese, D. (2002). Assessment of user affective and belief states for interface adaptation: Application to an Air Force pilot task. User Modeling and User-Adapted Interaction, 12(1), 1-47. Kapoor, A., Mota, S., & Picard, R. (2001). Toward a learning companion that recognizes affect. In Proceedings from Emotional and Intelligent II, the tangled knot of social cognition, AAAI Fall Symposium. AAAI Press. Kort, B., Reilly, R., & Picard, R. (2001). An affective model of interplay between emotions and learning: Reengineering educational pedagogy—building a learning companion. In T. Okamoto, R. Hartley, Kinshuk, & J. P. Klus (Eds.), Proceedings IEEE International Conference on Advanced Learning Technology: Issues, Achievements and Challenges (pp.43-48). Madison, Wisconsin: IEEE Computer Society. Meyer, D. K., & Turner, J. C. (2002). Discovering emotion in classroom motivation research. Educational Psychologist, 37(2), 107-114. Merriam-Webster, Incorporated. (n..d.) Merriam-Webster OnLine. Retrieved July 7, 2003, from http://www.m-w.com Miserandino, M. (1996). Children who do well in school: Individual differences in perceived competence and autonomy in above-average children. Journal of Educational Psychology, 88, 203-214. Ortony, A., Clore, G. L., & Collins, A. (1988). The cognitive structure of emotions. New York: Cambridge University Press. Picard, R. W. (1997). Affective computing. Cambridge, Mass: MIT Press. Rozin, P. & Cohen, A. B. (2003). High Frequency of facial expressions corresponding to confusion, concentration, and worry in an analysis of naturally occurring facial expressions of Americans. Emotion, 3, 68-75. Russell, J. A. (2003). Core affect and the psychological construction of emotion. Psychological Review, 110, 145-172. Stein, N. L., & Levine, L. J. (1991). Making sense out of emotion. In W. Kessen, A. Ortony, & F. Kraik (Eds.), Memories, thoughts, and emotions: Essays in honor of George Mandler (pp. 295-322). Hillsdale, NJ: Erlbaum. Stipek, D. (1998). Motivation to Learn: From Theory to Practice 3rd edition. Boston: Allyn and Bacon. Acknowledgements This research conducted by the authors was supported by the National Science Foundation (REC 0106965 and ITR 0325428), and the DoD Multidisciplinary University Research Initiative (MURI) administered by ONR under grant N00014-00-1-0600. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of ONR or NSF. We would like to thank the Tutoring Research Group for the use of the AutoTutor program in this research. The Tutoring Research Group (TRG) is an interdisciplinary research team comprised of approximately 35 researchers from psychology, computer science, physics, and education (visit http://www.autotutor.org). We would also like to thank Laurentiu Cristofor for use of the ARMiner client server data mining application used for the association rule mining.