Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Speech, Natural Language, & Affect in Tutorial Dialogue Systems Prof. Diane J. Litman Computer Science Department, Intelligent Systems Program, & Learning Research and Development Center http://www.cs.pitt.edu/~litman A few words about me… Currently Professor in CS and ISP Research Scientist at LRDC ITSPOKE research group (3 PhD students, 1 CS ugrad, 2 postdocs, 1 programmer) AI Research (speech and natural language, intelligent tutoring) Discourse and dialogue Prosody, spoken dialogue systems Speech and language technology for education (take my spring seminar!) Reinforcement learning, user simulation Affective computing AI and education Cognitive science Previously Member Technical Staff, AT&T Labs Research, NJ Assistant Professor, CS at Columbia University, NY AI Research (speech and NLP, knowledge representation and reasoning, plan recognition) 2 Speech-based Computer Tutors What are they? Example Tutor: Well, if an object has non zero constant velocity, is it moving or staying still? Student: Moving Tutor: Yep. If it’s moving, then its position is changing. So then what will happen to the packet’s horizontal displacement from the point of its release? Student: It will change Intersection of two fields: Intelligent Tutoring Systems (ITS) Spoken Dialogue Systems (SDS) 3 Intelligent Tutoring Systems (ITS) Education Classroom instruction [most frequent form] Human (one-on-one) tutoring [most effective form] Computer tutors – Intelligent Tutoring Systems Not as good as human tutors Ways to address the performance Language technologies gap Text-based dialogue Talking heads Speech-based dialogue: react to how in addition to what Affective computing 4 Adding speech to ITS Spoken Dialogue Systems (SDS) Advantages Systems that interact with users via speech Naturalness Efficiency Eye and hands free Domains Information access [Raux et al., 2005; Rudnicky et al., 1999; Zue et al., 2000] Tutoring [Graesser et al., 2001; Litman and Silliman, 2004; Pon-Barry et al., 2006] Assistants [Allen et al., 2001; Rayner et al., 2005; Acomb et al., 2007] 5 Challenges in ITS What does it mean to teach a subject? What to teach? Designing instruction Delivering instruction Understanding the human learning process 6 Challenges in SDS Automated speech recognition (ASR) Natural language understanding (NLU) Dialogue Management (DM) Sphinx, Microsoft Speech, Dragon Naturally Speaking How to keep the conversation going? Best strategy? How to detect errors in communication? How to recover from errors? Spoken language generation 7 Outline ITSPOKE Main research tools & projects Comparing systems Modeling learning Interactions between phenomena Other projects 8 How to do research in speech-based computer tutors ITSPOKE (Intelligent Tutoring SPOKEn Dialogue System) [Litman and Silliman, 2004] Speech-enabled version of the Why2-Atlas computer tutor [VanLehn, Jordan, Rose et al., 2002] Domain: Qualitative physics Sample ITSPOKE problem Suppose a man is in a free-falling elevator and is holding his keys motionless right in front of his face. He then lets go. What will be the position of the keys relative to the man's face as time passes? Explain. 9 • Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002] • Sphinx2 speech recognition and Cepstral text-to-speech 10 • Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002] • Sphinx2 speech recognition and Cepstral text-to-speech 11 • Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002] • Sphinx2 speech recognition and Cepstral text-to-speech 12 Human-Computer Excerpt Tutor26: Let me ask my question in a more general way. What happens to the velocity of a body when there is no force acting on it? Student27: dammit (ASR: it is) Tutor28 : Could you please repeat that? Student29 : same (ASR: i same) Tutor30 : Right. And what happens to the HORIZONTAL velocity of a body when there is no HORIZONTAL force acting on it? Student31: zero (ASR: the zero) Tutor32 : Well... OK, let's consider how forces in the vertical direction affect the motion of a body in a horizontal direction in a different scenario <…omitted…> Student33: oh gravity you already said this (ASR: beats gravity you're exert less) Tutor34: Fine. Are there any other forces acting on the apple as it falls? Student35: no why are you doing this again (ASR: no y and to it yes) Tutor36: Correct. The only force acting on the apple as it falls to the earth is the earth's gravitational force. In which direction does gravitational force act? Student37: downward you computer (ASR: downward you computer) How ITSPOKE/WHY works Simplified conversation structure Question-answer format Tutoring information authored in a hierarchical structure - KCDs [VanLehn, Jordan, Rosé et al, 2002] Problem Essay Dialogue with ITSPOKE Q1 Q2 Q3 14 ESSAY SUBMISSION & ANALYSIS ITSPOKE behavior Q1 Q2 Q3 Q5 Q4 Remediation subdialogue Sample KCD (Knowledge Construction Dialogue) 16 Outline ITSPOKE Main research tools & projects Comparing systems Modeling learning Interactions between phenomena Other projects 17 Comparing systems Metrics Subjective metrics Questionnaire at the end – agreement with statements like: “It was easy to learn from the tutor” “I enjoyed working with the tutor” “It was easy to loose track of where I was in the conversation” Problems Unreliable Need for standardization (psychometrics) 18 Comparing systems (2) Objective metrics Learning (gain) Time spent with the computer tutor Correctness of student answers Number of help/hint requrests Speech recognition performance PreTest PostTest Learning 19 Comparing systems - projects Speech vs. typed tutoring [Litman, Rose, Forbes-Riley et al., 2006] Human-speech vs. Human-typed Computer-speech (ITSPOKE) vs. Computer-typed (WHY) No improvement in learning No degradation due to ASR Prerecorded vs. Synthesized [Forbes-Riley et al., 2006] Better posttest scores Shorter instruction time No degradation due to synthesized voice High ASR errors -> prefer synthesized more Navigation Map ITSPOKE vs. regular ITSPOKE [Rotaru and Litman, 2007; Rotaru, PhD thesis] Affective ITSPOKE vs. regular ITSPOKE [Forbes-Riley, Litman et. al 2008a, 2008b, in progress] 20 The Navigation Map Outline of the conversation Theoretical concept: discourse structure Hypotheses Easier for users to follow the conversation Preferred / learn more 21 Discourse segment hierarchy Intention/purpose structure Discourse segments Solution walkthrough TUTOR1: To analyze this problem we will first describe the motion of the person and his keys while he is holding them. Then we will look at the motion of the person and his keys after he lets go of them. Let's begin by looking at the motion of the man and his keys while he is holding them. How does his velocity compare to that of his keys? USER1: it’s the same TUTOR2: Yes. So while he is holding his keys they both have the same velocity. Now let's see what happens when he releases them. To do so we will first find the individual forces exerted on the man and his keys, then both net forces, both accelerations, and finally we will describe and compare the velocities. So what are the forces exerted on the man after he releases his keys? Please, specify their directions (e.g., vertically up). USER2: vertically down gravity TUTOR3: Good. Because there is just one force, gravity's, it is trivial to determine the NET force (i.e., the vector sum of all forces). So what is the direction of the NET force on the person? .............. Two time frames: before release, after release Before release Man’s velocity ? keys’ velocity After release Recipe: Forces Net force Acceleration Velocity Man: Forces/acceleration Forces on the man Net force on the man …………. …………. …………. 22 Experiment design Within-subjects design Rate tutor after each problem 1 problem with the NM; 1 without the NM (noNM) 16 questions, 1 (Strongly Disagree) – 5 (Strongly Agree) scale Two conditions (to account for order and problem) F (First) : 1st problem NM; 2nd problem noNM S (Second) : 1st problem noNM; 2nd problem NM Experiment procedure F condition Read Problem 1 Problem 2 NM noNM Pretest S condition Questionnaire noNM Questionnaire Posttest NM Survey Interview NM Differences due to NM 23 Experiment design (2) ITSPOKE dialogue history was disabled Compare Audio-Only versus Audio+Visual (NM) NM noNM 24 Results – subjective metrics NM trend/significant effects on system perception during the dialogue: Dimension Question p Average rating noNM NMPres NM Structure identify tutoring structure follow the tutoring structure Integration forward looking integration backward looking integration ... the tutor had a clear and structured agenda behind its explanations ... it was easy to loose track of where I was in the interaction with the tutor 0.008 4.4 > 3.9 0.012 2.7 < 3.3 Rating scale ... it was easy to figure out where the tutor's instruction Disagree 1 - Strongly 0.017 was leading me ……. ... when the tutor asked me a question I knew why it was 0.054 5 - Strongly Agree asking me that question 4.1 > 3.6 4.0 > 3.5 Correct answers know the correct answer know if correct ... whenever I answered incorrectly, it was easy to know the correct answer after the tutor corrected me ... I knew whether my answer to the tutor's question was correct or incorrect 0.085 4.1 > 3.7 0.358 3.6 > 3.4 0.004 3.7 < 4.3 Level of concentration level of concentration ... a high level of concentration is required to follow the tutor 25 Outline ITSPOKE Main research tools & projects Comparing systems Modeling learning Interactions between phenomena Other projects 26 Modeling learning Problem: What contributes to/causes learning? Correlations with learning Events that significantly correlate with learning Does not imply causality but it is a requirement for it What events to measure? … correctness … time spent PreTest PostTest Learning 27 What events? Time on task (+), number of student words (+) [Litman, Rose, Forbes-Riley et al., 2006] [Forbes-Riley, Rotaru and Litman, 2008] Student emotions [Forbes-Riley, Rotaru and Litman, 2008] Type of turns – on human-human [Forbes-Riley et al., 2005] Neutral on certainty (-) Neutral on frustration (-) Student: introduce new concept (+) Tutor: control dialogue (-) Discourse structure inspired parameters [Rotaru and Litman, 2006] Computational implications? 28 Intuition 1 – Conditioning Student learned? Correctness: …………… Incorrect …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… It is more important to be correct at specific “places in the dialogue”. Phenomena related to performance: Correct Correct Incorrect Correct Incorrect Incorrect Correct Correct Incorrect Incorrect Correct not uniformly important across the dialogue have more weight at specific places in the dialogue. Discourse structure can be used to define “places in the dialogue” Correct Correct 29 Intuition 1 - Results Correctness Transition – correctness parameters Q1 Q2 Q2.1 Q3 Q2.2 PopUp–Correct, PopUp–Incorrect Interpretation: Capture successful learning events or failed learning opportunities Generalizes across corpora ITSPOKE modification: engage in an additional remediation dialogue 30 Intuition 2 – Discrimination Student that learned less …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… Different discourse structure …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… …………… Student that learned more 31 Intuition 2 - Results Transition – Transition parameters Q1 Q2 Q2.1 Q3 Q2.2 Push–Push Interpretation: system uncovers potential major knowledge gaps Q2.1.1 Q2.1.2 32 Other events Psychology inspired Models of reading comprehension – Landscape Model [Ward and Litman, 2005] Alignment model – lexical and prosodic convergence [Ward and Litman, 2007a, 2007b] NLP inspired Cohesion – lexical co-occurrence [Ward and Litman, 2006] 33 From Correlations to Causality Correlation does not imply causality But can inform modifications E.g. more instruction after PopUp-Incorrect events E.g. different instruction depending on student uncertainty Incorrect more tutoring Q1 Q2 Q2.1 Q3 Q2.2 34 Outline ITSPOKE Main research tools & projects Comparing systems Modeling learning Interactions between phenomena Other projects 35 Interactions between phenomena Things interact in a dialogue Why look for interactions? Student correctness tutor reply Student emotion tutor reply Capture human tutor behavior Extract new patterns Allow us to formulate hypotheses How to find interactions? Dependency tests: χ2 (Chi-Square) Example with 2 windows 36 Projects Certainty human tutor reply [Forbes-Riley and Litman, 2005] Student uncertainty associated with Student certainty associated with Increase in Bottom-up replies Decrease in Expansions Increase in Restatements Speech recognition errors [Rotaru and Litman, 2005, 2006a, 2006b] Speech recognition errors Next student state Student State Speech recognition errors Increase in frustration Incorrect, Uncertain, Frustrated more speech errors Discourse Structure Speech recognition errors 37 Other projects Affective computing (Kate Forbes-Riley’s postdoc) Emotion prediction Emotion adaptation/handling What are the important emotions in tutoring How to predict them Model human tutor behavior Formulate hypotheses from empirical analysis Reinforcement Learning and User Modeling System learns best way to react from rewards (Min Chi’s PhD) Needs a lot of data -> user simulations (Hua Ai’s PhD) 38 Resources Recommended classes Introduction to Natural Language Processing Foundations of Artificial Intelligence Machine Learning Knowledge Representation Seminar classes Advance Topics in Artificial Intelligence (Speech and Language Technology for Educational Applications (this spring!), Affective Spoken Dialogue Systems, Spoken Dialogue Systems, etc.) Other resources ITSPOKE Group Meetings NLP @ Pitt DoD @ CMU YRRSDS ISP Forum PSLC 39 Further information Visit my homepage and talk with me http://www.cs.pitt.edu Take my seminar (CS 3710), projects course (CS 2002) Talk with members of the ITSPOKE group http://www.cs.pitt.edu/~litman/itspoke.html 40