Download Speech-based Computer Tutors

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Machine learning wikipedia, lookup

Artificial intelligence wikipedia, lookup

Pattern recognition wikipedia, lookup

Natural computing wikipedia, lookup

Transcript
Speech, Natural Language, & Affect
in Tutorial Dialogue Systems
Prof. Diane J. Litman
Computer Science Department,
Intelligent Systems Program, &
Learning Research and Development Center
http://www.cs.pitt.edu/~litman
A few words about me…

Currently




Professor in CS and ISP
Research Scientist at LRDC
ITSPOKE research group (3 PhD students, 1 CS ugrad, 2 postdocs, 1
programmer)
AI Research (speech and natural language, intelligent tutoring)








Discourse and dialogue
Prosody, spoken dialogue systems
Speech and language technology for education (take my spring seminar!)
Reinforcement learning, user simulation
Affective computing
AI and education
Cognitive science
Previously



Member Technical Staff, AT&T Labs Research, NJ
Assistant Professor, CS at Columbia University, NY
AI Research (speech and NLP, knowledge representation and reasoning,
plan recognition)
2
Speech-based Computer Tutors


What are they?
Example

Tutor: Well, if an object has non zero constant velocity, is it moving or
staying still?
 Student: Moving
 Tutor: Yep. If it’s moving, then its position is changing. So then what will
happen to the packet’s horizontal displacement from the point of its
release?
 Student: It will change

Intersection of two fields:


Intelligent Tutoring Systems (ITS)
Spoken Dialogue Systems (SDS)
3
Intelligent Tutoring Systems (ITS)

Education
 Classroom instruction [most frequent form]
 Human (one-on-one) tutoring [most effective form]

Computer tutors – Intelligent Tutoring Systems
 Not as good as human tutors
 Ways to address the performance
 Language technologies




gap
Text-based dialogue
Talking heads
Speech-based dialogue: react to how in addition to what
Affective computing
4
Adding speech to ITS

Spoken Dialogue Systems (SDS)


Advantages




Systems that interact with users via speech
Naturalness
Efficiency
Eye and hands free
Domains



Information access [Raux et al., 2005; Rudnicky et al., 1999; Zue et al., 2000]
Tutoring [Graesser et al., 2001; Litman and Silliman, 2004; Pon-Barry et al., 2006]
Assistants [Allen et al., 2001; Rayner et al., 2005; Acomb et al., 2007]
5
Challenges in ITS





What does it mean to teach a subject?
What to teach?
Designing instruction
Delivering instruction
Understanding the human learning process
6
Challenges in SDS

Automated speech recognition (ASR)



Natural language understanding (NLU)
Dialogue Management (DM)




Sphinx, Microsoft Speech, Dragon Naturally Speaking
How to keep the conversation going? Best strategy?
How to detect errors in communication?
How to recover from errors?
Spoken language generation
7
Outline


ITSPOKE
Main research tools & projects




Comparing systems
Modeling learning
Interactions between phenomena
Other projects
8
How to do research in speech-based
computer tutors

ITSPOKE (Intelligent Tutoring SPOKEn Dialogue System) [Litman and
Silliman, 2004]

Speech-enabled version of the Why2-Atlas computer tutor [VanLehn, Jordan,
Rose et al., 2002]


Domain: Qualitative physics
Sample ITSPOKE problem

Suppose a man is in a free-falling elevator and is holding his keys
motionless right in front of his face. He then lets go. What will be the
position of the keys relative to the man's face as time passes? Explain.
9
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
10
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
11
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
12
Human-Computer Excerpt
Tutor26: Let me ask my question in a more general way. What happens
to the velocity of a body when there is no force acting on it?
Student27: dammit (ASR: it is)
Tutor28 : Could you please repeat that?
Student29 : same (ASR: i same)
Tutor30 : Right. And what happens to the HORIZONTAL velocity of a
body when there is no HORIZONTAL force acting on it?
Student31: zero (ASR: the zero)
Tutor32 : Well... OK, let's consider how forces in the vertical direction
affect the motion of a body in a horizontal direction in a different
scenario <…omitted…>
Student33: oh gravity you already said this (ASR: beats gravity you're
exert less)
Tutor34: Fine. Are there any other forces acting on the apple as it falls?
Student35: no why are you doing this again (ASR: no y and to it yes)
Tutor36: Correct. The only force acting on the apple as it falls to the
earth is the earth's gravitational force. In which direction does
gravitational force act?
Student37: downward you computer (ASR: downward you computer)
How ITSPOKE/WHY works

Simplified conversation structure


Question-answer format
Tutoring information authored in a hierarchical structure - KCDs
[VanLehn, Jordan, Rosé et al, 2002]
Problem
Essay
Dialogue with
ITSPOKE
Q1
Q2
Q3
14
ESSAY SUBMISSION & ANALYSIS
ITSPOKE behavior
Q1
Q2
Q3
Q5
Q4
Remediation subdialogue
Sample KCD (Knowledge Construction Dialogue)
16
Outline


ITSPOKE
Main research tools & projects




Comparing systems
Modeling learning
Interactions between phenomena
Other projects
17
Comparing systems


Metrics
Subjective metrics

Questionnaire at the end – agreement with statements like:




“It was easy to learn from the tutor”
“I enjoyed working with the tutor”
“It was easy to loose track of where I was in the conversation”
Problems


Unreliable
Need for standardization (psychometrics)
18
Comparing systems (2)

Objective metrics





Learning (gain)
Time spent with the computer tutor
Correctness of student answers
Number of help/hint requrests
Speech recognition performance
PreTest
PostTest
Learning
19
Comparing systems - projects

Speech vs. typed tutoring [Litman, Rose, Forbes-Riley et al., 2006]

Human-speech vs. Human-typed



Computer-speech (ITSPOKE) vs. Computer-typed (WHY)



No improvement in learning
No degradation due to ASR
Prerecorded vs. Synthesized [Forbes-Riley et al., 2006]



Better posttest scores
Shorter instruction time
No degradation due to synthesized voice
High ASR errors -> prefer synthesized more
Navigation Map ITSPOKE vs. regular ITSPOKE
[Rotaru and Litman, 2007; Rotaru, PhD thesis]

Affective ITSPOKE vs. regular ITSPOKE
[Forbes-Riley, Litman et. al 2008a, 2008b, in progress]
20
The Navigation Map

Outline of the conversation


Theoretical concept: discourse structure
Hypotheses


Easier for users to follow the conversation
Preferred / learn more
21
Discourse segment hierarchy
Intention/purpose structure
Discourse segments
Solution walkthrough
TUTOR1: To analyze this problem we will first describe the
motion of the person and his keys while he is holding
them. Then we will look at the motion of the person and
his keys after he lets go of them.
Let's begin by looking at the motion of the man
and his keys while he is holding them. How does his
velocity compare to that of his keys?
USER1: it’s the same
TUTOR2: Yes. So while he is holding his keys they both
have the same velocity. Now let's see what happens
when he releases them. To do so we will first find the
individual forces exerted on the man and his keys, then
both net forces, both accelerations, and finally we will
describe and compare the velocities.
So what are the forces exerted on the man after
he releases his keys? Please, specify their directions
(e.g., vertically up).
USER2: vertically down gravity
TUTOR3: Good. Because there is just one force, gravity's, it
is trivial to determine the NET force (i.e., the vector sum
of all forces). So what is the direction of the NET force
on the person?
..............
Two time frames: before release,
after release
Before release
Man’s velocity ? keys’ velocity
After release
Recipe: Forces  Net force 
Acceleration  Velocity
Man: Forces/acceleration
Forces on the man
Net force on the man
………….
………….
………….
22
Experiment design

Within-subjects design


Rate tutor after each problem


1 problem with the NM; 1 without the NM (noNM)
16 questions, 1 (Strongly Disagree) – 5 (Strongly Agree) scale
Two conditions (to account for order and problem)


F (First) : 1st problem NM; 2nd problem noNM
S (Second) : 1st problem noNM; 2nd problem NM
Experiment procedure
F condition
Read
Problem 1
Problem 2
NM
noNM
Pretest
S condition
Questionnaire
noNM
Questionnaire
Posttest
NM
Survey
Interview
NM
Differences due to NM
23
Experiment design (2)

ITSPOKE dialogue history was disabled

Compare Audio-Only versus Audio+Visual (NM)
NM
noNM
24
Results – subjective metrics

NM trend/significant effects on system perception during the dialogue:
Dimension
Question
p
Average rating
noNM
NMPres NM
Structure
identify tutoring structure
follow the tutoring structure
Integration
forward looking integration
backward looking integration
... the tutor had a clear and structured agenda behind its
explanations
... it was easy to loose track of where I was in the
interaction with the tutor
0.008
4.4 > 3.9
0.012
2.7 < 3.3
Rating scale
... it was easy to figure out where the tutor's
instruction Disagree
1 - Strongly
0.017
was leading me
…….
... when the tutor asked me a question I knew why it was
0.054
5 - Strongly Agree
asking me that question
4.1 > 3.6
4.0 > 3.5
Correct answers
know the correct answer
know if correct
... whenever I answered incorrectly, it was easy to know
the correct answer after the tutor corrected me
... I knew whether my answer to the tutor's question was
correct or incorrect
0.085
4.1 > 3.7
0.358
3.6 > 3.4
0.004
3.7 < 4.3
Level of concentration
level of concentration
... a high level of concentration is required to follow the
tutor
25
Outline


ITSPOKE
Main research tools & projects




Comparing systems
Modeling learning
Interactions between phenomena
Other projects
26
Modeling learning


Problem: What contributes to/causes learning?
Correlations with learning



Events that significantly correlate with learning
Does not imply causality but it is a requirement for it
What events to measure?
… correctness
… time spent
PreTest
PostTest
Learning
27
What events?

Time on task (+), number of student words (+)
[Litman, Rose, Forbes-Riley et al., 2006] [Forbes-Riley, Rotaru and Litman, 2008]

Student emotions [Forbes-Riley, Rotaru and Litman, 2008]



Type of turns – on human-human [Forbes-Riley et al., 2005]



Neutral on certainty (-)
Neutral on frustration (-)
Student: introduce new concept (+)
Tutor: control dialogue (-)
Discourse structure inspired parameters
[Rotaru and Litman, 2006]

Computational implications?
28
Intuition 1 – Conditioning
Student learned?
Correctness:
……………
Incorrect
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………

It is more important to be correct at
specific “places in the dialogue”.

Phenomena related to performance:
Correct
Correct
Incorrect

Correct
Incorrect

Incorrect
Correct
Correct
Incorrect
Incorrect
Correct

not uniformly important across the
dialogue
have more weight at specific places in the
dialogue.
Discourse structure can be used to define
“places in the dialogue”
Correct
Correct
29
Intuition 1 - Results

Correctness
Transition – correctness parameters
Q1
Q2
Q2.1

Q3
Q2.2
PopUp–Correct, PopUp–Incorrect

Interpretation: Capture successful learning events or failed learning
opportunities
 Generalizes across corpora
 ITSPOKE modification: engage in an additional remediation
dialogue
30
Intuition 2 – Discrimination
Student that learned less
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
Different
discourse structure
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
Student that learned more
31
Intuition 2 - Results

Transition – Transition parameters
Q1
Q2
Q2.1

Q3
Q2.2
Push–Push

Interpretation: system uncovers potential
major knowledge gaps
Q2.1.1
Q2.1.2
32
Other events

Psychology inspired

Models of reading comprehension – Landscape Model
[Ward and Litman, 2005]

Alignment model – lexical and prosodic convergence
[Ward and Litman, 2007a, 2007b]

NLP inspired

Cohesion – lexical co-occurrence [Ward and Litman, 2006]
33
From Correlations to Causality


Correlation does not imply causality
But can inform modifications


E.g. more instruction after PopUp-Incorrect events
E.g. different instruction depending on student uncertainty
Incorrect  more tutoring
Q1
Q2
Q2.1
Q3
Q2.2
34
Outline


ITSPOKE
Main research tools & projects




Comparing systems
Modeling learning
Interactions between phenomena
Other projects
35
Interactions between phenomena

Things interact in a dialogue



Why look for interactions?




Student correctness  tutor reply
Student emotion  tutor reply
Capture human tutor behavior
Extract new patterns
Allow us to formulate hypotheses
How to find interactions?


Dependency tests: χ2 (Chi-Square)
Example with 2 windows
36
Projects

Certainty  human tutor reply [Forbes-Riley and Litman, 2005]

Student uncertainty associated with



Student certainty associated with


Increase in Bottom-up replies
Decrease in Expansions
Increase in Restatements
Speech recognition errors [Rotaru and Litman, 2005, 2006a, 2006b]

Speech recognition errors  Next student state


Student State  Speech recognition errors


Increase in frustration
Incorrect, Uncertain, Frustrated  more speech errors
Discourse Structure  Speech recognition errors
37
Other projects

Affective computing (Kate Forbes-Riley’s postdoc)

Emotion prediction



Emotion adaptation/handling



What are the important emotions in tutoring
How to predict them
Model human tutor behavior
Formulate hypotheses from empirical analysis
Reinforcement Learning and User Modeling


System learns best way to react from rewards (Min Chi’s PhD)
Needs a lot of data -> user simulations (Hua Ai’s PhD)
38
Resources

Recommended classes





Introduction to Natural Language Processing
Foundations of Artificial Intelligence
Machine Learning
Knowledge Representation
Seminar classes


Advance Topics in Artificial Intelligence (Speech and Language Technology for
Educational Applications (this spring!), Affective Spoken Dialogue Systems,
Spoken Dialogue Systems, etc.)
Other resources



ITSPOKE Group Meetings
NLP @ Pitt
DoD @ CMU
 YRRSDS
 ISP Forum
 PSLC
39
Further information

Visit my homepage and talk with me
 http://www.cs.pitt.edu

Take my seminar (CS 3710), projects course (CS 2002)

Talk with members of the ITSPOKE group
 http://www.cs.pitt.edu/~litman/itspoke.html
40