Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sequence Package Analysis A New Data Mining Tool to Speed Up Wiretap Analysis Amy Neustein, Ph.D. Linguistic Technology Systems [email protected] Sequence Package Analysis: A New Data Mining Tool to Speed Up Wiretap Analysis Question: Why do we need a new tool? Answer: 1) In the real world speakers do not always use “key” words that can be spotted in a dialog. 2) In crime or terror related dialog, speakers will deliberately avoid the use of key words that can identify names, places, dates, etc Sequence Package Analysis How Does SPA Work? 1) Add rather than Replace SPA adds a layer of intelligence to standard dialog systems. 2) Mines audio data SPA goes beyond a conventional search for words and word strings. 3) Examines a series of related speaking turns that are discretely packaged as a sequence. WHAT DOES SPA DO? 1) SPA permits the discovery of “key” words (e.g., the name of a location where a crucial meeting among terrorists will take place) that are not in the preset lexicon. 2) SPA permits rapid and efficient data mining of large volumes of audio text by spotting sequence packages in the dialog. ADVANTAGES OF SPA Can be applied to different languages works by identifying interactional features of dialog (conversational sequence patterns) rather a preset glossary of words. Can perform data mining in real time permits a human analyst to be brought in immediately when high alarm content is being produced in the dialog. Dialog Example Speaker “A”: Come to the intersection near Juniors? (the question mark shows an upward intonation) 0.2 - 0.5 second pause (speaker then pauses briefly) Speaker “B”: 1.2 second pause Speaker “A”: You know the thoroughfare with the big traffic light? Speaker “B”: Juniors, yeah. THE SEQUENCE PACKAGE Speaker “A”: Come to the intersection near Juniors? 0.2-0.5 Speaker “B”: 1.2 seconds of silence • A noun referent (“Juniors”) with an upward intonation • A brief pause, giving the listener the chance to show recognition or ask for clarification. • Silence by the listener which indicates lack of understanding or confusion. Speaker “A”: You know the thoroughfare with the big traffic light? Speaker “B”: Juniors, yeah. • Clarification of the noun referent (“You know the thoroughfare with...”) • Repeat of noun referent (“Juniors”) the source of the recognition trouble - followed by a recognitional marker (“Yeah”). Finding the Sequence Package in the Dialog Example Look for a concatenation of these utterance components: • • • • • noun referent with upward intonation brief pause silence clarification of noun referent repeat of noun referent that was initial source of the recognition trouble • recognitional marker Private Industry Applications for SPA • • • • • • • • Technical Support Centers Help desks Call Centers Broadcast Media Depositions Courtroom Testimony Corporate Communications Conference Managers Sequence Package Analysis A New Data Mining Tool to Speed Up Wiretap Analysis Amy Neustein, Ph.D. Linguistic Technology Systems [email protected]