Download Hidden Markov Model and its application in Pos Tagging

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Machine Translation
Dai Xinyu
2006-10-27
1
Outline







Introduction
Architecture of MT
Rule-Based MT vs. Data-Driven MT
Evaluation of MT
Development of MT
MT problems in general
Some Thinking about MT from
recognition
2
Introduction
"I have a text in front of me which is written
in Russian but I am going to pretend that it is
really written in English and that it has been
coded in some strange symbols. All I need do
is strip off the code in order to retrieve the
information contained in the text"
machine translation - the use of computers to translate from one language to another
•The classic acid test for natural language processing.
•Requires capabilities in both interpretation and generation.
•About $10 billion spent annually on human translation.
http://www.google.com/language_tools?hl=en
3
Introdution - MT past and present






mid-1950's - 1965:
 Great expectations
The dark ages for MT:
 Academic research projects
1980's - 1990's:
 Successful specialized applications
1990's:
 Human-machine cooperative translation
1990's - now:
 Statistical-based MT
 Hybrid-strategies MT
Future prospects:
 ???
4
Interest in MT
Commercial interest:
 U.S. has invested in MT for intelligence
purposes
 MT is popular on the web—it is the most
used of Google’s special features
 EU spends more than $1 billion on
translation costs each year.
 (Semi-)automated translation could lead
to huge savings
5
Interest in MT
 Academic interest:
 One of the most challenging problems in NLP
research
 Requires knowledge from many NLP sub-areas,
e.g., lexical semantics, parsing, morphological
analysis, statistical modeling,…
 Being able to establish links between two
languages allows for transferring resources from
one language to another
6
Related Area to MT
 Linguistics
 Computer Science
 AI
 Compile
 Formal Semantics
 …
 Mathematics
 Probability
 Statistics
 …
 Informatics
 Recognition
7
Architecture of MT
-- (Levers of Transfer)
8
Rule-Based MT vs. Data-Driven MT
 Rule-Based MT
 Data-Driven MT
 Example-Based MT
 Statistics-Based MT
9
Rule-Based MT
语言学
语义学
认知科学
人工智能
写规则
规则
自然语言输入
x
翻译系统
翻译结果
10
Rule-Based MT
11
Man, this is so boring.
Hmm, every time he sees
“banco”, he either types
“bank” or “bench” … but if
he sees “banco de…”,
he always types “bank”,
never “bench”…
Translated documents
12
Example-Based MT
 origins: Nagao (1981)
 first motivation: collocations, bilingual
differences of syntactic structures
 basic idea:
 human translators search for analogies (similar
phrases) in previous translations
 MT should seek matching fragment in bilingual
database, extract translations
 aim to have less complex dictionaries,
grammars, and procedures
 improved generation (using actual
examples of TL sentences)
13
EBMT still going




Bi-lingual corpus Collection
Store
Searching and matching
…
14
Statistical MT Basics
 Based on assumption that translations
observed statistical regularities
 origins: Warren Weaver (1949)
 Shannon’s information theory
 core process is the probabilistic ‘translation
model’ taking SL words or phrases as input,
and producing TL words or phrases as
output
 succeeding stage involves a probabilistic
‘language model’ which synthesizes TL
words as ‘meaningful’ TL sentences
15
Statistical MT
统计学习
建立模型
自然语言输入
x1 x2  xn
学习系统
预测
自然语言输入
xn 1
概率模型
预测系统
p̂( xn 1 )
16
Statistical MT schema
17
Statistical MT processes




Bilingual corpora: original and translation
little or no linguistic ‘knowledge’, based on word cooccurrences in SL and TL texts (of a corpus), relative
positions of words within sentences, length of sentences
Alignment: sentences aligned statistically (according to
sentence length and position)
Decoding: compute probability that a TL string is the
translation of a SL string (‘translation model’), based on:




frequency of co-occurrence in aligned texts of corpus
position of SL words in SL string
Adjustment: compute probability that a TL string is a valid
TL sentence (based on a ‘language model’ of allowable
bigrams and trigrams)
search for TL string that maximizes these probabilities
argmaxeP(e/f) = argmaxeP (f/e) P (e)
18
Language Modeling
 Determines the probability of some English
l
sequence e1 of length l
 P(e) is normally approximated as:
i1
P(e )  P(e1 )P(e2 | e1 ) P(ei | eim
)
l
1
l
i 3
where m is size of the context, i.e. number

of previous words that are considered,
 m=1, bi-gram language model
m=2, tri-gram language model
19
Translation Modeling
 Determines the probability that the foreign
word f is a translation of the English word e
 How to compute P(f | e) from a parallel
corpus?
 Statistical approaches rely on the cooccurrence of e and f in the parallel data: If
e and f tend to co-occur in parallel sentence
pairs, they are likely to be translations of
one another
20
SMT issues





ignores previous MT research (new start, new ‘paradigm’)
 basically ‘direct’ approach:
 replaces SL word by most probable TL word,
 reorders TL words
 decoding is effectively kind of ‘back translation’
originally wholly word-based (IBM ‘Candide’ 1988) ; now predominantly
phrase-based (i.e. alignment of word groups); some research on
syntax-based
mathematically simple, but huge amount of training (large databases)
problems for SMT:

translation is not just selecting the most frequent ‘equivalent’
(wider context)

no quality control of corpora

lack of monolingual data for some languages

insufficient bilingual data (Internet as resource)

lack of structure information of language
merit of SMT: evaluation as integral process of system development
21
Rule-Based MT & SMT
 SMT black box: no way of finding how it works in
particular cases, why it succeeds sometimes and not
others
 RBMT: rules and procedures can be examined
 RBMT and SMT are apparent polar opposites, but
gradually ‘rules’ incorporated in SMT models
 first, morphology (even in versions of first IBM model)
 then, ‘phrases’ (with some similarity to linguistic
phrases)
 now also, syntactic parsing
22
Rule-Based MT & SMT
 Comparison from following perspectives:






Theory background
Knowledge expression
Knowledge discovery
Robust
Extension
Development Cycle
23
Evaluation of MT
 Manual:
 Precise / fluency / integrality
 信达雅
 Automatically evaluation:
 BLEU: percentage of word sequences
(n-grams) occurring in reference texts
 NIST
24
Development of MT - MT System
25
MT Development - Research
Shallow/ Simple
Word-based
only
Electronic
dictionaries
Phrase tables
Knowledge
Acquisition
Hand-built by
Strategy
experts
Hand-built by
non-experts
All manual
Original direct
approach
Typical transfer
system
Classic
interlingual
system
Original statistical
MT
Example-based
MT
Learn from
annotated data
Learn from unannotated data
Fully automated
Syntactic
Constituent
Structure
Semantic
analysis
New Research
Goes Here!
Interlingua
Knowledge
Deep/ Complex Representation
Strategy
26
MT problems in general
 Characters of language
 Ambiguous
 Dynamic
 Flexible
 Knowledge
 How to express
 How to discovery
 How to use
27
Some Thinking about MT from
recognition
 Human Cerebra




Memory
Progress - Learning
Model
Pattern
 Translation by human…
 Translation by machine…
28
Further Reading
 Arturo Trujillo, Translation Engines: Techniques for Machine Translation,
Springer-Verlag London Limited 1999
 P.F. Brown, et al., A Statistical Approach to MT, Computational Linguistics,
1990,16(2)
 P.F. Brown, et al., The Mathematics of Statistical Machine Translation:
Parameter Estimation, Computational Linguistics, 1993, 19(2)
 Bonnie J. Dorr, et al, Survey of Current Paradigms in Machine Translation
 Makoto Nagao, A Framework of a Mechanical Translation between Japanese
and English by Analog Principle, In A. Elithorn and R. Banerji(Eds.),
Artificial and Human Intelligence. NATO Publications, 1984
 Hutchins WJ, Machine Translation: Past, Present, Future. Chichester: Ellis
Horwood, 1986
 Daniel Jurafsky & James H. Martin, Speech and Language Processing,
Prentice-Hall, 2000
 Christopher D. Manning & Hinrich Schutze, Foundations of Statistical
Natural Langugae Processing, Massachusetts Institute of Technology, 1999
 James Allen, Natural Language Understanding, The Benjamin/Cummings
Publishing Company, Inc. 1987
29