Download sentence ([the, girl, sing, a, song], []).

Document related concepts

Morphology (linguistics) wikipedia , lookup

Ojibwe grammar wikipedia , lookup

Modern Greek grammar wikipedia , lookup

Old English grammar wikipedia , lookup

Macedonian grammar wikipedia , lookup

Lithuanian grammar wikipedia , lookup

English clause syntax wikipedia , lookup

Georgian grammar wikipedia , lookup

Udmurt grammar wikipedia , lookup

Lexical semantics wikipedia , lookup

Swedish grammar wikipedia , lookup

Navajo grammar wikipedia , lookup

Inflection wikipedia , lookup

Portuguese grammar wikipedia , lookup

Modern Hebrew grammar wikipedia , lookup

Compound (linguistics) wikipedia , lookup

Kannada grammar wikipedia , lookup

Malay grammar wikipedia , lookup

Italian grammar wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Old Irish grammar wikipedia , lookup

Romanian grammar wikipedia , lookup

Chinese grammar wikipedia , lookup

Romanian nouns wikipedia , lookup

Icelandic grammar wikipedia , lookup

Arabic grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Zulu grammar wikipedia , lookup

French grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Spanish grammar wikipedia , lookup

Determiner phrase wikipedia , lookup

Esperanto grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

English grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Polish grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Transcript
Definite Clause Grammar
• The most popular approach to parsing in Prolog is
Definite Clause Grammar (DCG) which is a
generalization of Context Free Grammar (CFG).
• Parsing is one of the important applications of
Prolog and Logic Programming.
• The DCG formalism is essentially independent of
Prolog and it will be possible to write compiler or
interpreter for it in any programming language
which permits the unification of arguments.
• But DCG is easily implementable in Prolog
because grammar rules are similar to Prolog rules.
• Let us see the relationship of DCG with Prolog
and its systematic evolution.
• Here we are staring from Prolog to DCG to justify
the claim that it is by product of Prolog.
• We begin by defining Context Free Grammar
where rules are expressed in Bacus Normal Form
(BNF) rules.
• The general form of CFG rule is as follows:
< non_terminal > :: = < body >,
where body is a sequence of terminals
and non terminals symbols of a grammar.
• Consider the following grammar for a small subset
of English sentences defined using BNF like
notation.
<sentence>
:: = < noun_phrase >, <verb_phrase>
<noun_phrase>:: = <determiner>, <noun>
<verb_phrase> :: = <verb>, <noun_phrase> | <verb>
<determiner> :: = a | the | an
<noun>
:: = apple | boy | girl | song
< verb>
:: = eats | sings
• Declarative meaning of first rule is that a sentence
can take a form in which noun_phrase is followed
by a verb_phrase.
• The parse tree for the sentence “the girl sings a
song” is given as follows:
sentence
noun_phrase
determiner
the
verb_phrase
noun
girl
verb
sings
noun_phrase
determiner
a
noun
song
• This grammar is context free and does not take
care of number agreement and other semantic
information.
• A sentence “the girl sing a song” also get parsed if
a verb sing is available in the lexicon which is
wrong syntactically.
• The semantically incorrect sentence will also be
parsed. For example, “the apple eats a boy” is
correct according to the above grammar.
• The reason is simple that we have not incorporated
any context sensitive and semantic information.
• If we can incorporate that the subject of eat should
be animate (object having life) and the object
should be eatable, then sentence can semantically
be parsed correctly.
• All these semantic features can be added and are
explained later.
• The CFG grammar can be easily coded into Prolog
rules.
• Each non_terminal symbol becomes a unary
predicate whose argument is a sentence or phrase
it identifies.
sentence(X) :- append(Y, Z, X), np (Y), vp(Z). (1)
np(X) :append(Y, Z, X), det(Y), noun(Z). (2)
vp(X) :append(Y, Z, X), verb (Y), np(Z). (3)
vp(X) :verb(X).
(4)
• The rules for terminal words are coded as facts.
det([a]). det([the]). det([an]).
noun([boy]). noun([apple]). noun([girl]).
noun([song]). verb([eats]). verb(([sings]).
Goal: ?- sentence([the, girl, sings, a, song]).
• Here it is noted that a sentence is given as a list of
words representing Prolog symbols.
• In rule (1), X is instantiated to [the, girl, sings, a,
song], but Y and Z are uninstantiated variables.
• The goal append will generate all possible pair of
values of Y and Z from X. Basically X is
concatenation of two lists Y and Z.
• The following pair of X and Y lists are obtained.
•
•
•
•
•
•
Y = [ ] , Z = [the, girl, sings, a, song]
Y = [the], Z = [girl, sings, a, song]
Y = [the, girl], Z = [sings, a, song]
Y = [the, girl, sings], Z = [a, song]
Y = [the, girl, sings, a], Z = [song]
Y = [the, girl, sings, a, song] , Z = [ ]
• Out of six pairs listed above, Y = [the, girl] and
Z = [sings, a, song] is the only possible pair for
which sub goals np(Y), vp(Z) are satisfied.
• Here we notice that lots of unnecessary searching
is done, the most of which are useless. There is
more direct method which avoids generation of
such pairs.
• The call to append suggests that a difference-list
might be more appropriate structure for parsing.
• The append function using difference-list concept
is simple one fact as: append(X - Z, Z - Y, X - Y).
Difference List : (Incomplete Data Structure)
• The difference list is an alternative data structure for
representing a list.
• Incomplete list is an example of such structures. For
example, [1,2,3 | X] is an incomplete list whereas
[1,2,3,4] is a complete list.
• Consider a complete list [1, 2, 3]. We can represent it
as the difference of the following pair of lists
[1, 2, 3, 5, 8] and [5, 8]
[1, 2, 3, 6, 7, 8, 9] and [6, 7, 8, 9]
[1,2,3] and [ ].
 Each of these are instances of the pair of two
incomplete lists [1,2,3 | X] and X. We call such pair a
difference-list.
 We denote the difference list by A-B, where A is the first
argument and B is the second argument of a difference-list
A-B. Such representation of list facilitates some of list
operations more efficiently.
Example: Concatenating two lists represented in the form
of difference lists.
Solution: When two lists represented as difference lists
are concatenated (or appended), then we get appended list
by simply unifying the appropriate arguments as given
below:
diff_append (A - B, B - C, A - C).
 If we have to append two lists [1,2,3] and [4,5,6], then
we execute the following goal using difference-list rule
given above.
Graphical representation of append program
for difference lists:
A
A-B
B
B-C
C
A- C
Goal: ?- diff_append([1,2,3 | X] - X , [4,5,6 | Y] - Y, N).
Search tree:
?- diff_append([1,2,3 | X] - X , [4,5,6 | Y] - Y, N).
{A = [1,2,3 | X], B = X = [4,5,6 | Y],
C =Y, N = A-C=[1,2,3,4,5,6 |Y] - Y}
succeeds
Answer:
X = [4,5,6 | Y]
N = [1,2,3,4,5,6 |Y] - Y
 This program can not be used for concatenating two
complete lists.
 Here each list is to be represented using difference-list
notation. There are nontrivial limitations to this
representation because the first list gets changed.
• Consider the following rule
sentence(X) :- append(Y, Z, X), np (Y), vp(Z).
• We can rewrite it using difference lists as follows:
sentence(X-Y) :- append(X-Z, Z-Y, X-Y),
np(X-Z), vp(Z-Y).
• Since append(X-Z, Z-Y, X-Y) is always true, we can
remove it from the rule.
• Therefore, the modified rule becomes
sentence(X -Y) :- np(X - Z), vp(Z - Y). (1)
• For the sake of convenience, we can write (1) as
sentence(X, Y) :np(X, Z), vp(Z, Y).
Interpretation: There is a sentence between the difference
of two lists X and Y if there is a noun_phrase between the
difference of two lists X and Z and verb_phrase between Z
and Y.
• The np clause decides how much of the sequence
is to be consumed and what is to be left for the vp
clause to work on.
• A terminal symbol in Prolog is coded using
difference-list concept as terminal( [token|X], X)
which means that there is a terminal_symbol
between the difference of two lists [token | X] and
X.
• For example:
• det( [the | X], X), noun([girl | X], X), verb([ sing |
X], X) etc.
• The complete program in prolog using difference
list is written as follows:
• The np clause decides how much of the sequence is to
be consumed and what is to be left for the vp clause to
work on.
• A terminal symbol in Prolog is coded using differencelist concept as terminal( [token|X], X) which means that
there is a terminal_symbol between the difference of two
lists [token | X] and X.
• For example:
det( [the | X], X).
noun([girl | X], X).
verb([ sing | X], X). etc
• The complete program in prolog using difference list is
written as follows:
sentence(X, Y) :np(X, Z), vp(Z, Y).
np(X, Y)
:det(X, Z), noun(Z, Y).
vp(X, Y)
:verb (X, Z), np(Z, Y).
vp(X, Y)
:verb(X, Y).
det([a | X], X). det([an | X], X). det([the | X], X).
noun([boy | X], X).
noun([girl | X], X).
noun([song | X], X). noun([apple | X], X).
verb(([sing | X], X). verb(([sings | X], X).
verb(([eats | X], X).
• The above grammar using DCG rules is given as
sentence(X, Y)
:np(X, Z), vp(Z, Y).
np(X, Y)
:det(X, Z), noun(Z, Y).
vp(X, Y)
:verb (X, Z), np(Z, Y).
vp(X, Y)
:verb(X, Y).
det([a | X], X).
det([an | X], X).
det([the | X], X).
noun([boy | X], X).
noun([girl | X], X).
noun([song | X], X).
noun([apple | X], X).
verb(([sing | X], X).
verb(([sings | X], X).
verb(([eats | X], X).
The above grammar using DCG rules is given as
sentence
-->
np, vp.
np
vp
vp
det
det
det
noun
noun
noun
noun
verb
verb
verb
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
det, noun.
verb.
verb, np.
[a].
[an].
[the].
[boy].
[girl].
[song].
[apple].
[sing].
[sings].
[eats].
• In most of the Prolog systems, a DCG handler is
built-in that translates DCG rules into Prolog rules.
• The actual grammar rules are Prolog structures,
with main functor --> which is declared as an infix
operator in the beginning of program.
• Prolog system checks whether a term read in, has
--> functor and if so then translates it into a proper
Prolog clause.
sentence
--> np, vp.
sentence(X, Y)
:np(X, Z), vp(Z, Y).
and det --> [the].
det([the | X], X).
• The DCG rules are translated into Prolog rules by
adding two difference-list arguments.
• The query is normal Prolog goal and thus is
expressed by adding the extra arguments by user as
?- sentence ([the, girl, sings, a, song], []).
• The goal get satisfied using above DCG grammar
rules. Further, the following goal also get satisfied.
?- sentence ([the, girl, sing, a, song], []).
• In order to avoid this, the number agreements
between subject and verb can be easily
incorporated in DCG grammar.
Adding Extra Arguments
• The grammar rules considered so far are of
restricted kind.
• Let us consider one useful extension, which allows
phrase type to have extra arguments.
• One way to resolve the problem of number
agreement is to duplicate the grammar rules for
singular and plural with different names.
• Express the grammar rules by saying that there are
two kinds of sentences viz., singular sentence and
plural sentence. For example,
sentence
sentence
sing_sent
sing_np
sing_vp
sing_vp
np
-->
-->
-->
-->
-->
-->
-->
sing_sent.
plur_sent.
sing_np, sing_vp.
s_det, sing_noun.
sing_verb, np.
sing_verb.
sing_np.
np
-->
plur_np.
• Similarly the rules for plur_sent are defined. It is
clear that this is not an elegant way of handling
singular and plural sentences.
• These sentences have lot of structures in common.
• A better way is to associate an extra argument with
phrase types, according to singular or plural.
• In the grammar shown below, an argument M
corresponds to number of entire sentence and M1
to number of verb phrase.
• The modified grammar incorporating number
agreement arguments is rewritten as follows:
sentence
--> sentence1(M).
sentence1(M)
--> np(M), vp(M).
np(M)
--> det(M), noun(M).
vp(M)
-->
verb(M).
vp(M)
det(singular)
det(singular)
det( _ )
noun(singular)
noun(singular)
noun(singular)
noun(plural)
noun(plural)
noun(singular)
verb(singular)
verb(plural)
verb(singular)
verb(plural)
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
verb(M), np(M1).
[a].
[an].
[the].
[boy].
[girl].
[apple].
[apples].
[girls].
[song].
[sings].
[sing].
[eats].
[eat].
• Goal:
?- sentence([the, girl, sing, a, song], []).
• It is to be noted that we have added context
sensitivity in context free grammar by adding
an extra argument.
• This type of grammar is called DCG grammar
as nonterminal symbols can have arguments in
contrast to CFG.
• Further, we can introduce arguments to express
other important information as well, such as,
an extra argument to return a parse structure
for syntactically correct sentence rather than
simply saying 'yes' and no.
Construction of Parse Structure
• In Prolog, root (sub_tree1, sub_tree2, ....) is a
representation of a following tree.
root
sub_tree1
sub_tree2
sub_tree3
…
• Here sub_tree1, sub_tree2 etc are themselves
trees.
• Consider the following parse structure tree of a
correct sentence 'the girl sings a song'.
sent
n_p
v_p
d
n
verb
the
girl
sings
n_p
d
a
n
song
• In Prolog, above parse tree is coded as
sent(
n_p( d(the), n(girl)),
v_p( verb(sings),
n_p(d(a), (song))
)
)
• Here sent, n_p, v_p, n, v, d are user defined
functor names representing sentence, noun phrase,
verb phrase, noun, verb and determiner.
• These names can be same as predicate names but
for the sake of clarity we use different names.
• The parse structure tree P of a sentence is constructed
as follows:
P = sent(NP, VP), where NP and VP are the
parse structures of noun phrase and verb phrase
respectively.
NP = n_p(D, N),
where D and N are the parse
structures of determiner and noun respectively.
VP = v_p(V, NP), where V and NP are the parse
structures of verb and noun phrase in verb phrase.
• It indicates that we can construct parse structure of a
sentence by using parse structures of n_p (NP) and v_p
(VP).
• The grammar rules with argument P for parse structure
tree and M for number of a sentence are given below:
sentence (P)
-->
sentence(M, sent(NP, VP)) -->
np(M, n_p( D, N))
-->
vp(M, v_p(V))
-->
vp(M, v_p(V, NP1))
-->
det(singular, d(a))-->
[a].
det( _ , d(the))
-->
noun(singular, n(girl))
-->
noun(plural, n(girls))
-->
noun(singular, n(song)) -->
verb(singular, v(sings)) -->
verb(plural, v(sing))
-->
sentence(M, P).
np(M, NP), vp(M , VP)
det(M, D), noun(M, N).
verb(M, V).
verb(M, V), np(M1, NP1).
[the].
[girl].
[girls].
[song].
[sings].
[sing].
• DCG rule is translated to Prolog rule by adding two
difference-list arguments as follows:
DCG rule:
sentence(M, sent(NP, VP)) --> np(M, NP), vp(M , VP).
Prolog rule:
sentence(M, s(NP,VP), X,Y)
:np(M,NP,X, Z), vp(M,VP, Z, Y).
• Here X, Y and Z stand for parts of the input sentence
and are added by DCG handler automatically
Goal: ?- sentence(P, [the, girl, sings, a, song], []).
Search tree:
?- sentence(M, P, [the, girl, sings, a, song], []).
P = sent(NP, VP)
?- np(M, NP, [the, girl, sings, a, song], Z), vp(M, VP, Z, []).
NP = n_p(D, N)
?- det(M, D, [the, girl, sings, a, song], Z1), noun(M, N, Z1, Z), …..
D = d(the), Z1 = [girl, sings, a, song]
?- noun(M, N, [girl, sings, a, song], Z), vp(M, VP, Z, []).
M = singular, N = n(girl), Z = [sings, a, song]
?- vp(singular, VP, [sings, a, song], []).
VP = v_p(V, NP1)
?- verb(singular, V, [sings, a, song], Y), np(M1, NP1, Y, []).
V = v(sings), Y = [a, song]
?- np(M1, NP1, [a, song], []).
NP1 = n_p(D, N)
?- det(M1, D, [a, song], X), noun(M1, N, X, []).
M1 = singular, D = d(a), X = [song]
?- noun(singular, N, [song], []).
N = n(song)
succeeds
Adding Extra Tests
• So far we have seen that the grammar rule
translator (DCG handler) adds two extra
arguments in each atom of the rule at the time of
converting DCG grammar rules to Prolog clauses.
• Sometimes it is desirable to specify Prolog sub
goals in DCG grammar rules.
• This can be easily achieved by putting Prolog sub
goals inside the curly brackets.
• DCG handler at the time of conversion will leave
sub goals enclosed in { } unchanged and brackets
are removed.
• This would be useful while defining lexicon .
• Suppose we want to add new nouns such as
banana, apple and orange in the grammar specified
earlier, we would write noun rules.
noun (singular, n(banana)) -->
noun (plural, n(apples))
-->
noun (singular, n(orange)) -->
[banana].
[apples].
[orange].
• We notice that there is lot of information to be
specified for each noun, even when we know that
every noun occupies only one element of an input
list and will give rise to a small parse tree with the
functor 'n'.
• A much more economical way would be to
express the common information about all the
nouns at one place and the information about
particular word somewhere else.
• We abstract the word details from the lexicon and
put it in the grammar.
• Lexicon may be stored in a separate file which
may grow or shrink according to the need. This
file is loaded in the main program containing
grammar rules at the time of execution by consult
predicate.
Abstract DCG rule for Noun:
noun(M, n(N)) --> [N], { is_noun(M, N) }.
Equivalent Prolog rule:
noun(M, n(N), [N | X], X) :- is_noun(M, N).
• Here, is_noun is a normal Prolog predicate used to
express an individual word.
• An argument M represents number of a noun and
N represents noun word.
• Curly brackets indicate that the sub goals inside
them remain unchanged after translation from
DCG rule to Prolog rule.
• The nouns in the lexicon are specified as follows:
is_noun(singular, banana).
is_noun(plural, apples).
is_noun(singular, orange).
• Similarly abstract DCG rule for verb phrase and
lexicon for verbs are defined as follows:
verb(M, v(V)) --> [V], { is_verb(M, V) }.
is_verb(singular, eats).
is_verb(plural, eat).
is_verb(singular, sings).
is_verb(plural, sing).
• Here we notice that each noun or verb is still
specified as singular or plural where the token is
same with some characters added or removed at
the end of the token e.g., banana / bananas, eat
/eats etc.
• Handling conversions from singular to plural or
vice versa could be done by using morphological
rules. In that case one need to specify only one
form of the token.
• For the sake of simplicity we consider both the
forms to be included in the lexicon.
• Complete DCG grammar with abstract rules
sentence (P)
-->
sentence(M, sent(NP, VP))-->
np(M, n_p( D, N))
-->
vp(M, v_p(V))
-->
vp(M, v_p(V, NP1))
-->
noun(M, n(N))
-->
verb(M, v(V))
-->
det(M, d(D))
-->
sentence(M, P).
np(M, NP), vp(M , VP).
det(M, D), noun(M, N).
verb(M, V).
verb(M, V), np(M1, NP1).
[N], { is_noun(M, N) }.
[V], { is_verb(M, V) }.
[D], { is_det(M, D) }.
Lexicon: (can be stored separately in a file )
is_noun (singular, girl).is_noun (plural, girls).
is_noun (singular, song). is_noun(singular, banana).
is_noun(plural, apples). is_noun(singular, orange). is_det( _ , the).
is_verb(singular, eats). is_verb(plural, eat). is_verb(singular, sings).
is_verb(plural, sing). is_det(singular, a). is_det(singular, an).
Construction of Semantic Representation
• Semantic representation of a sentence is obtained by
applying the principle of compositionality which
means that the semantic representation is composed
from the semantic representation of its constituents.
• Semantic representation of a sentence gives meaning
of a sentence whose computation relies on two things.
– The representation of individual word in the lexicon ( which
contains its semantic representation as one of its arguments).
– The rules for semantic composition.
• For example, the construction of semantic
representation for noun phrase is based on the
semantic representations for determiner and noun.
• Here we are considering the semantic representation of
a sentence as a first order predicate logic formula
corresponding to a sentence that specifies the meaning.
• For example, semantic representation of a sentence
‘every man is mortal’ is (X) (man(X)  mortal(X) ).
• Call semantic representation as Logical Form (LF).
• The computation of semantic representation is
particularly simple in Prolog.
• The use of variables allow us to specify the structures.
• Covington (1988) has used lambda () notation and
DCG grammar for semantic representation.
• For example, man(X), mortal(X) can be represented in
lambda notation as X . man(X) and X . mortal(X)
• The  is called unnamed function which when applied
on a given value of X, say ‘john’ gives man(john) and
mortal(john) i.e., (X . man(X)) john = man(john).
• Lambda notation provides a way to encode formulae
with missing information.
• For example ‘john likes mary’ has LF as
likes(john, mary).
•  expressions for verb ‘likes’ are given as follows:
X . likes(X, mary) ->
Y . X . likes(X, Y) ->
" _ likes mary"
" _ likes _ ”
• In Prolog, there is no standard notation for lambda
expression, but it is not difficult to construct it and
use in Prolog program.
• A lambda expression is merely a two argument
term whose arguments are variable and a term in
which that variable occurs.
• We use the character ^ as an infix operator to hold
lambda expression. Therefore, we write
X^likes(X, mary) for X . likes(X, mary)
Y^X^likes(X, Y) for Y . X . likes(X, Y)
Semantic Representation of Lexical items
• In lexicon, the semantic representation for each word
is given.
• The syntactic category of word can be either noun,
verb, determiner, adjective, adverb, pronoun or
conjunction.
• Let us see the semantic representations of a word in
each category.
• LF of ‘man’ is X^man(X).
• The entry using DCG notation for ‘man’ is defined as
follows:
noun(X^man(X), [man | T], T ).
• General DCG grammar rule for noun N is written
as: noun(X^N(X)) --> [N], {is_noun(N)}.
• Lexicon entry for noun ‘man’ is: is_noun(man).
• Similarly, general grammar rules for transitive
(verbs requiring objects) and intransitive (verbs
requiring no objects) verbs are given as:
t_verb(Y^X^V(X, Y)) -->
int_verb(X^V(X)) -->
[V], {is_tverb(V)}.
[V], {is_intverb(V)}.
• Lexicon entries are as follows:
is_tverb(likes). is_tverb(teaches). is_tverb(loves).
is_intverb(cry). is_intverb(laugh).
Determiner and Adjective
• Main type of modifiers of nouns are determiners
and adjectives. For example,
a, an, the, some, all, exist, every, each, etc.
• Syntactically,
determiners
correspond
to
quantifiers. Normally, quantifier  corresponds to
‘a’, ‘an’, ‘exist’ and  corresponds to ‘all’,
‘every’, ‘each’.
• The case of determiner is more complex. In fact, it
determines the relationship between restrictor and
scope.
• To understand the meaning of restrictor and scope,
let us consider the following LFs:
every man is mortal - (X) (man(X) ->mortal(X))
every girl likes doll (X) ( girl(X) -> (Y) (doll(Y)  likes(X, Y)) )
• Here, (X)man(X) and (X)girl(X) are called
restrictor of a quantifier X which restricts the set
of values of X that quantifier can pick out.
• Rest of the formulae ‘mortal(X)’ and ‘(Y) (doll(Y)
 likes(X, Y))’ are scope of the quantifier which
are supposed to be true for appropriate values of X.
• Hence for any quantified variable X, the quantifier is a
relation between the set of values of X that satisfy the
restrictor, and the set of values of X that satisfy both
the restrictor and the scope.
• So a determiner takes a restrictor and a scope and puts
them together with appropriate notation.
• The semantic representation of a determiner is of the
form:
Or
(X^Restrictor) ^ (X^Scope) ^ Formula
X^Restrictor^Scope^Formula
• The variable X is explicit here so that the corresponding
variables in different terms will be unified.
• The entries for determiners are defined as follows:
• For determiners {every, all, each}, the lexicon
entry is:
det((X^Restrictor) ^ (X^Scope) ^ all(X, Restrictor ->
Scope)) --> [every] / [all] / [each].
Or det(X^Restrictor^Scope^ all(X, Restrictor -> Scope))
--> [every] / [all] / [each].
• For derterminers {a, an, any}, the lexicon entry is:
det((X^Restrictor) ^ (X^Scope) ^ exist(X, Restrictor
 Scope)) --> [a] / [an] / [any].
Or det(X^Restrictor^Scope^ exist(X, Restrictor 
Scope)) --> [a] / [an] / [any].
• The adjectives are those words that can modify nouns.
Adjective takes arguments derived from the noun part.
• For example, in the sentence ‘the cute girl visits a
man’, ‘cute’ is an adjective associated with noun
‘girl’. The LF of above sentence is given as:
(X) [( (girl(X), cute(X) )  (Y) ( man(Y)  visits(X,Y) ) ]
• Semantic representations for pronouns and proper
nouns are token themselves. The lexical entries along
with LF are given as follows:
proper_noun(D) -->
[D] and pronoun(P) --> [P]
where, D  {john, mary, ram, rita, sita, …}
and P  {she, he, her, his, their, him, ….}
DCG Grammar Generating Semantic Representation
• DCG grammar rules using lambda notation, capable of
generating LF of sentences like ‘every girl likes a doll’,
'every man likes a woman' etc.
sentence(LF)-->
np(X^F^LF), vp(X^F).
np( X^F^LF)-->
det(X^B^F^LF), noun(X^B).
vp(X^F) -->
verb(X^Y^LFV), np(Y^LFV^F).
det(X^Res^Scope^all(X, Res -> Scope)) -->[every].
det(X^Res^Scope^exist(X, Res  Scope))-->[a].
noun(X^N(X))
-->
[N], {is_noun(N)}.
verb(X^Y^V(X, Y)) -->
[V], {is_verb(V)}.
is_noun(girl).
is_noun(doll).
is_verb(likes).
is_noun(man).
is_noun(woman).
• In Prolog, since the values of variables are assigned using
unification, we can easily rewrite above grammar as
follows:
sentence(LF) -->
np(X, F, LF), vp(X, F).
np(X, F, LF) -->
det(X, B, F, LF), noun(X, B).
vp(X, F) --> verb(X, Y, LFV), np(Y, LFV, F).
det(X, Res, Scope, all(X, Res -> Scope)) --> [every].
det(X, Res, Scope, exist(X, Res  Scope))-->[a].
noun(X, N(X))
-->
[N], {is_noun(N)}.
verb(X, Y, V(X, Y)) -->
[V], {is_verb(V)}.
is_noun(girl).
is_noun(doll).
is_noun(man).
is_noun(woman). is_verb(likes).
Goal: ?- sentence(LF, [every, girl, likes, a, doll], [])
?- np(X, F, LF, [every, girl, likes, a, doll ], S1), vp(X, F, S1, []).
?- det(X, B, F, LF, [every, girl, likes, a, doll], S2), noun(X, B, S2, S1), ..
LF = all(X, B-> F), S2 = [girl, likes, a, doll]
?- noun(X, B, [girl, likes, a, doll], S1), vp(X, F, S1, []).
B = girl(X), S1 =[likes, a, doll]
?- is_noun(girl), vp(X, F, [likes, a, doll], []).
?- vp(X, F, [likes, a, doll], []).
?- verb(X, Y, LFV, [likes, a, doll], S), np(Y, LFV, F, S, []).
LFV = likes(X, Y), S = [a, doll]
?- is_verb(likes), np(Y, likes(X, Y), F, [a, doll], []).
?- np(Y, likes(X, Y), F, [a, doll], []).
?- det(Y, B1, likes(X, Y), F, [a, doll], S1), noun(Y, B1, S1, []).
F = exist(Y, B1  likes(X, Y)), S1 =[doll]
?- noun(Y, B1, [doll], []).
?- noun(Y, B1, [doll], []).
{B1 = doll(Y)}
?- is_noun(doll).
succeeds
On backtracking construct the result as follows:
F =
exist(Y, doll(Y)  likes(X, Y))
B
=
girl(X)
LF
=
all(X, B -> F)
Therefore, we get
LF = all(X, girl(X) -> exist(Y, doll(Y)  likes(X, Y)) )
Natural Language Query Interface to Prolog
Database
• The query is given in English language and NLQ system
evaluates it and generates answer in English.
• The concept of difference list is used to develop this
interface.
• The NLQ can also recognize the user’s query on the basis
of incomplete or slightly erroneous input.
• Here we are not concerned with syntactically correct query,
but rather query should have unambiguous meaning.
• Meaningful words should be correctly spelled out.
• For example, in the query ‘what is the age of rajan’, the
meaningful words are age and rajan. Other words are
optional which are ignored while parsing the query.
• “nlq” system initiates interaction with user by asking query
that is given as a string of words as follows:
nlq :- input(Query), convert_to_list(Query, Qlist),
writeln('Answer:- '), main_module(LF,
Qlist,[]), nl, loop.
• “convert_to_list “ takes query and converts it into the list of
words.
• This list is passed onto logical form generator that converts
it into logical form (LF) using lexicon of words which
contains logical forms corresponding to meaningful words.
• The logical form is represented using unary predicate with
one argument. It preserves the meaning of query.
For example, logical form of a query,
'find the age of rajan' is age(rajan).
• There is unique LF corresponding for different types of
queries having same intended meanings viz., the queries
like ‘how old is rajan’, ‘what is the age of rajan’, ‘find out
rajan age’ etc., has same LF = age(rajan).
main_module(LF, Qlist,[]) :query(LF, Qlist, []), !, eval(LF, I),
display(LF, I), full_stop.
main_module(LF, Qlist,[]) :write('Not able to understand'), full_stop.
• Logical form generator uses various patterns of meaningful
word W and employee name N and parses an input query.
• Using W and N, logical form W(N) is generated.
• The meaningful word, for example, 'salary’ is stored in
lexicon file as:
word(X, salary(X), [salary | T], T),
where, salary(X) is a LF corresponding to name X.
• Further, the LF is given to evaluator module “eval” which
evaluates it and finds out answer word using database files.
• We are considering the databases of employee information.
• Finally, display module gets answer word from evaluator
and generates suitable answer in natural language using the
words involved in LF.
• If the interpreter is not able to recognize meaning full
words, it displays appropriate message.
• Database file contains the following facts.
• employee(Name, Identity_number)
• personal_record(Name, Age, , Qualification,
Address, M_status)
• official_record(Identity_number, Designation,
Salary, Department,
Experience, Official_address)
nlq :- input(Query), convert_to_list(Query, Qlist),
writeln('Answer:- '), main_module(LF, Qlist,[]), nl, loop.
main_module(LF, Qlist,[]) :- query(LF, Qlist, []), !,
eval(LF, I), display(LF, I), full_stop.
main_module(LF, Qlist,[]) :- write('Not able to understand'),
full_stop.
input(Query):-nl, writeln('Input your query:- '), nl,
read(Query).
convert_to_list(Query, Qlist) :- change(Query, X1),
gen(X1, X2), convert(X2, Qlist), nl.
loop :- writeln('Do you want to quit?(y/n) '), read(X),nl, X =
'y', !.
loop :- nlq.
• Query module
/* Various patterns of meaningful word W and employee name N.
Using W and N, logical form W(N) is generated. */
query(LF, S0, S1) :gp1(LF, S0, S1), !.
%WN
query(LF, S0, S1) :gp2(LF, S0, S1), !.
%NW
query(LF, S0, S1) :gp11(LF, S0, S1), !.
%W-N
query(LF, S0, S1) :gp3(LF, S0, S1), !.
%-NW
query(LF, S0, S1) :gp4(LF, S0, S1), !.
%-W-N
query(LF, S0, S1) :gp5(LF, S0, S1), !.
%N--W
query(LF, S0, S1) :gp6(LF, S0, S1), !.
%-WNquery(LF, S0, S1) :gp8(LF, S1, S2), !.
%-N-W
/*********** Grouping different patterns ****************/
gp1(LF, S0, S2) :- word(W, LF, S0, S1), ename(W, S1, S2). %WN
gp2(LF, S0, S2) :-ename(W, S0, S1), word(W, LF, S1, S2). %NW
gp3(LF, S0, S2) :- op_wd(S0, S1), gp2(LF, S1, S2).
%- N W
gp4(LF, S0, S4) :- op_wd(S0, S1), word(W, LF, S1, S2),
op_wd(S2, S3), ename(W, S3, S4).
%-W-N
gp5(LF, S0, S4) :-ename(W, S0, S1), op_wd(S1, S2),
op_wd(S2, S3), word(W, LF, S3, S4).
%N--W
gp6(LF, S0, S3) :- op_wd(S0, S1), gp1(LF, S1, S2),
op_wd(S2, S3).
%-WNgp7(LF, S0, S2) :- op_wd(S0, S1), gp3(LF, S1, S2). % - - N W
ename(X, [X | T], T) :- emp(X, _).
/************* Evaluation module ******************/
eval(salary(X), I) :- emp(X, N), off_rec(N, _, I, _, _, _).
eval(qual(X), Q) :- per_rec(X, _, Q, _, _).
eval(age(X), A) :per_rec(X, A, _, _, _).
eval(status(X), S) :- per_rec(X, _, _, _, S).
eval(desig(X), D) :- emp(X, N), off_rec(N, D, _, _, _, _).
eval(experience(X), I) :- emp(X, N), off_rec(N, _, _, _, I, _).
eval(address(X), S) :- per_rec(X, _, _, S, _).
eval(oadd(X), D) :- emp(X, N), off_rec(N, _, _, _, _, D).
eval(depart(X), D) :- emp(X, N), off_rec(N, _, _, D, _, _).
/*************** Display module *******************/
display(salary(X), I) :- write(X), write(' earns '), write('Rs. '),
writeln(I).
display(qual(X), I) :- write('Qualification of '), write(X),
write(' is '), writeln(I).
display(age(X), I) :write(X), write(' is '), write(I),
writeln(' years old').
display(status(X), I) :- write(X), write(' is '), writeln(I).
display(desig(X), I) :- write(X), write(' is working as '),
writeln(I).
display(experience(X), I) :- write(X),
write(' has been working for '), write(I), writeln(' years').
/************** Lexicon file (lexicon) ******************/
% Meaningful Lexicons
word(X, salary(X), [salary | S], S).
word(X, salary(X), [income | S], S).
word(X, salary(X), [earn | S], S).
word(X, salary(X), [earns| S], S).
word(X, salary(X), [earnings | S], S).
word(X, salary(X), [emoluments | S], S).
word(X, desig(X), [designation | S], S).
word(X, desig(X), [post | S], S).
word(X, desig(X), [title | S], S).
word(X, desig(X), [position | S], S).
word(X, address(X), [address | S], S).
word(X, address(X), [residence | S], S).
word(X, address(X), [reside | S], S).
word(X, address(X), [resides | S], S).
word(X, address(X), [live | S], S).
word(X, address(X), [lives | S], S).
word(X, address(X), [place, of, stay | S], S).
word(X, address(X), [stay | S], S).
word(X, address(X), [stays | S], S).
word(X, status(X), [status | S], S).
word(X, status(X), [marital, status | S], S).
word(X, oadd(X), [work, place | S], S).
word(X, oadd(X), [place, of, work | S], S).
word(X, oadd(X), [official, address | S], S).
/*************** Database file (dbase) ******************/
% emp(Name, Id_number )
emp(rajan, 100). emp(raju, 110). emp(sita, 120).
emp(gita, 130). emp(mark, 140). emp(mary, 150).
emp(mike, 160). emp(jimmy, 170). emp(saumya, 180).
% per_rec(Name, Age, , Qualification, Address, M_status),
per_rec(rajan, 44, 'Ph.D', a1, married).
per_rec(raju, 56, 'M.Tech', a2, unmarried).
per_rec(sita, 25, 'B.A', a3, unmarried).
per_rec(gita, 45, 'M.Sc', a4, married).
per_rec(mark, 40, 'M.Sc', a5, divorced).
per_rec(mary, 35, 'MBBS', a6, unmarried).
per_rec(mike, 55, 'M.Sc', a7, married).
/* off_rec(Id_number, Designation, Salary, Department,
Experience, Official_address),
*/
off_rec(100,
off_rec(110,
off_rec(120,
off_rec(130,
off_rec(140,
off_rec(150,
off_rec(160,
off_rec(170,
off_rec(180,
manager, 20000, 'NIIT', 12, 'New Delhi').
scientist, 35000, 'DRDO', 10, 'New Delhi').
clerk, 4000, 'IITD', 2, ' New Delhi').
assistant, 5500, 'DRDO', 8, 'New Delhi').
manager, 25000, 'State Bank', 12, 'Mumbai').
doctor, 21000, 'AIIMS', 10, 'New Delhi').
scientist, 20000, 'IITM', 22, 'Chennai').
lecturer, 10000, 'Delhi university', 2, 'Delhi').
counciler, 2500, 'Boston Ltd', 1, 'Mumbai').