Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1.
In my talk I’m WE ARE going to talk about a totally lexicalist morphology, which is the
developed form of a totally lexicalist grammar, GASG.
2.
In a series of articles we argued for the principle of total lexicalism within the generative
paradigm and the necessity of the elaboration of a grammar, a “Generalized Argument Structure
Grammar,” (and its computational implementation) serving as the model of this metatheoretical
principle.
3.
GASG is based on lexical items (‘li’) which are signs, whose inner structure is so rich that they
can capture all features of relevant “environmental” words. It may be accepted easily that case
marking and agreement have a straightforward “totally lexicalist” treatment; word order, and
especially adjacency relations between realizations of li’s, however, require an optimalistic
means: “environmental requirements” in li’s are assigned ranks (1, which is the highest one, 2,
3,…) which permit the indirect satisfaction of requirements by satisfying other requirements of
higher ranks.
Let’s see an example: the lexical item ’talált’ (’found’). In GASG lexical items consist of four
components. The first one shows the own word: a fully inflected word ’talált’. The second
component provides the morphosyntactic characterization of the own word (v5 – a constant) and
the environmental words — they are variables to be unified with own words of other lexical
items. In the example the own word v5 ’talált’ is characterized as a transitive verb in the past
tense and there are four environmental words because of the two arguments of ‘find.’ Why are
there four environmental words? Because of the two pillars of a nominal argument: the nominal
pillar and the “referential” pillar. If e.g. the subject of the sentence is ’Peter’ the two pillars
(V5.11 and V5.12) coincide, both are to be unified with the same own word (‘Peter’). And for the
object, the requirements are that the nominal pillar (V5.21) is in accusative case, and the
referential pillar (V5.22) is characterized as an indefinite element (with respect to the definite
conjugation of the finite verb).
The immprec (‘immediate precedence’) requirements account for the word order, you can see the
already mentioned rank parameters here, but I’m going to talk about it later.
The 3. and the 4. components of the lexical items account for the semantics, but the point is now
how the syntactic relations change in our new approach.
4.
Let’s take a potential sentence: a series of words w1, w2, etc. wn. In phrase-structure grammars
the proof of grammaticality is practically that a tree can be built upon the series of words. On the
contrary in GASG the point is the relations between the given words; technically, variables in
lexical items should be identified with constants in the descriptions of other li’s (and vica versa).
5.
Let’s take a concrete example: […] As you see a tree can be built upon this sentence, so it’s
grammatical in phrase-structure grammars; and now you can see the relations between the words
– with ranks. For example the noun ’székbe’, INTO THE CHAIR, and its article are in mutual
search on rank five. Or the verb ’beültetheti’ is searching for a nominative and an accusative
noun, and their determiner pillars. Most of the relations are mutual, but there are exceptions: the
adjective ’büszke’ for instance needs a noun (’lányt’), but this search is of course not mutual.
As I mentioned, an implementation was developed to this theory, which decides whether a series
of words is grammatical or not, and if it is, it generates the syntactic and the semanatic
representations of the sentence, as well. (uncompound neutral Hungarian sentences)
6.
So in the earlier model the li’s were assigned to words, and the lexical description of
morphologically complex words — very frequent in languages like Hungarian — was claimed to
be calculable in a multiple lexical inheritance network (in a never-specified way, TO TELL THE
TRUTH…).
A better method (TLM) is proposed here which suits the principle of total lexicalism directly:
each single morpheme within words is to be assigned a lexical item.
7.
Let’s see again a series of words, but the point is now the series of morphemes within the words.
There are some kind of grammars where trees can be built upon the morphemes as well – if
they’re well formed sequences. But in GASG, the proof of well-formedness again, that the
variables in lexical items – which are morphemes now – can be identified with constants; so
relations are declared between the morphemes.
And of course the tree upon the words, and the relations between the words are presented as well.
8.
Let’s see again the previous examle: […], and let’s see the relations between the morphemes
(li’s). The most simple case of course IS when the word contains only one morpheme (e.g.
’Péter’), there’s no seeking here. In the case of ’Marira’, ’lányt’ and ’székbe’ //RaAJUK
MUTATS, EZERT NEM MONDOD ANGOLUL// we found a stem and a case-marking
morpheme, where the suffix is searching for the stem. In this sentence the verb ’beültetheti’ is the
most complex MORPHOLOGICALLY . Below the word you can see the phonologically and
morphologically motiveted arrows: every suffix is searching for the stem, they want to be
adjacent to it. The morpheme order depends on the ranks, which means how stong are these
requirement – but I’ll come back to it a bit later. It also could be the case, that a suffix is
searching for the (immediately) preceding suffix because of its phonological requirements. The
arrows above the word account for the semantic representation, it shows the scope of a morpheme
which is in general the last morpheme before the given li, but there are exceptions, e.g. the
semantic scope of the derivative element ’tet’ is not the stem ’ül’, but the more complex ’beül’
(the stem and the particle together, because the scopal relations are of course inherited). And you
can see the relations between the words again.
9.
So what are the arguments for TLM? I’m not going to talk about the inner argument, the
legitimacy of GASG and the principle of total lexicalism, you should see the references at our
homepage (first slide).
But there are further arguments, just as our treatment for the word-level problem across
languages, or the word-internal scopal ambiguities. Further advantages of the model are its easy
feature checking, or the indirect satisfation concerning morpheme order, word order or argument
structures. In the rest of my talk I’m going to talk about these arguments in details.
10.
First the problem of word level. There is A radical superficial difference between languages in
respect of word level. But the meaning in the languages of opposite morphological types is the
same of course. This is not a problem in our system, because there’s almost no difference in
respect of background li’s. Let’s see an examle: […] Here you can see the (phonologically
underspecified) own words of the five relevant li’s, whose appropriate phonological realizations
are to be filtered out on the basis of (morpho-)phonological (word-internal) environmental
requirements. And finally you can see the own predicates: partial / underspecified semantic
representations in van Eijck and Kamp’s style. The equations among (proto-) referents are also
collected.
11.
The following argument I’m talking about is the easy feature checking of our system. We can
account for such phonological phenomena as vowel-harmony, lowering, V ~ alternation,
linking vowels, lengthening, shortening etc. The theory and the implementation are worked out
mainly for the nouns at the moment, the verb paradigm needs further research. We have two
morpheme-types – two kinds of lexical items –: stems and suffixes. But they’re very similar: in
the descriptions of the li’s all kinds of own features and environmental requirements are encoded:
phonological, morphological and syntactic requirements as well. Phonologically two kinds of
requirements are needed. The first one accounts for the choice of the possible realizations of the
given morpheme (li) (I mean by lenghening stems for example or the linking vowel in a suffix),
these are technically variables is IN the own words. For example in the case of ’bokor’ (bush),
which contains a V ~ alternation, the own word is ’bokOr’, and the phonological realization
depends on the following suffix. Or in the case of the suffix ’ban/ben’ (in) the own word is ’bAn’,
and the frontness of the vowel depends on the frontness of the stem (vowel-harmony).
And the other kind of requirements is how the lexical items effect on the phonological
realizations of other lexical items in the same word (e.g. lowering stems or suffixes, or again
vowel-harmony). Technically we store the relevant features in an array, which is not the same in
the two cases: in respect of the stems the relevant features are the quality of the vowels and that if
it is lowering stem or not; but in respect of the suffixes the point is if the li causes lengthening,
shortening, epenthesis and lowering or not. E.g. the plural suffix (’Vk’) indicates all the four
phenomena [kutyák,tüzek,bokrok,asztalokat], but the already mentioned ’ban/ben’ only the first
one (lengthening).
12.
So the program creates the proper form of a word by identifying the variables with constants by
finding the relations between the li’s. It sees //PERSZE IDEZOELBEN - ERZEKELTESD// all
the possible li’s of the given word at once, because we work with a total proposal, and it can take
into consideration all kinds of information at once (the relevant phonological information, or
other kinds of information – morphological, syntactical or lexical – if it is needed). The direction
of the search depends on the variables within the li’s. It can be forward (the next suffix – in the
case of lengthening), backward (the stem (in the case of vowel-harmony), the preceding suffix
(some forms of V ~ alternation), or a suffix somewhere between the stem and the given li
(roundness-harmony)).
Impl.
And now I show some examples to see that this mechanism CAN is really working. For example
if the well-formedness of the word ’bokorban’ is the question, you can see that it’s well-formed,
because the lexical items are printed out. Here you can see the own words of these li’s (with
variables of course), the English translations, and their features (phonological, morphological and
syntactic features). (We have A much simplified syntax and no semantics IN THIS VERSION,
BUT SEE A GRUZIAI at the moment.) Of course ’bokorben’ or ’bokrban’ are out. Or another
example: ’tüzeken’, there is shortening in the stem (because the next morpheme indicates that),
lowering in the plural suffix (because ’tűz’ is a lowering stem), and frontness harmony in the
case-marking suffix ’on/en/ön’. The roundness harmony (’tűzön’) is blocked here by a previous
(non round) morpheme (which is now accidentally the preceding morpheme). And finally a more
simple example: ’kalapokat’, where the first linking vowel is ’o’, because ’kalap’ is not a
lowering stem, but the second linking vowel (in the accusative case marking morpheme) is ’a’,
because the plural suffix indicates lowering.
13.
The most simple way EXAMPLE of indirect satisfaction is the calculation of order of morphemes
within words. Every suffix would like to be adjacent to the stem, but these requirements are not
equally strong. For example in the case of verbs, the strongest one (with rank 1) is the
requirement of the suffix ’hat’ which accounts for the modality; on rank 2, there are two kinds of
suffixes: tense and mood marking morphemes, and the agreement suffixes are on rank 3.
According to the definition if a requirement cannot be satisfied directly (there are more than one
suffix in a word), it could be satisfied indirectly. If a suffix ’A’ wants to be adjacent to the stem
on rank , and a suffix ’B’ wants to be adjacent to the stem on rank , and < ( strictly smaller
than ) then the morpheme order is: stem, A, B. Equation is not allowed, that’s why ’stem tense
mood agr’ is not a proper order. In this case the mood marking suffix will be the morpheme
’volna’ which is on rank 4, so it will appear at the end of the word (actually, ’volna’ is an
independent word, but it’s part of the word in our approach).
In the case of nouns the problem and the solution is quite similar, but there are differences. The
main difference is the rank 3.5. It means, that in the case of fraction ranks (??) equation is
allowed. VMI PELDA KENE, HOGY LASSAK: KALAPÉIÉI. And our treatment for
postpositions is the same as for the suffix ’volna’: though they’re independent words in
Hungarian, in our system they’re regarded as case marking suffixes because of their behaviour.
14.
The definition which accounts for the word order in a sentence (sentence-level immprec) is much
more compicated than the immprec relations within the word. The formal definition can be found
in the references. The informal definition is that a requirement of rank n concerning the
immediate precedence of word w1 relative to w2 can be satisfied either directly — by the fact
that w1 does immediately precede w2 — or indirectly — by permitting certain words to be
inserted between w1 and w2, those, and only those, whose immprec requirement to w1 or w2 is
stronger, or are dependents of such words, or dependents of dependents, etc. DEPENDANTS
You can see an example here […]. The adjactive – noun adjacency is very important in
Hungarian (because there is no agreement between them), that’s why the immprec ranks are 1 or
2 (very strong). The article – noun relation is on rank 5, so the possesive noun ’lány’ which wants
to precede the word ’nővérét’ on rank 4 can be inserted (because 4<5). But this word brings a
further dependant the adjective ’vigyázó’, which also brings a dependant, etc.
So the main difference between word level immprec and sentence level immprec, that on word
level inserted li’s (morphemes) will never bring further dependants.
15.
GASG (+TLM), due to the already mentioned separation of “template-forming” rank
parameters //PÖREEEEEEMITÖ// and semantically motivated ranks, sheds new light on
problems of word-internal scopal ambiguity, discussed in Bartos (1999), which seems to violate
such basic principles of modern (Chomskyan) generative theory as Cinque’s (1999) Hypothesis
on the Universal Hierarchy of Functional Projections, Baker’s (1985) Mirror Principle and the
universal that operators c-command their scopes.
For example the tense – mood ambiguity: […] Semantically there are two different meanings
because of the two different scope-relations. […] e10, which is a davidsonian argument referring
to the event […]
So it is not excluded in GASG that the eventuality “in the past,” denoted by e31 in the li
belonging to -ett, is permitted to be identified with both the permission (e20) and the eating (e10),
which accounts for the ambiguity. One additional factor is to be assumed: the “template-forming”
ranks are stronger than rank 3, responsible for the adjacency between li3 and the li whose
eventuality is clamed to be “in the past.” Our explanation for the universal tendency expressed by
the hypotheses that languages tend to avoid conflicts between “template-forming” and
semantically motivated morphological adjacency ranks.
The problem and the solution are the same in the case of tense – modality ambiguity ’evett
volna’, where the two meanings are ’???’ and ’???’. HE INTENDED TO EAT (BUT NOW HE
DOES NOT INTEND TO EAT ANY MORE), IT WOULD BE THE CASE THAT HE ATE
And finally there are similar phenomena in syntax as well, e.g. in the case of Hungarian focus the
semantic requirements are stronger than ones for the argument structure, that’s why in Hungarian
the focused element precedes its scope – it’s before the verb (Péter MARIT szereti). But in
English, the requirements for argument structure are stronger, so the focus stays at its original
position (Peter loves MARY). But in the case of question-words the semantic requirements are
stronger in both languages, so the verb is preceded in both cases: Péter KIT szeret, WHOM does
Peter love. The stricter the word order is in a language, the stronger the ranks are for the position
of the arguments, while li’s like operators have weaker requirements. That’s why in Hungarian
(which has less strict word order) the operators – in general – precede their scopes.
16.
And the last argument discussed here is the indirect satisfaction of environmental requirements
now for the argument structure. It means that in TLM not only the calculation of the order of
words or morphemes can be treated this way (basic tool of GASG – ranked immprec
requirements), but also the modification of argument structures, cases and agreement features.
According to the definition you can see here, the two central arguments of a verb have types: the
argument Y has the arg-type (Y1,Y2), the argument Z has the arg-type (Z1,Z2), and if argtype(Y1,Y2) is smaller than arg-type(Z1,Z2) (it means that Y1<Z1 or if Y1=Z1 then Y2<Z2),
then Y is nominative and Z is accusative. If arg-type(Y1,Y2)=arg-type(Z1,Z2), (Y=Z) then Y is
nominative. For example the verb ’ás’ (dig) has the central arg-pair Y, Z, where Y has the argtype(0,-1) (where 0 means (mit is??) and minus means that the argument is agentive), and Z has
the arg-type(0,+1) (where plus means that the argument is (páciensszerű – patientive??)). Or in
the case of ül ‘sit’ for instance, which is not a transitive verb, both arg-types are (0,-1), which
means that the only central argument of this verb is an agent in nominative case.
This system can account for the Hungarian -(t)At causativization easily: if we have the input
central-arg-pair(Y, arg-type(Y1,Y2), Z, arg-type(Z1,Z2)), and a condition which sais that the
arg-type(Y1,Y2) < 0 (so the argument is mildly agentive) (arg-type(Y1,Y2) < 0 means that either
Y1<0 or if Y1=0 then Y2<0), then the output is central-arg-pair(X, arg-type(Y1,Y2)–1, Z, argtype(Z1,Z2)), where arg-type(Y1,Y2)–1 means arg-type(Y1-1,*). Let’s see some example: if we
take the input ’ül’, the central arg-pair of ’ültet’ is (X, arg-type(–1,*), Y, arg-type(0,–1)), so the
new argument will be nominative (with the smaller arg-type) and the original subject will be the
object now. Or the other example the transitive ’ás’, the result of the causativisation is the centralarg-pair(X, arg-type(–1,*), Z, arg-type(0,+1)), so the new argument will be nominative again, the
previous object stays accusative, and the previous subject cannot occupy a central argument
position anymore. For further details you sould see references.
So what does indirect satisfaction mean in TLM? Let’s see the example ’beültethetlek’ again. The
stem ül ‘sit’ requires its word to have an arg.str. with the arg. pair you can see here, which
consists of a 3sg. arg. in Nom. It is assumed, however, that li ’be’ and li ’tet’ “force” a bigger
arg.str. due to stronger ranks with a locative arg. and a second central arg. (which going to be
subject, and the original subject is going to be object). And the version of -(A)lAk decides the
person and the number of the subject and the person of the object alone, overriding “earlier”
predictions. So finally we get a transitive verb with a locative argument, where the subject is not
3sg anymore but 1sg, and we have information about the object as well.
And finally I show some examples from the implementation.
Kérek egy könyvet az okos kutyákról.
Péter beültetheti a Marira büszke lányt a székbe.
Kritikák:
Reviewer 1: Unclear how the approach relates to current empirical or theoretical work on
morpohlogy. The idea that "each single morpheme within words is to be assigned a lexical item"
is not new; it is, e.g., elaborated in Wunderlich's Minimalist Morphology. On the other hand,
there are also influential non-lexicalist ('a-morphous', or realisational) theories. Empirical
comparison is only drawn with Cinque and Baker, rather than with those genuine morphological
theories. So what are the advantages of TLM? Is there evidence from e.g. psycho-linguistics or
connectionist simulations? The abstract looks very complex and ambitious for a 30 min talk. I
wasn't able to comprehend all the details in (1) - (4).
Needs further research...
Reviewer 2: How are the ranks motivated? Why are the lexical entries and
the rule types so complicated as they are? Seems to be cooked together
very quickly.
In GASG the rewriting and transformational rule system is substitued for a system furnished with
ranks. It’s generative: well formed and only well formed word order sequences can be calculated.
Particular languages should be scrutinized and we stive for finding universals.
Reviewer 3: The formalism is very articulate and precise, but I'm not
sure what the issue is, what problem they're trying to solve, and how the work relates to or
improves on previous work.
The issue is that a syntactician always need lexicon, but the lexicon doen’t need syntax, so the
grammar can be homogenous.
For the implementation: Impl. is a necessary req. for alternative frameworks because of the
verification of formaliseability.