Download 1 - Lingua

1. In my talk I’m WE ARE going to talk about a totally lexicalist morphology, which is the developed form of a totally lexicalist grammar, GASG. 2. In a series of articles we argued for the principle of total lexicalism within the generative paradigm and the necessity of the elaboration of a grammar, a “Generalized Argument Structure Grammar,” (and its computational implementation) serving as the model of this metatheoretical principle. 3. GASG is based on lexical items (‘li’) which are signs, whose inner structure is so rich that they can capture all features of relevant “environmental” words. It may be accepted easily that case marking and agreement have a straightforward “totally lexicalist” treatment; word order, and especially adjacency relations between realizations of li’s, however, require an optimalistic means: “environmental requirements” in li’s are assigned ranks (1, which is the highest one, 2, 3,…) which permit the indirect satisfaction of requirements by satisfying other requirements of higher ranks. Let’s see an example: the lexical item ’talált’ (’found’). In GASG lexical items consist of four components. The first one shows the own word: a fully inflected word ’talált’. The second component provides the morphosyntactic characterization of the own word (v5 – a constant) and the environmental words — they are variables to be unified with own words of other lexical items. In the example the own word v5 ’talált’ is characterized as a transitive verb in the past tense and there are four environmental words because of the two arguments of ‘find.’ Why are there four environmental words? Because of the two pillars of a nominal argument: the nominal pillar and the “referential” pillar. If e.g. the subject of the sentence is ’Peter’ the two pillars (V5.11 and V5.12) coincide, both are to be unified with the same own word (‘Peter’). And for the object, the requirements are that the nominal pillar (V5.21) is in accusative case, and the referential pillar (V5.22) is characterized as an indefinite element (with respect to the definite conjugation of the finite verb). The immprec (‘immediate precedence’) requirements account for the word order, you can see the already mentioned rank parameters here, but I’m going to talk about it later. The 3. and the 4. components of the lexical items account for the semantics, but the point is now how the syntactic relations change in our new approach. 4. Let’s take a potential sentence: a series of words w1, w2, etc. wn. In phrase-structure grammars the proof of grammaticality is practically that a tree can be built upon the series of words. On the contrary in GASG the point is the relations between the given words; technically, variables in lexical items should be identified with constants in the descriptions of other li’s (and vica versa). 5. Let’s take a concrete example: […] As you see a tree can be built upon this sentence, so it’s grammatical in phrase-structure grammars; and now you can see the relations between the words – with ranks. For example the noun ’székbe’, INTO THE CHAIR, and its article are in mutual search on rank five. Or the verb ’beültetheti’ is searching for a nominative and an accusative noun, and their determiner pillars. Most of the relations are mutual, but there are exceptions: the adjective ’büszke’ for instance needs a noun (’lányt’), but this search is of course not mutual. As I mentioned, an implementation was developed to this theory, which decides whether a series of words is grammatical or not, and if it is, it generates the syntactic and the semanatic representations of the sentence, as well. (uncompound neutral Hungarian sentences) 6. So in the earlier model the li’s were assigned to words, and the lexical description of morphologically complex words — very frequent in languages like Hungarian — was claimed to be calculable in a multiple lexical inheritance network (in a never-specified way, TO TELL THE TRUTH…). A better method (TLM) is proposed here which suits the principle of total lexicalism directly: each single morpheme within words is to be assigned a lexical item. 7. Let’s see again a series of words, but the point is now the series of morphemes within the words. There are some kind of grammars where trees can be built upon the morphemes as well – if they’re well formed sequences. But in GASG, the proof of well-formedness again, that the variables in lexical items – which are morphemes now – can be identified with constants; so relations are declared between the morphemes. And of course the tree upon the words, and the relations between the words are presented as well. 8. Let’s see again the previous examle: […], and let’s see the relations between the morphemes (li’s). The most simple case of course IS when the word contains only one morpheme (e.g. ’Péter’), there’s no seeking here. In the case of ’Marira’, ’lányt’ and ’székbe’ //RaAJUK MUTATS, EZERT NEM MONDOD ANGOLUL// we found a stem and a case-marking morpheme, where the suffix is searching for the stem. In this sentence the verb ’beültetheti’ is the most complex MORPHOLOGICALLY . Below the word you can see the phonologically and morphologically motiveted arrows: every suffix is searching for the stem, they want to be adjacent to it. The morpheme order depends on the ranks, which means how stong are these requirement – but I’ll come back to it a bit later. It also could be the case, that a suffix is searching for the (immediately) preceding suffix because of its phonological requirements. The arrows above the word account for the semantic representation, it shows the scope of a morpheme which is in general the last morpheme before the given li, but there are exceptions, e.g. the semantic scope of the derivative element ’tet’ is not the stem ’ül’, but the more complex ’beül’ (the stem and the particle together, because the scopal relations are of course inherited). And you can see the relations between the words again. 9. So what are the arguments for TLM? I’m not going to talk about the inner argument, the legitimacy of GASG and the principle of total lexicalism, you should see the references at our homepage (first slide). But there are further arguments, just as our treatment for the word-level problem across languages, or the word-internal scopal ambiguities. Further advantages of the model are its easy feature checking, or the indirect satisfation concerning morpheme order, word order or argument structures. In the rest of my talk I’m going to talk about these arguments in details. 10. First the problem of word level. There is A radical superficial difference between languages in respect of word level. But the meaning in the languages of opposite morphological types is the same of course. This is not a problem in our system, because there’s almost no difference in respect of background li’s. Let’s see an examle: […] Here you can see the (phonologically underspecified) own words of the five relevant li’s, whose appropriate phonological realizations are to be filtered out on the basis of (morpho-)phonological (word-internal) environmental requirements. And finally you can see the own predicates: partial / underspecified semantic representations in van Eijck and Kamp’s style. The equations among (proto-) referents are also collected. 11. The following argument I’m talking about is the easy feature checking of our system. We can account for such phonological phenomena as vowel-harmony, lowering, V ~  alternation, linking vowels, lengthening, shortening etc. The theory and the implementation are worked out mainly for the nouns at the moment, the verb paradigm needs further research. We have two morpheme-types – two kinds of lexical items –: stems and suffixes. But they’re very similar: in the descriptions of the li’s all kinds of own features and environmental requirements are encoded: phonological, morphological and syntactic requirements as well. Phonologically two kinds of requirements are needed. The first one accounts for the choice of the possible realizations of the given morpheme (li) (I mean by lenghening stems for example or the linking vowel in a suffix), these are technically variables is IN the own words. For example in the case of ’bokor’ (bush), which contains a V ~  alternation, the own word is ’bokOr’, and the phonological realization depends on the following suffix. Or in the case of the suffix ’ban/ben’ (in) the own word is ’bAn’, and the frontness of the vowel depends on the frontness of the stem (vowel-harmony). And the other kind of requirements is how the lexical items effect on the phonological realizations of other lexical items in the same word (e.g. lowering stems or suffixes, or again vowel-harmony). Technically we store the relevant features in an array, which is not the same in the two cases: in respect of the stems the relevant features are the quality of the vowels and that if it is lowering stem or not; but in respect of the suffixes the point is if the li causes lengthening, shortening, epenthesis and lowering or not. E.g. the plural suffix (’Vk’) indicates all the four phenomena [kutyák,tüzek,bokrok,asztalokat], but the already mentioned ’ban/ben’ only the first one (lengthening). 12. So the program creates the proper form of a word by identifying the variables with constants by finding the relations between the li’s. It sees //PERSZE IDEZOELBEN - ERZEKELTESD// all the possible li’s of the given word at once, because we work with a total proposal, and it can take into consideration all kinds of information at once (the relevant phonological information, or other kinds of information – morphological, syntactical or lexical – if it is needed). The direction of the search depends on the variables within the li’s. It can be forward (the next suffix – in the case of lengthening), backward (the stem (in the case of vowel-harmony), the preceding suffix (some forms of V ~  alternation), or a suffix somewhere between the stem and the given li (roundness-harmony)). Impl. And now I show some examples to see that this mechanism CAN is really working. For example if the well-formedness of the word ’bokorban’ is the question, you can see that it’s well-formed, because the lexical items are printed out. Here you can see the own words of these li’s (with variables of course), the English translations, and their features (phonological, morphological and syntactic features). (We have A much simplified syntax and no semantics IN THIS VERSION, BUT SEE A GRUZIAI at the moment.) Of course ’bokorben’ or ’bokrban’ are out. Or another example: ’tüzeken’, there is shortening in the stem (because the next morpheme indicates that), lowering in the plural suffix (because ’tűz’ is a lowering stem), and frontness harmony in the case-marking suffix ’on/en/ön’. The roundness harmony (’tűzön’) is blocked here by a previous (non round) morpheme (which is now accidentally the preceding morpheme). And finally a more simple example: ’kalapokat’, where the first linking vowel is ’o’, because ’kalap’ is not a lowering stem, but the second linking vowel (in the accusative case marking morpheme) is ’a’, because the plural suffix indicates lowering. 13. The most simple way EXAMPLE of indirect satisfaction is the calculation of order of morphemes within words. Every suffix would like to be adjacent to the stem, but these requirements are not equally strong. For example in the case of verbs, the strongest one (with rank 1) is the requirement of the suffix ’hat’ which accounts for the modality; on rank 2, there are two kinds of suffixes: tense and mood marking morphemes, and the agreement suffixes are on rank 3. According to the definition if a requirement cannot be satisfied directly (there are more than one suffix in a word), it could be satisfied indirectly. If a suffix ’A’ wants to be adjacent to the stem on rank , and a suffix ’B’ wants to be adjacent to the stem on rank , and < ( strictly smaller than ) then the morpheme order is: stem, A, B. Equation is not allowed, that’s why ’stem tense mood agr’ is not a proper order. In this case the mood marking suffix will be the morpheme ’volna’ which is on rank 4, so it will appear at the end of the word (actually, ’volna’ is an independent word, but it’s part of the word in our approach). In the case of nouns the problem and the solution is quite similar, but there are differences. The main difference is the rank 3.5. It means, that in the case of fraction ranks (??) equation is allowed. VMI PELDA KENE, HOGY LASSAK: KALAPÉIÉI. And our treatment for postpositions is the same as for the suffix ’volna’: though they’re independent words in Hungarian, in our system they’re regarded as case marking suffixes because of their behaviour. 14. The definition which accounts for the word order in a sentence (sentence-level immprec) is much more compicated than the immprec relations within the word. The formal definition can be found in the references. The informal definition is that a requirement of rank n concerning the immediate precedence of word w1 relative to w2 can be satisfied either directly — by the fact that w1 does immediately precede w2 — or indirectly — by permitting certain words to be inserted between w1 and w2, those, and only those, whose immprec requirement to w1 or w2 is stronger, or are dependents of such words, or dependents of dependents, etc. DEPENDANTS You can see an example here […]. The adjactive – noun adjacency is very important in Hungarian (because there is no agreement between them), that’s why the immprec ranks are 1 or 2 (very strong). The article – noun relation is on rank 5, so the possesive noun ’lány’ which wants to precede the word ’nővérét’ on rank 4 can be inserted (because 4<5). But this word brings a further dependant the adjective ’vigyázó’, which also brings a dependant, etc. So the main difference between word level immprec and sentence level immprec, that on word level inserted li’s (morphemes) will never bring further dependants. 15. GASG (+TLM), due to the already mentioned separation of “template-forming”  rank parameters //PÖREEEEEEMITÖ// and semantically motivated  ranks, sheds new light on problems of word-internal scopal ambiguity, discussed in Bartos (1999), which seems to violate such basic principles of modern (Chomskyan) generative theory as Cinque’s (1999) Hypothesis on the Universal Hierarchy of Functional Projections, Baker’s (1985) Mirror Principle and the universal that operators c-command their scopes. For example the tense – mood ambiguity: […] Semantically there are two different meanings because of the two different scope-relations. […] e10, which is a davidsonian argument referring to the event […] So it is not excluded in GASG that the eventuality “in the past,” denoted by e31 in the li belonging to -ett, is permitted to be identified with both the permission (e20) and the eating (e10), which accounts for the ambiguity. One additional factor is to be assumed: the “template-forming”  ranks are stronger than rank 3, responsible for the adjacency between li3 and the li whose eventuality is clamed to be “in the past.” Our explanation for the universal tendency expressed by the hypotheses that languages tend to avoid conflicts between “template-forming” and semantically motivated morphological adjacency ranks. The problem and the solution are the same in the case of tense – modality ambiguity ’evett volna’, where the two meanings are ’???’ and ’???’. HE INTENDED TO EAT (BUT NOW HE DOES NOT INTEND TO EAT ANY MORE), IT WOULD BE THE CASE THAT HE ATE And finally there are similar phenomena in syntax as well, e.g. in the case of Hungarian focus the semantic requirements are stronger than ones for the argument structure, that’s why in Hungarian the focused element precedes its scope – it’s before the verb (Péter MARIT szereti). But in English, the requirements for argument structure are stronger, so the focus stays at its original position (Peter loves MARY). But in the case of question-words the semantic requirements are stronger in both languages, so the verb is preceded in both cases: Péter KIT szeret, WHOM does Peter love. The stricter the word order is in a language, the stronger the ranks are for the position of the arguments, while li’s like operators have weaker requirements. That’s why in Hungarian (which has less strict word order) the operators – in general – precede their scopes. 16. And the last argument discussed here is the indirect satisfaction of environmental requirements now for the argument structure. It means that in TLM not only the calculation of the order of words or morphemes can be treated this way (basic tool of GASG – ranked immprec requirements), but also the modification of argument structures, cases and agreement features. According to the definition you can see here, the two central arguments of a verb have types: the argument Y has the arg-type (Y1,Y2), the argument Z has the arg-type (Z1,Z2), and if argtype(Y1,Y2) is smaller than arg-type(Z1,Z2) (it means that Y1<Z1 or if Y1=Z1 then Y2<Z2), then Y is nominative and Z is accusative. If arg-type(Y1,Y2)=arg-type(Z1,Z2), (Y=Z) then Y is nominative. For example the verb ’ás’ (dig) has the central arg-pair Y, Z, where Y has the argtype(0,-1) (where 0 means (mit is??) and minus means that the argument is agentive), and Z has the arg-type(0,+1) (where plus means that the argument is (páciensszerű – patientive??)). Or in the case of ül ‘sit’ for instance, which is not a transitive verb, both arg-types are (0,-1), which means that the only central argument of this verb is an agent in nominative case. This system can account for the Hungarian -(t)At causativization easily: if we have the input central-arg-pair(Y, arg-type(Y1,Y2), Z, arg-type(Z1,Z2)), and a condition which sais that the arg-type(Y1,Y2) < 0 (so the argument is mildly agentive) (arg-type(Y1,Y2) < 0 means that either Y1<0 or if Y1=0 then Y2<0), then the output is central-arg-pair(X, arg-type(Y1,Y2)–1, Z, argtype(Z1,Z2)), where arg-type(Y1,Y2)–1 means arg-type(Y1-1,*). Let’s see some example: if we take the input ’ül’, the central arg-pair of ’ültet’ is (X, arg-type(–1,*), Y, arg-type(0,–1)), so the new argument will be nominative (with the smaller arg-type) and the original subject will be the object now. Or the other example the transitive ’ás’, the result of the causativisation is the centralarg-pair(X, arg-type(–1,*), Z, arg-type(0,+1)), so the new argument will be nominative again, the previous object stays accusative, and the previous subject cannot occupy a central argument position anymore. For further details you sould see references. So what does indirect satisfaction mean in TLM? Let’s see the example ’beültethetlek’ again. The stem ül ‘sit’ requires its word to have an arg.str. with the arg. pair you can see here, which consists of a 3sg. arg. in Nom. It is assumed, however, that li ’be’ and li ’tet’ “force” a bigger arg.str. due to stronger ranks with a locative arg. and a second central arg. (which going to be subject, and the original subject is going to be object). And the version of -(A)lAk decides the person and the number of the subject and the person of the object alone, overriding “earlier” predictions. So finally we get a transitive verb with a locative argument, where the subject is not 3sg anymore but 1sg, and we have information about the object as well. And finally I show some examples from the implementation. Kérek egy könyvet az okos kutyákról. Péter beültetheti a Marira büszke lányt a székbe. Kritikák: Reviewer 1: Unclear how the approach relates to current empirical or theoretical work on morpohlogy. The idea that "each single morpheme within words is to be assigned a lexical item" is not new; it is, e.g., elaborated in Wunderlich's Minimalist Morphology. On the other hand, there are also influential non-lexicalist ('a-morphous', or realisational) theories. Empirical comparison is only drawn with Cinque and Baker, rather than with those genuine morphological theories. So what are the advantages of TLM? Is there evidence from e.g. psycho-linguistics or connectionist simulations? The abstract looks very complex and ambitious for a 30 min talk. I wasn't able to comprehend all the details in (1) - (4). Needs further research... Reviewer 2: How are the ranks motivated? Why are the lexical entries and the rule types so complicated as they are? Seems to be cooked together very quickly. In GASG the rewriting and transformational rule system is substitued for a system furnished with ranks. It’s generative: well formed and only well formed word order sequences can be calculated. Particular languages should be scrutinized and we stive for finding universals. Reviewer 3: The formalism is very articulate and precise, but I'm not sure what the issue is, what problem they're trying to solve, and how the work relates to or improves on previous work. The issue is that a syntactician always need lexicon, but the lexicon doen’t need syntax, so the grammar can be homogenous. For the implementation: Impl. is a necessary req. for alternative frameworks because of the verification of formaliseability.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1 - Lingua