* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lexicon - Grammar The Representation of Compound Words
Germanic weak verb wikipedia , lookup
Navajo grammar wikipedia , lookup
Kannada grammar wikipedia , lookup
Lithuanian grammar wikipedia , lookup
Chinese grammar wikipedia , lookup
Zulu grammar wikipedia , lookup
Ojibwe grammar wikipedia , lookup
Georgian grammar wikipedia , lookup
Macedonian grammar wikipedia , lookup
Old Irish grammar wikipedia , lookup
Ukrainian grammar wikipedia , lookup
Esperanto grammar wikipedia , lookup
Modern Greek grammar wikipedia , lookup
Old Norse morphology wikipedia , lookup
Malay grammar wikipedia , lookup
Latin syntax wikipedia , lookup
Kagoshima verb conjugations wikipedia , lookup
Modern Hebrew grammar wikipedia , lookup
Lexical semantics wikipedia , lookup
Spanish grammar wikipedia , lookup
Japanese grammar wikipedia , lookup
Portuguese grammar wikipedia , lookup
Ancient Greek grammar wikipedia , lookup
Swedish grammar wikipedia , lookup
Sotho parts of speech wikipedia , lookup
Icelandic grammar wikipedia , lookup
Yiddish grammar wikipedia , lookup
French grammar wikipedia , lookup
Russian grammar wikipedia , lookup
Polish grammar wikipedia , lookup
Scottish Gaelic grammar wikipedia , lookup
Turkish grammar wikipedia , lookup
Old English grammar wikipedia , lookup
Compound (linguistics) wikipedia , lookup
Serbo-Croatian grammar wikipedia , lookup
Lexicon - Grarar~ar The Representation Maurice of C o m p o u n d Words Gross U n i v e r s i t y Paris 7 I_aboratoire Documentaire ot Linguistigue 1 2, place Jussieu 1-"-75221 Paris CEDEX 05 The essenti~d feature of a lexicon-grammar is that the elementary unit of computation and storage is the simple sentence: snbleet-verb-complement(s). -this type of representation is obviously needed for verbs: limiting a verb to its shape has no meaning other than typographic, ~ir]ce a verb cannot be separated from its subject and essential coreplemenl(s) 2. We have shown (M, Gross 1975) that given a verb, Or equivalently a simple sentence, the set of syntactic properties that describes its variations is unique: in general, no ether v e r b has an identical syntactic paradigm 3. As a consequence, the properties of each verbal construction must be represented in a texicon--grammar. The lexicon has no significance taken as an isolated component and the gr~rnmar tempera:at, viewed as independent of the lexicon, will have to be limited to c e r t a i n c o m p l e x sentences, $apport verbs are frequerd in technical texts, and variants, as in this last example, (Z.S, H a r r i s 1976, M. Gross 1982, 1986): Vsup ::: The fO be Prep: text ia in contradiction with the low VSUp =: to This text has h~ve a certain importance for Bob4 Vsup =: tO occur, etc. Accident8 occur at random The ~4ccident (was, happened, occurred, took night place) late at ~.-'UA--8~I;'Ot-~I;RS.-This research has been partly financed by contract "P[]C Informatique LinggisUque" 1985-86 from the Ministry of Research. 2. The notion of essential complement has been refined through the systematic study of 12,000 verbs of French (M. Gross 1975; J,-P. Boons, A. Gaillet, C. Locl~re 19YGa, 1976b, 1987) and a study of adverbials, that is, of nonessential complements (M. Gross 19~6). The subject arKI/or the con'lplements may be transformed and/or omitted through various syntactic operations, in particular, by nominalizing the verb (G. Gross, R. Vivbs 198E;), but the full information can be recovered (Z.S. H a r r i s 1982). 3. A line of '~+'" and " " ' " marks in FiGure 1 is such a paradigm. 4. Both examples are not isolated entries of the lexicon=grammar, r a t h e r (Z,S, H a r r i s 1964), t r a n s f o r m s of o t h e r forms: fhJ~ This text text <::ontradi/.;ta the law is important for Bob but have stylistic Grammatical elements such as determiners, prepositions and conjunctions, do not belong to the lexicon-grammar in the same sense as the four major parts of speech do, siece they are parts ot structures or rules. For example, prepositions appear in the columns ef the lexicon-grammar. An early representation of verbs in a lexicon-grammar of about 12,000 verbs is ftiven in figure t. Each row of the matrix is an entry whose main construction is defined by a table or class code. In figure I, the code G corresponds to the class of constructions: subject-verb-direct s e n t e n t i a l c o m p l e m e n t , noted: Since be-Adiective tetras are close to verbs, their description is quite similar, that is, they are considered as sentences. We have apl}lied lexicon-grammar representation not only to the two obvious predicative parts of speech, verb and adjective, but to nouns and adverbs a~; well. In the same way as one adjoins the verb to be tn adjectives, we have ~ystematically introduced support verbs (Vsup) for nouns and adverbs, as in the following examples may (N O (1) ts NO the V qee P subject and P stands for sentence). Each column is a syntactic property, and corresponds to a structure into which V may enter, roughly a syntactic transform of the main structure, ~or example, in columns we have placed the Passive forms, Extraposed and renominal forms. Thus, the related structures are semantically close. " + " sign at the intersection of a row and a column indicates that the ntry in the row is accepted in the structure associated to the column, " - " sign correspoeds to inacceptability. The process of accumulation that led to the formalized lexicon-grammar of 12,000 French verbs has run into what seemed to be at first a minor problem of representation of words: the difference between simple and compound words, On the one hand, there are simple words ~uch as the verb know and complex (idiomatic) forms such as keep in mind, Both forms play the same syetactic arid s e m a n t i c role in s e n t e n c e s such as: Bob Bob knows keeps that Max ha~ moved to Tampa in mind that Max /)as moved to Tampa bat the lexical content (one word va three) requires different identification procedures (simple dictionary tooktJp vs a certain amount of s y n t a c t i c analys}s). The representation of fiGUre 1 treats two forms such as to know (,~erneone, something) arid to keep (someone, something] in mind m t f ~ same way, thut~ emphasizing the semantic equivalence between simple and c o m p o u n d verbs, Bet compoged terms raise ~;i problem of representation. The unit of representation in a linear lexicon is roughly the word 5 as d e f i n e d by its written form, that is, a .~equence of letters separated from ne~lhbOring sequences by boundary bionic. As a consequence, compound words cam|of be directly put into a dictionary the way simple words are. Aa idenUficatior| procedure i:~ needed for their occurrences in texts, and thi~ procedure will make use of the various simple parts of the compound utterance. Hence, the formal linguistic properties of con'lpouud terms will determine both the procedure of ideetifieaUon in text~ and t h e t y p e of s t o r a g e t h e y r e q u i r e . [ Compl~nt Compl6tive$ di:oc( Om i observer i obtenir -I i -I- officialiser i ! -I -I 1 - + ordlestrer oublier _ oui'r - + + omettre -- + -- + + - + - + - + palper ~arapher ~assersous silence ~enser + -- + 4- )ercevoir _ -- + + + .-. + - -i - + - + iI )erdre de vue perforer i I_ p~rorer TABLE 6 Verbs with Sentential Complements (From M. Gross 1975) Figure We thus have to discuss the main types of compounds and out those properties that bear on to single automatic parsing and dictionary lookup. 1. Notice that nightly can also be considered as a frozen compound, though not constituted of words but Of a word and a suffix. Again, lack of compositionality s t e m s f r o m the observation that daily, weekly, Compound adverbs We call adverb any circumstancial complement, including p h r a s e s , as in t h e f o l l o w i n g e x a m p l e s : (1) I sentent)al The show took place nighlly el night during a busy night the night Bob missed his plane By compound adverbs, or frozen or idiomatic adverbs, we mean adverbs that can be separated into several words, with some or all o f t h e i r words frozen, that is, semantically and/or syntactically noncompositional. In (1), af night is a compound adverb, t h e lack of c o m p o s i t i o n a l i t y is a p p a r e n t f r o m l e x i c a l r e s t r i c t i o n s such as: monthly, Det day, *at afternoon, *at that is a priori during are compounds Det of the Adj night Relclauae can vary freely (within semantic constraints). In the same event associated w i t h t h e s e n t e n c e 8 in t h e form: the night way, tfm (E, that) 8 by a l a r g e v a r i e t y o f u n c o n s t r a i n e d forms. Frozen or compound adverbs constitute the simplest case of compound forms because they do not allow variations of their components. As mentioned above, in at night no a d j e c t i v e is authorized. Moreover, one c a n n o t insert a determiner: *at (a, this) night, t h e plural is forbidden: (coming. present) night ( c o l d , dark) night during during which evening and by t h e impossibility of inserting material p l a u s i b l e , s y n t a c t i c a l l y and s e m a n t i c a l l y : *at *st etc. The two other adverbs of (1) are tree forms. Thus, the determiners and modifiers (Adl and Re/clause) of: can be e x p r e s s e d *at yearly, same formal type have a regular formation, in the sense that t h e i r interpretation is h o m o g e n e o u s . Thus, nightly is an isolated case, as opposed to an open series of identical forms with a different interpretation. *M can be appended: Such observations are general, and apply to many adverbs of f o r m and l e x i c a l c o n t e n t : the (coming, present) night s (cold, dark) night It 5. Note that words or roots are often considered a t t e m p t s to d e v i s e s e m a n t i c r e p r e s e n t a t i o n s . nights and no relative clause (that, which) was agreed on. *at night as units in most rained cats and dogs *many cats and dogs *big cats and dogs *cat and dog varied from ~trorn *from from time to tit~e timet~ to times a time to a n o t h e r time l o n g time to l o n g time Max *Max propound ideas f r o m the top el y o u r hat *My staler p r o p o s e d ideas from the top of his hat Bob and Max proposed ideas from the top of their hat(8) A lexieal study of compound adverbs has been performed in French and a systematic inventory has been compiled from various dictionaries. R u n n i , g texts have been examined as well. It is interesting to note that whereas in current dictionaries t h e r e are about 1,500 one word adverbs, most of t h e m in -meat (-ly), we have found o v e r 5,000 In this case. the recognition procedure is no longer a simple string matching operation, since a variable slot must be dealt with inside the fixed string. More general matching rules are required here 6. Once this compound adverb l,laa been identified in a text to be processed, it can be given an iaterpt~etation, for example in terms of a simple adverb such as teiaarely or lightly and the referential information carried by Pots can then be ignored. Itowever, one oar] easily adverbs, These compound adverbs have been classihed according to their sywtacltc shape. The syntactic forms are described at the elementary level ef sequences of fmrts of speech. We use symbols with obvious interpretations such as Prep, Dot, Adl, N, V, COOl (fay conjunction) and W for a variable ranging over verb complements, etc. 8ohrtiena f r o m l,he top of his hat It ic largely frozen: no o t h e r d e t e r m i n e r is allowed, no a d j e c t i v e s can be appended to either noun, etc., but the person of the possessive a d j e c t i v e Pone, may vary. This possessive a d j e c t i v e must refer to t h e s u b j e c t o f t h e s e n t e n c e , and v a r i e s a c c o r d i n g l y : Consequently, these compound adverbs could be identified by a simple recognRiorr procedure, for they do not require any lei, amatization or syntactic analysis to be reduced to a dictionary form, as is the case with verb for)as for example. compound propoaed construct particular discourses w h e r e the obligatory cereference relation involved will (bsambiguate some analysis. Thus, not only the v a r i a t i e n of Poaa must be accounted l,or at the lexical level, but its r e f e r e n t i a l i n f e r m a t i e n has to be kept l,or possible use in a parser. We write: Prep Prep Prep N Dot Dot =: =: N Adf Prop Oel, N el Dot N Prop Def N Ceni N {fiber compound adverbs o i l e r different degrees of variation. There m'e cases where one part of the adverb is frozen and another part is entirely free: at night in the end =: in the l o n g r i m Max Max =: in every nonce of the w o r d at the point of a gun The V Dot N =: time and again W to =: S =: all begin with things being . ._ in honor, al the far end are The parts they do not allow modil,iers. o b s e r v e v a r i a t i o n s such as: Max Max equal F gure 2 shows the classes that have been defined on this basis, t o g e t h e r w i t h e x a m p l e s and t h e n u m b e r el, i t e m s in each (:lass; _ parts organized a p a r t y in h o n o r of Bob h i d the c a r at the f a r end of the p a r k i n g lot . . . . . . . . |~oad~,fin |i Consider for for for P O __ -.~ PAD% Adv PC Ihap (" ] en bref P[)l~q( Prep De~ C lco,m-e tome atzente 570 PAC Prep a d i ¢ +++esa hel,h, mort 440 frozen. of N For example, are tree, for we organized a p a r t y in hJa h o n o r hid the car at the lar end, I think, of the parking lot the adverbials: the sake of r u i n i n g thinfjs the sake of Bob G o d ' s sake pea P~+pCAdj ~ d ~o,'~e ~l+,l..+ t ~oo l We (:all the combinatien for--cake frozen, since the noun sake does not occur elsewhere than in adverbial phrases with the preposition for: it cannot be t h e s u b j e c t or o b j e c t of any verb. On the other hand, the modifiers of sake are quite varied and regular from the point of view of the syntax of noun modifiers 7. PCDN Prap C de N [ , ' h i maven tie N / 330 / There are also cases of seemingly free adverbs which require an hoc t r e a t m e n t . F o r e x a m p l e , d a t e s such as: PCP( i lhdp ( Pr~:p C 1 ,160 ! / / 1 Monday - . ' [de.~ plods d {a 1~1% . . T . . 170 ....... i PF P (phrase figae) I p i . . . . . . . l i e sail ] 230 pl~co <Adi) I I 200 PVCO (V) comme C ~ . .............. ,in,. [ / (V) . . . . . . . . . ]'r6p C F'JC ..... ('onj (" J , " ......... d...... h, t i are described bellFr£ ~ / J (2) (3) 210 _1 100 TOTAL ~,',4 190 (t4. w a y by a f i n i t e automaton. They Max e l e c t e d Bob on the ( f i r s l , ate his n o o d l e s in a bow/ close to being s e c o n d ) ballot 30 / el out le tret~ hlet e.' t Frozen Adverbs it) a n a t u r a l Tecl;nical or specialized families of adverbs come frozeu adverbs: / ~ c o m m e un cheveu sar la soupe / PPCO ' {,,o I l,,,pv, ( 13, 1 9 6 8 at 9 p a n . 240 pv .......... March ad (;ros;,s 19~6 ) The special semantic relations that hold between the adverbial complement and the rest el, the sentence are lirmted. There are few verbs such as to eat which c o m b i n e with in a bowl and which have t h e non locative i n t e r p r e t a t i o n of (3). The usual i n t e r p r e t a t i o n is t h a i f o u n d in: Tableau 2 The examples discussed an far are entirely frozen. Itence, as a i)vuctical matter, t h e y can be located iu a text by using the search function available for strings in any text e d i t o r system. T h e r e are however more complex examples that require deeper analysis. Consider fay e x a m p l e t h e i d i o m a t i c a d v e r b in t h e s e n t e n c e : 6. PRDLOG rules are particularly f r o z e n f o r m s (P. S a b a t i e r 1980). 7. There are nonetheless ~for a well restrictions h e a v e n l y ,~oke adapted on t h e m : to recognizing such Max puF hia n o o d l e ~ in a b o w l Entering ITozen adverbs into a lexicon-grammar raises many r=ew questions, The bulk of adverbs can be described by means of the Following t y p e of d e r i v a t i o n (Z.S, H a r r i s 197¢): Bob left; 7hat Bob left occurred : Bob l e f t , fhia o c c u r r e d :: Bob l e f t at 9 at 9 at 9 and sulaport verbs play a crucial role here. However, there are cases where no general support verb is found and where adverbs have to be considered as a part of the elementary sentence. Consider the adverb in: Bob sang at t h e t o p o f hJ~ v o i c e It is syntactically and semantically analogous to tree adverbs such as noisily, powerfelly. For these two f r e e adverbs, a d e r i v a t i o n a l source i n v o l v i n g t h e a d j e c t i v e is available: The way - b o a r d of g o v e r n o r s one be modified in several ways: board a~ld g o v e r n o r a ta.ke separate determiners and modifiers: ~he powerful boarda of the twelve governora of my bank, Such a compound noun comes close to being a free Form. It is the liruited number of second aeons such as d i r e c t o r , governor or regent that suggests we are dealing with a compound noun. Also, the meameg of these phrases is nonoompoaitional in the sense that they have a legal or i n s t i t u t i o n a l m e a o i n g t h a t t h e i r c o m p o n e n t s do not have c l e a r l y . The variations of lurer we atlcachiag a finite automaton describe the main grammatical relative clauses to compound have enumerated can be partly hal'=died bit to a given entry, and this automaton will changes allowed The adjunction o~ free nouns may r e q u i r e a different t r e a t m e n t "l~)e kiads of variation of cletermieing whether a given nol: almost requires c~. original co,infraction of a leKicoa Ibnitatioas. compound nouns are aO numereu,~ that nomit)al coostruction is a compouod noun or demonstratiou. Titus, aotontatizirlg~ the is a,'l activity that will preseot severe Determining the sup~mrt verbs for compound nouns does )tot seem to raise o t h e r p r o b l e t e s t h a n t h o s e e n c o u n t e r e d with simple nouns. Bob sang was (noiay, powertn/) This is not the case for at the top of his voice which is practically limited to modifying the verbs of saying. Moreover the obbgatory c o r e f e r e n c e link of hia leads to a representation where this adverb is not analyzed. Thus two semantically similar types of adverbs have to be represented quite differently in the lexicon-grammar. All the situations just exemplified with adverbs are quite common, cod are also encountered with nouns, adjectives and verbs. The paradox el~ relaresentatJon they lead to can only be solved by introducing a complex level of semantic equivalence for the entries of the lexicon-grammar, - in Gerraan. whore rio blacks occur between component¢, segmentation is ~[ prebleltn; - in French (G. Gross '1985), where the spelling of the plural is ht g e n e r a l not s t a n d a r d i z e d , e x t r a v a r i a t i o n s have to be expecte(I. 2, C o m p o u n d modifielFs C o m p o u n d nouns C~n'npound nouns form the bulk of the lexicon of languages. Language creativity is largely associated with the growth of technical vocabularies which consist mainly of technical nouns. Compound nouns number in the millions for European laoguages. They are usually built rrem the vocabulary of simple words by means or grammatical rules which may involve grammatical words. By definition, their meaniog is nencompositional. The compound nouns can be described in terms of the sequence of their grammatical categories, in the same way as for adverbs (IA. Gross, D. T r e m b l a y 1985). We have for example: Det N Adl N N of N Det N N N of Dot =: the =: =: crude elroke oil, r e a l o f luck, but m a n y s i t u a t i o n s allow variations of in some language•: Adjectives, noun complements and relative clauses carl be cemplex and yet apply to free nouns. From the point ot view developped here, that is, the representation in terms of sequences of grammatical categories allowing for efficient matching procedures witt) texts, th~.,y do not d i f f e r from a d v e r b s and nouns. E x a m p l e s are: table book The The is as c l e a n a s a n e w p i n is u p to d a l e Bob is the w o r l d ' s ( b e a t , w o r a t ) teacher They discussed it, on a t a k e it or l e a v e it ealaFe 3. board of (governors, regenfa) N =: t h e t a l k o f the t o w n =: l e s t lobe, color 7V In general, compound nouns Conrlpound aeons raise o t h e r q u e s t i o n s moon Such nouns can b e c o m e q u i t e c o m p l e x in v a r i o u s t e c h n i c a l modifiers, R~MAIrlK Fields. determiners and are encountered: the moon is a frozen combination, - - definite article-noun -- which behaves like a proper name, because ot its unicity of reference. It cannot be modified by adjectives without losing its reference: *the (big, yellow) moon; c r u d e oil takes restricted determiners. Since it is a mass noun, t h e r e are difficulties in accepting its plural, It can be modified by adjectives and nouns as in (cheap, high q u a l i t y ) c r u d e oH, but these cannot modify el/: *crude, ( c h e a p , h i g h q u a l i t y ) oil; stroke of luck has unrestricted determiners and modifiers, bat no iosertion is allowed i m m e d i a t e l y before or next to of, in particular luck cannot be modified: *stroke of g o r ~ luckS; word, whose Compound verbs Compound verbs or frozerl sentences as we have termed them (M. Gross 1982), can be described as sequences of categories. We write N i for variable noun phrases and Ci for frozen noun phrases. For subjects; i = 0, for complements: i = I, 2. Examples are: (I) (2) (3) (4) NO N0 N0 N°~ V Ct V N1 V CI C0 V Prep Prep Ct =: Bob hit the /ackpot C2 =: Bob took your project into account C2 =: Bob look the bull b y Ihe heron =: Bob'a dream came true W e outlined in I the description ot a lexicon-grammar of French v ~ b s and the reasons why compound verbs had to be separated from simple On~S. ~;ystematic search through dictionaries (monolingaal, bilingual, and specialized) has yielded close to 20,000 compound verbs belonging to the same level of language as the 12,000 simple verbs. A syntactic c l a s s i f i c a t i o n has b e e n b u i l t for t h e m ( F i g u r e 3). into 9. 8. ~,lrnko of b a d luck would be a different compound relation to a f r o k e o f l u c k is o n l y etymological. basis Compound verbs are the most complex Forms that have to be entered a lexicon £t. The compounds discussed previously were simple There are however a limited number of frozen discourses such as: If wa,s f o r a l l I h e w o r l d aa i t S Which n e e d an e x t r a l e v e l of c o m p l e x i t y (L. Danlos 19B5). because by and large they wore topologically connc% that is, either their I'mrts could not be separated by any extraneous linguistic material or else the+ inso~ted material could be easily described (i.e. by moans of a finite a u t o m a t o n ) . ++'+ ]+ - I l.o " ii, +, ..... '"+ ! ..... h.. / CAN |NoV (C i~ de N), !;!_iN NoV (C d,: N), [] hat le rappcl d ........... h; (.~PI NoV Pr+p Ci 11 clmr=ie dam; los b+gonia+; CPN NoV I:'rap (t C de N), C:!Pi"J N0V (',2[r~p N2 . } l a d6charg6 sa bit . . . . . " M a x t 750 CNF'2 NoV Iql })'cp C'~ l l s out pass/: Max par tes altucs I 350 ] !CIP2 N,. V (;i I'rb.l+ C2 It tact de l'cau dans sen vin 800 Quc P V PI6p Ci Quc Max rcstc inilhe l!ti ~i;t favour 150 1(7 NoV Ct flce ()n P II a dit IlOIl ~'l CC que Max testc cP, NoV (~l ,:1..' ce Qu P I1 se tiler (t lop; ¢loigts dc cc qu'il i'Cst(~ NoV Adv Cola C5 CAI)V _ c× INoVN (]O CoV %/ Col . . . . . leli~ la [antuc d~. Max (hd) 50(! [ 5_00 i 300 + II abondc ,Jam; Ic sens de Max nc 25(! r egt pisse pas loin li est palti sans laisscl (l+adrcsse ,50 / I ,] Z!} 200! ] 30() ]AI nloll[indc nlOll[C all ILCI tic Max [ I 300 4. SoFno (;oncIusions Ilew to organize the lexicon of compound utterances is an gloom question, From a computational point ef view, many solutions use a v a i l a b l e for t h e lookup of a (:emDound term: (i) Io classical algorithms m which left-to-right analysis is ess~,ntird+ the compound teraq could I.)e viewed as an extension of the first Ina)ot element met while scanning the sentence. Vor eXSOlplo, the adiectiw'~ long is the first such element of the cotopoond adverb m the toJ~g rim. Among mmw other possibilities, the program, pausing ,:nJ the word long would test the occurrence of the and in to the l o f t of long, snd the o c c u r r e n c e of run to the right. Notice that the l e f t - t o - r i g h t constraint has to be somewhat relaxed iu o r d e r to t e s t both l e f t and r i g h t c o n t e x t s of long. (ii) In a futuristic view Of parsing involving parallel computing, one might envision several levels of lexicon. At the firat level, lon(j on the one hand ~md run, on the othe~, would to two sots of cov=structions whose intersection would contain tfJu~ compouiKI ilt I'ilo /oJJfl run; the lattc, r can then be searched N)r in the input text. V(u con'ffJo,ond verbs, one wonh'l have to synthesize a matchinfl utterance, rather than .girn[dy looking it up. Such a procedure car, always fm sln+utat ed s(tqueutJally. I. all cost-.,';, the representatio, el utterances which we have used. flamen the Se(luer.cos of syntactic categories, agow.~; for the separation of the lexi(:on of con'lpeund [ornl!~: into classes for which direct access can be provided. In this way, dictionary Iooliup can Lie stied u|l 1i ftEMAIH< AU'N ._INepv,}+~:~ l...)v 0 i+ ~).,,¢ ~., M.= l E°~ / ANP2 / No i veil N I )).(~p ('2 Ill [' MilX CO hoti'ctll L ]00 A'?+ A-I-i'2 "- No ave-h (£i A(+il- No avoir C I'l',r~ I) (+ EOi-i(°:ii-N~'!!i"!:~!! ) Eel'[ II t, la v\le[ ........ II a mat aux chcvcux i / ;'imhe;i~M C0 6trc lh61> Cl . . . . . , "leo,lie l.cs ~ieuts sont du c6t(" dc Max Fr(3zen IO0-} 250 i 350 2(10 Verbs (hi. Cz'os;.~ 19112) Tableau 3 In the case of compound verbs, the various ports of each utterance remain syntactically independent, Thus, the verbs of (1)-.(4) can take any t e n s e d f o r m , as ill: At Sentential tbaf time. Bob w i l l be h i t t i n g the l a c k p a t i n s e r t s (:an s e p a r a t e Bob hit, a verb from its coruplemonts: if s e e m s to me, the jackpot In example (2). the direct complement N t is Ifee and general. heoce, se+ltenti~d structures can separate the verb from its second (frezed=} c o m p l e m e n t : Bob took In laver el l e f l q o - r i g h t aualysit; one could point to the loci that complex terms can ellen be abbreviated and that abbreviations are nlostty rHfht truncations. In seth situations the remaining part (the tellmast p~rt) af the truocated term must carry the in|ormation that describers the rgtht context m order to allow reconstruction of the reducncl part. Iherc are however examples where abbreviations are carried out on the left part el a term. (e g. a progral~mlng language a larp.quagc). Preliminary figures have shown that conl[~und terms form thP. essential [.art of a lexicon-grammar. It is also interesting to observe that they Iorce both the linguist and the computer specialist to adopt a me(;h voore a b s t r a c t view of language; - ~;emantically, tw defied)on, compoond utterances cannot be decomposed into simple utterances', in other terms, meaning is not compositional fer c(a'npoends, fleece, in a certain sense, one has to recognize that meaning has not nuJch to do with words; - syntactically, it has become a rather general hatlit to attach properties 1o individual words, In the case of compounds this mode of representation is no longer possible: Why privilege one part of a compound with marks rather than some other part? For example, there is no reason to attach the Passive marking to the verb rather than to either of the complements of the utterance to put the cart before the horse, Lexicon-grammar representations eliminate such questions by dolocalizing the syntactic information and by attaching it to the full sentence, In this sense, compound expressions provide a powerful n]etivation for representing lexical and syntactic phenomena in the form of a l e x i c o n - g r a m m a r . the tact lhat Jo was absent yesterday into account Notice that parts of compound verbs may be recognized directly, for example the iackpof, or into account, but these parts may be ambiguous, whereas the full utterances can rarely be confused with free for~,ns10. 10+ As a matter of fact, when an utterance is found to be ambiguous, with one analysis as a frozen form and the other as a free form, ignoring c o m p e t i n g free forms a l t o g e t h e r is a good parsing strategy, 11. The saree use of se(luences of syntactic categories is found in n string grammar (Z.S+ Harris 1961), which has proven to be quite efficient in syntactic recognition (N, Sager 1981, M. Salkoff 1973, 1979). REFERENCES Boeas, Jean-Paul,, Guillet, Alain. and LeclSre, C-hristian. 197Ga, La slruclure des phrases simples en lran~aia, I Constr~/ctions intransitives, Geneva: Droz, 37zp. Boons, Jean-PauL, Guillet, Alain. and Lecl~re, Christian, 197Gb, ta structure (tea phrases simple8 en fran,~iais. III Clasae,~ de constructions transitives, Rapport de recherches No 6, Paris: University Paris 7, L,A,D,L., 143p, Boons, Jean-Paul., Guillet, Alain. and t.ecl~'re, Christian. 1987. sfrucltn'e des phrases simplea en fran~ais. II Classes constructions locatives, Paris: Cantil~ae. Danlos, Laurence. 1985, G#n~ration automatique de textes en nalurellea, Paris: Masson, 239p, La de langues Gross, Ga.~:ton. 1985. Le lexique ~lectranique des roots compos~a du fran(~ala, Rapport ATrP CNRS, Paris: LJnwersity Paris XIII, Gross, Gaston; Viv~s Robert, eds. 1986. Syntaxe <lea hems, Langue francaise 69, Paris: Larousse, 128p. Gross, Maurice 1975. M~fhades en syntaxe, Paris: Hermann, 414p. Gross, Maorice 1981. Lea bases empiriques de la notion de pr~dicat ~ m a n t i q u e , Langages G3, Paris: Larousse, pp,7-52. Grc,~s, Mautice 1982, Une classification des phrases fig~.Kes du franwcais, Revue qllt~ceJse de linguiatique, VOI. 11, No 2, Montreal : Presses de I'Lh~iversite do Quebec ~ Montr#.:al, pp,151-19,5. Gross, Maurice 1986. Grammaire tranafermalionnelle Ill Synlaxe de I'advert~e, Paris : CantJl~ne. du fran~sis. Gross, Maurice; Tremblay, Diane 1905, Etude du conlenu d'une bsnque terminolegique, Rapport de recherche du LADL, Paris: MIDIST, Harris, Zellig S. 19Gt. String Analysis o1 Sentence Structure, Papers on Formal Linguistics, The Plague: Mouton, Harris, Zellig S. 1964. The Elementary lranformations, Transformations and Discourse Analysis Papers 54, in Harris, Zellig S. 1970, Papers in Structural and Transformational Lingltistic~, Dordrechl: Reidel, pp.482-532, H~rris, Zellig S. Seuil, 237p. Harris° Zellig $. Principles, 1976. Notes du cours de syntaxe, Paris : Le 1982. A Grammar of English on Mathematical New York: Wiley Interscience, 429p, Sabatier, Paul 1 9 8 0 . Dialogue Doctoral thesis, Marseille: en francais avec un ordinateur, Groupe d'intelligence artificielle. Sager, N a o m i 1991. Natural Language Information Processing. A Compuler Grammar of English and Its Applicalions, Reading: Addison-Wesley, xv-399p. Salkoff, Morris 1973. Line grammaire en chsihe du diatributionnelle, Paris: Duned, xiv-199p. franyais. Analyse Salkotf, Morris 1979. Analyse ayntsxique du franqais. Grammalre cha}~e, Amsterdam: John Benjamins El.V,, 334p. on