* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 5 - ELTE / SEAS
Survey
Document related concepts
Kannada grammar wikipedia , lookup
Spanish grammar wikipedia , lookup
Scottish Gaelic grammar wikipedia , lookup
Serbo-Croatian grammar wikipedia , lookup
Yiddish grammar wikipedia , lookup
Preposition and postposition wikipedia , lookup
Probabilistic context-free grammar wikipedia , lookup
Ancient Greek grammar wikipedia , lookup
Georgian grammar wikipedia , lookup
Esperanto grammar wikipedia , lookup
Old English grammar wikipedia , lookup
Latin syntax wikipedia , lookup
English clause syntax wikipedia , lookup
Junction Grammar wikipedia , lookup
Lexical semantics wikipedia , lookup
Antisymmetry wikipedia , lookup
Transcript
BMN ANGD A2 Linguistic Theory Lecture 5: Derivation Verses Representation 1 Standard and Extended Standard Theory The beginning of the 1970s marked a shift in linguistic theorising from a number of perspectives, one of which we discussed last week arising from the introduction of constraints. Alongside this, the X-bar theory of phrase structure was introduced (Chomsky 1970), which at the time was taken to be a set of constraints on possible phrase structure rules in a similar way to how Island constraints constrained possible movements. Thus, we went from a theory which looked as in (1), in the 1960s to one which looked as in (2), in the 1970s: (1) Phrase Structure Rules Lexicon Deep Structures Transformations Surface Structures (2) Phrase Structure Rules Lexicon X-bar Theory D-Structures Transformations Constraints on Transformations S-Structures The change of terms ‘Deep Structure’ to ‘D-structure’ and ‘Surface Structure’ to ‘S-structure’ was merely a way to avoid the word deep and surface that Chomsky felt had unfortunate connotations. The important differences are in the additions to the theory. The original theory became known as Standard Theory and that in (2) as Extended Standard Theory. 2 Further Developments Before we continue to the main theme of the present lecture it is necessary to mention a few developments that also took place in the 1970s which while having no direct bearing on our topic, will make our discussion more straightforward if they are introduced at this point. Mark Newson The first concerns the complementiser and associated elements. During the 60s, if the complementiser was included at all in discussion, it was simply assumed to be part of the S node, along with the subject and VP: (3) S that NP VP the man ran away Indeed, it was only in the last part of the 1960s that the term complementiser started to be used to refer to this element (Rosenbaum 1967). In 1970, Bresnan proposed that the complementiser stood outside the clause, forming another constituent with it which she labelled S. This analysis was rapidly accepted along with the assumption that the position occupied by the complementiser also hosted the fronted wh-element in interrogatives and relative clauses. The standard analyses of such clauses was as below: (4) S S COMP S COMP that the man ran away S why COMP the man ran away As we can see the wh-element was not assumed to occupy the same position as the complementiser, but be adjoined to it. We need not delay over the justifications of these assumption, but they are necessary to know about for the following discussion. The second important development which has its roots in several works of the early 1970s but is usually attributed to Fiengo (1974), is the assumption that when an element undergoes a movement, it leaves behind an abstract phonologically empty element of the same category as itself, known as a trace. Traces are usually indicated by the letter ‘t’ and this is indexed with the moved elements to show which part of the structure it is the trace of. Again, we do not need to go too far into the justification of this assumption at the moment, but it is easy to see that it limits the power of transformations in terms of what they are capable of doing: (5) S NP Aux was S VP V NP NP Aux VP John was seen John (6) seen S NP Aux was V S VP V NP NP Aux John1 was seen John 2 VP V NP seen t1 Derivation Verses Representation As we can see from (5), if the trace convention is not used, the transformation rearranges the structure in some quite drastic ways as whole parts of the tree are made to disappear. Moreover the lexical items involved also undergo changes: the verb which is in a transitive environment at D-structure is in an intransitive environment at S-structure, which is tantamount to saying that a transitive verb is made intransitive by the transformation. In (6), where a trace is assumed, the only thing the transformation does is to move the object into subject position and insert the trace itself. Structurally and lexically nothing much changes. Without traces then transformations have to be assumed to be able to make more radical changes and hence are more powerful mechanisms. Having introduced these notions, we can now proceed to the main topic of the lecture. 3 Filters Besides using constraints on transformations to counteract over-generation, Ross also used another device which has become known as a Filter. Filters do the same job as a constraint, deeming ungrammatical some structure which would be predicted to be grammatical by the free application of a transformation. However, instead of imposing a restriction on the transformation itself, Filters work by imposing restrictions on the structures that are formed by transformations, i.e. S-structures. A simple example can be taken from a paper by Ross written in 1972. We have previously mentioned the idea of early transformational grammar that gerunds might be derived from underlying sentences by a transformation. Thus, there is an obvious relationship between the following examples which is exactly the sort of thing that transformations were formulated to account for: (7) a b John wrote a letter John’s writing a letter (pleased his mother) However it is that we are to deal with this relationship, note that it is the first verbal element which has the gerund –ing form, whether this be a main verb or an auxiliary: (8) a b c John’s writing a letter John’s having written a letter the letter’s being written (by John) (John wrote a letter) (John had written a letter) (the letter was written (by John)) However, there is an unexpected gap in this paradigm, we cannot form a gerund when the progressive auxiliary is the first element: (9) a b John’s having been writing a letter * John’s being writing a letter (John had been writing a letter) (John was writing a letter) Note that there is nothing wrong with the underlying sentence associated with the gerund in (9b), so the problem seems to be due only to whatever is involved in forming the gerund itself. Ross’s suggestion was that what is wrong with (9b) is nothing to do with the process by which it is formed, but simply that we are not allowed to have two verbal elements in the ‘-ing’ form next to each other. That this is so can be seen by the following data which are produced using different mechanisms to those in (9): 3 Mark Newson (10) a b it began to rain it is beginning to rain it began raining * it is beginning raining In this case we have a verb in its continuous form followed by a gerund complement rather than an auxiliary in its gerund form followed by a continuous verb. Hence the two structures are different, yet the result is the same: two adjacent verbs in their –ing form is ungrammatical. We might propose therefore a condition which simply states that however the structure is formed, if the result has two adjacent –ing verbs, it will be ungrammatical: (11) The Double-ing Filter * … V-ing V-ing … The first thing to note it that this does not work in the same way that a constraint does: a constraint directly restricts the operation of a transformation and so limits what S-structures can be associated with a given D-structure. A filter, on the other hand, simply says that certain S-structures, no matter how they are formed, are ungrammatical and so completely ignores the issue of the relationship between D- and S-structure. We can extend the model of the grammar given in (2) in the following way to include filters: (12) Phrase Structure Rules Lexicon X-bar Theory D-Structures Constraints on Transformations Transformations S-Structures Filters A number of points are raised by these suggestions. Why, for example, do we need two different mechanisms to do the same job? Both constraints and filters check the overgeneration of transformations, so why do we need both? I suspect that the answer to this question at the time that the early filters were proposed was simply because they were the most obvious way to state the restriction. To formulate a constraint that prevented the generation of double –ing structures, although perhaps not impossible, would result in something rather complex and inelegant. The Double-ing Filter at least has simplicity on its side. Moreover, there is an underlying assumption here that might not be totally accurate: that every phenomenon that might be ruled out by a filter is the result of some transformation. This is not guaranteed. For example, the offending phenomenon may be the result of the phrase structure rules which survives into S-structure because transformations do not affect it. In this case it would be difficult to get a constraint on transformations to rule out the problematic S-structure. 4 Derivation Verses Representation This said, however, there is a serious theoretical point here concerning the nature of human grammatical systems. Given that constraints and filters both perform the same function, it would obviously be a simpler system that made use of just one of these mechanisms that one that makes use of both. Only if there is empirical evidence that both are needed would we be wise to continue with the more complex assumption. As it turns out, although it may not be possible to replicate the effect of a particular filter by a particular constraint, or vice versa, it is always possible to replicate an effect of any system which makes use of only filters with one which makes use of only constraints, and vice versa. For example we can arrange the grammar so that a particular S-structure configuration is not generated without there being a specific constraint which accounts for its absence. Thus the question is still valid as to which of the two mechanisms for countering over-generation should be used. This turns out to be an extremely important issue which ultimately relates to the question of whether we need transformations or not. We will return to this issue in a later section. A second point that one might raise concerning the use of filters is that as they are associated with surface phenomena, isn’t it the case that they are rather superficial themselves, merely translating observations into grammatical mechanisms meant to account for the observations? In other words, is it not the case that filters are not just hopelessly descriptive in nature and cannot hope to achieve any level of explanation? The issue revolves on just how surfacy the filters have to be. For example, the double-ing filter seems to be very surface based given that it directly addresses a particular structure of English. However, this does not necessarily mean that it cannot be restated in a less surface based way. There is evidence that the doubleing effect is not just a one off observation, but is part of a larger set of observations which indicate something more general is going on. For example, Chomsky and Lasnik (1977) proposed a filter to account for the following observation: (13) a b c d e solitude is what she longs for for him to leave is what she longs for she longs for solitude * she longs for for him to leave she longs for him to leave The verb long takes a PP complement headed by for. It can also be associated with an infinitival clause introduced by the complementiser for. However, it cannot have the preposition and the complementiser together, though as shown by (13b), there is nothing to stop these appearing in the same construction, as long as they are separated. Thus we might suppose a filter which rules out the adjacency of the preposition and complementiser for: (14) The For-For Filter * … for [S for … Obviously this filter is just as surface based as the double-ing filter, but they also share something in common: both involve the adjacency of two elements with identical forms expressing different grammatical entities; the gerund and progressive ing and the preposition and complementiser for. Is this just coincidence, or is there something deeper going on? We can note that similar observations can be made concerning other phenomena in other languages. Take NPs in Hungarian for example: 5 Mark Newson (15) a b c c az ő kalapja the he hat “his hat” az ember kalapja the person hat “the person’s hat” * az az ember kalapja az embernek a kalapja In (15a) we see that the Hungarian determiner a(z) precedes the possessor. In this case, this determiner must be associated with the possessed noun, not the possessor, as Hungarian pronouns cannot co-occur with determiners: (16) * az ő olvasott egy könyvet the he read a book The possessor can also be an NP with its own determiner, as shown in (15b). However, we cannot have both a determiner for the possessed noun and the possessor together in such a way that the two determiners are adjacent (15c), though they are fine when separated (15d). Again, we seem to have a ban on two identical forms with different functions following each other adjacently. Strangely enough, similar observations can be made in phonology too. Leben (1973) proposed a kind of phonological filter which ruled out two similar tones being adjacent. So while we find HL and LH tone patterns in various languages we do not find HH and LL patterns. This restriction was termed the Obligatory Contour Principle. It may be that doubleing, double for and double az phenomena are the syntactic versions of violations of the Obligatory Contour Principle, in which case it seems that there is something very deep rooted and not at all surface based that is being filtered out in the linguistic system. What this deep filter is and how to formulate it are other matters which take us beyond the boundaries of this lecture, however. On the other hand, it has to be admitted that many of the filters that were proposed in the 1970s were not particularly deep and had the feel of a descriptive device waiting for something more explanatory to come along and replace it. For example, it is a straightforward observation that complementisers such as that only introduce subordinate clauses and not main clauses: (17) a b I think [that he left] * that he left From a traditional point of view, this is not remarkable, as complementisers were classified as subordinators, i.e. things which introduce subordinate clauses. Yet from the generative position, there was something to be explained. To start with, given that main clauses can take a fronted wh-element and assuming these to move to the COMP position, this means that main clauses have COMP positions, even if they cannot be filled by a complementiser. Moreover, not all subordinate clauses are introduced by a visible complementiser. The idea emerged then that all clauses have a COMP position, but that this maybe left unfilled, or perhaps it was filled by some phonologically empty complementiser: 6 Derivation Verses Representation (18) S COMP S e he left The question that arises is why the COMP position must be empty in main clauses, given that it may be filled in subordinate clauses. Chomsky and Lasnik (1977) proposed that this was due to a filter that deemed root clauses to be ungrammatical if beginning with a complementiser: (19) The Root Clause Filter * [S COMP [ … if COMP is filled by an overt complementiser and S is root Apparently not all languages behave like this and so this has to be seen as a language specific filter, which perhaps makes it even more unappealing as it simply describes the surface distribution of a category in certain languages. It might be better to deal with this in terms of a lexical property ascribed to complementisers, more or less admitting that they have a subordinating function. Another very well known, but ultimately descriptive filter from the same paper as the Root Clause Filter accounts for the following observations: (20) a b c a man [who1 COMP [I met t1]] a man [who1 (that) [I met t1]] * a man [who1 that [I met t1]] A restrictive relative clause in English may be introduced by a wh-element, in which case the COMP position is left empty. On the other hand, the wh-element may undergo a deletion process, in which case the COMP position may or may not be left empty. What is not allowed is for the wh-element to be undeleted and the COMP position to be filled. Thus Chomsky and Lasnik proposed the following: (21) The Doubly Filled COMP Filter * [S WH + COMP [ … if WH is not deleted or COMP is not empty While this has the feel of a descriptive device, it is difficult to come up with some more deeper account for it. Certainly it would be hard to claim that it was the result of some lexical property. One of Chomsky and Lasnik’s descriptive filters turned out to be the basis of a more deeper principle of the 1980s, which we will talk about next week. The observation was a rather puzzling one, which had been known about since Perlmutter (1971). It seems that while the presence of a complementiser does not affect the grammaticality of a clause with an extracted wh-element from object position, it does when the wh-element is extracted from subject position: (22) a b who1 did you think [that Mary called t1] who1 did you think [Mary called t1] 7 Mark Newson (23) a b * who1 did you think [that t1 called Mary] who1 did you think [t1 called Mary] Given that what seems to be the problem here is the combination of a movement from the subject position and an overt complementiser, Chomsky and Lasnik proposed the following filter: (24) The that-trace Filter * [that [ t … In other words, when the complementiser is immediately followed by a trace, the result is ungrammatical. While this is clearly descriptive, as mentioned above, it did serve as the starting point for a more explanatory account in the 1980s. We will review this development in a later lecture. The final filter we will discuss here is also one introduced by Chomsky and Lasnik. The starting point for this is the following observations: (25) a b c d they want [him to leave] they want [__ to leave] they want very much [for him to leave] * they want very much [for __ to leave] The example in (25) demonstrate the distribution pattern of subjects in infinitive clauses. Essentially this subject can be overt when it is immediately preceded by certain verbs of by a for complementiser. However, the subject can only be covert in the absence of the complementiser. While this might appear to be related to that-trace effects, in that an overt complementiser is followed by a covert element in subject position, it turns out that the two phenomena are unrelated. To start, the covert subject in (25) is not a trace as none of these examples involve movement: the subject of want is semantically related to this verb and hence has not moved from another position. Chomsky and Lasnik propose that the covert position is taken by a phonologically null pronoun, which they term PRO1. Another reason to suspect that this observation is independent of the that-trace filter is the fact that there are dialects of English in which (25d) is grammatical. But even in these dialects that-trace violations are not allowed, demonstrating that the two have different sources. The filter proposed by Chomsky and Lasnik is again rather descriptive in nature: (26) The for-to filter * [for [__ to … In other words this filter filters out structures in which to complementiser for is immediately followed by the infinitival to in the surface string (i.e. there is no phonological material between them). Despite this, it was the foundations of one of the most successful application of filters. Shortly before the publication of their paper, Chomsky and Lasnik received a letter from Jean-Roger Vergnaud commenting on a draft of the paper they had sent him. In this letter Vergnaud proposed a different account of the data in (25) based on the idea that NPs 1 This analysis has its roots in another transformational process called Equi-NP deletion, which we will introduce in a future lecture. Chomsky and Lasnik’s analysis was subsequently taken up and as such played a role in the elimination of deletion type transformations as part of the programme of restricting their power. 8 Derivation Verses Representation occupy positions in which they are assigned Case, even if this is not realised morphologically in a language. The Case positions Vergnaud suggested are subject of a finite clause (which Vergnaud called Subject Case, but is traditionally called Nominative) and positions following verbs and prepositions (which Vergnaud calls Governed Case, but is traditionally Accusative). To account for the for-to filter data, Vergnaud suggests that Case is relevant for all NP, except the phonologically null one which appears in the subject position of infinitives. The idea is that the complementiser preceding the infinitive forces the subject position to be in the Governed Case and this is incompatible with a null subject. Chomsky (1981) reworked Vergnaud’s informal account into a more formal theory which made use of the following filter: (27) The Case Filter * NP, where NP is overt and does not sit in a Case position The more formal aspects of this theory involved the definition of Case positions, which we do not need to go into here. Descriptively speaking they are the same positions as Vergnaud proposed. This theory then accounts for the surface distribution of NPs in general including observations about why some NPs undergo movements and others do not: (28) a b c it seems [he is rich] * it seems [him to be rich] he1 seems [ t1 to be rich] Note that when the complement clause is finite, its subject is not forced to move (28a), but when it is non-finite it is (28b and c). As the subject position of a finite clause is a Case position, the Case filter is satisfied by (28a), but the subject of a non-finite clause is only a Case position if it is preceded by certain verbs (such as believe) or by the complementiser for. Hence in (28b) the Case filter is violated and the sentence is ungrammatical. This is obviously a far more general account which subsumes the effects of the for-to filter. To conclude this section, I think it is worthwhile pointing out the importance of Chomsky and Lasnik’s paper. Even though the specific filters they propose in this paper were highly descriptive in nature, many of them laid the foundations of much of the work done in the 1980s which reached a level of explanation not until then achieved. Much of this later work was also based on filters, though filters of a far more general kind than those originally proposed. 4 Is Derivation Necessary? Finally in this lecture we turn to an issue that the notion of filters gives rise to and which has been the source of much disagreement since their proposals in the 1970s. As we have already mentioned, although filters and constraints work in a different way and so it is not always possible to replicate the effects of a particular constraint by a particular filter, and vice versa, it is always possible to replicate the effects of a constraint by a grammar that does not make use of constraints at all, but uses only filters. For example, consider the Complex NP constraint, which claimed that no transformation could move any element out of a clause contained in an NP. Such a constraint, being a constraint on transformations, is in effect a constraint on which D- and S-structures can be paired. It follows then that no filter can do the same job as filters only examine the S-structure and not the D-structure. However, with the 9 Mark Newson development of the trace convention, certain aspects of D-structure come to be encoded in Sstructure, as the trace marks the original position of the moved element. Hence an S-structure annotated with traces is like a D- and an S-structure collapsed into one. Because of this, it is possible to come up with a filter which does the same job as the Complex NP constraint: (29) The Complex NP filter * … XP1 … [NP … [S … t1 … I present this only as a demonstration of what traces allow us to do rather than as a serious proposal. The point is that traces by making D-structures visible at S-structure in many ways negate the need for D-structures altogether. One might have thought that if we do not need transformations and therefore derivations, it would be simpler to do away with them. However, there are two main arguments that are used by transformationalists to support the derivational approach. The first is the obvious surface/descriptive nature of filters, which often merely translate observations into grammatical mechanisms. As we have seen, this is not a necessary property of filters and hence it remains to be shown whether or not filters can handle all grammatical phenomena in a more explanatory way. The second argument is that in order to achieve full descriptive power, non-derivational theories always have to adopt mechanisms that do the same sort of things that transformations do. For example, suppose we generate structures with traces in place rather than being inserted by a movement rule. There will have to be a set of filters which ensure that any ‘moved’ element is always associated with a trace in the relevant position. Transformationalists would claim then that the mechanisms that allow traces to be generated, plus those which ensure their valid inclusion in a structure add up to an equivalent mechanisms to a transformation and hence no real simplicity is gained. Nonetheless, non-transformational theories have been proposed, starting from the end of the 1970s. An early example is Lexical Functional Grammar, which assumes two levels of representing syntactic expressions: one, an f-structure, which is not a constituent structure, but which represents semantic relationships between the elements of the expression, such as what is the subject and the object of a given verb, and the other, a c-structure, which is a simplified constituent structure which does not indicate movement at all. Another example is Generalised Phrase Structure Grammar, which had various mechanisms for dealing with apparently displaced elements, including slash categories, which we have mentioned a few weeks ago. More modern theories seem also to be divided along these lines. The Minimalist Programme, which started in the early 1990s is a derivational theory, which assumes just one syntactic level of representation, with transformations of a general kind being involved in the step by strep building of that structure. It makes use of one very general filter: Full Interpretation, which is satisfied only if every element in the structure is able to receive a valid interpretation with respect to the position it is placed in. In Optimality Theory, which also started in the early 1990s, grammaticality is determined solely by filters, only instead of defining grammaticality by absolute conformity to the filters, the filters are used to determine which of a number of possible structures is best and hence grammaticality is a relative rather than absolute thing. In this way the filters are kept as general as possible as they do not necessarily describe exact surface phenomena. 10 Derivation Verses Representation The fact that both derivational and representational theories still survive is an indication that the debate between the two has been inconclusive. In fact it is very difficult to argue in theoretical terms between them and virtually impossible to distinguish between then on empirical grounds. In this situation we are left with Chomsky’s original suggestion made in 1957, the theories have to be continually developed as explicitly as possible to see if anything materialises which might argue for or against any one. We appear to be a long way off from this position. References Bresnan, Joan W. 1970, On Complementisers: toward a syntactic theory of complement types, Foundations of Language 6, 297-321. Chomsky, Noam 1981 Lectures on Government and Binding, Foris, Dordrecht, Holland. Chomsky, Noam and Howard Lasnik 1977 ‘Filters and Control’ Linguistic Inquiry 8, 425504. Fiengo, Robert 1974, Semantic Conditions on Surface Structure, unpublished Doctoral dissertation, MIT, Cambridge, Mass. Leben, William 1973, Suprasegmental phonology, unpublished Doctoral dissertation, MIT, Cambridge, Mass. Perlmutter, D. 1971. Deep and surface constraints in syntax. Holt, Rinehart, and Winston, New York. Rosenbaum, Peter S. 1967, The grammar of English predicate complement constructions. Cambridge, MA: MIT Press. Ross, John R. 1972, Doubl-ing, in J. Kimball (Ed.), Syntax and semantics 1, 157-186, New York: Seminar Press. Vergnaud, Jean-Roger 1977 Letter to Noam Chomsky and Howard Lasnik, http://norvin.dlp.mit.edu/~norvin/24.902/Vergnaud.pdf 11