Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Undergraduate Dissertation – Extended Proposal Provisional title: “Paraphrasing metonymic verbs using vector space models.” 1. Research question The aim of the study is to develop a model which will allow the automatic paraphrasing of constructions that employ metonymic verbs. Consider the following sentences: (1) The cook finished eating the meal. (2) The cook finished the meal. In sentence (1) the aspectual verb ‘finish’ combines with a Verb Phrase meaning the event of eating a meal, thus (1) refers to the termination of an event. Contrast this with (2), where ‘finish’ instead combines with a Noun Phrase referring to a specific meal. The resulting sentence concerns the termination of an unspecified event involving ‘the meal’. Interestingly, the structure of (2), [NP[V[NP]]], does not include an event whose termination the sentence could be referring to. Arguments that could pair with the aspectual verb ‘finish’ are restricted to those with temporal or eventive meanings. This restriction is not directly satisfied by ‘the meal’, yet human judges will be able to make sense of (2). Katsika et al. (2012) suggest that the fact that sentences like (2) make sense despite this conflict means that “a temporal/eventive argument is supplied to [aspectual verbs] at some point during the interpretation of the sentence” (p. 59). Jackendoff (1997) describes logical metonymy as an instance of “enriched composition” (p. 49), and Utt et al. (2013) succinctly define it as consisting of an event-selecting verb combining with an entity-denoting noun. The interpretation of sentences like (2) entails the recovery of a covert event (e.g. eating, making, cooking). My study aims to perform this recovery by using distributional semantics, extracting data from the British National Corpus. 2. Motivation for the study My interest in focusing on verbs stems partly from the fact that other aspects of language have received more attention in past computational studies of semantics. Existing computational accounts of metonymy in the literature may sometimes explore other instances of metonymy, such as those which use toponyms or proper names in general (Markert and Nissim 2006). Additionally, the abundance of psycholinguistic studies of verbal metonymy as opposed to the relative scarcity of papers from a computational / distributional semantics point of view encourages me to pursue my research question. The frequency with which metonymy happens in natural language and the ease with which humans can interpret it through context and our knowledge of the world makes metonymy an interesting challenge to model computationally. 1 of 4 3. Research context Shutova et al.’s 2012 paper on using techniques from distributional semantics to compute likely candidates for the meanings of metaphors has been a major influence in the preparation of this extended proposal. The methodology introduced below echoes their study in part, namely in getting me to think about possible obstacles and improvements in regards to the ranking algorithm. Additionally, it may be useful to return to this study in order to evaluate my results, seeing how they compare to the state-of-the-art 0.52 precision they obtain in a very similar task. In the literature, the interpretation of metonymic sentences has been described in terms of a type clash between the “verb’s selectional restrictions and the noun’s type” (Utt et al. 2013). Psycholinguistic studies conducted on this feature of language include McElree et al. (2001) and Traxler et al. (2002). The latter tested combinations of metonymic and non-metonymic verbs with both entity- and event-denoting nouns (e.g. The cook [finished / saw]V [the meal / the fight ]NP). The study found that sentences featuring a metonymic verb and an entitydenoting object (‘The cook finished the meal’ – the “coercion combination”) involved higher processing costs. It remains to be seen what role such psycholinguistic studies / human judgements will play in the analysis of my data. At the moment this serves to emphasize that my background reading has not been restricted to the computational domain and to show awareness of the accounts researchers have given for metonymy (regardless of whether their approach was based on experiments centred on human evaluation or distributional semantics). 4. Methodology In broad terms, the study follows three steps: first, the derivation of potential interpretations for candidate sentences using distributional techniques; followed by the word sense disambiguation of these interpretations, and finally, an evaluation of the study. This last phase may take the form of an informal verification by the author of whether the model is returning sensible possibilities or a more rigorous assessment through Amazon Mechanical Turk. A first step is the identification of verbs which occur most frequently in and lend themselves easily to metonymic constructions. Utt et al. (2013) ask the questions: “What is a metonymic verb?” and “Are all metonymic verbs alike?” (p. 31). They develop empirical answers to these questions by introducing a measure of eventhood which captures the extent to which “verbs expect objects that are events rather than entities” (p. 31). Utt et al. provide both a useful list of metonymic verbs as well as one of non-metonymic verbs. The list builds from the datasets provided by two previous psycholinguistic studies: Traxler et al. (2002) and Katsika et al. (2012). This will give me both a starting point and an empirical account of verbs which are most suitable for study and those which are best for a control condition (particularly for any human judgement tasks that I may wish to perform as evaluation). The existence of the list is useful since it allows me to bypass the ongoing debate regarding whether individual verbs lend themselves to metonymy (this debate has been approached both by theorists (Pustejovsky 1991), psycholinguists (McElree et al. 2001) and computational linguists (Lapata et al., 2003)) and instead concentrate on my research question. 2 of 4 4.1 Creating vectors An approach that I think will work well is a traditional bag-of-words vector space. The target word will be each metonymic verb whose distribution I wish to analyse (the number of verbs under consideration will be between one and three, per my advisor’s recommendations). The space, constructed by using the British National Corpus, will use the 2000 most frequent context words as dimensions (following Erk and Padó 2008). However, I will tweak Erk and Padó’s surface window size. They originally used a size of 10 words, yet after reading Shutova et al.’s (2012) arguments for a smaller surface window (backed up by their experimental success) I will be using a window of 3 words, more suitable for the present task. This shares some similarities with Lapata and Lascarides’ (2003) approach, which was reimplemented by Shutova et al. (2012). I am hopeful that the use of a newer version of the BNC (cf. Lapata and Lascarides) and a more robust parser (cf. Shutova et al.’s use of RASP – I will be using NLTK’s parser trained on a portion of the Penn Treebank) will yield accurate data. I have yet to establish all of the finer details of the ranking algorithm, but have some ideas about how it will function. The time-tested metric of cosine similarity between target and candidate vectors is an obvious starting point to generate a ranking of paraphrases. However, some of my background reading (namely Shutova et al. 2012, and Utt et al. 2013) leads me to believe that this will not be sufficient on its own. Rather, it is possible that I will have to implement some refinements to cut down on noise (Erk and Padó 2008 have some good ideas concerning the filtering of vectors rarely seen in the data) and implement word sense disambiguation at this stage. 4.2 Evaluation The aforementioned Katsika et al. (2012) paper is an eye-tracking study which classes metonymic verbs into “aspectual” and “psychological” predicates. Katsika et al. isolate the processing behaviour for each category and suggest two sources of eventive meaning: “compositional versus inferential” (p. 12). While not strictly relevant at this moment in time, such studies could help me account for results in the processing of metonymic verbs by human judges. (That is, should the performance of my model be evaluated through Amazon Mechanical Turk or a similar system). Alternatively, a simpler performance evaluation could be carried out by using standardised tests. This would make for a simple validation for which human performance is already known. There is precedent for this approach in Rapp (2003) and Turney (2006), who test vector-based representations of meaning on TOEFL and SAT exams and compare their models to the average human score. (A note on this approach: I find that Rapp’s account of his model’s impressive 30% accuracy advantage over humans does not place nearly enough emphasis on the fact that the TOEFL is taken almost exclusively by speakers of English as a second language. It would make more sense to compare a model to exams taken primarily by native speakers as done by Turney). 3 of 4 References Erk, K. & Padó, S. (2008). A Structured Vector Space Model for Word Meaning. Proceedings of the Conference on Empirical Methods in NLP, 897–906. Jackendoff, R. (1997). The Architecture of the Language Faculty. MIT Press, Cambridge. Katsika, A., Braze, D., Deo, A., & Piñango, M. M. (2012). Complement coercion: Distinguishing between type-shifting and pragmatic inferencing. The mental lexicon, 7(1), 58-76. Lapata, M. & A. Lascarides (2003). A Probabilistic Account of Logical Metonymy. Computational Linguistics 29:2, 263-317. Lapata, M., F. Keller, & C. Scheepers (2003). Intra-sentential context effects on the interpretation of logical metonymy. Cognitive Science 27(4), 649–668. Markert, K. & Nissim, M. (2006). Metonymic proper names: A corpus-based account. Trends in Linguistics Studies and Monographs, 171-152. McElree, B., M. Traxler, M. Pickering, R. Seely, and R. Jackendoff (2001). Reading time evidence for enriched composition. Cognition 78(1), 17-25. Pustejovsky, J. (1991). The generative lexicon. Computational Linguistics, 17(4), 409-441. Rapp, R. (2003). Word sense discovery based on sense descriptor dissimilarity. Proceedings of the Ninth Machine Translation Summit, 315-322. Shutova, E., Van de Cruys, T. & Korhonen, A. (2012). Unsupervised Metaphor Paraphrasing using a Vector Space Model. Proceedings of COLING 2012, 1121-1130. Traxler, M. J., Morris, R. K., & Seely, R. E. (2002). Processing subject and object relative clauses: Evidence from eye movements. Journal of Memory and Language, 47(1), 69-90. Turney, P. D. (2006). Similarity of semantic relations. Computational Linguistics, 32 (3), 379-416. Utt, J., Lenci, A., Padó, S. & Zarcone, A. (2013). The curious case of metonymic verbs: A distributional characterization. Proceedings of the International Conference on Computational Semantics, 30-39. 4 of 4