* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 14_ chapter v
Modern Greek grammar wikipedia , lookup
Old English grammar wikipedia , lookup
Portuguese grammar wikipedia , lookup
Word-sense disambiguation wikipedia , lookup
Ojibwe grammar wikipedia , lookup
Zulu grammar wikipedia , lookup
Arabic grammar wikipedia , lookup
Lithuanian grammar wikipedia , lookup
Lexical semantics wikipedia , lookup
Sanskrit grammar wikipedia , lookup
Ancient Greek grammar wikipedia , lookup
Yiddish grammar wikipedia , lookup
Latin syntax wikipedia , lookup
Junction Grammar wikipedia , lookup
Serbo-Croatian grammar wikipedia , lookup
Icelandic grammar wikipedia , lookup
Romanian nouns wikipedia , lookup
Contraction (grammar) wikipedia , lookup
Untranslatability wikipedia , lookup
Compound (linguistics) wikipedia , lookup
Agglutination wikipedia , lookup
Turkish grammar wikipedia , lookup
Spanish grammar wikipedia , lookup
Esperanto grammar wikipedia , lookup
French grammar wikipedia , lookup
Scottish Gaelic grammar wikipedia , lookup
Morphology (linguistics) wikipedia , lookup
Polish grammar wikipedia , lookup
Chapter 5 PROPOSED SYSTEM DESIGN 5.1. English language Structure English language is a member of the West Germanic group of the Germanic subfamily of the Indo-European family of languages spoken by about 470 million people throughout the world. English is the most widely scattered of the great speech communities. It is also the most commonly used auxiliary language in the world. The syntactic structure of a language is determined by the word order. Words are classified into 8 parts-of-speech (POS) [noun, pronoun, adjective, verb, conjunction, preposition, interjection]. The arrangement of these POS in sentences is determined according to the structure the language follows. English follows Subject-Verb- Object (SVO) structure [2] [4]. The syntactic structure of English language is shown in Figure 5.1 Part of Speech (English) Noun Pronoun Verb Adjective Adverb Preposition Conjunction Interjection Figure 5.1 POS Structure of English Language 5.2. Marathi Language Structure Marathi is the language spoken primarily by the native people of Maharashtra, a state of Indian sub continent. There are about 90 million people who speak Marathi worldwide. It is the oldest of the Indo-Aryan regional languages. It is thought to be approximately 1300 87 | P a g e 5 Proposed System Design years old and it is considered that this language evolved from Sanskrit and Prakrit (a group of languages spoken in ancient India), and its syntax and grammar, from Pali. Marathi words can be divided into following categories [Naam, Sarvanam, Kriyapad, Visheshan, Shabdyagi Avaya, Kriya Visheshan Avyay, Ubhayanvayi Avyay, Kevalprayogi Avyay]. The POS (SHABDANCHYA JATI) for Marathi language can be shown figure 5.2. The Marathi language follows Subject- Object- Verb (SOV) structure [1] [3] [11]. Figure 5.2 POS Structure of Marathi Language 88 | P a g e 5 Proposed System Design 5.3. Part of Speech Part of speech is the common name for a word class--a category into which words are placed according to the work they do in a sentence. NOUN - Words that name person, places and things are called nouns. Following table 5.1 shows the examples of nouns in English and Marathi language . Table 5.1 Nouns in English and Marathi language Word class English words Marathi words noun Sai, garden, cow साई, बाग, गाय PRONOUN -- Pronouns refer to and replace nouns (the names of people, places, and things) that have already been mentioned, or that the speaker/writer assumes are understood by the listener/reader. Following table 5.2 shows the examples of pronouns in English and Marathi language. Table 5.2 Pronouns in English and Marathi language Word class English word Example Marathi word example Pronoun I, you, he, she, ours, them, who मी, तू, तो, ती, ते, आ ह , कोण, मला VERB -- the verb identifies action or state of being. Following table 5.3 shows the examples of verb in English and Marathi language. Table 5.3 Verb in English and Marathi language Word class English word Example Marathi word example verb Go, write, sing, reads जाणे, िल हणे, गाणे, वाचणे ADJECTIVE-- To talk or write about a person place or thing, you use nouns like girl, house, or tree. To add descriptions to those nouns that give the reader a clearer picture of what you mean, you add “detail” words in front of the noun like little, blue, rich, old. Words that tell more about nouns or pronouns are called adjectives. Following table 5.4 shows the examples of adjective in English and Marathi language. 89 | P a g e 5 Proposed System Design Table 5.4 Adjective in English and Marathi language Word class English word Example Marathi word example adjective Big, small, black, one, clean मोठा, लहान, काळा, एक, व छ ADVERB -- the adverbs modifies a verb, adjective, or other adverb. Following table 5.5 shows the examples of adverb in English and Marathi language. Table 5.5 Adverb in English and Marathi language Word class English word Example Marathi word example adverb Softly, often, lovely, very हळु वार, कधीतर , ेमळ, खूप PREPOSION -- preposition shows a relationship between a noun (or pronoun) and other words in a sentence. Following table 5.6 shows the examples of preposition in English and Marathi language. Table 5.6 Preposition in English and Marathi language Word class English word Example Marathi word example preposition up, over, against, by, for वर, वरती, व , बाजूला/कडू न, यासाठ CONJUNCTION-- Conjunctions, like prepositions, are also joining words or connectives. Conjunctions are used to join words, phrases, or clauses. 5 Conjunctions can be found in any position in a sentence except the very end. Following table 5.7 shows the examples of conjunction in English and Marathi language. Table 5.7 Conjunction in English and Marathi language Word class English word Example Marathi word example conjunction and, but, or, yet आ ण, परं त,ु कंवा, पयत INTERJENCTION---- An interjection is a word or group of words used to express strong feeling. Interjection are used to expresses emotion It can be an actual word, or 90 | P a g e 5 Proposed System Design merely a sound and is followed by an exclamation mark (!) or a comma [2] [4]. Following table 5.8 shows the examples of interjection in English and Marathi language. Table 5.8 Interjection in English and Marathi language 5.4. Word class English word Example Marathi word example interjection ah, whoops, ouch, wow, alas अबब, आहा, बापरे Inflectional Properties of words The word features classified in two types depending on the Inflection as : Inflectional Words: ( वकार श द ) · Noun (नाम ) · Pronoun ( सवनाम ) · Adjective ( वशेषण ) यापद ) · Verb ( Non-Inflectional Words ( अ वकार श द ) · Adverb ( या वशेषण अ यय ) · Preposition ( श दयोगी अ यय ) · Interjection ( उ ारवाचक / केवल योगी अ यय ) · Conjunction ( उभया वयी अ यय ) The words are inflected on the basis of changing Gender (Masculine, Feminine, Neuter). [ िलंग ] Multiplicity (Singular, Plural) [ वचन ] Tense (Present, Past, Future) [ काळ ] Case (Nominative, Accusative, Instrumental, Dative, Ablative, Genitive, Locative, Vocative) [ वभ 5.5. ][6][7] Lexical representation OF Inflectional words 91 | P a g e 5 Proposed System Design The Lexical representation of inflectional word class is explained as above [5] [8] [9] [10]. Noun Inflection Noun inflection is performed on the basis of change in Gender, Multiplicity or Case (Vibhakti). The inflection of a word can be determined from the word endings. The lexical representation for noun in shown in figure 5.3 Figure 5.3 Lexical Representation of Noun in Database The database structure for noun can be mentioned as in table 5.9 Table 5.9 Database structure for Nouns in English and Marathi language English Word Mango Marathi word Mangos Gender 1 Number 1 1 2 Verb inflections The verb inflections are given as The Gender: Masculine, Feminine and Neuter. The Number: Singular, Plural. Person: First, Second and Third. Tenses: Present, Past and Future. The lexical representation for noun in shown in figure 5.4 92 | P a g e 5 Proposed System Design Figure 5.4 Lexical Representation of Verb in Database The database structure for verb can be mentioned as in table 5.10 Table 5.10 Database structure for verb in English and Marathi language English Word write Writes Marathi word Gender 2 Number 1 Person 1 1 2 2 Adjective inflection Adjective is a verb which is joined to a noun to qualify it. Inflection of adjective depends upon gender, multiplicity, attachment of postpositions to the noun modified by such objective. When genitive case makers or some prepositions are attached to nouns, it produces adjective. The lexical representation for noun in shown in figure 5.5 Figure 5.5 Lexical Representation of Adjective in Database The database structure for Adjective can be mentioned as in table 5.11 93 | P a g e 5 Proposed System Design Table 5.11 Database structure for Adjective in English and Marathi language English Word big Marathi word Gender 1 Number 1 big 2 1 big 1 2 Pronoun – A pronoun is a word that can be substituted for a noun or a noun phrase. Pronoun inflection is similar to noun inflection but there are some special cases which need to handle separately. For the verbs such as “like”, “want”, “will”, “need” and “would”, the pronoun inflection are different than general cases. Some sentences have the same structure along with the parse tree, gender, multiplicity, cases but have different translation of pronoun. The lexical representation for noun in shown in figure 5.6 Figure 5.6 Lexical Representation of Pronoun in Database The database structure for Pronoun can be mentioned as in table 5.12 Table 5.12 Database structure for Pronoun in English and Marathi language English Word he she Marathi word Gender 1 Number 1 Person 3 2 1 3 94 | P a g e 5 Proposed System Design 5.6. Proposed System Design The architecture of the proposed system can be represented as in figure 5.7. Figure 5.7 Proposed System Architecture Input Unit In this unit an English sentence is given as input to the system for translation into Marathi sentence. For example the English input sentence given is: She is going. Sentence Tokenizer In this step the words and punctuation marks are separated into tokens. This is called tokenization1. These tokens are passed to the morphological analyzer unit. The given English sentence is separated into tokens as: Word (0) = She Word (1) = is Word (2) = going 1 S. B. Kulkarni*, Karbhari V. Kale “Tokenization in Natural Language Processing” published in proceedings of IEEE sponsored International Conference “Advances in Computer Vision and Information Technology 07” held at Aurangabad during 28-30 November 2007. 95 | P a g e 5 Proposed System Design Morphological Analyzer and POS Tagging In this step the morphological features are considered. The given tokens are identified and part of speech tagging is done based on the database. For each word, the morphological analyzer generates the appropriate word with full grammatical information such as nouns, verb, and adjective along with its inflectional properties as gender, number, and person. The tokens separated are tagged and shown as Word (0) = (PRP, 2, 1, 3) Word (1) = (VBZ, 1) Word (2) = (VBG, 1, 1) Syntactic Parser The input for this is unit is generated from the Morphological analyzer unit. It receives tokenized words from the morphological analyzer and composes grammatically correct sentence. If the given sentence is grammatically correct then only it is sent for the further process. The syntactic parse tree can be given as: [ROOT [S [NP [PRP She]] [VP [VBZ is] [VP [VBG going]]]]] Parse Tree 96 | P a g e 5 Proposed System Design Marathi Sentence generator Since the structure of English language and Marathi language is different. The structure of English language is Subject + Verb + Object (SVO). But, the structure of Marathi language is Subject + Object + Verb (SOV). While translating a sentence form English to Marathi language the reordering step plays an important role.In this unit the sentences are generated in Marathi Language according to rules in the database. The correct Marathi sentence generated with the help of reordering rules in the database. The output after reordering is given below: Output Unit In this unit the generated Marathi Sentence is displayed. The output generated is given as 97 | P a g e 5 Proposed System Design Thus the given input English sentence is translated into Marathi Sentence with the help of proposed system. The various set of experiments are done and each step of the proposed system is shown in the Experimental Work. 5.7. Summary In this chapter the structure of English and Marathi language are given. Also the inflectional properties and the lexical representation are considered for word classes are represented. The architecture and working of the proposed system are explained in this chapter. 5.8. References 1. Navalkar, Ganpatrao, “The Student’s Marathi Grammar”, Asian Educational Services, 2001. 2. Wren and Martin, “New Edition High School English Grammar & Composition”, S. Chand. 3. Gowilkar, Leela, Marathiche Vyakran, Mehta Publishing House, 2006. 4. Walambe, M.R, Sugam Marathi Vyakaran Lekha, Nitin Prakashan, 2005. 5. Bharati,Akshar,Vineet Chaitanya, Rajeev Sangal, Natural Language Processing: A Paninian Perspective, Prentice-Hall of India,1995 6. B. Hettige, A. S. Karunananda,” Developing Lexicon Databases for English to Sinhala Machine Translation”, Second International Conference on Industrial and Information Systems, ICIIS 2007, 8 – 11 August 2007, Sri Lanka, ©2007 IEEE. 7. Promila Bahadur, A.K Jain, D. S Chauhan, “EtranS-English to Sanskrit Machine Translation” ICWET 2012, Bombay, ACM 2012 [3]English to Sanskrit machine translation semantic mapper, International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5313-5318 8. Uday C. Patkar , Prakash R. Devale “ Transformation Of Multiple English Text Sentences To Vocal Sanskrit Using Rule Based Technique”, International Journal of Computers and Distributed Systems www.ijcdsonline.com Vol. No.2, Issue 1, December 2012, ISSN: 2278-5183. 98 | P a g e 5 Proposed System Design 9. B. Hettige1, A. S. Karunananda,” Developing Lexicon Databases for English to Sinhala Machine Translation”, Second International Conference on Industrial and Information Systems, ICIIS 2007, 8 – 11 August 2007, Sri Lanka, 1-4244-11521/07/$25.00 ©2007 IEEE. 10. Remya Rajan, Remya Sivan, Remya Ravindran, K.P Somanm,” Rule Based Machine Translation from English to Malayalam”, 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies, 978-0-7695-39157/09 $26.00 © 2009 IEEE, DOI 10.1109/ACT.2009.113 11. Pallavi Bagul, Archana Mishra, Prachi Mahajan, Medinee Kulkarni, Gauri Dhopavkar,” Rule Based POS Tagger for Marathi Text”, (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 5 (2) , 2014, 13221326,ISSN: 0975-9646. 99 | P a g e