Download 14_ chapter v

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Modern Greek grammar wikipedia , lookup

Old English grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Word-sense disambiguation wikipedia , lookup

Ojibwe grammar wikipedia , lookup

Zulu grammar wikipedia , lookup

Arabic grammar wikipedia , lookup

Lithuanian grammar wikipedia , lookup

Stemming wikipedia , lookup

Lexical semantics wikipedia , lookup

Sanskrit grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Inflection wikipedia , lookup

Junction Grammar wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Icelandic grammar wikipedia , lookup

Romanian nouns wikipedia , lookup

Pleonasm wikipedia , lookup

Contraction (grammar) wikipedia , lookup

Untranslatability wikipedia , lookup

Compound (linguistics) wikipedia , lookup

Agglutination wikipedia , lookup

Turkish grammar wikipedia , lookup

Spanish grammar wikipedia , lookup

Esperanto grammar wikipedia , lookup

French grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Morphology (linguistics) wikipedia , lookup

Polish grammar wikipedia , lookup

OK wikipedia , lookup

Malay grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Transcript
Chapter 5
PROPOSED SYSTEM DESIGN
5.1.
English language Structure
English language is a member of the West Germanic group of the Germanic subfamily
of the Indo-European family of languages spoken by about 470 million people throughout
the world. English is the most widely scattered of the great speech communities. It is also
the most commonly used auxiliary language in the world. The syntactic structure of a
language is determined by the word order. Words are classified into 8 parts-of-speech
(POS) [noun, pronoun, adjective, verb, conjunction, preposition, interjection]. The
arrangement of these POS in sentences is determined according to the structure the
language follows. English follows Subject-Verb- Object (SVO) structure [2] [4]. The
syntactic structure of English language is shown in Figure 5.1
Part of Speech
(English)
Noun
Pronoun
Verb
Adjective
Adverb
Preposition
Conjunction
Interjection
Figure 5.1 POS Structure of English Language
5.2.
Marathi Language Structure
Marathi is the language spoken primarily by the native people of Maharashtra, a state of
Indian sub continent. There are about 90 million people who speak Marathi worldwide. It
is the oldest of the Indo-Aryan regional languages. It is thought to be approximately 1300
87 | P a g e
5 Proposed System Design
years old and it is considered that this language evolved from Sanskrit and Prakrit (a
group of languages spoken in ancient India), and its syntax and grammar, from Pali.
Marathi words can be divided into following categories [Naam, Sarvanam, Kriyapad,
Visheshan, Shabdyagi Avaya,
Kriya Visheshan Avyay,
Ubhayanvayi Avyay,
Kevalprayogi Avyay]. The POS (SHABDANCHYA JATI) for Marathi language can be
shown figure 5.2. The Marathi language follows Subject- Object- Verb (SOV) structure
[1] [3] [11].
Figure 5.2 POS Structure of Marathi Language
88 | P a g e
5 Proposed System Design
5.3.
Part of Speech
Part of speech is the common name for a word class--a category into which words are
placed according to the work they do in a sentence.
NOUN - Words that name person, places and things are called nouns. Following table
5.1 shows the examples of nouns in English and Marathi language .
Table 5.1 Nouns in English and Marathi language
Word class
English words
Marathi words
noun
Sai, garden, cow
साई, बाग, गाय
PRONOUN -- Pronouns refer to and replace nouns (the names of people, places, and
things) that have already been mentioned, or that the speaker/writer assumes are
understood by the listener/reader. Following table 5.2 shows the examples of pronouns in
English and Marathi language.
Table 5.2 Pronouns in English and Marathi language
Word class
English word Example
Marathi word example
Pronoun
I, you, he, she, ours, them, who
मी, तू, तो, ती, ते, आ ह , कोण, मला
VERB -- the verb identifies action or state of being. Following table 5.3 shows the
examples of verb in English and Marathi language.
Table 5.3 Verb in English and Marathi language
Word class
English word Example
Marathi word example
verb
Go, write, sing, reads
जाणे, िल हणे, गाणे, वाचणे
ADJECTIVE-- To talk or write about a person place or thing, you use nouns like girl,
house, or tree. To add descriptions to those nouns that give the reader a clearer picture of
what you mean, you add “detail” words in front of the noun like little, blue, rich, old.
Words that tell more about nouns or pronouns are called adjectives. Following table 5.4
shows the examples of adjective in English and Marathi language.
89 | P a g e
5 Proposed System Design
Table 5.4 Adjective in English and Marathi language
Word class
English word Example
Marathi word example
adjective
Big, small, black, one, clean
मोठा, लहान, काळा, एक, व छ
ADVERB -- the adverbs modifies a verb, adjective, or other adverb. Following table 5.5
shows the examples of adverb in English and Marathi language.
Table 5.5 Adverb in English and Marathi language
Word class
English word Example
Marathi word example
adverb
Softly, often, lovely, very
हळु वार, कधीतर , ेमळ, खूप
PREPOSION -- preposition shows a relationship between a noun (or pronoun) and other
words in a sentence. Following table 5.6 shows the examples of preposition in English
and Marathi language.
Table 5.6 Preposition in English and Marathi language
Word class
English word Example
Marathi word example
preposition
up, over, against, by, for
वर, वरती, व
, बाजूला/कडू न, यासाठ
CONJUNCTION-- Conjunctions, like prepositions, are also joining words or
connectives. Conjunctions are used to join words, phrases, or clauses. 5 Conjunctions can
be found in any position in a sentence except the very end. Following table 5.7 shows the
examples of conjunction in English and Marathi language.
Table 5.7 Conjunction in English and Marathi language
Word class
English word Example
Marathi word example
conjunction
and, but, or, yet
आ ण, परं त,ु कंवा, पयत
INTERJENCTION---- An interjection is a word or group of words used to express
strong feeling. Interjection are used to expresses emotion It can be an actual word, or
90 | P a g e
5 Proposed System Design
merely a sound and is followed by an exclamation mark (!) or a comma [2] [4].
Following table 5.8 shows the examples of interjection in English and Marathi language.
Table 5.8 Interjection in English and Marathi language
5.4.
Word class
English word Example
Marathi word example
interjection
ah, whoops, ouch, wow, alas
अबब, आहा, बापरे
Inflectional Properties of words
The word features classified in two types depending on the Inflection as :
Inflectional Words: ( वकार श द )
· Noun (नाम )
· Pronoun ( सवनाम )
· Adjective ( वशेषण )
यापद )
· Verb (
Non-Inflectional Words ( अ वकार श द )
· Adverb (
या वशेषण अ यय )
· Preposition ( श दयोगी अ यय )
· Interjection ( उ ारवाचक / केवल योगी अ यय )
· Conjunction ( उभया वयी अ यय )
The words are inflected on the basis of changing
Gender (Masculine, Feminine, Neuter). [ िलंग ]
Multiplicity (Singular, Plural) [ वचन ]
Tense (Present, Past, Future) [ काळ ]
Case (Nominative, Accusative, Instrumental, Dative,
Ablative, Genitive, Locative, Vocative) [ वभ
5.5.
][6][7]
Lexical representation OF Inflectional words
91 | P a g e
5 Proposed System Design
The Lexical representation of inflectional word class is explained as above [5] [8] [9]
[10].

Noun Inflection
Noun inflection is performed on the basis of change in Gender, Multiplicity or Case
(Vibhakti). The inflection of a word can be determined from the word endings. The
lexical representation for noun in shown in figure 5.3
Figure 5.3 Lexical Representation of Noun in Database
The database structure for noun can be mentioned as in table 5.9
Table 5.9 Database structure for Nouns in English and Marathi language
English Word
Mango
Marathi word
Mangos

Gender
1
Number
1
1
2
Verb inflections
The verb inflections are given as
The Gender: Masculine, Feminine and Neuter.
The Number: Singular, Plural.
Person: First, Second and Third.
Tenses: Present, Past and Future.
The lexical representation for noun in shown in figure 5.4
92 | P a g e
5 Proposed System Design
Figure 5.4 Lexical Representation of Verb in Database
The database structure for verb can be mentioned as in table 5.10
Table 5.10 Database structure for verb in English and Marathi language
English Word
write
Writes

Marathi word
Gender
2
Number
1
Person
1
1
2
2
Adjective inflection
Adjective is a verb which is joined to a noun to qualify it. Inflection of adjective depends
upon gender, multiplicity, attachment of postpositions to the noun modified by such
objective. When genitive case makers or some prepositions are attached to nouns, it
produces adjective. The lexical representation for noun in shown in figure 5.5
Figure 5.5 Lexical Representation of Adjective in Database
The database structure for Adjective can be mentioned as in table 5.11
93 | P a g e
5 Proposed System Design
Table 5.11 Database structure for Adjective in English and Marathi language
English Word
big

Marathi word
Gender
1
Number
1
big
2
1
big
1
2
Pronoun –
A pronoun is a word that can be substituted for a noun or a noun phrase. Pronoun
inflection is similar to noun inflection but there are some special cases which need to
handle separately. For the verbs such as “like”, “want”, “will”, “need” and “would”, the
pronoun inflection are different than general cases. Some sentences have the same
structure along with the parse tree, gender, multiplicity, cases but have different
translation of pronoun. The lexical representation for noun in shown in figure 5.6
Figure 5.6 Lexical Representation of Pronoun in Database
The database structure for Pronoun can be mentioned as in table 5.12
Table 5.12 Database structure for Pronoun in English and Marathi language
English Word
he
she
Marathi word
Gender
1
Number
1
Person
3
2
1
3
94 | P a g e
5 Proposed System Design
5.6.
Proposed System Design
The architecture of the proposed system can be represented as in figure 5.7.
Figure 5.7 Proposed System Architecture

Input Unit
In this unit an English sentence is given as input to the system for translation into
Marathi sentence. For example the English input sentence given is:
She is going.

Sentence Tokenizer
In this step the words and punctuation marks are separated into tokens. This is
called tokenization1. These tokens are passed to the morphological analyzer unit.
The given English sentence is separated into tokens as:
Word (0) = She
Word (1) = is
Word (2) = going
1
S. B. Kulkarni*, Karbhari V. Kale “Tokenization in Natural Language Processing” published in proceedings of IEEE
sponsored International Conference “Advances in Computer Vision and Information Technology 07” held at
Aurangabad during 28-30 November 2007.
95 | P a g e
5 Proposed System Design

Morphological Analyzer and POS Tagging
In this step the morphological features are considered. The given tokens are
identified and part of speech tagging is done based on the database. For each
word, the morphological analyzer generates the appropriate word with full
grammatical information such as nouns, verb, and adjective along with its
inflectional properties as gender, number, and person.
The tokens separated are tagged and shown as
Word (0) = (PRP, 2, 1, 3)
Word (1) = (VBZ, 1)
Word (2) = (VBG, 1, 1)

Syntactic Parser
The input for this is unit is generated from the Morphological analyzer unit. It
receives tokenized words from the morphological analyzer and composes
grammatically correct sentence. If the given sentence is grammatically correct
then only it is sent for the further process. The syntactic parse tree can be given
as:
[ROOT
[S
[NP [PRP She]]
[VP [VBZ is]
[VP [VBG going]]]]]
Parse Tree
96 | P a g e
5 Proposed System Design

Marathi Sentence generator
Since the structure of English language and Marathi language is different. The
structure of English language is Subject + Verb + Object (SVO). But, the
structure of Marathi language is Subject + Object + Verb (SOV).
While translating a sentence form English to Marathi language the reordering step
plays an important role.In this unit the sentences are generated in Marathi
Language according to rules in the database. The correct Marathi sentence
generated with the help of reordering rules in the database. The output after
reordering is given below:

Output Unit
In this unit the generated Marathi Sentence is displayed. The output generated is
given as
97 | P a g e
5 Proposed System Design
Thus the given input English sentence is translated into Marathi Sentence with the help of
proposed system. The various set of experiments are done and each step of the proposed
system is shown in the Experimental Work.
5.7.
Summary
In this chapter the structure of English and Marathi language are given. Also the
inflectional properties and the lexical representation are considered for word classes are
represented. The architecture and working of the proposed system are explained in this
chapter.
5.8.
References
1. Navalkar, Ganpatrao, “The Student’s Marathi Grammar”, Asian Educational
Services, 2001.
2. Wren and Martin, “New Edition High School English Grammar & Composition”, S.
Chand.
3. Gowilkar, Leela, Marathiche Vyakran, Mehta Publishing House, 2006.
4. Walambe, M.R, Sugam Marathi Vyakaran Lekha, Nitin Prakashan, 2005.
5. Bharati,Akshar,Vineet Chaitanya, Rajeev Sangal, Natural Language Processing: A
Paninian Perspective, Prentice-Hall of India,1995
6. B. Hettige, A. S. Karunananda,” Developing Lexicon Databases for English to
Sinhala Machine Translation”, Second International Conference on Industrial and
Information Systems, ICIIS 2007, 8 – 11 August 2007, Sri Lanka, ©2007 IEEE.
7. Promila Bahadur, A.K Jain, D. S Chauhan, “EtranS-English to Sanskrit Machine
Translation” ICWET 2012, Bombay, ACM 2012 [3]English to Sanskrit machine
translation semantic mapper, International Journal of Engineering Science and
Technology Vol. 2(10), 2010, 5313-5318
8. Uday C. Patkar , Prakash R. Devale “ Transformation Of Multiple English Text
Sentences To Vocal Sanskrit Using Rule Based Technique”, International Journal of
Computers and Distributed Systems www.ijcdsonline.com Vol. No.2, Issue 1,
December 2012, ISSN: 2278-5183.
98 | P a g e
5 Proposed System Design
9. B. Hettige1, A. S. Karunananda,” Developing Lexicon Databases for English to
Sinhala Machine Translation”, Second International Conference on Industrial and
Information Systems, ICIIS 2007, 8 – 11 August 2007, Sri Lanka, 1-4244-11521/07/$25.00 ©2007 IEEE.
10. Remya Rajan, Remya Sivan, Remya Ravindran, K.P Somanm,” Rule Based Machine
Translation from English to Malayalam”, 2009 International Conference on Advances
in Computing, Control, and Telecommunication Technologies, 978-0-7695-39157/09 $26.00 © 2009 IEEE, DOI 10.1109/ACT.2009.113
11. Pallavi Bagul, Archana Mishra, Prachi Mahajan, Medinee Kulkarni, Gauri
Dhopavkar,” Rule Based POS Tagger for Marathi Text”, (IJCSIT) International
Journal of Computer Science and Information Technologies, Vol. 5 (2) , 2014, 13221326,ISSN: 0975-9646.
99 | P a g e