Download Toward an Ontology of the Sumerian Language Part 1. The

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Arabic grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Pleonasm wikipedia , lookup

Compound (linguistics) wikipedia , lookup

Untranslatability wikipedia , lookup

Spanish grammar wikipedia , lookup

Old Irish grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Morphology (linguistics) wikipedia , lookup

Inflection wikipedia , lookup

Malay grammar wikipedia , lookup

Zulu grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Agglutination wikipedia , lookup

Transformational grammar wikipedia , lookup

Parsing wikipedia , lookup

Esperanto grammar wikipedia , lookup

Junction Grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Transcript
Toward an Ontology of the Sumerian Language
Part 1. The Sumerian Language
§ 1. The Sumerian language has characteristics which it shares with many other languages of
the world, living or dead. Unfortunately, it does not seem to share all its set of characteristics
with the set of any other known language of the world, living or dead. In other words, it has
contradictory and conflicting aspects and non consequential or ambiguous relationships
between its parts and stands as a unicum among the languages. Of course, that simply means
that we Assyriologists do not have, to date, a reliable reconstruction of the Sumerian
Grammar. In fact, it is a common knowledge and a matter of fact among scholars of Third
Millennium B.C. Mesopotamia that every Sumerologist has his own grammar in mind, which
evidently differs from the one of all the other scholars in the field in this or that point. So, we
have to deal, speaking about Sumerian, with a sort of strange grammatical monstrum (slide n.
1-3). This is the areas of Sumerian development..
Joking aside, there is one very good reason for the difficulties Sumerologists have in
reconstructing a reliable grammar of the Sumerian language, and it is of historical nature.
During all its history, starting from the very first attestations on the tablets of Uruk at the end
of the IVth mill. until the collapse of the Sumerian culture at the beginning of the IInd, the
Sumerians never strove to express in their writing all what was necessary, or is felt as
necessary by us, in order to read the text itself . To say this differently, they only wrote the
nucleum of the information, hence the definition of nuclear writing, and even if they added
more and more grammatical elements in course of time, they never regarded as obligatory to
bestow the reader in a given text all the elements which could allow him to read the text as it
was thought in the mind of the writer. The word which best describes the attitude of the reader
of a Sumerian text is “hospitalization”, that is, the reader had to add, in the necessary case, the
grammatical elements which had been not written by the writer, because the last one
considered sufficient what had been written in order to let anybody understand and
consequently hospitalize, grammatically, his message (and understand it). (slide n. 4-11).
These are examples of sumerian texts from each period.
§ 2. Typologically, the Sumerian language is agglutinative, that is the word (verb, substantive,
adjective etc.) we read in a given text is identical with the same word we find in the
vocabulary. To say it differently, in Sumerian we do not have the idea of “root”, that is a
linguistic reality bearing a basic semantic value and which, as such, does not form part of the
vocabulary (in other word, the Sumerian language does not know any flection). The
specification of the meaning happens with morphems added, agglutinated to the unchangeable
word in a fixed order after it in the case of the substantive, or before and after it in the case of
the verb; moreover, generally speaking, the morphemes have only one semantic meaning and
as such are transparent, immediately recognizable. (slide n. 12).
For instance, in the latin expression filiis, “to the children”, we can detect a root, *fili- and a
suffix *-is, which has at least three different meanings: masculine, plural and dative. In
Sumerian we should add for each one of these specifications a different morpheme, in this
specific case we should have: *dumu-nita2-ene-ra, that is: *child-male-plural-dative, “to the
children”. The morphemes build with the noun they refer to a group which is often called
“chain”, so we can speak of noun chains and, as we shall see, we can speak of verbal chains.
It is interesting to highlight that in Sumerian we have a two class system for the substantive,
class A, for the gods and the human being, and class B for all the rest.
§ 3. The Sumerian is an ergative language. That means that the subject of a sentence with one
participant is morphologically identical with the patiente of a sentence with two participants.
Let’s make an example (slide n. 13):
lugal-e e2-Ø in(=i+n)-du3-Ø
lugal-Ø i3-gin-Ø
The king built a temple
The king went
In the first sentence we have two participants, the “temple” and the “king”, while in the
second only the “king”, so to say, participates to the action. Now, it is evident that the way in
which the participants are treated morphologically is exactly the opposite of what happens, for
instance, in Latin, which is not an ergative language but has a nominative-accusative
structure:
rex templum exstruxit
rex ivit
The king built a temple
The king went
In this case the word rex has its nominative case-marker, *-s (*reg-s), which means that it is
the subject of the action in both phrases and therefore is morphologically highlighted as such.
On the other hand, in Sumerian in the two instances the word lugal, “king”, as we have seen,
has been treated differently from the morphological point of view: in the first sentence lugal is
added with the case-marker *-e, called “ergative”, which means that he is the actor (from
Greek ergàzomai, “I work, I act”) while the word e2, “temple”, is left unmarked, or rather it is
marked with the zero case-marker, which is called absolutive (in many ergative languages the
patiente is left unmarked); in the second sentence he has no case-marker, that is the word is in
the absolutive state, because the king does not work toward something else, if we want, he is
exercising his power only on himself.
§ 4. The verb is the part of the grammar where we have the biggest problems for a reliable
reconstruction of the function of the many morphemes we can detect in its structure.
First of all, we have in Sumerian a series of prefixes: it is a morpheme which seems to be
necessary in a finite verbal phrase and which is always present in a finite verbal form,
indicating perhaps that the structure is to be understood as having an acting subject. To date,
we do not have any satisfactory explanation for them and this remains the most troublesome
and problematic aspect of the entire Sumerian grammar.
Moreover, the Sumerian presents in the verb a characteristic which is admittedly rather rare in
the languages of the world and which can be labeled “verbal incorporation”. That simply
means that the verb, which is always to be found at the end of a given sentence, according to
the SOV pattern (subject-object-verb), incorporates, absorbes, so to say, inside its structure
(we shall see soon how) not only the indication of ergative-absolutive, but also a series of
other noun-phrases which have been used by the speaker (slide n. 14)
For instance, in the sentence: “The king drank beer in the garden with the general”, in the
verbal chain at the end of the sentence we should find, after a prefix which always starts a
verb-phrase, the morphem for the ergative (lugal-e), *-n- in preverbal position; the one for the
absolutive (kaš-Ø), marked with *-Ø in postradical position; the infixes, called dimensional in
Sumerian grammar, of the comitative: *-n+da-, and the one of the locative, *-b+a-; so the
form should be translated literally:
(FINITE VERB)-him+with-it+in-he(past)-drink-(it)
We have at the beginning the prefix *i-, one of the series of prefixes, chosen here with a mere
didactic scope. Then we find the so called dimensional infixes, which appear in a fixed order
(it is possible to find in the verbal forms up to four of such infixes, although the medium rate
is two); in this case: comitative (with him)-locative (in it), recalling the two noun-phrase in
the sentence. Then we have the morphem *-n-, in preverbal position, recalling the ergative
and indicating that: 1) it is third person of class A (human being); 2) it is a singular subject; 3)
that it is a past action. Finally, we find a zero-morphem in post-verbal position recalling the
absolutive (kaš, beer).
So you can easily realize how central the verbal chain is in the Sumerian language.
Let’s now pass to the description of the ontology created in order to represent in a knowledge
oriented base of information the characteristic we have briefly described above.
Part 2. The ontology of Sumerian language
The construction of an ontology of the Sumerian language was part of a wider project of the
company Epistematica s.r.l. for testing "Semantic Web tecnology" and its application on the
computational linguistics. In this context, the company had already completed a project to
realize an ontology of Esperanto trying to demonstrate the possibilty to realize a linguistic
parser with reasoning skills. After this experience, it seemed appropriate to test this procedure
on a language that were not artificial like the Esperanto, but natural, a language which had an
evolving grammar and a more complex history. The choice of the Sumerian for the next step
was due to two reasons: firstly it is a natural language, with all the problems of "irrationality"
that this entails, but is a language whose textual corpus is closed and can be treated as a unit
that can not be further modified; secondly, the transparency of its grammar, for his character
agglutinative, allows an easier identification of morphemes. It is important to stess that the
next step should be the formal description of a living language (e.g. Turkish). I started to
write an ontology of Sumerian with Dr. Marco Romano, an expert in the theory of the
Semantic Web, who helped me to write the ontology using the programs for managing
ontologies in OWL format: Protégé 3.2 and RacerPro 1.8.
An ontology is composed of two elements: a T-Box (Terminological Box), and an A-Box
(Assertation-Box) (slide n. 15) The T-box is the taxonomic part of ontology, where concepts
(and the hierarchy that exists among them) is defined, and where we formalize the
relationship among different concepts. It is the "shape" of the ontology. The A-Box is the part
of the ontology that contains the facts, where individual instances are classified as belonging
to a specific class and where properties are defined for the class of each instance. It is the
"substance" of ontology. In the case of Sumerian (slide n. 16), the T-Box is the Sumerian
grammar, while the A-Box is represented by every Sumerian texts that can be formalized in
the grammar described in the T-Box. The possibility to apply this system on the Sumerian
language (that is still so problematic for scholars) could be very important: when the T-box is
formalized, and when we apply to this T-Box a great number of Sumerian texts, we could let
the machine to tell us where our reconstruction of grammar is right, and where, on the
contrary, it is not possible to apply the reconstruction of the grammar, formalized in T-Box, to
a text. So, it would be possible to add new texts (and there are hundreds of thousands in the
museums all over the world) to understand where our reconstruction of the language is right
and where we need to seek new grammatical solution. Given the fact that the ontology
represented an "experiment", it was decided not to use a long text, but one which could show
some of the most common grammatical features of the Sumerian language in order to obtain a
small but consistent and fully instantiated A-Box. After analyzing some texts the choice fell
on a foundation brick of the king Ur-Namma, king and founder of the Third Dynasty of Ur,
who ruled in Mesopotamia between 2112 and 2095 BC. (slide n. 17). This means (slide n. 18)
that I use in the T-Box the class, for example, of nominal chain (which has two sub-classes:
possessive and case-marker), and the class of verbal chain (which has, for example, the subclasses of prefix, dimensional infix and so on). The transliteration (that is the rendering in
latin characters of each cuneiform sign) and translation of the text is the follow (slide n. 1920):
d
Nanna
lugal-a-ni
Ur-dNamma
lugal-Urim5ki-ma-ke4
e2-a-ni
mu-na-du3
bad3-Urim5ki-ma
mu-na-du3
to the God Nanna
his king
Ur-dNamma
king of Ur
his temple
he built
the walls of Ur
he built
This text represents our A-Box. The T-Box had been formalized using the grammar in this
text, and after about two months of work, the progam could distinguish each grammatical
element of this text (slide 21), that is it distinguished that, for example lugal-a-ni was a
sbstantive + possessive adjective:
d
Nanna
lugal-a-ni
Ur-dNamma
lugal-Urim5ki-ma-ke4
e2-a-ni
mu-na-du3
bad3-Urim5ki-ma
mu-na-du3
noun (= god)+(dative case)
noun (= substantive) + possessive adjective (a-ni)
noun (= personal name)
noun (= substantive) + genitive(city name + ak) +
ergative
noun (substantive) + possessive adjective (a-ni)
verbal chain: prefix (mu) + dimensional infix (na) +
verbal root (du3)
noun (substantive) + genitive (city name + a<k>)
verbal chain: prefix (mu) + dimensional infix (na) +
verbal root (du3)
As I said above, this ontology is only an experiment, an attempt, but this work shows however
that it is possible to apply the technologies of the Semantic Web on a natural language as
well. This seems to be the right track and I am sure that these technologies will be able to
provide important new tools not only for Sumerian, but also for many other linguistic aspects.
This work has produced a final report (slide n. 22) available in
http://dx.doi.org/10.1683/ab0002, and a ontology in OWL format (Ur_Namma.owl) in
http://dx.doi.org/10.1683/me0004.