Download help file

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Stemming wikipedia , lookup

English clause syntax wikipedia , lookup

Agglutination wikipedia , lookup

Macedonian grammar wikipedia , lookup

Ukrainian grammar wikipedia , lookup

Lexical semantics wikipedia , lookup

Inflection wikipedia , lookup

Modern Hebrew grammar wikipedia , lookup

Georgian grammar wikipedia , lookup

Kannada grammar wikipedia , lookup

Sanskrit grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Untranslatability wikipedia , lookup

Esperanto grammar wikipedia , lookup

Modern Greek grammar wikipedia , lookup

Swedish grammar wikipedia , lookup

Morphology (linguistics) wikipedia , lookup

Romanian grammar wikipedia , lookup

Russian grammar wikipedia , lookup

Latvian declension wikipedia , lookup

Yiddish grammar wikipedia , lookup

Romanian nouns wikipedia , lookup

Pipil grammar wikipedia , lookup

Old Norse morphology wikipedia , lookup

Lithuanian grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Russian declension wikipedia , lookup

Old Irish grammar wikipedia , lookup

French grammar wikipedia , lookup

Old English grammar wikipedia , lookup

Spanish grammar wikipedia , lookup

Malay grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Polish grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Transcript
Searching in Medieval Nordic texts
OddEinarHaugen,29February2016
UniversityofBergen
IMPORTANTNOTICE:Thisisabriefintroductioninthreesearchfacilitiesfor
MedievalNordictexts.Pleasenotethattheinterfaceforallfacilitiesmaychange
withoutannouncment,sotheseinstructionsmayturnouttobeoutdated,atleast
inparts,inthenottoodistantfuture.Hopefully,itwillretainsomeuse,andifyou
wouldliketoupdatetheinstructionsorsupplementthemdonothesitateto
contactmeat<[email protected]>.
1. Reading texts & making morphological queries in the Menota archive
TheMedievalNordicTextArchivecanbeaccessedwithoutpasswordorregistrationontheCLARINOwebsite(whichcontainsanumberofothercorpora):
http://clarino.uib.no/menota
Thecatalogueofthearchivelistsallavailabletexts.Theyhavebeenencodedatup
tothreelevels,i.e.thefacsimile,thediplomaticandthenormalisedlevel(indicated
byFacs/Dipl/Norminthetablehead),butthemajorityonlyonthediplomatic
level,Dipl.Thefacsimilelevelmeansthatthetextisdisplayedveryclosetothe
source,thediplomaticlevelthatitisslightlynormalised(andthuseasiertoread),
andthenormalisedlevelmeansthattheorthographyhasbeenfullyregularised.
Forthetimebeing,thisonlyappliestoOldIcelandicandOldNorwegiantexts.
Tenofthemorethan30textsinthearchivehavebeenlemmatised,indicatedby
Lemmainthetablehead,andnineofthesehavealsobeenannotatedfor
grammaticalforms,Analysis.Ofthese,fivearemajorOldNorwegiantexts(AM
243bαfol,DG4–7,DG8II,Holmperg6folandHolmperg174to).Theremaining
morphologicallyannotatedtextsareeitherOldIcelandicorOldSwedish.
Byclickinganyfilenameinthesecondcolumnofthecatalogueyouwillbeableto
readitonscreen.Ifithasbeenencodedonmorethanonelevel,youcanwatchall
levelsside-by-sidesimplybyselectingtheboxesforFacs,DiplandNormbelow
thetitleline.
Notethatyoudonotneedtohaveanyspecificfontinstalledinordertoreadthe
textsinthearchive,evenonthefacsimilelevel.ThearchiveusesaWebOpenFont
Format(WOFF),meaningthatandunderlyingfont,AndronMegaCorpus,isbeing
usedtodisplayallnecessarycharactersonscreen.
Credit:MostoftheworkondevelopingthedisplayandsearchfacilitiesinMenota
hasbeendonebyPaulMeurer,UniResearch,Bergen.
1.1 Searching for words in all files (“free text search”)
ClickonQueryintherightcolumn,stayinBasicsearch,andselectShowquery
expression.
Inthequeryfield,writekonongrwithinthesquarebrackets:["konongr"]
ThenhitRunquery–andyouwillgetalistofallthe4044occurencesofexactly
thisword(inthisspecificorthography)inthearchive.
NB!Donotuse“smartquotes”;itshouldbe["konongr"],not[“konongr”]
Now,inthequeryfield,write["en"]["konongr"],hitRunquery,andyouwillget
the44examplesofexactlythistwowordscombination.
So,continuewiththequery["en"]["konongr"]["take"],hitRunquery,andyou
willseethe7examplesofexactlythisthreewordscombination.
Sothereareonly7examplesofthisphraseinthearchive?Unfortunatelynot.
Thereare7examplesoftheexactorthographiccombinationenkonongrtake.If
themanuscripthasenkonungrtakeorenkonongrtakiorenkonungrtaki,
theywillnotcomeup.Inotherwords,youneedtobefamiliarwiththe
orthographyofthesourcesinordertomakegoodqueries.
TheresultwillbedisplayedinaKeyWordInContext(KWIC)format,andby
clickingonanywordhere,adialogueboxwillpopup,andbyclikcingonShow
contextinthetoplineofthis,youwillbetakentotheexactpagewheretheword
islocated.[Forthetimebeing,thisonlyworksincertaintexts.]
1.2 Searching for words in annotated files
Ifyouwanttoseeallexamplesofthelemmakonungryouhavetorestrictyour
querytotheannotatedfiles.Forthetimebeing,themajorityoftheseareOld
Norwegian,butOldIceandictextsarebeingadded,andthereareafewOld
Swedishfiles,too.
So,inthequeryfieldwrite:[lemma="konungr"]andhitRunquery.Theresultis
nolessthan3240hits.Thislistwillincludeallformsoftheword.
2
Sincetheannotatedfileshaveinformationaboutthegrammaticalforms,youmay
specifythis,ifyoulike,tonarrowdownyourquery.
Sobytyping[lemma="konungr"&msa=("cN""nP")]yougetall39examplesofthe
lemmakonungrincaseNominativeandnumberPlural.
Fine!Butyouarenothappywiththedefiniteforms,soyounarrowdownyour
searchto[lemma="konungr"&msa=("cN""nP""sI")],addingspeciesIndefiniteto
thelist,andsuddenlyyouaredownto30examples.
Sowhataboutanycombinationsofsáhinninthetexts,inwhateverforms?Just
typethequery[lemma="sá"][lemma="hinn"],andyougetthe248examplesof
thesetwolemmata(inthissequence)inalltheirvariantforms.
Toomuch?Onlyinterestedinthefemininesingularnominative?Sotype
[lemma="sá"&msa=("cN""nS""gF")][lemma="hinn"&msa=("cN""nS""gF")]
andyougetthe48examplesofsuhin(withminororthographicvariations).
Now,arethereanyformsinpreteritesubjunctiveoffara?Ifyoutypethequery
[lemma="fara"&msa=("tPT""mSU""nP")]youwillgetthe6examplesofpreterite
subjunctiveofthisverb,irrespectiveofperson–sinceyoudidnotspecifythis.You
mighthaveadded"p1"forthe1stperson,however.Theorderofthecategories
doesnotmatter–("tPT""mSU""nP""p1")isasgoodas("p1""nP""tPT""mSU").
Fromthispoint,youcanmakeyourownsearches.Twothingsareusefultoknow,
however,theorthographyofthelemmata,andthegrammaticalabbreviations.
1.3 Lemma orthography
IntheannotationoftheOldNorwegiantexts,theorthographyoflemmasfollow
therulesoftheOldNorwegianDictionary(GammelnorskOrdboksverk).These
differfromstandardOldNorseorthographyinfourmajoraspects:
1. Consonantaliisspelledi,e.g.siá,kveðia(notsjá,kveðja)
2. Nodistinctionismadebetweenoandǫ,e.g.diofullandkoma
(notdjǫfullvs.koma)
3. Thediphthongeyisspelledøy,e.g.høyra(notheyra)
4. Thehisnotwrittenbeforel,norr,e.g.lutr,nakkiandringr
(nothlutr,hnakki,hringr)
Morethan5%ofthelemmasareaffectedbytheserules,soifyoudonotget
expectedhitsonaquery,pleaseremembertochecktheserulesoncemore.
3
1.4 List of abbreviations
TheabbreviationsareforthetimebeingidenticalwiththoseusedintheXML
annotationoftextsinMenota.
Category
Features
Person
p1
p2
p3
Number
nS
nP
nU
Tense
tPS
tPT
tU
Mood
mIN
mSU
mIP
mU
Finiteness
fF
fI
fP
fU
Voice
vA
vR
vU
Gender
gM
gF
gN
gU
Case
cN
cG
cD
cA
Degree
rP
rC
rS
rU
Definiteness
sI
sD
sU
pU
cU
Individual abbreviations
A
= accusative
I
= infinitive
PS = present
A
= active
IN = indicative
PT = preterite
c
= case
IP = imperative
r
= degree
C
= comparative
M = masculine
R
= reflexive
D = dative
m = mood
s
= definiteness
D = definite
N = neutral
S
= singular
F
= feminine
N = nominative
S
= superlative
F
= finite
n
= number
SU = subjunctive
f
= finiteness
P
= participle
t
g
= gender
p
= person
U = undecided
G
= genitive
P
= plural
v
= tense
= voice
I = indefinite
P = positive
Notethatidenticalabbreviationsareneverusedinthesamecontext;forexampleA
meansactiveintheabbreviationvA,butaccusativeintheabbreviationcA.
4
2. Morphological and syntactic queries in the PROIEL interface
MenotecwasaninfrastructureprojectfundedbytheNorwegianResearchCouncil
2010–2012.Today,itispartoftheMenotanetworkandthetextsannotatedinthe
projectareforthetimebeinghostedattheUniversityofOslo.See
https://en.wikipedia.org/wiki/Menotec
Thefourtextswhichwereannotatedmorphologicallyandsyntacticallyaspartof
theMenotecprojectarenowavailableonthePROIELwebsite(andalsothrough
INESS,asexplainedbelow):
http://foni.uio.no:3000
ThePROIELprojectwasinitiatedbyDagT.T.HaugattheUniversityofOslo,see
http://www.hf.uio.no/ifikk/english/research/projects/proiel
PROIELrequiresregistrationandpassword,butthisisaverysimpleandquick
process.Donotletthisprocessputyouoff!
OnthePROIELwebsite,youwillbeofferednolessthan11languages(fromthe
classicalandmedievalperiod),butyouwillrushdowntoOldNorse,ofcourse.In
additiontothefourOldNorwegiantextsfromtheMenotecproject,therewillbe
twoOldNorwegiantextsaddedaftertheMenotecproject,andanOldIcelandic
one,theEddicpoems,fromanother,Icelandicproject.Theyaregroupedtogether
undertheheadingofOldNorse(callednoninISO)intheusualmeaningofthis
term,i.e.OldNorwegian(uptoca.1350)andOldIcelandic(uptoca.1500).
Oneofthetexts,UppsDG8I,hasnotyetbeenfullyannotated,butitisexpected
thatitwillhaveafullmorphologicalandsyntacticannotationbeforethesummer
of2016.
Beginbychoosingamanuscript,e.g.Strengleikar,andamongthechaptersofthis
manuscript,selectLaustik.Afterhavingselectedapartofamanuscript,itwillbe
displayedonscreen,usuallyovermorethanonepage.Laustikisashorttext,sothe
wholetextwillfitononescreen.
Inthetext,eachsentenceisindicatedbyasmalldigitinblue.Clickoneofthemto
getaccesstothefullmorphologicalandsyntacticalannotationofthesentence.
Themorphologicalannotationisdisplayedfirst.Beloweachwordfromthetext(in
bold),youwillfind(a)thewordclass,(b)thegrammaticalform,and(c)thelemma
ofeachword.Scrolldowntoseethesyntacticanalysisofthetext(seeill.1below).
5
Whilethemorphologicalannotationanddisplayismoreorlessself-explanatory,
thesyntacticanalysisislessso.Menotecisusingdependencyanalysis,whichwas
implementedforanumberofothertextsaspartofthePROIELproject.
Explainedinveryfewwords,dependencyanalysisassignsasyntacticfunctionto
eachwordinthesentence.Fromaconfigurationalpointofview,dependency
analysisissimplerthanphrasestructureanalysis,butunlikephrasestructure
analysis,eachwordisdescribedwithregardstoitsfunctioninthesentence.
Furthermore,dependencyanalysisdoesnothavetodealwithwordorder
(althoughthisisindexed),soitworkswellwithlanguagesofacomparativelyfree
wordorder–includingoneofthemostextremecasesinOldNorse,theskaldic
poems.
Ill.1.Thisiswhatthemorphologicalandsyntacticalanalysisofasentencelookslike
inthePROIELinterface.Fearnot–therearemorecomplexsentencesthanthisone!
6
Whattheanalysisinill.1saysisthattheverb‘er’isthePRED(ICATE)ofthe
sentence,andthatishastwodependents,theSUB(JECT)‘lioð’andtheOBL(IQUE)
‘her’(statingthelocale),andthattheword‘Laustiks’isanATR(IBUTE)of‘lioð’.See
theappendixonpp.15–17belowforafulllistoffunctionsinthePROIELscheme.
ThereisasearchpageinthePROIELwebapplicationwhichisnotverydifferent
fromtheoneintheMenotaarchive.ItcanbeaccessedfromSearchinthetop
menuline.
GuidelinesforthemorphologicalandsyntacticannotationofOldNorwegian
(includingalargenumberofpracticalexamples)havebeenwrittenbyOddEinar
HaugenandFarteinTh.Øverlandandcanbedownloadedhere,inparallelversions
inNorwegian(nynorsk)andEnglish:
https://bells.uib.no/bells/issue/view/157(Norwegiantext)
https://bells.uib.no/bells/issue/view/158(Englishtext)
Inthescreenshotbelow,wehavesearchedforthelemma“lióð”(whichinanOld
Norsedictionarywouldbe“ljóð”)inthetextStrengleikar,andhavingsub(ject)as
itsfunction.Inotherwords,thisisanexampleofacombinedmorphologicaland
syntacticsearch,butthepossibilitiesforthelatterisratherlimited.Forthetime
being,thisinterfaceisbestsuitedformorphologicalqueries,asistheMenota
interfaceexplainedabove.
Ill.2.ThesearchinterfaceinthePROIELwebapplication
7
Doyouthinkitistoomuchofaninvestmenttolearnthebasicsofdependency
analysis?Maybenot.OnthePROIELsite,inadditiontowonderfulOldNorwegian
texts,thewholecorpusoftheEddicpoems–oneofthemostimportantOldNorse
texts–isavailableinahigh-qualityannotation.WhataboutGothic?Itisallthere.
3. Morphological and syntactic queries via the INESS portal
TheINESSprojectisanotherinfrastructureprojectfundedbytheNorwegian
ResearchCouncil(2010–2016).ItisheadedbyVictoriaRosénattheUniversityof
Bergen.MuchoftheworkondevelopingtheportalhasbeendonebyPaulMeurer,
UniResearch,Bergen.INESSoffersaccesstoalarge(andincreasing)numberof
treebanks,includingtheOldNorsetextsonthePROIELwebsite.
INESScanbeaccessedwithoutregistrationorpasswordthroughthislink:
http://clarino.uib.no/iness
Afterhavingreadtheopeningpage,clickonTreebankselectionintheleft
column.Thiswilltakeyoutoalistofavailabletreebanks.SelectOldNorsefrom
thecategoryofLanguages.
Thiswilltakeyoutoanoverviewofthecontents,whichismoreorlessidenticalto
theoneinthePROIELwebapplication.
Now,lookforthesamesentenceinStrengleikarassuggestedabove!Selectthistext
byclickingonnon-strleik-dep(currentlyatthebottomofthescreen)andinthe
nextwindow,youwillbeofferedalotofmetadata.Loandbehold!Click“Accept”in
thedialoguebox.Bytheway,thethreefirstlettersinnon-strleik-depisthe
above-mentionedISOabbreviationfor“OldNorse”.
Now,clickonSentenceoverviewintheleftcolumn.Youwillnowseethewhole
textdisplayedonthescreen,fromsentence1tosentence2491.
Youcaninfactreadthewholetexthere,althoughitwouldnotbeaparticularly
convenientwayofreadingit,unlessyouareacomputerscientist.Otherpeople
prefertogototheUniversityLibraryinUppsalaandreaditdirectlyfromthe
parchment.Or,fautedemieux,fromtheMenotasite.
Now,gotosentence957bytypingthisnumberintheGotorowbox.
Fine!YouhavecometothefirstsentenceofLaustiksljóð,whichyoumightguess
fromthewordsLaustikslioðerher‘ThelaisofLaustikis[i.e.begins]here’.
8
HowdoyouknowthatLaustiksljóð,thefifthlaisinStrengleikar,beginsexactly
here?Well,youdonotknow.However,assumingthatyoucanaccessthistextin
anothersource,suchasintheMenotaarchive,inPROIELorinaprintededitionof
thetext,youmayfindacharacteristicword–like“Laustiks”–andsearchforthisin
INESS.SimplytypeasuitablewordintotheQuerybox:
[word="Laustiks"]
Youwillgetonehit,withtheIDnumber957.Fine–thisisit!
ClickResetintheQueryboxtodeletethequery,thentype957intheGotorow
box(asabove).YouwillbebackatthebeginningofLaustiksljóð.
Clickonthesentence.Theresultwillbeasdisplayedinill.3below.
Ill.3.TheINESSdisplayofsentence957inStrengleikar–syntaxonly.
Now,movethecursorovertheword,anditchangesintoahand.Clickonit,anda
boxcropsup,asshowninill.4.ThisgivesyouexactlythesametypeofmorphologicalinformationasinthePROIELinterface,ill.1.
9
Ill.4.TheINESSdisplayofsentence957inStrengleikar–withmorphology,too.
3.1 Morphological queries
Forastart,onecaneasilysearchformembersofanywordclassbyusingthe
followingabbreviations:
noun,common
= noun-com noun,proper
= noun-prop
adjective
= adj pronoun,personal
= pron-pers pronoun,reflexive
= pron-refl pronoun,interrogative
= pron-intrg pronoun,indefinite
= pron-indef
determiner,demonstrative = det-dem determiner,quantifier
= det-quant determiner,possessive
= det-poss verb
= verb
adverb,general
= adv-gen adverb,interrogative
= adv-intrg preposition
= prep conjunction
= conjnct
subjunction
= subjnct
infinitivemarker
= inf-mark interjection
= intject unassigned
= unassgn foreignword
= foreign-word 10
Thequeryshouldbetypedinthesearchfield.Forexample,thequery
[pos="subjnct"]
willyieldallsubjunctionsinthetext,while
[pos="subjnct"]&[pos="conjnct"]
willyieldallsubjunctionsandconjunctionsinthetext.
NB!InallINESSqueries,usetheordinaryquotationmarks,"...",notthecurly
quotes“...”(thequotationmarksaresometimesconvertedtocurlyonesbehind
yourback).
Ifyouwanttosearchforoneormorespecificmorphologicalcategories,theyhave
tobespecifiedwithasuitablesetofabbreviations.Forexample
[morph=("3""sg""pret""ind""act")]
isaqueryforwordsin3.person,singular,preterite,indicative,active
and
[morph=("3""sg""pret""ind""refl")]
isaqueryforwordsin3.person,singular,preterite,indicative,reflexive.
So,whatyoushoulddoistoinsertthemorphologicalcategoriesyouarelookingfor
usingtheabbreviationsinthelistbelow.Notethatmoodandfinitenesshavebeen
merged,sinceaverbiseitherfiniteandhasaspecificmood,oritisinfiniteand
doesnothaveanymood,butiseitheraninfinitiveoraparticiple.Alsonotethatnot
allfeaturesinthislistareactuallyusedinthecorpus.
(1) Person
first
= 1 second
= 2 third
= 3 unspecifiedperson
= unsp-pers (2) Number
singular
= sg
plural
= pl unspecifiednumber
= unsp-numb (3) Tense
present
= pres
preterite
= pret
unspecifiedtense
= unsp-tns
11
(4) Mood & Finiteness
indicative
subjunctive
imperative
indicativeorsubjunctive
indicativeorimperative
subjunctiveorimperative
unspecifiedformood
infinitive
participle
unspecifiedforfiniteness
(5) Voice
active
reflexive
unspecifiedvoice
(6) Gender
masculine
feminine
neuter
masculineorfeminine
masculineorneuter
feminineorneuter
masculineorfeminineorneuter
unspecifiedgender
(7) Case
nominative
genitive
dative
accusative
accusativeordative
genitiveordative
accusativeornominative
oblique(i.e.non-nominative)
unspecifiedcase
(8) Degree
positive
comparative
superlative
unspecifieddegree
=
=
=
=
=
=
=
=
=
=
ind subjv imp
ind/subjv ind/imp subjv/imp
unsp-mood
inf
part
unsp-finit = act = refl = unsp-vc
=
=
=
=
=
=
=
=
masc fem
neut masc/fem masc/neut
fem/neut masc/fem/neut unsp-gen =
=
=
=
=
=
=
=
=
nom gen dat acc acc/dat
gen/dat acc/nom obl unsp-case =
=
=
=
pos comp supl unsp-deg
12
(9) Definiteness
indefinite
= indef definite
= def unspecifieddefiniteness
= unsp-def (10) Inflection
inflecting
= infl non-inflecting
= non-infl Moreexamples!Thefollowingquery
[morph=("sg""masc""nom")]
yieldsallwordshavingthefeaturessingular,masculine,nominative.
However,thesecategoriesarefoundinseveralwordclasses,suchasnouns,
adjectivesanddeterminers.Soifyouwanttosearchforcategorieswithinaspecific
wordclass,youshouldbeginbyspecifyingthewordclass:
[pos="noun-com"&morph=("sg""masc""nom")]
[pos="adj"&morph=("sg""masc""nom")]
[pos="det-dem"&morph=("sg""masc""nom")]
SinceallMenotectextshaveafullmorphologicalannotation,youcaneasilysearch
forallinstancesofacertainlemma(headword).So,
[lemma="konungr"]
isaqueryforwordsofthelemma“konungr”,while
[lemma="hvat"]&[lemma="er"]
isaqueryforallinstancesofthetwolemmas“hvat”and“er”(althoughnot
necessarilyinthesamepartofthesentence).
NB!Pleasenotethatifyoumovefromonequerytoanother,thetreefromthe
previousquerymaystillbeonthescreen.JustclickonNextmatch(orPrevious
match)andaresultwillbedisplayed–withthewordhighlightedinred.
Then,youmightliketocombineaqueryforacertainlemmawithspecific
morphologicalcategories,forexample
[lemma="konungr"&morph=("sg""gen""def")]
whichwillyieldallinstancesofthelemmakonungrinsingular,genitive,definite,
typicallyformslikekonungsins.
13
Important(again)!
RememberthatthelemmasintheOldNorwegiantextsfollowtheorthography
usedbyGammelnorskOrdboksverk.ThisdiffersfromthestandardOldNorse
orthographyinfourmainways,asspecifiedonp.2above.
Sothisishowyou(forthetimebeing)searchforinstancesofgenitivesingularof
thelemmajǫrðintheOldNorwegiancorpus,rememberingtousethespellingiorð:
[lemma="iorð"&morph=("sg""gen")]
Sowhyaretherenohitsforthefollowingquery?
[lemma="iorð"&morph=("sg""gen""def")]
Well,therecanbeanerrorinthewaythequeryhasbeentyped(anextraspace
somewhere,forexample),atechnicalfault(horribiledictu!)–or,mostlikely,that
theresimplyarenoinstancesofthisparticularforminthecorpus.Whichinitsefis
avaluableresult.
Finally–goodnews!Thesequenceofmorphologicalfeaturesisnotfixed.Testit
yourself,mixingfeaturesasyoulike:
[morph=("3""sg""pret""ind""act")]
[morph=("act""sg""pret""ind""3")]
Bothqueriesgivethesameresult.Howpractical!
3.2 Syntactic queries
Eachqueryshouldbeopenedwiththe‘>’signandthenoneormoreofthe
functionalcategoriesinthevocabularyofdependencyanalysis.Seetheappendix
onpp.15–17belowforanoverviewofthefunctionsinuse.
Thequery
>aux
yieldsallauxiliaries(aux)(which,bytheway,isaheterogeneousfunction).Note
thatthisisaquerywhichwillgiveahighnumberofmatches–sodonotforgetto
marveloverhowfastthesearchis!
Andthenextquery,
>pred#x>comp#y>pred
yieldsallpredicates(pred)whichdominateacomplement(comp)which
dominatesanewpredicate(pred)
14
Nowalistofmorecomplexqueries:
>pred#x>obl&#x>sub
yieldsallpredicates(pred)whichdominatetwosisters,anoblique(obl)anda
subject(sub)
>pred#x>sub&#x>aux&#x>obj
yieldsallpredicates(pred)whichdominateasubject(sub),anauxiliary(aux)
andanoblique(obl)
>obj#x>atr#a&#x>atr#b!=#a
yieldsallobjects(obj)whichdominatetwoattributes(atr)
And,finally,afewmoreabstract(asitwere)queries:
#x>~#y
isthegeneralqueryforasecondarydependency,suchasbetweenanexternal
object(xobj)andasubject(sub),orbetweencoordinatedwords
#x>~xsub#y
isaspecificqueryforthesecondarydependencyexternalsubject(xsub)
!(>sub)
yieldsallsentenceswithoutasubject(sub).
Theseareonlysomeexamples.Trytomakeyourowncombinations,andincase
youaremissingsomething,havealookatthefulldocumentationinINESS:
http://clarino.uib.no/iness/page?page-id=INESS_Search
3.3 Combined morphological and syntactical queries
Eachsearchshouldbeopenedwiththe‘>’signandthenoneormoreofthe
functionalcategories.Thesecanbespecifiedwithlemmaormorphological
features,astheywereexplainedin3.1above.
So,thequery
>sub[lemma="dagr"]
yieldsallsubjects(sub)withthelemma“dagr”.
15
Andthisquery
>sub[morph=("pl""masc""nom""indef")&lemma="sunr"]
yieldsallsubjects(sub)withthemorphologyplural,masculine,nominative,
indefinite(strong),andhavingthelemma“sunr”
>pred#x>obl&#x>sub[morph=("sg""masc""nom")]
yieldsallpredicates(pred)whichdominatetwosisters,anoblique(obl)anda
subject(sub)(withthemorphologysingular,masculine,nominative)
>pred#x>obl&#x>sub[morph=("sg""masc""nom")&lemma="hverr"]
yieldsallpredicates(pred)whichdominatetwosisters,anoblique(obl)anda
subject(sub)(withthemorphologysingular,masculine,nominativeandhaving
thelemma“hverr”)
And,finally,thequery
>pred#x>obl[morph=("sg""masc""dat")]&#x>sub[morph=("sg""masc"
"nom")]
yieldsallpredicates(pred)whichdominatetwosisters,anoblique(obl)(with
themorphologysingular,masculine,dative)andasubject(sub)(withthe
morphologysingular,masculine,nominative).
Confused?
Gettingalittleconfusedwhenreceivingtheresultsofaquery?INESSwilllistall
hitsintheMenoteccorpus,butgroupedaccordingtothefourmanuscripts.This
willbethelineofmatchesfortheabovequery:
Asyoucansee,thenumberofmatchesvarybetweenthefourmanuscripts,evenif
theyareroughlyofthesamelength.
Alsonotethatifyouwanttocheckmatchesinsomeoftheothermanuscripts,you
mayhavetogothroughthelicenseroutinebeforegettingaccesstothemanuscript
inquestion.Youmaybeaskedtoacceptthelicenseevenifyoualreadyhavedone
that;justclick“Accept”andthen“Sentenceoverview”togetbacktowhereyou
were.
16
Appendix: Syntactic functions in the PROIEL (and INESS) databases
ThislistoffunctionshasbeentakenfromtheBeLLSannotationguidelines,pp.
199–201(seelinkonp.7above).
IntheMenotecproject,thefunctionsAG,PART,ARG,PERandADNOMarenot
beingused,buttheyhaveincludedhereforthesakeofcompleteness.
Explanation
predicate
-mainverbinindependentclauses
-conjunctioncoordinatingmainverbs
-mainverbindependentclauses(excluding
indirectquestions,seeCOMP)
-emptyverbwhenverbismissing
PARPRED
parenthetic
predicate
-interpolatedsentencefragment,typicallythe
mainverbindirectspeech
VOC
vocative
-wordofaddress
-interjections
Abbreviation Name
PRED
AUX
auxiliaryword -theauxiliaryverbshafa,geta,fá
-theauxiliaryverbsmunu,skulu,mega,vilja
-infinitivemarker
-negationofmainverb
-introductoryconjunction
-theexpletiveparticlesofandum
-theinterrogativeparticlehvárt
SUB
subject
OBJ
object
OBL
oblique
-thesubjectofaverb(excludingobliquesubjects, seech.7.3.3)
-theaccusativeinaccusativewithinfinitive
constructions
-theobjectofaverb(excludingobliquesubjects andnounphrases,seech.7.3.2)
-obliquewordsunderPRED(dativeandgenitive objects)
-obliquesubjects,indativeandaccusative
-indirectobjects(alwaysdative)
-objectsofprepositions
17
-objectsofadjectivesthatgoverngenitiveor
dative
-expressionsofsourceorgoalwithverbsof
motion
XOBJ
externalobject -infinitivesthathaveexternalsubjects
-predicatives,bothtosubjectsandobjects
COMP
complement
-thesubjunctioninnominalclauses
-thefiniteverbinindirectquestions
-theinfinitiveinaccusativewithinfinitive
constructions
-infinitivesassubjectsorpredicatives
-thesubjunctionen‘than’incomparison
NARG
adnominal
argument
-nounarguments,objectivegenitiveinparticular AG
agent
-thelogicalsubjectinpassive(andreflexive)
constructions,typicallyaprepositionphrase
introducedbyaf
ADV
adverbial
ADVisafunctionwhichcancanmodify:
-finiteverbs
-prepositions
-adjectives
-otheradverbs
ADVcanbe:
-adverbs
-prepositionphrases
-nounsinobliquecases
-subordinateclauses
XADV
external
adverbial
-anadverbialthathasanexternalsubject
ATR
attribute
-adjectivesundernouns
-possessivesundernouns
-determinersundernouns
-restrictiverelativeclauses
APOS
apposition
-wordssubordinatedtonouns,butwhichhave
thesamescope
-non-restrictiverelativeclauses
PART
partitive
-wordsorphrasesthattellustowhichgroupor wholeanounbelongs
18
ARG
argument
-OBL/OBJ
PER
peripheral
-argument/adjunct
NONSUB
non-subject
-OBJ/OBL/ADV
-anomalies
ADNOM
adnominal
-ATR/APOS/PART/NARG
REL
relative
-restrictiverelativeclause(ATR)/non-restictive relativeclause(APOS)
external
subject
predicate
identity
-theexternalsubjectofXOBJ
XSUB
PID
-thesecondarydependencybetweenimplicitand explicitverbs
-averbthatsharesargumentswithanotherverb
19