Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MODS capture by Zotero as observed in American Social History Online http://www.dlfaquifer.org/home Preliminary notes for Aquifer Metadata Working Group Laura Akerman, 2008-10-29, revised by Laura 2009-03-25 [email protected] Issues arranged by Zotero field name: creator: #4, #5 creatorTypes: #6 identifier: #2 itemType: #1, #9 #14, #15, #16 language: #3 place: #7 publisher: #8 title: #13 Tags tab: #10 no Zotero field mapping; don't we need one? #11, #12 Analysis arranged by MODS element; issues are numbered sequentially abstract Mapping appears OK. MODS abstract maps to Zotero abstractNote. (NOTE that this kind of note appears as a field in the Info tab of Zotero, whereas other types of notes such as in the MODS "note" field, are "pushed" to the Notes tab.). accessCondition Mapping appears OK. MODS accessCondition of any type, maps to Zotero rights element. classification Mapping appears OK. MODS classification element is mapped to a Zotero callNumber element. extension Mapping appears OK; no Zotero mappings for extension were found, or expected. genre Issue #1. MODS genre is only mapped if it matches one of the Zotero itemType types. There are many more types of form/genre terms used in that element, more detailed and from different angles (literary or film genres such as "cartoons" "mystery stories" etc. or physical genres such as "stereographs"). If Zotero could add this field to all itemType field sets, that would be lovely, if all genre terms could be captured there. If it can't, could we consider mapping these terms to the Tags tab, along with subjects? (Can Zotero differentiate kinds of Tags - for subject and genre?) Examples of Aquifer MODS records containing a <genre> element: title: Journal of a voyage across the Atlantic: with notes on Canada & the United States, and return to Great Britain in 1844 (genre - Biography) title: Every girl pulling for victory : Victory Girls : United War Work Campaign (genre Posters) identifier Issue #2. Cannot fully assess processing of the MODS identifier element to a Zotero identifier field (?) from the translator code; it apparently calls other code not present in the translator (processIdentifiers). It would be helpful to know how that code operates, then we can determine if there are any issues. Example Aquifer MODS records with identifier elements: title: Every girl pulling for victory : Victory Girls : United War Work Campaign (genre Posters) -- <identifier>msp00003</identifier> title: Wild wild women <identifier type="local" displayLabel="Call number">091074</identifier> Note: in Zotero, or when exported to either Zotero RDF or MODS, no identifier field appears. language Issue #3. No mapping of MODS language/languageTerm (with type attribute of either term or code) could be found. This is puzzling because Zotero has a "language" item field. Example Aquifer MODS records with language elements: title: Cortés Nos Chingó In A Big Way The Hüey (has two elements for English and Spanish) title: A travers la somme devastee : le cimetiere Allemand de Nontecourt, dans le fond, o droite, vue des ruines du village de Nontecourt, o gauche, des ruines du bourg de Monchy-la-Gache (French) location Mapping appears OK. MODS location/physicalLocation is mapped to Zotero archiveLocation element. MODS location/url is mapped to Zotero url element. name Issue #4. Some names are not coming through in the Zotero metadata record; only the date at the end of the name string appears. NACO Authority File names are often qualified by date. Example: captured MODS for "Famous actresses of the day in America" from Aquifer, shows Author: 1869-1935 , (first) The MODS record has <name type="personal"> <namePart>Strang, Lewis Clinton</namePart> <namePart type="date">1869-1935</namePart> <role> <roleTerm authority="marcrelator" type="text">creator</roleTerm> </role> </name> It appears that the code is dealing appropriately with MODS name/namePart elements that that have "family" or "given" attributes (mapping them to Zotero creator.lastName and creator.firstName elements), but then assumes that any other namePart subelement be stored in variable "backup name" and run through the Zotero "cleanAuthor" utility. This is probably designed for namePart elements with no attribute which (if in AACR2 form), have a form Lastname, Firstname M. I. However, there are two other defined MODS namePart type attributes that are not dealt with: @type=date and @type=termsOfAddress. These need to be either specifically ignored, or mapped in to the end of the name somehow (if this is possible in Zotero?). The results make it appear that the Zotero translator is processing them through "clean author" as if they were a name. Issue #5. There is a comment in the code area where Zotero "creator" field is mapped: "// TODO: institutional authors". Please follow through on this. Right now, MODS name elements with type "corporate" or "conference" are showing up in Zotero looking like this: Author: United States, (first) Where the MODS record has: <name type="corporate"> <namePart>United States</namePart> <role> <roleTerm authority="marcrelator" type="text">creator</roleTerm> </role> </name> Corporate bodies don't have a first name, so the (first) need not display. Example Aquifer records containing corporate body names include: Title: Olympic Boulevard, State Route 173, looking east from point 200 feet west of Irolo Street, Los Angeles County, 1940 Title: Organization and historical sketch of the Women's Anthropological Society of America Issue #6. Only 3 terms, when found in MODS name/role/roleTerm, are mapped to Zotero's "creatorTypes", and these are only mapped if the "code" form of term is used. Many more mappings are possible. The current mapping handles code "edt" mapped to "editor", "ctb" mapped to "contributor", and "trl" mapped to "translator". This MODS roleTerm element can contain either a code or a term (governed by the "type" attribute), and a standard vocabulary used is the MARC Relators code list (http://www.loc.gov/marc/relators/relators.html) referred to in the MODS documentation. Below are examples of more mappings that could be made from this list. Zotero team may wish to review the definitions in the MARC documentation to see if they are in harmony with Zotero definitions or if more mappings could be made. MARC term: map to Zotero: Editor creatorTypes.editor MARC term: map to Zotero: Contributor creatorTypes.contributor MARC term: map to Zotero: Translator creatorTypes.translator MARC code: or MARC term: map to Zotero: ive Interviewee creatorTypes.interviewee MARC code: or MARC term: map to Zotero: ivr Interviewer creatorTypes.interviewer MARC code: or MARC term: map to Zotero: drt Director creatorTypes.director MARC code: or MARC term: map to Zotero: aus Author of screenplay, etc. creatorTypes.scriptwriter MARC code: or MARC term: map to Zotero: pro Producer creatorTypes.producer MARC code: or MARC term: map to Zotero: act Actor creatorTypes.castMember MARC code: or MARC term: map to Zotero: spn Sponsor creatorTypes.sponsor MARC code: or MARC term: map to Zotero: inv Inventor creatorTypes.inventor MARC code: or MARC term: map to Zotero: rcp Recipient creatorTypes.recipient MARC code: or MARC term: map to Zotero: prf Performer creatorTypes.performer MARC code: or MARC term: cmp Composer map to Zotero: creatorTypes.composer MARC code: or MARC term: map to Zotero: lbt Librettist creatorTypes.wordsBy MARC code: or MARC term: map to Zotero: ctg Cartographer creatorTypes.cartographer MARC code: or MARC term: map to Zotero: prg Programmer creatorTypes.programmer MARC code: or MARC term: map to Zotero: art Artist creatorTypes.artist MARC code: or MARC term: map to Zotero: cmm Commentator creatorTypes.commenter MARC code: or MARC term: map to Zotero: cwt Commentator for written text creatorTypes.commenter Example Aquifer records containing some of these terms: Title: Map of city and county of San Francisco (Cartographer) Title: Performance by Tito Vasconcelos (prf) Title: White Eagle and Pura Fé sing Rudy Martin's songs (cmp, prf, as well as additional codes not mentioned above (mus, lyr, voc) note Mapping appears OK. MODS note element is assigned to a variable and "pushed" to the Zotero notes tab for this item. originInfo Mappings that appear OK: MODS originInfo/edition subelement is mapped to Zotero edition field. There are mappings from MODS originInfo subelements, either copyrightDate, dateIssued, or dateCreated (in that order) to one Zotero date field. MODS originInfo/dateModified is mapped to a Zotero lastModified element. MODS originInfo/dateCaptured is mapped to a Zotero accessDate element. Issue #7. MODS originInfo/place/placeTerm is mapped to Zotero place field, only if type-"text". It could be possible to use a table with the MARC Code List for Countries (http://www.loc.gov/marc/countries/) to lookup the text form, when only type="code" is present here. For most MODS records, especially those mapped from MARC, a type="text" form is likely to be present, so this may not be worth the effort. Aquifer records where only a type="code" form of originInfo/place/placeTerm is present in the MODS record: Title: White Eagle and Pura Fé sing Rudy Martin's songs Title: Biographical dictionary and portrait gallery of representative men of Chicago and the World's Columbian Exposition Issue #8. MODS originInfo/publisher. Not sure about this one - it appears that MODS publisher maps to a Zotero "publisher" element, except when the Zotero itemType is "website" or "webpage", in which case, it is mapped to Zotero "publicationTitle" ! Is this because the "publisher" field is not defined for the "webpage" set of elements? If so, this is problematic from two fronts: a. Items published on the web as webpages or websites can have publishers (entities responsible for the webpages or sites). b. See under "physicalDescription", the note about setting itemType to Zotero webpage based on value "electronic" in the MODS physicalDescription/form element. This means that, for example, digitized books could end up with an itemType of "webpage" and their publisher would not be captured. This appears to have been partially improved since the first draft of this analysis. For records getting type "Website", the publisher element is now not appearing as publisher (still!) but is not showing up as "Website title" either. MODS <relatedItem type="host"> appears to be mapped to "Website title" which makes more sense. Aquifer records whose Zotero itemType comes through as "website" or "webpage", which have an <originInfo><publisher> element: Title: Studies on Inbreeding. Publisher is The Wistar Institute of Anatomy and Biology; Title: The woman who wouldn't. Publisher is G.P. Putnam's Sons, Note that publisher (in these cases, of the original item) does not show up anywhere in Zotero record. Both of these items are actually digitized books. part Mapping to Zotero "volume" "issue" or "section" field: may be OK; not a complete mapping. The code maps part detail elements that have type "volume" "issue" or "section" to Zotero fields with the same name. It uses variables, first looking for subelement part of the relatedItem element, then looking at part as a top-level element. The code maps part/detail/number if it is present; otherwise it maps detail as text (Is this possible?). It seems to ignore the part/detail/caption and part/detail/title subelements, as well as part/detail "level" attribute The Aquifer group may have more comment on this later if/when we have time and examples of MODS records using "part" element, to test with. Page(s): Seems OK. Maps start and ending pages to a Zotero "pages" element, separated by a dash if start and end are different pages. I have done a lot of hunting but have been unable to find Aquifer records using the toplevel MODS Part element. This is a newer MODS element and has apparently gotten limited application in the Aquifer metadata collections. physicalDescription Issue #9. Zotero is using the MODS physicalDescription/form element with @authority="marcform", where content is "electronic", to set the Zotero itemType element to be "webpage". This may have unintended effects, because almost anything from the web captured in Zotero could get the designation "electronic" (particularly if it is a MODS record converted from MARC, where this is mapped from a "fixed field code" that's widely used for web resources of all types). It would be better to omit this mapping. Aquifer records containing <physicalDescription><form authority="marcform">electronic</form></physicalDescription> which are getting inappropriately mapped to "Web Page" instead of "Book" itemType: Title: Studies on Inbreeding. Publisher is The Wistar Institute of Anatomy and Biology; Title: The woman who wouldn't. Publisher is G.P. Putnam's Sons, recordInfo Mapping is OK. There is a mapping of content of recordInfo/recordContentSource to Zotero's source field, and a mapping of recordInfo/recordIdentifier to Zotero's accessionNumber field. relatedItem Mapping appears OK. MODS relatedItem type="host" subelement title/titleInfo type="abbreviated" is mapped to the Zotero journalAbbreviation element, and to the publicationTitle element if that has not been mapped from other content yet. Since serials are generally the types of "hosts" for which abbreviated titles are supplied, seems safe. MODS relatedItem type="series" subelement titleInfo/title is mapped to Zotero's series element; titleInfo/partTitle for a series is mapped to Zotero's seriesTitle element; titleInfo/subtitle is mapped to Zotero's seriesText element; titleInfo/partNumber is mapped to Zotero seriesNumber. subject Issue #10. Mapping of MODS subject subelements is missing a lot! Subject is "pushed" to the Tags tab in Zotero. Only the MODS subject/topic subelements are mapped; this leaves out many other types of subject or parts of subjects, which could be useful to Zotero users (who wouldn't likely care about separation of the "subject facets"). Sublements for name, titleInfo, geographic, temporal, and occupation could be mapped directly; geographicCode, hierarchicalGeographic, and cartographics might present more difficulty to map and are less critical to use (usually records containing these elements have other types of subject terms used for the same entities, that are more easily mapped). The genre subelement under subject is a special case. #10a. Would it be possible, based on attribute authority="lcsh" in the subject element, to map all the subelements of such a MODS subject into one string, sequentially, with a space, two dashes, and a space as a delimiter? (This is how LC subject headings are intended to be viewed but may not fit with Zotero's functionality.) #10b. Right now there seems to be no way to "group" or differentiate types of tags... is anything in the offing? If we mix different kinds of subjects there (or subject plus other kinds of descriptors such as genre), it makes it difficult to "map back out" the tags field to MODS or other metadata formats that make these differences. But, that being said, mapping subject/genre or MODS genre to the tags tab is an option. If neither 10a or 10b is possible, it would be better to leave the subject/genre subelement unmapped. An example MODS subject, from Aquifer record "Washington and his comrades in arms" <subject authority="lcsh"> <geographic>United States</geographic> <topic>History</topic> <temporal>Revolution, 1775-1783</temporal> </subject> In "LCSH display form": United States -- History -- Revolution, 1775-1783. The translator will only pick up "History" from this subject element. That's missing a lot. Other Aquifer records with multifaceted subjects: Title: Southern women in the recent educational movement in the South (topic/temporal/geographic) Title: A history of Williams College (corporate name/topic) Title: The memorial life of General William Tecumseh Sherman (personal/topic, geographic/topic/temporal Some of the Aquifer records don't separate out the facets into separate subelements, but just have a string with dashes. From an example title, Letter to Adelina from Juanita Wolfskill: <subject authority="lcsh"> <geographic>Orange County (Calif.)--History</geographic> </subject> Because this example is under the subelement <geographic> it doesn't get mapped to tags in Zotero. But if the <topic> subelement was used (which is how the MODS instructions say to treat undifferentiated "subject headings"), it would have mapped, dashes and all, as in the Aquifer examle record titled: Ruins of Prager's Department Store: <subject> <topic>Earthquakes--California--San Francisco--Photographs</topic> </subject> Zotero captures this and two other similar <topic> subjecs as Tags that look like this: Earthquakes--California--San Francisco--Photographs Note that this example does not identify the subject as an LC subject heading. It follows the form of printed LCSH but doesn't have the subelement structure. In summary, the options for Zotero seem to be: 1. Map all types of subject subelements (name, title, topic, chronological, geographic, and form) separately as Tags. 2. Map multiple subelements of subjects having authority="lcsh" as a string, with each subelement in the sequence separated from others by two dashes 3. Allow "typing" of Tags so that topic tags, name tags, geographic tags, time period tags can be together (probably a "new feature). . tableOfContents Issue #11. The translator does not appear to make use of this element at all. If there is a Zotero "abstractNote" field, why not have a "contentsNote" field? Or, if the size of some tables of contents might be an issue, could it at least get "pushed" to the Note tab? An example of an Aquifer ASHO record with a TOC is the title "The people of the Eastern Orthodox churches, the separated churches of the east, and other Slavs: report of the Commission Appointed by the Missionary Department of New England to Consider the Work of Co-operating with the Eastern Orthodox Churches, the Separated Churches of the East, and Other Slavs" targetAudience Issue #12. There appears to be no Zotero field to map the content of targetAudience field to. This is not a major issue, but if adding a Zotero field is not feasible, could this be a separate kind of Tag, in which case the element could be mapped to the Tags tab? Example Aquifer record with a targetAudience field: titleInfo Issue #13. It appears that the current code creates a Zotero newItem.title for each MODS titleInfo/title, if the @type is not equal to "abbreviated". However, when there are multiple titleInfo elements, only one seems to get picked to appear in the brief list and the "title" box when viewing in the Zotero plugin. Zotero might even prefer a titleInfo element with an attribute - but it shouldn't. The titleInfo element without any type attribute is the actual title of the work and should be the primary one to capture and display - and although MODS does not require it, standard practice is to always have at least one "typeless" titleInfo element (and usually only one). If there's a way to capture and use titles with type attributes (translated, alternative, uniform as well as abbreviated) elsewhere, that would be nice, but one of these should not get to be the "title" if a type-less titleInfo exists, just because the translator encounters it in a certain order (last?). Please change the logic to prefer a titleInfo element with no type attribute as the source for title. An Aquifer record's multiple titleInfo elements: <titleInfo> <nonSort>The </nonSort> <title>Constitution of the United States of America</title> <subTitle> as proposed by the Convention, held at Philadelphia, September 17, 1787, and since ratified by the several states : with the several amendments thereto </subTitle> </titleInfo> − <titleInfo type="uniform"> <title>Constitution</title> </titleInfo> The only title showing up in Zotero is "Constitution". typeOfResource Issue #14. There is no Zotero type for "sheet music", so "book" is being used. We want to request a "sheet music" type. A Metadata Working Group member has agreed to work on the request for Zotero elements for this type; that information will be furnished later. Aquifer records for sheet music may be found in several collections of sheet music (Music for the Nation: American Sheet Music, 1820-1860, 1870-1885, Musical Scores, and the Starr Sheet Music Collection. Title: Who tied the can on the old dog's tail? Did you tie the can on the old dog's tail? Title: If I should take a notion to jump into the ocean. Issue #15 Zotero's mapping to its itemType element does not take MODS typeOfResource values into account at all. These are more likely to be present than genre elements and are usually required for many contexts, including Aquifer. While a one-to-one mapping isn't possible for all Zotero types, the following could help provide some better defaults than the all-purpose "book" if other data doesn't supply a different type. text could map to Zot. itemTypes.book by default (could also be periodical, newspaper, theses, letter, but those mappings would have to come from genre) cartographic could map to itemTypes.map by default notated music Need an itemType for this! book by default... sound recording - or, o sound recording-musical o sound recording-nonmusical all 3 could map to Zot. itemTypes.audioRecording still image could map to itemTypes.artWork by default, although that's too specific (since there's not a generic "image" itemType for Zotero). moving image could map to itemTypes.videoRecording by default three dimensional object (uh, probably won't encounter one of these on the web, perhaps itemTypes.artWork would be a better guess than itemTypes.text). software, multimedia could map to itemTypes.computerProgram mixed material not sure there's a Web equivalent of this ("collections") and probably won't get any but closest match is itemTypes.webpage Example Aquifer records with typeOfResource: text: - Biographical sketches of the founder and principal alumni of the Log college (book; captures as Book) - Witness log (2 page handwritten document; captures as Book) text: - The New England home magazine (a serial: captures as Journal article) cartographic: - Peninsula between Delaware & Chesopeak Bays (a map; captures as Book) notated music: - Oh! You beautiful doll, you great, big beautiful doll! (sheet music, captures as Book) sound recording-nonmusical: - Title: Poetry reading and Creator: Frost, Robert, 1874-1963 (streaming audio, captures as Book) still image: Flowering Currant at Boonville Mendocino county (digitized photograph, captures as Book) moving image: - Visitin' 'round at Coolidge Corners (streaming video; captures as Film) - Chavela Vargas en vivo en El Hábito (versión sin editar) (streaming video; captures as Book) Aquifer doesn't contain any true examples of software, multimedia right now, although some records are mis-coded as that type (but are really electronic texts). mixed material: is the MODS type for collections (such as archival collections). - W. Stewart Evans Collection, 1967-1979 Other issues: Isssue #16. There's a "TODO: thesis type" note in the code section that deals with Zotero itemTypes. We feel it would be very useful to map to this Zotero type from MODS if possible. Note that the MARC Genre Terms (http://www.loc.gov/marc/sourcecode/genre/genrelist.html) which maps to various MARC fixed field values, contains the term "thesis"; MODS records using that vocabulary in the genre element will have an authority attribute "marcgt". This would be one "hook" in a MODS record that could map to a "thesis" itemType; there may be other possibilities. Aquifer records for theses: The development of Chicago and vicinity as a manufacturing center prior to 1880 (does not contain a genre element with "thesis", but contains a "thesis note"; captured as Book) The progress of the fire in San Francisco April 18th-21st, 1906 : as shown by an analysis of original documents (contains a genre element with "academic dissertations" and a "thesis note"; captured as Book) Wasn't able to find an Aquifer example of use of genre element containing "thesis". Presence of the word at the beginning of a note element may be a more likely marker at present, expecially for MODS records mapped from MARC. Presence of "dissertation" in the genre field might also be useful.