* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Folksonomy - Columbia University
Survey
Document related concepts
Transcript
Folksonomies and Social Tagging What Are Tags? • Keywords or terms associated with or assigned to a piece of information • They enable keyword-based classification and search of information Basic Model for Tagging Systems USER RESOURCES TAGS Don’t confuse tags with keywords or full-text searching • Keywords are behind the scenes, tags are often visibly aggregated for use and browsing • Keywords can not be hyper-linked • Keywords imply searching, tags imply linking • Full-text searching is passive, tagging is active • It’s more about connecting items rather than categorizing them. Tags can be … • • • • Descriptions of the subject matter Where the item is located The intended use of the item Individual (gift from mom) • Different people have different tagging patterns • Tagging systems encourage differences Tags are • Non-hierarchical • A way to create links between items by the creation of sets of objects • A means of connecting with others interested in the same things Tagging Systems Define • Who can tag • What can be tagged • What kinds of tags can be used • Tagging systems may result in the creation of a “folksonomy” Types of Tagging Systems • • • • Managing personal information Social bookmarking Collecting and sharing digital objects Improving the e-commerce experience Why is tagging so popular? • • • • It is easy and enjoyable It has a low cognitive cost It is quick to do It provides self and social feedback immediately Putting the social in tagging • Tags allow for social interaction because when we navigate by tags we are directly connecting with others • People tag for their own benefit Tags, and therefore social tags are • Dynamic categorization systems • Often created on-the-fly • Chosen as relevant to the user – not to the creator, cataloger or researcher • A social activity (more on this later) • Hopefully one small step toward a more interactive and responsive library system What is a folksonomy? • Folksonomy refers to an “emergent, grassroots taxonomy” – An aggregate collections of tags – A bottom-up categorical structure development – An emergent thesaurus • A term coined by Thomas Vander Wal Why do folksonomies work? • The searcher defines the access, but • The aggregation of the terms has public value • It’s a typically messy democratic approach What makes folksonomies popular? • Their dynamic nature works well with dynamic resources • They’re personal • They lower barriers to cooperation Tagging and the consequent folksonomies work best when • • • • It’s easy to do It’s not commercial in nature Taggers have ownership Taggers are more likely to tag their own stuff than they are your stuff • It has been shown to work well on the Web The unexpected development: terminological consensus • Collective action yields common terms • Stabilization may be caused by imitation and shared knowledge • The wisdom of the crowd Is your tagging influenced by my tagging? • Of course it is! • People are beginning tag in ways that make it easier for others to find like stuff • Shared meaning consequently evolves for tags • Most used tags become most visible Strengths of folksonomies • • • • Cost-effective way to organize Internet Social benefits It’s inclusive For many environments, they work well Collocation issues • They do not yield the level of clarity that controlled vocabularies do • Term ambiguity – words with multiple meanings • No synonym control Issues with specificity • Variable specificity for related terms • Broadness of terms impacts precision – terms are often imprecise • Mixed perspectives Issues with structure • Singular and plural forms create redundant headings • No guidelines for the use of compound headings, punctuation, word order • No scope notes • No cross references Issues with accuracy • Collective ‘wisdom’ of the tagging community • How does wrong information impact retrieval • Conflicting cultural norms • Sometimes authority counts “Spagging” and other problems • Opening doors to opinion tags • Tagging wars • “Spagging” Spam tagging Tidying up the tags…? • Lists of tagging norms have been developed • Are there programmatic solutions? • Users know they are looking at tags • By tidying, do we destroy the essence of why this works? • Do we realistically have the resources? Recommendations Don’t assume that one size fits all • Retain controlled vocabularies in the catalog • Explore ways to use controlled vocabularies to help organize the internet by re-purposing controlled vocabularies that already exist • Invite Folksonomies to the party in the catalog to gain their benefits • Explore ways to combine the two systems Recommendations When you invite folksonomies into the catalog, do so strategically, and carefully • Don’t put terms in the same index as controlled vocabularies • Find ways to associate terms applied across editions of works • Need for mediation, or at least observation • The crowd is not necessarily the best arbiter of specific terminology Recommendations Always remember why people tag • People tag things because they want to find them, not because they want others to find them • Be aware that this will impact the quality of the terms, and their frequency Recommendations Controlled vocabularies could be better utilized than they currently are • Subject structures are underutilized in the ILS • Controlled vocabularies that exist are not being exported to the Web • Well-connected terms foster discovery – let’s connect them. Index those cross references where available Where are folksonomies found? • Folksonomies are found in social bookmarks managers such as Del.icio.us (http://del.icio.us/) and Furl (http://www.furl.net/), which allow users to: – Add bookmarks of sites they like to their personal collections of links – Organize and categorize these sites by adding their own terms, or tags – Share this collection with other people with the same interests. • The tags are used to collocate bookmarks: (a) within a user’s collection; and (b) across the entire system, e.g., the page http://del.icio.us/tag/blogging will show all bookmarks that are tagged with “blogging” by any user. Social Bookmarking and Social Tagging • what is social bookmarking? – public sharing of links • association of tags (keywords) with links – network of related links created by users • network of related tags created by users • what is tagging? – act of associating a term with a link or article – labelling or classifying for personal use • Tagging creates an association between user, item and set of tags Inter-term relationships • There are no clearly defined relations between and among the terms in the vocabulary, unlike formal taxonomies and classification schemes, where there are multiple kinds of explicit relationships (e.g., broader, narrower, and related terms) between and among terms. • Folksonomies are simply the set of terms that a group of users tagged content with; they are not a predetermined set of classification terms or labels. Popular folksonomy sites • • • • • • • Del.icio.us (http://del.icio.us) Flickr (http://www.flickr.com) Frassle (http://www.frassle.org) Furl (http://www.furl.net) Simpy (http://www.simpy.com) Spurl (http://www.spurl.com) Technorati (http://www.technorati.com) The popularity of folksonomies • The growing popularity of folksonomies can be attributed to two principal factors: – An increasing need to exert control over the mass of digital information that we accumulate on a daily basis. – A desire to “democratize” the way in which digital information is described and organized by using categories and terminology that reflect the views and needs of the actual end-users, rather than those of an external organization or body. What is Social Bookmarking? • Social bookmarking is a server side web based service which allows users to create, manage and share their personal bookmarks in a social community. • Social bookmarking systems have three major axes: users, tags, and URLs. • Social bookmarking systems are a type of folksonomy. …then what is folksonomy? • Folksonomy is a collaboratively generated, open-ended labeling system that enables users to categorize content by freely chosen labels. • Thomas Vander Wal coined the phrase by combining “folk” + “taxonomy”. • While folksonomy appears to be the most popular, other names for the same phenomena have been proposed which included: folk classification, folk taxonomy, ethnoclassification, distributed classification, social classification, open tagging, free tagging, faceted hierarchy, etc Social Bookmarking as a Classification System • A classification system is a structured scheme for categorizing knowledge, entities or objects to improve access or study, created according to alphabetical, associative, hierarchical, numerical, ideological, spatial, chronological, or other criteria. • Traditional methods for organizing information include controlled vocabularies, taxonomies, thesauri, and ontologies. Function of Social Bookmarking • Method for organize and storing information – Social bookmarking as a type of sense making – Allows users to organize personal information their way • Connects users to other related topics and ideas – Gives the users the ability "to sort the wheat from the chaff“ – More narrowed focus, vetted by humans as opposed to computers – Collective Wisdom - tags are ranked by popularity. • Connects users to other users – Allows users to interact with other users methods – “Eavesdropping on someone else’s thought pattern” Social Bookmarking Characteristics Common elemental characteristics of social bookmarking (folksonomic) systems. • Tag – a single word label that is applied to an object (URL) • Tagging – the process of organizing an object by assigning a label or “tag” • Tag bundle –a group of tags linked by another tag or “super tag” • Tag cloud - a visual weighted list of a set or subset of tags Example of a Tag Cloud Tagging Issues • Tagging is Good • dynamic distributed classification • related tag networks • tag cloud shows extent of collection • user terminology • diversity • Tagging is Bad • mob indexing • no controlled vocabulary • poor browsing experience • no thesaurus • consensus by a mob or no consensus Tagging Issues • • • • • spelling variations spelling mistakes potentially mistaken term usage acronyms, homonyms, synonyms sesquipedalians (terms made by sticking many smaller terms together e.g. information_seeking_behaviour) • non subject tags (e.g. affective tags, time and task tags) Patterns in Tagging (3 studies of tags) • Are categories emerging in social tagging that will complement those developed through professional methods? • What does tag convergence and co-word usage suggest about the utility of tagging? • What implications do the use of affective or time and task related tags have for the organisation of information? Convergence and Divergence in Tags • When enough people tag a site, a set of more frequently applied tags will emerge that start to look like a reasonable description of the item • tag trends do not follow standard power laws for term usage (80/20 rule) – the drop off tends to be much slower at first before suddenly returning to the normal power law library Internet semantic ontologies information w eb Blog reference categories Categorization toread tag semanticw eb folksonomies metadata taxonomy article del.icio.us shirky w eb2.0 tags classification folksonomy tagging ontology Frequency Tag Frequency 1 Tag Frequency Graph for http://shirky.com/writings/ontology_overrated.html 300 250 200 150 100 50 0 socialbookmarking Social_netw orking netw ork library Information taxonomy socialtagging indexing articles academic cataloging kcb201 classification bookmarking Web2.0 research socialnetw orking del.icio.us folksonomies article tags folksonomy collaboration social tagging Frequency Tag Frequency 2 Tag Frequency Graph for http://www.ariadne.ac.uk/issue54/tonkin-et-al/ 45 40 35 30 25 20 15 10 5 0 Tagging Patterns • Consensus forms after a certain number of users have tagged an item – first item by 2250 people, second only tagged by 49 • frequency graphs suggest a relative consensus on terms, but tag lists and co-word graphs do not – high frequency tags used frequently but not necessarily with other high frequency terms – tagging patterns may show group consensus and trends in user communities. Tag Lists • Shirky 2005 (http://del.icio.us/url/97c30ea798555e7b8380bc 1f4925233d): • by nayma to folksonomy tags web2.0 ontology • by zeft to ontology • by chrysoberyl to 2.0 libraries thinky • by peleke12 to ontology shirky tagging • by alisaepstein to folksonomy folksonomies tagging web2.0 653 Co-word Graph of Tags Comparison of Tags with Controlled Vocabulary • 1. study tag use and types of tags on articles compared to subject headings on CiteULike (like del.icio.us but indexes journal articles which have more metadata) – most common relationship between the terms was "related but not in the thesaurus" – next most common RT and then equivalence • 2. study comparing tags and LCSH on LibraryThing without further context it is extremely difficult to tell whether an apparently anomalous tag in a tag cloud is a mistake Non Subject Tags • some time and task or affective tags are very popular – cool, fun, funny, toread appeared in main del.icio.us tag cloud • ToRead and fun are popular tags on all three sites • affective terms appear more frequent on Citeulike and Connotea than expected – biology articles more often listed as toread; math and physics as fun Utility of Tagging • tagging can be useful for providing a good picture of how users see the material – Steve Museum project: found that users used very different terminology and tagged specific items seen in the picture which had been absent from professional cataloguing Tagging Discussion • tagging has all the problems of free text search/automatic indexing • but, tag groups tend to converge on a useful set of terms after a threshold number of users • users use some terminology which is rare or completely absent from subject heading lists (e.g. time and task tags) • user terms often not part of formal thesaurus Social Bookmarking Characteristics • Common elemental characteristics of social bookmarking (folksonomic) systems. – Tag – a single word label that is applied to an object (URL) – Tagging – the process of organizing an object by assigning a label or “tag” – Tag bundle –a group of tags linked by another tag or “super tag”. Bundles are a way to group together common tags. For instance, if you have the tags "design", "painting", and "moma", you may want to group these together into a bundle called "art". – Tag cloud - a visual weighted list of a set or subset of tags Folksonomies and user vocabulary • In information retrieval systems (IRS), the vocabulary used to organize content may be based upon the choices of the authors of the materials, the designer of the IRS, or the designer of the controlled vocabulary in place. • Folksonomies reflect users’ choices in diction, terminology, and precision. • Folksonomies can adapt very quickly to changes in user needs and vocabulary, and adding new terms to a folksonomy incurs virtually no cost for either the user or the system. Folksonomies and online communities • Folksonomies create a sense of community amongst their users. Most social bookmark managers will recommend new links and other members’ folders or sites that are strongly related to an individual member by analyzing his or her linking pattern. • As soon as users assign a tag to an item, they can see the cluster of items carrying the same tag. This feedback loop leads to a form of asymmetrical communication between users through metadata. • The users of a system negotiate the meaning of the terms in the folksonomy. Ambiguity • The terms in a folksonomy may have inherent ambiguity as different users apply terms to documents in different ways. – E.g., the tag “ANT” has been used to refer to “Actor Network Theory”, a sociological term, as well as Apache Ant, a Java programming tool Polysemy • The polysemous tag “port” could refer to a sweet fortified wine, a porthole, a place for loading and unloading ships, the left-hand side of a ship or aircraft, or a channel endpoint in a communications system. Synonyms • Folksonomies provide for no synonym control; the terms “mac”, “macintosh”, and “apple”, for example, are used to describe Apple Macintosh computers. • Both singular and plural forms of terms appear (e.g., flower and flowers), thus creating a number of redundant headings. Specificity • Related terms that describe an item vary along a continuum of specificity ranging from very general to very specific; so, for example, documents tagged “perl” and “javascript” may be too specific for some users, while a document tagged “programming” may be too general for others. Syntax • Folksonomies provide no guidelines for the use of compound headings, punctuation, word order, and so forth; for example, should one use the tag “vegan cooking” or “cooking, vegan”? Incorrect Usage • Tags could be applied incorrectly; the term “archeology”, for example, is used to tag items pertaining to both dinosaurs and primitive microbes Consensus • Users strive to achieve a degree of consensus over the general meaning of tags. • As a URL receives more and more bookmarks, the set of tags used in those bookmarks becomes stable across different users. • This stabilization is facilitated through imitation and shared knowledge. Del.icio.us shows users the tags most commonly used by others who bookmarked the same URL already; users can easily select those tags for use in their own bookmarks, thus imitating the choices of previous users. Folksonomies and controlled vocabularies • Folksonomies are not necessarily antithetical to controlled vocabularies. • Once you have a preliminary system in place, you can use the most common tags to develop a controlled vocabulary that truly speaks the users’ language – E.g., you can link related tags such as “nyc,” “newyork,” and “newyorkcity”; it may be possible to align these terms with established controlled vocabularies, such as the Getty Thesaurus of Geographic Names, in order to provide a greater range of related terms. Other uses for folksonomies • Could be used to organize resources for an intranet, course collection, etc. • Could be used to enhance the customizable features of library catalogues. Clients could organize and tag items of interest from the catalogue, as well as external sources (if allowable). – Could share these tags and sources with other clients with similar interests. This could lead to a user-directed reader advisory service. – Could use folksonomies to supplement existing LCSH vocabulary in the catalogue, e.g., LCSH does not contain terms for the popular film genres “cult”, “drama,” or “action.” Advantages of Social Bookmarking • Low “cognitive” cost – large grassroots community users vs. expert metadata specialists or catalogers • Self moderating and democratic • Flexible, inclusive, adaptive and current • Immediate Feedback • Usability – easy to use • Great at serendipitous discovery Disadvantages of Social Bookmarking • Low Precision/Recall due to synonymous and polysemous tags • Basic Level problem – Granularity of tags (too specific, too general) • Lack of hierarchy – no parent-child, broad-narrow relationship • Highly susceptible to malicious users. – Meta Noise - incorrectly "malicious" tags – Gaming - cheating the system – Spamming - a universal plague of all social systems • Fails as a search system, bad at finding specific items Conclusions • Folksonomies are undoubtedly fraught with the problems typical of uncontrolled vocabularies, but their growing popularity suggests that people are interested and motivated in assigning their own metatags to items of interest. • One cannot help but wonder whether such enthusiasm for metadata would be the same if people were asked to use only prescribed and standardized vocabularies. Other Areas to Explore • The cognitive and behavioural aspects of folksonomy use: – What is the tagging behaviour of people who use folksonomies? – Why do people choose the tags they use; what motivates them to modify these tags; how often do they modify them? – How are folksonomies used communally? – How do folksonomies foster consensus in the use of tags? – How does the community affect which tags are used and how? Folksonomies in Libraries • Libraries can’t continue to rely exclusively on in-house cataloging • We can achieve our overall goals while allowing new mechanisms along the way • Users are one additional source of metadata we must tap • We must match appropriate metadata needs to the tasks users are best equipped to perform • Good interfaces for metadata collection will be key • We must use the best ideas for user participation, and adapt them for the library environment