Download Folksonomy - Columbia University

Document related concepts

Postdevelopment theory wikipedia , lookup

Web of trust wikipedia , lookup

Truce term wikipedia , lookup

History of the social sciences wikipedia , lookup

Online participation wikipedia , lookup

Social web wikipedia , lookup

Social computing wikipedia , lookup

Six degrees of separation wikipedia , lookup

VSide wikipedia , lookup

Tribe (Internet) wikipedia , lookup

Transcript
Folksonomies and Social
Tagging
What Are Tags?
• Keywords or terms associated with or
assigned to a piece of information
• They enable keyword-based classification
and search of information
Basic Model for Tagging Systems
USER
RESOURCES
TAGS
Don’t confuse tags with keywords
or full-text searching
• Keywords are behind the scenes, tags are often
visibly aggregated for use and browsing
• Keywords can not be hyper-linked
• Keywords imply searching, tags imply linking
• Full-text searching is passive, tagging is active
• It’s more about connecting items rather than
categorizing them.
Tags can be …
•
•
•
•
Descriptions of the subject matter
Where the item is located
The intended use of the item
Individual (gift from mom)
• Different people have different tagging
patterns
• Tagging systems encourage differences
Tags are
• Non-hierarchical
• A way to create links between items by the
creation of sets of objects
• A means of connecting with others
interested in the same things
Tagging Systems Define
• Who can tag
• What can be tagged
• What kinds of tags can be used
• Tagging systems may result in the creation
of a “folksonomy”
Types of Tagging Systems
•
•
•
•
Managing personal information
Social bookmarking
Collecting and sharing digital objects
Improving the e-commerce experience
Why is tagging so popular?
•
•
•
•
It is easy and enjoyable
It has a low cognitive cost
It is quick to do
It provides self and social feedback
immediately
Putting the social in tagging
• Tags allow for social interaction because
when we navigate by tags we are directly
connecting with others
• People tag for their own benefit
Tags, and therefore social tags are
• Dynamic categorization systems
• Often created on-the-fly
• Chosen as relevant to the user – not to the
creator, cataloger or researcher
• A social activity (more on this later)
• Hopefully one small step toward a more
interactive and responsive library system
What is a folksonomy?
• Folksonomy refers to an “emergent,
grassroots taxonomy”
– An aggregate collections of tags
– A bottom-up categorical structure
development
– An emergent thesaurus
• A term coined by Thomas Vander Wal
Why do folksonomies work?
• The searcher defines the access, but
• The aggregation of the terms has public
value
• It’s a typically messy democratic approach
What makes folksonomies
popular?
• Their dynamic nature works well with
dynamic resources
• They’re personal
• They lower barriers to cooperation
Tagging and the consequent
folksonomies work best when
•
•
•
•
It’s easy to do
It’s not commercial in nature
Taggers have ownership
Taggers are more likely to tag their own
stuff than they are your stuff
• It has been shown to work well on the
Web
The unexpected development:
terminological consensus
• Collective action yields common terms
• Stabilization may be caused by imitation
and shared knowledge
• The wisdom of the crowd
Is your tagging influenced by my
tagging?
• Of course it is!
• People are beginning tag in ways that
make it easier for others to find like stuff
• Shared meaning consequently evolves for
tags
• Most used tags become most visible
Strengths of folksonomies
•
•
•
•
Cost-effective way to organize Internet
Social benefits
It’s inclusive
For many environments, they work well
Collocation issues
• They do not yield the level of clarity that
controlled vocabularies do
• Term ambiguity – words with multiple
meanings
• No synonym control
Issues with specificity
• Variable specificity for related terms
• Broadness of terms impacts precision –
terms are often imprecise
• Mixed perspectives
Issues with structure
• Singular and plural forms create redundant
headings
• No guidelines for the use of compound
headings, punctuation, word order
• No scope notes
• No cross references
Issues with accuracy
• Collective ‘wisdom’ of the tagging
community
• How does wrong information impact
retrieval
• Conflicting cultural norms
• Sometimes authority counts
“Spagging” and other problems
• Opening doors to opinion tags
• Tagging wars
• “Spagging”  Spam tagging
Tidying up the tags…?
• Lists of tagging norms have been
developed
• Are there programmatic solutions?
• Users know they are looking at tags
• By tidying, do we destroy the essence of
why this works?
• Do we realistically have the resources?
Recommendations
Don’t assume that one size fits all
• Retain controlled vocabularies in the catalog
• Explore ways to use controlled vocabularies to
help organize the internet by re-purposing
controlled vocabularies that already exist
• Invite Folksonomies to the party in the catalog to
gain their benefits
• Explore ways to combine the two systems
Recommendations
When you invite folksonomies into the
catalog, do so strategically, and carefully
• Don’t put terms in the same
index as controlled vocabularies
• Find ways to associate terms applied across
editions of works
• Need for mediation, or at least observation
• The crowd is not necessarily the best arbiter of
specific terminology
Recommendations
Always remember why people tag
• People tag things because they want to
find them, not because they want others to
find them
• Be aware that this will impact the quality of
the terms, and their frequency
Recommendations
Controlled vocabularies could be better
utilized than they currently are
• Subject structures are underutilized in the
ILS
• Controlled vocabularies that exist are not
being exported to the Web
• Well-connected terms foster discovery –
let’s connect them. Index those cross
references where available
Where are folksonomies found?
• Folksonomies are found in social bookmarks
managers such as Del.icio.us (http://del.icio.us/) and
Furl (http://www.furl.net/), which allow users to:
– Add bookmarks of sites they like to their personal
collections of links
– Organize and categorize these sites by adding their own
terms, or tags
– Share this collection with other people with the same
interests.
• The tags are used to collocate bookmarks: (a) within
a user’s collection; and (b) across the entire system,
e.g., the page http://del.icio.us/tag/blogging will
show all bookmarks that are tagged with “blogging”
by any user.
Social Bookmarking and Social
Tagging
• what is social bookmarking?
– public sharing of links
• association of tags (keywords) with links
– network of related links created by users
• network of related tags created by users
• what is tagging?
– act of associating a term with a link or article
– labelling or classifying for personal use
• Tagging creates an association between user,
item and set of tags
Inter-term relationships
• There are no clearly defined relations
between and among the terms in the
vocabulary, unlike formal taxonomies and
classification schemes, where there are
multiple kinds of explicit relationships (e.g.,
broader, narrower, and related terms)
between and among terms.
• Folksonomies are simply the set of terms
that a group of users tagged content with;
they are not a predetermined set of
classification terms or labels.
Popular folksonomy sites
•
•
•
•
•
•
•
Del.icio.us (http://del.icio.us)
Flickr (http://www.flickr.com)
Frassle (http://www.frassle.org)
Furl (http://www.furl.net)
Simpy (http://www.simpy.com)
Spurl (http://www.spurl.com)
Technorati (http://www.technorati.com)
The popularity of folksonomies
• The growing popularity of folksonomies can
be attributed to two principal factors:
– An increasing need to exert control over the mass
of digital information that we accumulate on a
daily basis.
– A desire to “democratize” the way in which digital
information is described and organized by using
categories and terminology that reflect the views
and needs of the actual end-users, rather than
those of an external organization or body.
What is Social Bookmarking?
• Social bookmarking is a server side web
based service which allows users to
create, manage and share their personal
bookmarks in a social community.
• Social bookmarking systems have three
major axes: users, tags, and URLs.
• Social bookmarking systems are a type of
folksonomy.
…then what is folksonomy?
• Folksonomy is a collaboratively generated,
open-ended labeling system that enables users
to categorize content by freely chosen labels.
• Thomas Vander Wal coined the phrase by
combining “folk” + “taxonomy”. 􀂄􀂄
• While folksonomy appears to be the most
popular, other names for the same phenomena
have been proposed which included: folk
classification, folk taxonomy, ethnoclassification,
distributed classification, social classification,
open tagging, free tagging, faceted hierarchy,
etc
Social Bookmarking as a
Classification System
• A classification system is a structured scheme
for categorizing knowledge, entities or objects to
improve access or study, created according to
alphabetical, associative, hierarchical,
numerical, ideological, spatial, chronological, or
other criteria.
• Traditional methods for organizing information
include controlled vocabularies, taxonomies,
thesauri, and ontologies.
Function of Social Bookmarking
• Method for organize and storing information
– Social bookmarking as a type of sense making
– Allows users to organize personal information their way
• Connects users to other related topics and ideas
– Gives the users the ability "to sort the wheat from the chaff“
– More narrowed focus, vetted by humans as opposed to
computers
– Collective Wisdom - tags are ranked by popularity.
• Connects users to other users
– Allows users to interact with other users methods
– “Eavesdropping on someone else’s thought pattern”
Social Bookmarking Characteristics
Common elemental characteristics of social
bookmarking (folksonomic) systems.
• Tag – a single word label that is applied to an
object (URL)
• Tagging – the process of organizing an object by
assigning a label or “tag”
• Tag bundle –a group of tags linked by another
tag or “super tag”
• Tag cloud - a visual weighted list of a set or
subset of tags
Example of a Tag Cloud
Tagging Issues
• Tagging is Good
• dynamic distributed
classification
• related tag networks
• tag cloud shows
extent of collection
• user terminology
• diversity
• Tagging is Bad
• mob indexing
• no controlled
vocabulary
• poor browsing
experience
• no thesaurus
• consensus by a mob
or no consensus
Tagging Issues
•
•
•
•
•
spelling variations
spelling mistakes
potentially mistaken term usage
acronyms, homonyms, synonyms
sesquipedalians (terms made by sticking
many smaller terms together e.g.
information_seeking_behaviour)
• non subject tags (e.g. affective tags, time
and task tags)
Patterns in Tagging (3 studies of
tags)
• Are categories emerging in social tagging
that will complement those developed
through professional methods?
• What does tag convergence and co-word
usage suggest about the utility of tagging?
• What implications do the use of affective
or time and task related tags have for the
organisation of information?
Convergence and Divergence in
Tags
• When enough people tag a site, a set of
more frequently applied tags will emerge
that start to look like a reasonable
description of the item
• tag trends do not follow standard power
laws for term usage (80/20 rule)
– the drop off tends to be much slower at first
before suddenly returning to the normal power
law
library
Internet
semantic
ontologies
information
w eb
Blog
reference
categories
Categorization
toread
tag
semanticw eb
folksonomies
metadata
taxonomy
article
del.icio.us
shirky
w eb2.0
tags
classification
folksonomy
tagging
ontology
Frequency
Tag Frequency 1
Tag Frequency Graph for http://shirky.com/writings/ontology_overrated.html
300
250
200
150
100
50
0
socialbookmarking
Social_netw orking
netw ork
library
Information
taxonomy
socialtagging
indexing
articles
academic
cataloging
kcb201
classification
bookmarking
Web2.0
research
socialnetw orking
del.icio.us
folksonomies
article
tags
folksonomy
collaboration
social
tagging
Frequency
Tag Frequency 2
Tag Frequency Graph for http://www.ariadne.ac.uk/issue54/tonkin-et-al/
45
40
35
30
25
20
15
10
5
0
Tagging Patterns
• Consensus forms after a certain number of
users have tagged an item
– first item by 2250 people, second only tagged by 49
• frequency graphs suggest a relative consensus
on terms, but tag lists and co-word graphs do
not
– high frequency tags used frequently but not
necessarily with other high frequency terms
– tagging patterns may show group consensus and
trends in user communities.
Tag Lists
• Shirky 2005
(http://del.icio.us/url/97c30ea798555e7b8380bc
1f4925233d):
• by nayma to folksonomy tags web2.0 ontology
• by zeft to ontology
• by chrysoberyl to 2.0 libraries thinky
• by peleke12 to ontology shirky tagging
• by alisaepstein to folksonomy folksonomies
tagging web2.0 653
Co-word Graph of Tags
Comparison of Tags with Controlled
Vocabulary
• 1. study tag use and types of tags on articles
compared to subject headings on CiteULike (like
del.icio.us but indexes journal articles which
have more metadata)
– most common relationship between the terms was
"related but not in the thesaurus"
– next most common RT and then equivalence
• 2. study comparing tags and LCSH on
LibraryThing without further context it is
extremely difficult to tell whether an apparently
anomalous tag in a tag cloud is a mistake
Non Subject Tags
• some time and task or affective tags are very
popular
– cool, fun, funny, toread appeared in main del.icio.us
tag cloud
• ToRead and fun are popular tags on all three
sites
• affective terms appear more frequent on
Citeulike and Connotea than expected
– biology articles more often listed as toread; math and
physics as fun
Utility of Tagging
• tagging can be useful for providing a good
picture of how users see the material
– Steve Museum project: found that users used
very different terminology and tagged specific
items seen in the picture which had been
absent from professional cataloguing
Tagging Discussion
• tagging has all the problems of free text
search/automatic indexing
• but, tag groups tend to converge on a useful set
of terms after a threshold number of users
• users use some terminology which is rare or
completely absent from subject heading lists
(e.g. time and task tags)
• user terms often not part of formal thesaurus
Social Bookmarking Characteristics
• Common elemental characteristics of social
bookmarking (folksonomic) systems.
– Tag – a single word label that is applied to an object
(URL)
– Tagging – the process of organizing an object by
assigning a label or “tag”
– Tag bundle –a group of tags linked by another tag or
“super tag”. Bundles are a way to group together
common tags. For instance, if you have the tags
"design", "painting", and "moma", you may want to
group these together into a bundle called "art".
– Tag cloud - a visual weighted list of a set or subset of
tags
Folksonomies and user
vocabulary
• In information retrieval systems (IRS), the
vocabulary used to organize content may be based
upon the choices of the authors of the materials, the
designer of the IRS, or the designer of the controlled
vocabulary in place.
• Folksonomies reflect users’ choices in diction,
terminology, and precision.
• Folksonomies can adapt very quickly to changes in
user needs and vocabulary, and adding new terms to
a folksonomy incurs virtually no cost for either the
user or the system.
Folksonomies and online
communities
• Folksonomies create a sense of community amongst
their users. Most social bookmark managers will
recommend new links and other members’ folders or
sites that are strongly related to an individual
member by analyzing his or her linking pattern.
• As soon as users assign a tag to an item, they can
see the cluster of items carrying the same tag. This
feedback loop leads to a form of asymmetrical
communication between users through metadata.
• The users of a system negotiate the meaning of the
terms in the folksonomy.
Ambiguity
• The terms in a folksonomy may have
inherent ambiguity as different users
apply terms to documents in different
ways.
– E.g., the tag “ANT” has been used to refer
to “Actor Network Theory”, a sociological
term, as well as Apache Ant, a Java
programming tool
Polysemy
• The polysemous tag “port” could refer
to a sweet fortified wine, a porthole, a
place for loading and unloading ships,
the left-hand side of a ship or aircraft,
or a channel endpoint in a
communications system.
Synonyms
• Folksonomies provide for no synonym
control; the terms “mac”, “macintosh”,
and “apple”, for example, are used to
describe Apple Macintosh computers.
• Both singular and plural forms of terms
appear (e.g., flower and flowers), thus
creating a number of redundant
headings.
Specificity
• Related terms that describe an item
vary along a continuum of specificity
ranging from very general to very
specific; so, for example, documents
tagged “perl” and “javascript” may be
too specific for some users, while a
document tagged “programming” may
be too general for others.
Syntax
• Folksonomies provide no guidelines for
the use of compound headings,
punctuation, word order, and so forth;
for example, should one use the tag
“vegan cooking” or “cooking, vegan”?
Incorrect Usage
• Tags could be applied incorrectly; the
term “archeology”, for example, is used
to tag items pertaining to both
dinosaurs and primitive microbes
Consensus
• Users strive to achieve a degree of consensus over
the general meaning of tags.
• As a URL receives more and more bookmarks, the
set of tags used in those bookmarks becomes stable
across different users.
• This stabilization is facilitated through imitation and
shared knowledge. Del.icio.us shows users the tags
most commonly used by others who bookmarked
the same URL already; users can easily select those
tags for use in their own bookmarks, thus imitating
the choices of previous users.
Folksonomies and controlled
vocabularies
• Folksonomies are not necessarily antithetical
to controlled vocabularies.
• Once you have a preliminary system in place,
you can use the most common tags to
develop a controlled vocabulary that truly
speaks the users’ language
– E.g., you can link related tags such as “nyc,”
“newyork,” and “newyorkcity”; it may be possible
to align these terms with established controlled
vocabularies, such as the Getty Thesaurus of
Geographic Names, in order to provide a greater
range of related terms.
Other uses for folksonomies
• Could be used to organize resources for an intranet,
course collection, etc.
• Could be used to enhance the customizable features
of library catalogues. Clients could organize and tag
items of interest from the catalogue, as well as
external sources (if allowable).
– Could share these tags and sources with other clients with
similar interests. This could lead to a user-directed reader
advisory service.
– Could use folksonomies to supplement existing LCSH
vocabulary in the catalogue, e.g., LCSH does not contain
terms for the popular film genres “cult”, “drama,” or
“action.”
Advantages of Social Bookmarking
• Low “cognitive” cost – large grassroots
community users vs. expert metadata
specialists or catalogers
• Self moderating and democratic
• Flexible, inclusive, adaptive and current
• Immediate Feedback
• Usability – easy to use
• Great at serendipitous discovery
Disadvantages of Social
Bookmarking
• Low Precision/Recall due to synonymous and
polysemous tags
• Basic Level problem – Granularity of tags (too specific,
too general)
• Lack of hierarchy – no parent-child, broad-narrow
relationship
• Highly susceptible to malicious users.
– Meta Noise - incorrectly "malicious" tags
– Gaming - cheating the system
– Spamming - a universal plague of all social systems
• Fails as a search system, bad at finding specific items
Conclusions
• Folksonomies are undoubtedly fraught with
the problems typical of uncontrolled
vocabularies, but their growing popularity
suggests that people are interested and
motivated in assigning their own metatags to
items of interest.
• One cannot help but wonder whether such
enthusiasm for metadata would be the same
if people were asked to use only prescribed
and standardized vocabularies.
Other Areas to Explore
• The cognitive and behavioural aspects of
folksonomy use:
– What is the tagging behaviour of people who use
folksonomies?
– Why do people choose the tags they use; what
motivates them to modify these tags; how often
do they modify them?
– How are folksonomies used communally?
– How do folksonomies foster consensus in the use
of tags?
– How does the community affect which tags are
used and how?
Folksonomies in Libraries
• Libraries can’t continue to rely exclusively on in-house
cataloging
• We can achieve our overall goals while allowing new
mechanisms along the way
• Users are one additional source of metadata we must
tap
• We must match appropriate metadata needs to the tasks
users are best equipped to perform
• Good interfaces for metadata collection will be key
• We must use the best ideas for user participation, and
adapt them for the library environment