Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
When Information Technology “Goes Social” Marti Hearst UCB SIMS SLA Meeting Oct 19, 2000 1 The SLA Hearst 2 Standard Information Retrieval Ranking Problem: there is a lot of useful information available. Which pieces in particular does the user want to view right now? Procedure: – match the words that reside in documents against the words users state in their query – combinations of words are proxies for the underlying meaning of the documents 3 Standard Web Search Engine Architecture crawl the web Check for duplicates, store the documents DocIds create an inverted index user query Show results To user Search engine servers Inverted index 4 Document Processing Steps Figure from Baeza-Yates & Ribeiro-Neto Spam Email Spam: – Undesired content Web Spam: – Content is disguised as something it is not, in order to » Be retrieved more often than it otherwise would » Be retrieved in contexts that it otherwise would not be retrieved in 6 Web Spam What are the types of Web spam? – Add extra terms to get a higher ranking » Repeat “cars” thousands of times – Add irrelevant terms to get more hits » Put a dictionary in the comments field » Put extra terms in the same color as the background of the web page – Add irrelevant terms to get different types of hits » Put “free beer” in the title field in sites that are selling cars – Add irrelevant links to boost your link analysis ranking There is a constant “arms race” between web search companies and spammers 7 Information Retrieval Goes Social A new way of – using words as weapons? – using words to mean other than what they say? – subliminal authoring? Ranking algorithms must now adopt a posture of “defensive searching”. 8 Information Retrieval Goes Social Thirty years of research on ranking algorithms had never remotely considered what happens with IR goes social. For the underlying assumptions of IR, these problems are almost absurd. 9 Information Retrieval Goes Social The IR side isn’t all innocent – issues of ranking sites higher in return for payoffs 10 Traditional Information Filtering At least 20 years of research on information filtering – A stream of information flows by, filter out those not of interest, or retain those of interest Focus: how to identify which documents about a particular topic – financial news, terrorist activity A classification problem Usually single-user judgements only 11 Information Filtering Goes Social ABC news call ~1993. They’ve heard about categorization software. They want to identify: – news programming about sex and violence With the WWW, rapid commercial adoption of filtering software for: – adult content. This was not on the research radar screen. – Major use of filtering now: taste alignment. – Major technique: pooled judgements » Examples: Ringo, DirectHit 12 Why Does this Happen? Computer Scientists are not trained to think of the social interactions in the use of their systems There wasn’t good reason to see this happening soon. – The PARC Tapestry project (CACM 35 (12), 1992) – Collaborative Filtering, but ahead of its time 13 Domain Names 14 Hypertext Then Proceedings of ACM Hypertext 89 – 28 papers: » Navigation, engineering, knowledge representation, implementation & interfaces, applications, IR, usability of links, fiction and writing – 9 panels: » » » » » » » » » Interchanging hypertexts Narrative and Consciousness Lessons from ACM hypertext project Indexing Expert Systems Higher Education: A Reality Check Software Engineering Cognitive Aspects Confessions: What’s Wrong with our Systems 15 Hypertext Then Much discussion on – – – – semantics of link types navigation paths not getting lost (still an issue!) how to author documents What about social implications? 16 Hypertext Then Two papers are relevant. – Amy Pearl, Sun’s Link Service: A Protocol for Open Linking » discusses use of a separate link repository to allow linking between objects that reside on different systems » simply assumes bidirectional linking » concerned with technical difficulties – Bob Glushko, Design Issues for Multi-Document Hypertexts » considers the question of whether links should be allowed outside of documents » concludes they should, but in a cautionary manner 17 Course Gedanken Experiment What happens if bi-directional links are possible? Required? My naïve pre-social CS-y thoughts: – easier to link footnotes and their citations – easier to link papers, lectures, to author’s home page – easier to find related information 18 Course Gedanken Experiment What the socially-savvy SIMS students said about bi-directional links Basically, overall a negative thing. – link “spamming” » people who hate microsoft overburdening them with links » sexual harrassment – use for false endorsements – alliances, negotiation for cross-linking, a link market – inability to hide confidential information – advertisers would be affected – redirect unwanted links to another page 19 What’s Going On? Before going social, most hypertext was – within a single “document” or user group – incompatible with outside hypertext – seen as useful as a new way for reading acomplex documents After going social, hypertext is – seen as useful for linking information in quite farflung places, assembled by people who don’t have know each other or have access to each other’s systems – social issues follow Without appropriate safeguards, pages might also have to adopt “defensive linking” 20 CS and the Social Sciences A subset of CS has long engaged with social sciences and humanities Artificial Intelligence (since early 60’s) » psychology (cognitive science) » linguistics » philosophy More recently, HCI » human-computer interaction » psychology (cognitive science, human factors) » ethnography But … sociology … NOT 21 What is this leading to? I might be suggesting the topic: how should CS research be changed? Instead, I think these effects are interesting in their own right. 22 Turning the Tables The standard way to incorporate a field (like sociology) into a CS project would be for the purposes of building better systems. – recommend information better – filter information better Instead, what if the goal is to build systems to better understand society? 23 Talk Re-Cap When information processing systems “go social” they are used in radical, often unexpected ways Now that information processing systems have gone social, it is time to use them to help us better understand society Let’s Turn the Tables – Create technology to aid study of society 24