Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 Genres of Spam: Expectations and Deceptions Wendy L. Cukier, PhD Ryerson University [email protected] Susan Cody, PhD Ryerson University [email protected] Eva J. Nesselroth, MA Comm Ryerson University [email protected] Abstract This paper is a pilot study that explores how the concept of genre can be applied to the massive set of digital documents known as ‘spam’. The authors studied 300 spam messages collected over 15 weeks from a university email system. Messages were coded based on content, form and specific features as well as on the manifest relationship to existing genres of communication. The paper argues that spam is not a single genre but many genres. For the most part, the genres evoked in spam are adaptations of print to Internet, including information artifacts, pamphlets, business cards, order forms, bulletins, advertisements, and “Nigerian letters”. With spam, however, the concept of genre operates at several levels. Often, there is a contradiction between the manifest genre and the underlying purposes. The paper concludes that spam exploits genre by conforming to known forms while at the same time breaching those norms. 1.1 Introduction Why examine spam? Spam is one of the principal forms of communication on the Internet today. The rapid growth of the Internet has promulgated a quick and adaptive evolution in new forms of spam that take advantage of the speed, breadth, and accessibility of the medium. While much attention has been paid to the prevention of spam, little analysis has been conducted on the content or purposes of spam. This paper applies the concept of genre to the analysis of spam and identifies both the applicability and limitations of current genre analysis to this electronic communication category. Throughout this study, we ask key questions regarding spam and genre. For example, what is the importance of studying spam? How have researchers examined spam? What characteristics of spam have allowed it to elude more inquisitive study? What contribution can a change of focus make to the understanding of issues in genre theory and its application to digital documents? How can this focus help researchers map the evolution of genres within digital documents? Spam is typically treated as a single document set. This is because it has been viewed primarily from the perspective of the typical receiver in most deliberative instances as a “constrained organizational actor” [27]. By definition, genre presupposes or depends upon contextual or situational coherence, a community, or an organization [49]. As Orlikowski, Yates, and Yoshioka indicate, genres of communication are “socially recognized types of communicative actions — such as memos, meetings…that are habitually enacted by members of the community to realize particular social purposes” [49]. Genre analysis, according to these authors, is a useful way to examine “how a community communicates” [49]. Genres may also extend to the formation of social relationships which can have a significant effect on the quality of work [7]. Toms and Campbell argue: “Recognizing genre will facilitate effective user-document interaction” and so a particular “genre can be seen as an interface metaphor” [60]. However, the overall character of spam works against genre recognition. Individual spam genres employ misdirection or disguise in purpose and/or function in order to (1) bypass filters; (2) invite the receiver’s attention; or (3), in the cases of some advertisement types, to disarm the subject and overcome resistance to selecting the product/service. 0-7695-2507-5/06/$20.00 (C) 2006 IEEE 1 Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 Investigations have dealt with spam in terms of sector representation, ratios and costs to organizations and, more recently, researchers have taxonomized types of spam that present threats to systems. The reasons for these research choices are measurably evident. According to MessageLabs, a US consultancy firm, spam now accounts for around 65% of all email traffic [39]. For the European Commission (EC), the cost of spam to Internet users worldwide amounts to some 10 billion Euros per year [39]. A recent study estimated that loss of productivity due to spam messages represents an annual cost of about $1930 per employee [39]. In another study conducted by the Radicati Group, researchers found that 4% of corporate email users and 11% of consumer users had lost money to an email scam [53]. As many as 39% had clicked on a link in a spam message, and 13% of corporate users and 11% of consumers had purchased a product advertised via spam [53]. And because of the low cost of producing and sending spam, responses from as few as .00001 per cent of targets make this enterprise profitable [43]. Our study shows that genre analysis needs to recognize that one genre can mask another and that it may be necessary to penetrate the multiple layers of a typical spam message to uncover the message’s true intent. Thus, genre becomes a tool for deception. Fraudulent emails arrive disguised as ordinary communications which, at first glance, take the shape of a recognizable form. For example, a spam “memo” may evoke the expectations of “memo” in order to catch the reader’s attention and clinch confidence. A memo is a recognizable genre and common format of email messages. The memo may in fact reveal itself to be an advertisement for a product when further analysis of its structural content is made. Yet, at the same time, the memo-advertisement may in fact not be an advertisement at all. Its real purpose may be to harvest email addresses, ‘phish’ for consumer information, distribute a virus, or perpetrate fraud. 2. Literature Review Because this paper brings together several topics, we have reviewed the literature on spam, genre, advertising and the Internet to provide the relevant context for our analysis. 2.1. Spam Zeltsin defines spam as including all electronic messages that are unsolicited or unwanted, sent to a large number of users (in bulk), without regard to the identity of the individual user, and usually having commercial purposes [65]. These can also include viruses that spread via email, or fraud and scam mechanisms. Early 2002 estimations showed that one out of twelve email messages fits the description of spam. During 2002, spam numbers escalated to an average frequency of one out of three email messages. In 2003, Ferris Research estimated that an Internet user takes an average of 4.4 seconds to handle a spam message and that approximately 20 billion such spams are sent each day. Cumulatively, handling spam approaches 25 million hours per day [20]. Participants of the 2003 World Summit on the Information Society (WSIS) recognized that spam is a “significant and growing problem for users, networks and the Internet as a whole” [61]. To date, most research on spam has examined its negative impacts, technical characteristics, regulatory issues and the technologies to prevent it from overwhelming “legitimate” communications. Many scholars are concentrating on developing a web spam taxonomy in an effort to identify instances of spam, to prevent spam and to counterbalance the effect of spamming [12, 17, 26, 31, 51]. The literature explores emerging forms of spamming and the ways in which spam is used to collect email addresses, distribute viruses and perpetrate deceptive marketing practices or fraud. For example, spam masquerading as advertising or gibberish is often designed to circumvent filters with the sole purpose of “harvesting” email addresses. In “brute force” and “dictionary” attacks, spam programs send spam to every possible combination of letters at a domain, or to common names and words [9]. Spammers continually discover new techniques to send spam. Spam may mix content and use orthographic inventions (e.g., ‘sec’s in lieu of ‘sex’) and gibberish (e.g., Subject: “Pittsburgh pullover diorite chimera bray”) to avoid lexical detection by filters. Viruses, worms, and malware, such as Melissa, Love Bug and MyDoom, also use spamming techniques to propagate after a user unwittingly activates them. Viruses and worms may install open proxies that can be used to relay spam or to install software which transforms a computer into a “zombie” (i.e., a computer owned by an unsuspecting user, through which spam is sent). Phishing attacks steal consumers’ personal identity data and financial account credentials. Socialengineering schemes use ‘spoofed’ emails to lead consumers to counterfeit websites designed to trick recipients into divulging financial data such as credit card numbers, account usernames, passwords and social security numbers. Hijacking brand names of banks, eretailers and credit card companies, phishers often convince recipients to respond. Technical subterfuge schemes plant crimeware onto PCs to steal credentials directly, often using Trojan keylogger spyware. Pharming crimeware misdirects users to fraudulent sites or proxy servers, typically through DNS hijacking or poisoning [4]. Phishing and scams are distributed as 2 Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 spam, directly leading to identity theft and fraud. Phishing spam increased 52 per cent in January, 2004. The statistics show that the response rate to this type of fraud is around five per cent [65]. While there is a growing body of research on online advertising, surprisingly little attention has focused on examining the types or functions of spam. Most of the research to date on spam has been motivated primarily by an interest in technological and regulatory issues. An early study published in the Association for Computing Machinery (ACM) classified 400 unique messages sent to the AT&T and Lucent subdomains under study for three months in 1997 [3]. The leading categories were money-making opportunities (36%); and adult entertainment including singles services, and sexually oriented products or services (11%) [3]. Regulatory issues were the main focus of the article. More recently, the Federal Trade Commission (FTC) analyzed a sample of 1000 items from 11 million collected messages and again, investment/business opportunities (20%), adult-oriented spam (18%), and finance (17%) were the most common categories [19] (See Table 1). For example, the FTC study noted, in spite of the Direct Marketing Association guidelines which stipulate that messages must provide instructions for removal, only 36% of messages in the sample provided these [59]. Further, comments from email administrators suggest that many of these instructions were likely faulty or deliberately misleading [19]. The study also notes that fewer than 10% of the sample identified the name, postal address, phone number, and email address of the sender [19]. Similarly, Jacobsson and Carlsson’s experiment with false email accounts corroborates the failure of most spam emails to conform to regulatory requirements including identifying the sender and providing options to unsubscribe [32]. Again, the focus of these studies was the regulatory compliance of mass mailings. Despite the preponderance of these messages, research has been almost exclusively forensic. Spam content, style, and genre have been largely overlooked and few researchers have gone beyond broadly categorizing the types of spam messages. One exception is Orasan and Krishnamurthy’s investigation of the linguistic characteristics of junk email based on an analysis of 673 files, comparing them to a corpus of leaflets extracted from the BNC [44]. They noted a number of linguistic differences including shorter sentences, limited vocabulary and increased use of personal pronouns such as “you” in spam [44]. The researchers examined the occurrence of key words such as: free, money, investment, credit, fast, Internet, email, sex, weight and miracle [44]. However, the article is primarily descriptive, drawing few implications. 2.2 Genre 2.2.1. Genre History. Northrop Frye, the Canadian literary scholar, is probably the most cited source on genre. In the book, Anatomy of Criticism, Frye proposes that virtually all literature could be categorized according to universal genres with defined structure, rules and characteristics [25]. Genres are the literary conventions or “codes” associated with particular forms — for example, epic, tragedy, and allegory. Miller defines genre as “typified rhetorical action based in recurrent situations” [41]. Swales notes that genres have similar structures, stylistic features, content and intended audiences [58]. However, as Chandler notes in “An Introduction to Genre Theory”, hierarchical taxonomy of genres is not a neutral or “objective procedure” [10]. “A genre is ultimately an abstract conception rather than something that exists empirically in the world”, notes Feuer [21]. Thus, one theorist’s genre may be another’s sub-genre or even super-genre; and what is technique, style, mode, formula or thematic grouping to one, may be treated as a genre by another. Miller suggests that “the number of genres in any society…depends on the complexity and diversity of the society” [41]. Some have focused not only on exploring the conventions of the form, but also on the “interpretive and cultural-historical aspects of compound mediation that are so important in understanding the use of documents” [56]. In other words, in addition to considering the questions ‘What is the purpose of this genre?’ and ‘What material goes into one?’ the academic approach to genre also explores the social and political dimensions of the context from which the genre emerged [1, 7]. 2.2.2. Genre and Advertisements. The study of advertising and consumer behavior has a long history among both marketing and communications scholars. In 1987, Holbrock and Batra explored the effects that certain types of ads could produce on specific groups of people [30]. Further analysis of advertisement content and structure has likewise been studied by Mick and Mitchell and Olson [40, 42]. Laskey, Day and Crask argue that advertisements should be categorized not only by the message content (i.e., what is said) but also by structured format (i.e., how it is said) [36]. The message’s construction is an important feature, as articulated by Eldridge and by Wells, Burnett and Moriarty [18, 63]. Scholars of advertising have employed genre, narrative and other concepts as analytic tools. Stern, for example, explored the use of classical allegory in contemporary advertising [57]. Communications scholars have used genre to explore various forms of media, for example, when examining the advertising aimed at children [28]. Puto and Wells distinguished information and transformation advertising 3 Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 and Holbrook and Hirschheim examined the role of fantasies, feelings and fun in advertising [52, 31]. 2.2.3. Genre and Electronic Communication. Genre and genre repertoire have been proposed as analytic tools for investigating the structuring of communicative practices within a community. For example, Orlikowski and Yates propose that genres of organizational communications — memos, meetings, expense forms, training seminars — are habitually enacted by members of a community for particular social purposes [46]. Subsequently, they examined the communication exchanged by a group of distributed knowledge workers in a multiyear, inter-organizational project and suggest that the group’s communicative practices evolved in response to community norms, project events, time pressure and media capabilities [45]. Building on structuration theory, Orlikowski and Yates describe iterative relationships between communications genres and organizational practices [45]. This approach has subsequently been applied in other contexts; for example, Crowston and Williams initially analyzed 100 websites as a pilot project, and subsequently extended their analysis to 1000 websites [13]. They conclude that genres provide a useful tool for analyzing uses of the Web [13]. In Crowston’s subsequent work, he notes the limitations of top-down genre analysis using pre-existing categories and explains that bottom-up analysis allows for multi-dimensional definitions of genre as they become apparent, thus providing more flexibility in the face of limited forms [12]. Others have examined personal home pages as an emerging genre. Structural features included the presence of personal information, formulaic welcome messages and iconographic technical features. Davidson explores genres in the context of medical information systems [15]. Herring et al. examine Weblogs in the context of traditional and new media [29]. Atunes and Costa explore genres of electronic meeting communications [6]. Akesson et. al explore the genre of on-line newspapers examining content, form and functionality [3]. Genre hierarchies, embedded genre and genre systems, genre repertoires, and genre change are discussed by Crowston and Williams [13]. 2.3. Advertising and the Internet Scholars have only recently begun to examine issues related to advertising on the Internet. For example, Palmer conducted a genre-based analysis of target advertisements or “netvertising” focusing on linguistic characteristics [50]. Moreover, Shamdasani, et. al look at the ways in which website reputation and the relevance between the website content and banner ad product category match [54]. They differentiate between high-involvement products which are relevance-driven and low-involvement products which are reputationdriven. In addition, Kunz and Osborne analyze new formats of advertising on the Internet with a focus on banner and streaming media ads and their impact [35]. McMillan explores the array of forms and ways in which online advertising differs from conventional advertisements, particularly in terms of the compression of the selling cycle, interactivity, intrusiveness, and the capacity for personalization [38]. She builds a typology of Internet advertising based on function such as brand building messages, corporate communications, direct response messages, and electronic transactions. Her twoby-two matrix differentiates location (external/internal) and purpose (communication/call to action) [38]. 3. Our Study Our sample comprises 300 spam items received in a single organizational account over a limited period of time. This set evaded the organization’s spam filters. We analyzed the set using a series of levels, including sector, heading, relationship between heading and content, rhetorical purpose, content, structure, tone and other features and forms characteristic of genre analysis. We compare our results to previous studies on electronic media and genre. 3. 1. Purpose This project represents the first phase of a longitudinal study of spam in which we explore the potential application of genre analysis. In this paper, we attempt to situate spam in the context of traditional as well as emerging electronic communication genres. We seek to characterize the properties of spam based on a temporally limited sample. Our principal purpose is to provide an empirical snapshot of spam with the intention of examining its uses of genre. This is intended to represent the first part of a larger longitudinal study which will analyze the evolution, adaptation and uses of genre. 3.2. Data Collection Spam messages that passed through the filter at a large Canadian university were collected over a 15 week period (February 21, 2005 – June 12, 2005). A total of 300 messages were collected for analysis. These were messages that had passed through the spam filter process and represent a tiny fraction of the spam received by the university. The university’s email server receives approximately 200,000 email messages each day. Of these, two thirds are typically blocked by the server using Postfix and another 10% are flagged as spam. The university uses a 4 Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 multi-layered approach to the spam problem. First, black-lists of known sources of spam or viruses are used to block messages from high risk sites. Next, Postfix, the mail transfer application, applies a basic filter and in cases of viruses or spam attacks can be used to delete messages of a particular type. Third, a virus check is performed by the server. At the fourth level of filtration, SpamAssassin, a Perl based software, will perform as a spam filter. If an email fails some of these tests, it may be sent to the recipient’s inbox with **SPAM** noted in the subject line. However, only one quarter of the spam used in this study was flagged with this subject header. Also, the occurrence of flagged words (e.g., ‘penis’) will lessen an email’s chances of reaching the inbox. Recognizing the range of factors that affect the amount and type of spam an individual may receive (Internet use habits, newsgroups, technical characteristics of spam filters, etc.), we do not propose that this set of messages is a representative sample of all possible variations of spam sent to a broad range of recipients. However, the researchers find that the overall composition of the spam in the sample roughly reflects the composition of spam recorded in much larger studies [3]. Subsequent work will explore the differences between the messages tracked by the filter and those which were not. 3.3. Data Analysis The methodology used combined both quantitative and qualitative content analysis. Different approaches to “reading” text are not mutually exclusive and applying multiple perspectives to text may address the limitations of individual techniques in isolation. While critical theorists have tended to reject quantitative strategies for determining the content or meaning of media messages (given the importance of considering both the manifest and latent meanings), even Kracauer granted that quantitative studies might serve as a supplement to qualitative analysis [34]. The sheer volume of mass media texts poses problems in terms of heterogeneity as well as quantity. As a starting point and for the purpose of coding, we used the predefined categories of spam from the FTC study [19] combined with Crowston and Kwasnik’s notions of facetted genre classification [12]. The coders created new categories where none existed and for hybrids (that appeared to be combinations of categories). These messages were coded twice by three coders for sector or subject, source information, function, “genre”, format, subject line/body relationships, structure, addressee, signer, action desired, tone and regulatory compliance. The coding results were cross-checked for inter-rater reliability. Subsets representing each sector and genre were further analyzed using qualitative discourse analysis to explore recurrent themes and connotative use of language. Coding for sector or subject drew on previous studies [12] with space provided for ‘other’. 3.4. Findings 3.4.1 Categories of spam (sector or subject) As noted, the most common categories of spam in our study paralleled the categories in the FTC study. These are listed in Table 1 below. Financial and business opportunities were the most common categories, defined by sector or subject, followed by adult-oriented products and services. Table 1: Common categories of spam (defined by sector or subject) Category Financial Business Opportunity AdultOriented Health (including prescriptions) Computer hard/software Sales Other SubCategory Mortgage Loans Other Adultoriented Male enhancement Unknown Recruiting Gaming Misc. News/sports Politics Our Study (2005) Total % per cat. FTC (2003) N= 1000 26.7% 20% 22.7% 18% 21.7% 21.7% 10% 45 15.0% 15% 7% 7 23 5 5 4 3 1 2.3% 7.7% 1.7% 1.7% 1.3% 1.0% 0.3% 2.3% 16% 13.7% 10% N= 300 26 4 23 % 8.7% 1.3% 7.7% 24 8.0% 35 11.7% 30 10.0% 65 * Male enhancement advertisements could be classified as adult-oriented or health. We have combined them with adultoriented for the purposes of comparison. There were differences between the composition of our sample and the much larger random sample used by the FTC. For example, our sample contained considerably fewer product and service advertisements and more health-related spam. This might have been a reflection of individual Internet behavior (i.e., visiting websites that create a ‘cookie’ which will attract specific kinds of spam) as well as the filter used by the organization. 3.4.2. Manifest “Genre” While we initially thought that we would be able to define the spam genre through content and discourse analysis, it became apparent that spam is not a single genre but a collection of genres. 5 Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 Among the spam, we found messages that resembled a wide range of well-established document genres (see Table 2). The most common genre of spam, at least on the surface, is a personalized memo which includes a description of a product or service with an embedded URL for more information. Almost sixty percent (59.7%) of the spam was in this form. The next most common form of spam appeared to be a letter, which might describe a service, but more often appear to be a scam to obtain personal information. For example, we found nine examples of spam versions of “Nigerian Letters”. Spam also took the form of “confirmations” of orders or preapproved applications modeled on standard invoice and purchase order forms. For example, “You have been preapproved – your new application number 34”; “Your new application number 56”; or “PGF ALERT: Purchase Order Created for You”, again invoking a wellestablished business communications form. Testimonials, a common form of advertising and direct mail, were used in 2% of the cases (e.g., “I have always worried about the size of my penis…”). Only 2% of the spam analyzed resembled conventional promotional pamphlets and only 2.7% resembled conventional display advertisements. News bulletins (2.7%) such as announcements on stock prices as well as warnings and announcements (1.7%) such as “Remove all these popup messages today!” and “Microsoft virus warning – September 8th” were found. Newsletters (2%) and catalogs (1%) also appeared. There were contest winners (e.g., “Winner – winning notification”) reflecting a common form of direct mail. Table 2: Manifest “genres” of spam Types of Genre Memo (with URL/link) Letter Order confirmation, preapproved, application no. Gibberish Display advertisement News bulletin Newsletter Pamphlet (HTML) Testimonial Announcement/warning Press release Catalog Contest winner HTML code No. of data 179 24 % 59.7% 8.0% 24 17 8 8 6 6 6 5 4 3 2 2 8.0% 5.7% 2.7% 2.7% 2.0% 2.0% 2.0% 1.7% 1.3% 1.0% 0.7% 0.7% URL submission form Article Business card Form Order forms/price list 2 1 1 1 1 300 0.7% 0.3% 0.3% 0.3% 0.3% 100% At the same time, we also found spam that was not easily linked to existing genres for example, “gibberish” spam (5.7%) which consisted of combinations of words and/or letters which appeared to be random; for example, “gibbon perplexbarrett inactive briberyknead…”. This form of communication is the most apparent example of a new genre — gibberish spam is principally designed to circumvent filters to confirm email addresses or to deliver viruses to unsuspecting recipients. 3.4.3. Regulatory Compliance. Previous studies of spam, as noted above, revealed that a small percentage of spam complies with regulatory requirements found in the United States. In our study, we found that less than 1% of messages informed recipients of their right to noncontact and just over one third (36.7%) included noncontact information, some of it very disguised. Sixtythree per cent of the messages we examined did not comply with privacy requirements at all. Again this is, at least in part, because many of the spam communications were not what they initially appeared to be. 3.4.4. Unclear or Misleading Spam. Previous studies have revealed that most spam messages are deceptive to some degree. The FTC report on a sample of 1000 incidents of spam finds that 66% of messages contained false “from” lines, “subject” lines, or message text [19]. We found that the subject line was indicative of the contents of the message in more than one third (36.9%) of the messages examined. These included subject lines such as: “prescription drugs”, “replica rolex”, “job offer”, “name brand software”. In another 30.6% of cases, the subject line was loosely associated with the content of the message without specifying up front what was being sold. For example, the subject line “men’s silent secret” preceded an email selling impotence drugs. In 27.1% of the cases, the subject line was a teaser apparently intended to arouse interest: “Good idea”, “Fact or Fiction”, “Power, Possibilities, Opportunities” are examples. In some cases (2.7%), the subject had no relationship to the content (for example, promises of sex were in the subject line while software was in the text) or there was simply no subject line at all (2.7%). There is other evidence of deceptive and misleading spam. We were unable to accurately categorize the genuine intent of all emails, but in 55% of the cases, the provided links did not work when tested after the study period, suggesting that they were in fact not what they 6 Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 purported to be. Approximately 15% of the links within the spam emails were live working links, and 30% of the links redirect the user to another site. Table 3: Different Subject Lines of Spam Category of Subject lines Subject line directly reflects content Subject line associated with content Subject line is a teaser Subject line has no relation to content No subject line % 36.9% 30.6% 27.1% 2.7% 2.7% 100% 3.4.5. Use of Language. One of the most intriguing features of spam as advertisement (rather than spoof, Trojan horse, virus, or phishing device), is its reliance on extraordinary compression and exclusivity of text. Most images are available only sequentially rather than synchronously, accessed through the following steps of text acceptance: 1) subject line, 2) message, and 3) link. Spam advertisers use a multi-layered and gnomic textual approach. They make mimetic use of email as a medium of intimacy, interiorism, and breaking down of social barriers (public/private, conscious/deliberate vs. unconscious/ impulsive domains, etc.). Spammers use words that might be part of the realm of resistance to an inquiry, objection, or serious consideration of a proposal or purchase. They “neutralize” these words/concepts by misappropriating or contextualizing them in the “message” language. By classifying characteristics that have so far escaped notice, researchers can contribute to the understanding of genre and to the practice of genre analysis. Many spam messages use language as a technique for thwarting spam filters. We found that 33% of messages added language, quotations or “alphabet soup”. For example, a spam entitled “Top meds bought online”, with the text “Same medicine, different price!” and a link, also included the following gnomic saying, “television has brought murder back into the home — back where it belongs.” A Google search on these additional texts revealed a wide range of sources from the Bible to quotations from Eleanor Roosevelt. 3.4.6. Evolving Genres. It is apparent that traditional modes of correspondence have emerged in the electronic medium of spam. While there is not enough space in this paper to explore the use of language in more detail, it must be noted that every genre of spam operates according to a certain semiotic logic of recognizable signs and indices. For example, the “Nigerian letter” is an identifiable genre in print communications. In the mid-1990s, there was an explosion of direct mail scams, many originating or purporting to originate in Nigeria. The proliferation of these letters was so great that The United States Secret Service actually issued an “Advance Fee Fraud Advisory” regarding Nigerian letters [61]. The Federal Trade Commission also identified the “Nigerian letter” in its study of false claims in spam [19]. These letters are unusually longer than other forms of spam, often exceeding two printed pages. The language in this genre of spam is usually ornate and pleading, playing upon the recipient’s altruism, conscience, or even greed. Generally, these letters ask the reader to send money or banking information in order to release funds from the estate of a deceased wealthy diplomat. Often, these letters prey upon the reader’s vulnerability and pity. For instance, one letter asks the reader to sympathize with the ‘obnoxious’ treatment of women in her country (Sierra Leone), since the writer was not permitted to inherit her late husband’s estate because she had no male children. Only with the support of the reader, to whom she promised a portion, could she access the funds. Another letter from Sudan asks the reader for support in the aftermath of the “Attack of tsunami”. This genre, like many others, is a carry-over from traditional postal letters, but email affords the senders a much broader pool of recipients as well as greatly reduced cost. In addition, we saw evidence that the electronic format was able to quickly adapt to changing current events. 3.4.7. Multiple Layers of Genre. Typical genre analysis may fail to adequately decode the intent of spam messages because spam uses recognizable cultural markers or indices to frame the genre in a way that fools the reader. The memo format is used to imply that the recipient has a business or personal relationship with the sender. Informal subject lines such as “Hi there” and “Haven’t heard from you in awhile” are used to imply the messages are coming from friends, though they may contain adult content or ads for software. Very few of the sampled emails actually resemble advertisements. Therefore, the form of one genre is used in place of another as a mask to fool the recipient. However, the deceptive use of manifest genre goes far beyond the masquerading of advertisements as memos, letters, confirmation forms etc. In many cases, the memo (which is actually an advertisement) is not really intended to sell anything, but rather is meant to verify an email address, collect personal financial information or distribute a virus. Hence, there are multiple layers of genre, used to mask one deception over yet another. Our study provides further evidence that Crowston and Kwasnik’s [12] notions of the limitations of top-down analysis of spam using pre-existing categories is restrictive and provides further support for bottom-up analysis to account for emerging forms. 7 Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 4. Conclusions and Implications What makes the electronic spam message resistant to current genre analysis is that its “communities” are multiple, multi-layered, sometimes overt and covert, and sometimes private and public. As such, a bulk advertisement emailed indiscriminately is in some ways similar to print “junk” or direct mail. Because spam has been defined to include anything from viruses to joke forwards from friends, seeking a rigid definition of it — or, as Zeltsin suggests, of “unsolicited, or of “bulk” — works against an open inquiry and openness to “the evolution of the phenomenon.”[65]. Some bona fide new forms emerge with hidden purposes. For example, “ruse” spam uses lures, misdirection, or a string of nonsense to procure actual email addresses, sorted out behind the ostensible screen. Our analysis, while preliminary, suggests that spam covers a range of genres, serving a wide range of purposes. While some characteristics are common to many types of spam, there is more evidence to suggest that spam memos, advertisements, letters, and contest announcements are significantly different forms, even though in many cases they may serve similar purposes. Moreover, in the case of spam, “genre” is not fixed. One person’s unwelcome spam may be another person’s welcomed opportunity. To one “organizational actor” [27] an apparent spam “theme” may be oppressive and intrusive, suggesting that the organization is heedless of members’ environmental “hygiene”. To another, the recurrence of a theme may conjure a community of the like-minded, a sub-culture. Spam genres are actually hybrids. For the most part, the messages resemble traditional genres in their manifest form in order to increase the likelihood of eliciting certain behaviors; however, the actual purposes of the spam are often radically different from what they seem to be. While spam clearly embraces a range of genres, these operate on a variety of levels. Further work focusing on analyzing the function, content and style of spam may be valuable to understanding its position in the “ecology” of genres of electronic communications. 5. References [1] Agre, P.E., “Designing Genres for New Media: Social, Economic, and Political Contexts”. 1997. From: http://dlis.gseis.ucla.edu/people/pagre/genre.htm [2] Åkesson, M., Ihlström, C. and Svensson, J., “Genre Structured Design Patterns – The Case of Online Newspapers”, Universiteit van Halmstad, 2003. From: http://w3.msi.vxu.se/users/per/IRIS27/iris27-1106.pdf [3] Anon., “What does spam advertise?” Association for Computing Machinery. Communications of the ACM. Vol.41, Iss. 8, New York, Aug 1998, 80. [4] Anti-Phishing Working Group, 2005. From: http://www.antiphishing.org [5] Askehave, I. and Nielsen, A. E. “Digital genres: a challenge to traditional genre theory”, Information Technology & People , Vol. 18 No. 2, 2005, 120-141. [6] Atunes, P. and Costa, C. J., “From Genre Analysis to the Design of Meetingware”, Proceedings of the International ACM SIGGROUP Conference on Supporting Group Work, 2003, 302. [7] Bergquist, M. and Ljungberg, J., “Genres in Action: Negotiating Genres in Practice”. Proceedings from The 32nd Hawaii international Conference on System Sciences-Volume 2, Hawaii, 1999. [8] Carliner, S. and Bosworth, T., “Genre: A Useful Construct for Researching Online Communication for the Workplace”, Information Design Journal, Vol. 12, Iss. 2, John Benjamins Publishing, 2004, 124. [9] Center for Democracy & Technology, “Why am I Getting All this Spam? Unsolicited Commercial E-mail Research Six Month Report”, March, 2003. From: http://www.cdt.org/speech/spam/030319spamreport.shtml [10] Chandler, D., “An Introduction to Genre Theory”. From: www.aber.ac.uk/media/Documents/intgenre. Aug. 11, 1997. [11] Cranor, L.F. and La Macchia, B.A., “Spam!”, Communications of the ACM, Vol. 41, Iss. 8, August, 1998, 74. From: http://lorrie.cranor.org/pubs/spam/spam/htm [12] Crowston, K. and Kwasnik, B.H., “A Framework for Creating a Facetted Classification for Genres: Addressing Issues of Multidimensionality”. Proceedings from The 37th Hawaii International Conference on Systems Science, Hawaii, January, 2004. [13] Crowston, K. and Williams, M., “Reproduced and Emergent Genres of Communication on the World Wide Web”, The Information Society, Vol. 16, Iss. 20, 2000, 1. [14] Crowston, K. and Williams, M., “The Effects of Linking on Genres of Web Documents”, Presented at The Hawaii International Conference on Systems Science, Hawaii, January 1999. [15] Davidson, E. J., “Analyzing Genre of Organizational Communication in Clinical Information Systems”, Information Technology & People, Vol. 13, Iss. 3, West Linn, 2000, 196. [16] Doring, N., “Personal Home Pages on the Web: A Review of Research”, Journal of Computer Mediated Communication, Vol 7, Iss. 3, 2002. [17] Drucker, H., Wu, D., and Vapnik, N., “Support vector machines for spam categorization”, IEEE Trans. Neural Networks, Vol. 10, 1999, 1048. [18] Eldridge, C., “The Role of Advertising” in Advertising’s Role in Society, Wright, J.S., and Mertes, J.F. (Eds.), West Publishing Co., St. Paul, 1974. [19] Federal Trade Commission, “False Claims in Spam, A Report by the FTC’s Division of Marketing Practices”, April 30, 2003. From: http://www.ftc.gov/reports/spam/030429spamreport.pdf [20] Ferris Research, http://www.ferris.com/ [21] Feuer, J., “Genre Study”, in Channels of Discourse, Reassembled: Television and Contemporary Criticism, 2nd ed, pg. 138, Chapel Hill, University of North Carolina Press, 1992. [22] Firth, D. and Lawrence, C., “State of Research Review: Genre Analysis in Information Systems Research”, Journal of 8 Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 Information Technology Theory and Application (JITTA). forthcoming [23] Firth, K., Shaw, P. and Cheng, H., “The Construction of Beauty: A Cross-Cultural Analysis of Women’s Magazine Advertising”, Journal of Communication, Vol. 55, Iss. 1, New York, Mar 2005, 56,. [24] Freedman. A. & Medway. P. “Locating genre studies: antecedents and prospects”, in Genre and the new rhetoric, Freedman. A. & Medway. P. (Eds.), Taylor and Francis, London, 1994. [25] Frye, N., The Anatomy of Criticism, Princeton University Press, Princeton, New Jersey, 1957. [26] Gyongyi, Z., and Garcia-Molina, H., “Web Spam Taxonomy”, Technical Report TR 2004-25, Stanford University, 2004. [27] Hasselbladh, H., and Kallinikos, J., “The Project of Rationalization: A Critique and Re-appraisal of Neoinstitutionalism in Organization Studies”, Organization Studies, Vol. 21, Iss. 4, 2000, 697. [28] Hawkins, R.P., Pingree, S., “Television and Behaviour: Ten years of scientific progress and implications for the eighties: Vol. 2,” National Institute of Mental Health, 1982, 224-247. [29] Herring, S.C., Scheidt, L.A., Bonus, S. and Wright, E. “Bridging the Gap: A Genre Analysis of Weblogs”, Proceedings of The 37th Hawaii International Conference on System Sciences, 2004. [30] Holbrook, M.B. and Batra, R., “Assessing the Role of Emotions as Mediators of Consumer Responses to Advertising”, Journal of Consumer Research, December 14, 1987, 404. [31] Holbrook, M.B. and Hirschman, E.C., “The Experiential Aspects of Consumption: Consumer Fantasies, Feelings and Fun”, Journal of Consumer Research, September 9 1982, 132. [32] Jacobsson, A., and Carlsson, B., “Privacy and Unsolicited Commercial E-mail”, Proceedings of the Seventh Nordic Workshop on Secure IT Systems, Gjövik, Norway, 2003. [33] Jung, J., “An Empirical Study of Spam Traffic and the use of DNS Black Lists”, Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, 2004. [34] Kracauer, S., “The challenge of qualitative content analysis”, Public Opinion Quarterly, Vol. 16, 1952, 631. [35] Kunz, M. B., and Osborne, P., “What Impact will New Standards have on Internet Advertising?”, Proceedings of Society for Marketing Advances, New Orleans, LA, 2001. [36] Laskey, H.A., Day, E., and Crask, M.R., Social Communication in Advertising: Persons, Products, and Images of Well-Being, Methuen, Toronto, 1989. [37] Martin, B.A.S., Van Durme, J., Raulas, M. And Merisavo, M. “Email Advertising, Exploratory Insights from Finland”, Journal of Advertising Research, Vol. 43, Iss.3, 2003, 293. [38] McMillan, S., “Internet Advertising: One Face or Many?”, in Internet Advertising: Theory and Research, Schumann, D. and Thorson, E. (Eds.), 2nd ed., forthcoming. [39] MessageLabs, 2005. From: http://www.messagelabs.com [40] Mick, D.G., “Toward a Semiotic of Advertising Story Grammars”, in Marketing Signs: New Directions in the Study of Signs for Sale, Umiker-Sebeok, J., (ed.), Mouton de Gruyter, Berlin,1987. [41] Miller, C.R., “Genre as Social Action”, in Genre and the new Rhetoric, Freedman and Medway (Eds.), Taylor & Francis, London,1994. [42] Mitchell, A.A. and Olsen, J.C., “Are Product Attribute Beliefs the Only Mediator of Advertising Effects on Brand Attitude?” Journal of Marketing Research, 18 August, 1981, 318-332. [43] Orange Coast IBM PC User Group. [44] Orasan, C. and Krishnamurthy, R., “A Corpus-based investigation of junk emails”, in Proceedings of Language Resources and Evaluation Conference (LERC-2002), Las Palmas, Spain, 2002. [45] Orlikowski, W.J. and Yates, J., “Genre Repertoire: The Structuring of Communicative Practices in Organizations”, Administrative Science Quarterly. Vol. 39, Iss. 4, Ithaca, December 1994, 541. [46] Orlikowski, W.J. and Yates, J., “Genres of Organizational Communication a Structurational Approach to Studying Communication and Media”, Academy of Management, The Academy of Management Review. Vol. 17, Iss. 2, Briarcliff, April 1992, 299. [47] Orlikowski, W.J. and Yates, J. and Okamura, K., “Explicit and Implicit Structuring of Genres in Electronic Communication: Reinforcement and Change of Social Interaction”, Organization Science, Vol. 10, Iss. 1, Jan-Dec 1999, 83. [48] Orlikowski, W.J. and Yates, J. and Okamura, K., “Constituting Genre Repertoires: Deliberate and Emergent Patterns of Electronic Media Use”, Academy of Management Journal, Best Paper Proceedings, Briarcliff Manor, 1995, 353. [49] Orlikowski, W., Yates, J., and Yoshioka, T., Communitybased interpretive schemes: Exploring the use of cyber meetings with a global organization, MIT, Cambridge, 2000. [50] Palmer, J.C., "Netvertising and ESP: Genre-Based Analysis of Target Advertisements and its Application in the Business English classroom”, Iberica, Vol. 1, 1999, 39. [51] Pelletier, L., Almhana, P., and Choulakian V., “Adaptive Filtering of Spam”, Communication Networks and Services Research, 2004. [52] Puto, C. P. and Wells, W.D., “Informational and Transformational Advertising: The Differential Effects of Time”, Advances in Consumer Research, Vol. 11, Polvo, UT, 1984, 638. [53] Radicati Group, Email Hygiene Survey Results From: http://www.radicati.com/email-survey2005.shtml. 2005. [54] Shamdasani, P., Stanaland, A., and Tan, J. “Location, Location, Location: Insights for advertising placement on the Web”, Journal of Advertising Research, Vol. 41, Iss. 4, 2001, 7. [55] Spinuzzi, C., “Describing Assemblages: Genre Sets, Systems, Repertoires, and Ecologies”, Computer Writing and Research Lab, White paper #040505-2, Austin, Texas. May 5, 2004. [56] Spinuzzi, C., Tracing Genres through Organizations: A Sociocultural approach to Information Design, MIT Press, Boston, 2003. [57] Stern, B., “Other-speak: Classical Allegory and Contemporary Advertising”, Journal of Advertising, Vol. 19, Iss. 3, 1990, 14. [58] Swales, J.M. Genre Analysis. English in Academic and Research Settings, Cambridge University Press, Cambridge, 1990. [59] The Direct Marketing Association. From: http://www.the-dma.org/guidelines/onlineguidelines.shtml 9 Proceedings of the 39th Hawaii International Conference on System Sciences - 2006 [60] Toms, E.G. and Campbell, D.G., “Genre as Interface Metaphor: Exploiting Form and Function in Digital Environments”, Proceedings of The 32nd Hawaii International Conference on System Sciences, January, Hawaii, 1999. [61] United States Secret Service, “Public Awareness Advisory Regarding ‘4-1-9’ or ‘Advance Fee Fraud’ Schemes”, From: http://www.secretservice.gov/alert419.shtml [62] Viser, V., “Thematics and Products in American Magazine Advertising Containing Children, 1940-1950”, Communication Quarterly, Vol. 47, Iss. 1, 1999, 118. [63] Wells, W.D., Burnett, J. and Mortiary, S., Advertising Principles and Practice, Prentice-Hall, Englewood Cliffs, NJ, 1989. [64] World Summit on Information Society Declaration, “Declaration of Principles: Building the Information Society: A Global Challenge in the New Millennium”, World Summit on the Information Society, December 12, 2003. From: http://www.itu.int/wsis/docs/geneva/official/dop.html. [65] Zeltsin, Z., “General Overview of Spam and Technical Measures to Mitigate the Problem” ITU-T SG 17 Interim Rapporteur Meeting November, 2004. 10