Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Understanding library users you don't see Techniques for tracking and analyzing library Web resources Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University http://staffweb.library.vanderbilt.edu/breeding [email protected] Saturday June 24 Theme For many libraries, the number of visitors of their Web site and electronic resources exceeds the numbers that visit their physical premises. It's vital for libraries to understand how these remote visitors approach the Web site, not only to measure use but to improve the resources themselves. Marshall Breeding will present a number of practical techniques that libraries can use to better understand the use of their Web-based resources. Topics will include the basics of analyzing the server logs of the library's Web site, transaction logs from the OPAC, the complexities of measuring use of subscription-based electronic resources, and techniques for enhancing applications to better record how they are used. Understanding remote users Vital to providing relevant library services More libraries may use library resources remotely through the Web than from physical library facilities Must work harder to ensure that Webbased services meet patron needs Move beyond hit counters and raw statistics to more sophisticated analysis and assessment Analysis goals Improve usability Web site diagnostics Understand user needs Content selection decisions Improve quality of service Marketing Budget justification Strategy to increase interest and activity Data sources for tracking remote use Web server logs Application logs Remote tracking data (Google Analytics) Vendor provided use statistics (eresources) Enterprise approach to analytics Multiplicity of Resources to track Web Servers OPACS E-Resources Databases Repositories Important to track the flow of use among all the library’s Web-based resources Beyond the library: study flow to and from higher-level Web sites and portals (University -> Courseware -> Library) Web server logs Web servers are routinely configured to record detailed information about each request. Common elements include: File requested Date / time stamp Status code Request directive (get, post, head) Referrer (where the user came from) User agent (browser and platform data) Example Web log Raw data for analysis process 2006-06-20 05:01:43 129.59.150.105 GET /index.pl - 80 - c-69-250-131199.hsd1.md.comcast.net Mozilla/4.0+(compatible;+MSIE+6.0;+Windows +NT+5.1;+SV1;+.NET+CLR+1.1.4322) http://www.google.com/search?hl=en&lr=&saf e=off&q=september+11+television+archive 200 0 0 11752 Exploiting referral data The query string component of the referrer can be parsed to reveal search terms and other interesting information http://www.google.com/search?hl=en&lr=&safe= off&q=september+11+television+archive User typed “september 11 television archive” in Google to find our site Important to study how users get to your site [example: TV News Public Web queries vs OpenWeb) Analysis methodology Go beyond simply counting pages Identify Sessions Categorize users Determine use patterns Measure interest Time spent on Web site Bounce rate Page overlay analysis Move from measurement to impact Establish site goals Benchmark current use Implement goal oriented improvements Measure impact Repeat as needed (Example: enhancement of TV News OpenWeb) Appropriate data filtering Requests from indexing bots (crawlers) can skew statistics Count user requests and bot requests separately Performance monitors Link checkers Monitoring crawler activity is an important component of SEO and Web site discoverability strategies. Resource Discovery How do users get to your site? Track performance of the Web site relative to major search engines SEO – Search engine optimization Few users begin with library Web sites Troubling statistic Where do you typically begin your search for information on a particular topic? College Students Response: 89%Search engines (Google 62%) 2% Library Web Site (total respondents -> 1%) 2% Online Database 1% E-mail 1% Online News 1% Online bookstores 0% Instant Messaging / Online Chat OCLC. Perceptions of Libraries and Information Resources (2005) p. 1-17. Library Discovery Model Web Library Web Site / Catalog Library as search Destination TV News OpenWeb project Dramatic increase in Web site activity and loan requests through systematic and controlled exposure of metadata to Google and other search engines SEO (Search Engine Optimization) strategy Helped the Archive become financially self-sufficient. Examples of Web reporting and analysis tools Selected utilities Analog – free, open source NetTracker – enterprise level Web analysis application Google utilities – process for submitting Web pages for optimized indexing by Google with some assessment capabilities Analytics – Sophisticated approach for measuring Web site performance Sitemap Analog Free Open Source application Basic Web statistics application Includes fairly full set of static metrics Command line utility – generates Web report Windows, Unix, Linux, etc. NetTracker Unica Corporation Enterprise level Web analytics http://www.sane.com/ NetTracker Executive Dashboard NetTracker Bandwidth Trends NetTracker Content NetTracker Keyword Summary NetTracker Referrers NetTracker Pages Viewed Google SiteMaps XML specification for systematically submitting URLs that represent a Web site Makes indexing more efficient but does not affect PageRank SiteMap interface provides utilities for monitoring how the site has been indexed with some analytical information on terms used to find your Web site. Google SiteMaps Top Searches Google SiteMaps Page Analysis Google Analytics Available at no cost from Google Must receive invitation code Slanted toward e-commerce “Conversion University” – training on how to optimize Web site for high conversion rates. Allows Webmasters to establish site goals and measure performance Google Analytics main Google Analytics overview Google Analytics Browser Versions Google Analytics Top Content Google Analytics Entrance-Bounce Rates Google Analytics Navigational Analysis Google Analytics Goal tracking Application-level reporting and analysis Content management systems and other dynamically driven Web environments can provide additional usage information. Can offer additional information beyond raw Web logs More capabilities for identifying use based on user categories Reporting can be built into the business logic of the application Examples from the TV News Web Site Reports of use by user category and institution Statistics on resource use Data on search types, query terms, etc. Ability to track all aspects of business activity Other sources of Use data ILS OPAC Logs Proxy Server logs and reports Link resolver logs and reports Limitations Can’t know the intent of the user User success can only be estimated Difficult to obtain trends by user type More aggressive reporting might intrude on privacy Few libraries require the level of user authentication needed to determine use by type of patron Additional Information Breeding, Marshall. Strategies for Measuring and Implementing E-use. ALA TechSource. MayJune 2002. 79 pages. Breeding, Marshall. “Analyzing Web server logs to improve a site’s usage.” Computers in Libraries. Information Today. Medford, CT. October 2005. Handout Presentation will be available after the conference at: http://staffweb.library.vanderbilt.edu/breeding/presentati ons/ala2006.ppt