Download Understanding library users you don`t see

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

URL redirection wikipedia , lookup

Transcript
Understanding
library users you
don't see
Techniques for tracking and
analyzing library Web resources
Marshall Breeding
Director for Innovative Technologies and Research
Vanderbilt University
http://staffweb.library.vanderbilt.edu/breeding
[email protected]
Saturday June 24
Theme


For many libraries, the number of visitors of their Web site and
electronic resources exceeds the numbers that visit their physical
premises. It's vital for libraries to understand how these remote
visitors approach the Web site, not only to measure use but to
improve the resources themselves. Marshall Breeding will present a
number of practical techniques that libraries can use to better
understand the use of their Web-based resources.
Topics will include the basics of analyzing the server logs of the
library's Web site, transaction logs from the OPAC, the complexities
of measuring use of subscription-based electronic resources, and
techniques for enhancing applications to better record how they are
used.
Understanding remote users
Vital to providing relevant library services
 More libraries may use library resources
remotely through the Web than from
physical library facilities
 Must work harder to ensure that Webbased services meet patron needs
 Move beyond hit counters and raw
statistics to more sophisticated analysis
and assessment

Analysis goals








Improve usability
Web site diagnostics
Understand user needs
Content selection decisions
Improve quality of service
Marketing
Budget justification
Strategy to increase interest and activity
Data sources for tracking remote
use
Web server logs
 Application logs
 Remote tracking data (Google Analytics)
 Vendor provided use statistics (eresources)

Enterprise approach to analytics

Multiplicity of Resources to track







Web Servers
OPACS
E-Resources
Databases
Repositories
Important to track the flow of use among all the library’s
Web-based resources
Beyond the library: study flow to and from higher-level
Web sites and portals (University -> Courseware ->
Library)
Web server logs

Web servers are routinely configured to record
detailed information about each request.
Common elements include:
 File
requested
 Date / time stamp
 Status code
 Request directive (get, post, head)
 Referrer (where the user came from)
 User agent (browser and platform data)
Example Web log

Raw data for analysis process
2006-06-20 05:01:43 129.59.150.105 GET
/index.pl - 80 - c-69-250-131199.hsd1.md.comcast.net
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows
+NT+5.1;+SV1;+.NET+CLR+1.1.4322)
http://www.google.com/search?hl=en&lr=&saf
e=off&q=september+11+television+archive
200 0 0 11752
Exploiting referral data


The query string component of the referrer can
be parsed to reveal search terms and other
interesting information
http://www.google.com/search?hl=en&lr=&safe=
off&q=september+11+television+archive
 User
typed “september 11 television archive” in
Google to find our site


Important to study how users get to your site
[example: TV News Public Web queries vs
OpenWeb)
Analysis methodology
Go beyond simply counting pages
 Identify Sessions
 Categorize users
 Determine use patterns
 Measure interest

 Time
spent on Web site
 Bounce rate
 Page overlay analysis
Move from measurement to
impact
Establish site goals
 Benchmark current use
 Implement goal oriented improvements
 Measure impact
 Repeat as needed
 (Example: enhancement of TV News
OpenWeb)

Appropriate data filtering





Requests from indexing bots (crawlers) can
skew statistics
Count user requests and bot requests
separately
Performance monitors
Link checkers
Monitoring crawler activity is an important
component of SEO and Web site discoverability
strategies.
Resource Discovery
How do users get to your site?
 Track performance of the Web site relative
to major search engines
 SEO – Search engine optimization
 Few users begin with library Web sites

Troubling statistic
Where do you typically begin your
search for information on a
particular topic?
College Students Response:
 89%Search engines (Google 62%)
 2% Library Web Site (total respondents -> 1%)
 2% Online Database
 1% E-mail
 1% Online News
 1% Online bookstores
 0% Instant Messaging / Online Chat
OCLC. Perceptions of Libraries and Information Resources
(2005) p. 1-17.
Library Discovery
Model
Web
Library Web Site / Catalog
Library as search Destination
TV News OpenWeb project
Dramatic increase in Web site activity and
loan requests through systematic and
controlled exposure of metadata to Google
and other search engines
 SEO (Search Engine Optimization)
strategy
 Helped the Archive become financially
self-sufficient.

Examples of Web
reporting and
analysis tools
Selected utilities



Analog – free, open source
NetTracker – enterprise level Web analysis
application
Google utilities
– process for submitting Web pages for
optimized indexing by Google with some assessment
capabilities
 Analytics – Sophisticated approach for measuring
Web site performance
 Sitemap
Analog
Free Open Source application
 Basic Web statistics application
 Includes fairly full set of static metrics
 Command line utility – generates Web
report
 Windows, Unix, Linux, etc.

NetTracker
Unica Corporation
 Enterprise level Web analytics
 http://www.sane.com/

NetTracker Executive
Dashboard
NetTracker Bandwidth Trends
NetTracker Content
NetTracker Keyword Summary
NetTracker Referrers
NetTracker Pages Viewed
Google SiteMaps
XML specification for systematically
submitting URLs that represent a Web site
 Makes indexing more efficient but does
not affect PageRank
 SiteMap interface provides utilities for
monitoring how the site has been indexed
with some analytical information on terms
used to find your Web site.

Google SiteMaps Top Searches
Google SiteMaps Page Analysis
Google Analytics





Available at no cost from Google
Must receive invitation code
Slanted toward e-commerce
“Conversion University” – training on how to
optimize Web site for high conversion rates.
Allows Webmasters to establish site goals and
measure performance
Google Analytics main
Google Analytics overview
Google Analytics Browser Versions
Google Analytics Top Content
Google Analytics Entrance-Bounce
Rates
Google Analytics Navigational
Analysis
Google Analytics Goal tracking
Application-level reporting and
analysis




Content management systems and other
dynamically driven Web environments can
provide additional usage information.
Can offer additional information beyond raw
Web logs
More capabilities for identifying use based on
user categories
Reporting can be built into the business logic of
the application
Examples from the TV News Web
Site




Reports of use by user category and institution
Statistics on resource use
Data on search types, query terms, etc.
Ability to track all aspects of business activity
Other sources of Use data
ILS OPAC Logs
 Proxy Server logs and reports
 Link resolver logs and reports

Limitations





Can’t know the intent of the user
User success can only be estimated
Difficult to obtain trends by user type
More aggressive reporting might intrude on
privacy
Few libraries require the level of user
authentication needed to determine use by type
of patron
Additional Information


Breeding, Marshall. Strategies for Measuring
and Implementing E-use. ALA TechSource. MayJune 2002. 79 pages.
Breeding, Marshall. “Analyzing Web server logs
to improve a site’s usage.” Computers in
Libraries. Information Today. Medford, CT.
October 2005.
Handout

Presentation will be available after the
conference at:
http://staffweb.library.vanderbilt.edu/breeding/presentati
ons/ala2006.ppt