Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Best Practices For Project Web Sites Based on experiences from previous programmes Brian Kelly UK Web Focus UKOLN University of Bath UKOLN is supported by: Email [email protected] URL http://www.ukoln.ac.uk/ What Happens When The Funding Stops? When the project funding finishes:  The project gracefully turns into a fully-fledged service, with new funding from JISC, the EU, your institution, etc.  The project staff all leave and the Web site is shut down, is moved and can’t be found, or is broken and there is no-one with the interest, expertise or permissions to fix it An aim of this talk is to consider ways to help your project migrate to an ongoing service, or to minimise disruption if additional funding is not forthcoming 2 Contents • • • • • • • • 3 Web Site Dissemination We’ve Been Here Before Web-Based Dissemination News Feeds Standards Mirroring, Migration & Preservation Monitoring & Benchmarking Thoughts on Browsers Conclusions Embedding Web Service You want people to know about your project – but you also want your project deliverables to be embedded within institutions We’ve Been Here Before Who remembers: CTI Projects • CBL applications locked into obsolete hardware TLTP Projects • CBL developers using Toolbook on standalone PC, which could not be deployed on campus LAN eLib Projects • Web sites disappear • Other issues (Stephen Pinfield’s talk) EU Programmes • … 4 Survey of EU Web Sites WebWatching Telematics For Libraries Project Web Sites (Fourth Framework) • Exploit Interactive article published in Oct 2000 • Web site availability: Yes Never 65 16 Domain Gone 11 Page Gone 12 • Server details: Apache – 41 Netscape – 3 IIS – 10 NCSA – 3 Other – 6 (e.g. Mac, GN) • See <http://www.exploit-lib.org/ issue7/webwatch/> 5 Survey of eLib Web Sites WebWatching eLib Project Web Sites • Ariadne article published in Jan 2001 • Of 71 Web sites, 3 domains no longer available and 2 entry points have gone SOSIG 7,076 • LinkPopularity.com results shown: OMNI 5,830 EEVL 3,865 • Survey also includes: History 2,605  Analysis of entry points Netskills 2,363 (links, HTML, accessibility) Ariadne 2,144  Nos. of pages indexed by AltaVista … - 0 in some cases  xxx ~10  Due to robots.txt file  Due to frames interface or other robots barrier • See <http://www.ariadne.ac.uk/ issue26/web-watch/> 6 Web Site Promotion You want: • Your quality pages to be found in a timely fashion by users of search engines • To encourage others to link to you To ensure this happens you should: • Have a domain and URL naming policy • Exploit the Robots Exclusion Protocol • Be aware of barriers to robots (which may also be barriers to humans) • Think about a linking policy and procedures 7 URL Naming Policy Issues: • Having your own domain is a good idea (e.g. http://www.ariadne.ac.uk/) • Short URLs are good (more memorable; search engines tend not to index deeply) • Sub-domains may be a useful compromise (e.g. http://ariadne.bath.ac.uk/) • Keep URLs short by using directory defaults: 8 www.ariadne.ac.uk/issue5/metadata/intro.htm www.ariadne.ac.uk/issue5/metadata/ Shorter, less prone to typos and allows for format and language negotiation, new server management tools, etc …/issue5/metadata/intro.fr.html …/issue5/metadata/intro.pdf (.cfm, .asp, .jsp) Planning Search Engine Strategy You search for your project name and find a personal page of a former colleague with informal information  To avoid this: • Distinguish between (a) initial information about the project (b) information for project partners, funders, etc. and (c) information for end user • Use search engine techniques to:  Ban search engines from indexing certain pages  Promote other pages as appropriate 9 Robots Make use of the Robots Exclusion Protocol (REP) to ban robots from indexing : • Non-public areas (e.g. area for partners) • Pre-release Web sites • Pages prior to an official launch Note: Remember to switch off ban after launch! User-agent: * Disallow: /partners Disallow: /draft /robots.txt in Web root Note that use of directories to group related resources will have many benefits: controlling indexing robots, mirroring and auditing software, etc. 10 Other Barriers To Indexing Other barriers to indexing robots: Frames  Most search engines can’t index framesets and rely on appropriate <NOFRAMES> tags Flash (and other proprietary formats)  Most search engines can’t index proprietary formats Poorly implemented JavaScript pages  Search engines may not have JavaScript interpreters and can’t index text generated by JavaScript Poorly implemented user-agent negotiation (clientor server-side)  Most search engines don’t have a Netscape or IE useragent string and so will index “Upgrade to Netscape” Invalid HTML Pages 11  Search engines may not be as tolerant of HTML errors as Web browsers Accessibility • Robots have similarities to the visually impaired • Good design for robots is likely to be good design for people with disabilities (and vice versa) • Make use of Bobby (both versions) to check accessibility – see <http://www.cast.org/bobby/> You should formulate plans for making your Web site search-engines friendly and accessible 12 Other Ways Of Dissemination Users find your Web site by: • Search engines • Following a link • Entering a URL which they found on a mouse mat, pen, in an article, etc Links to your Web site are valuable as they: • Drive traffic to your Web site • Improve ranking in citation-based search engines such as AltaVista Possible problems with links: • “Link-spamming services”  • Being in the “Web sites that suck” portal • Resources needed to encourage linking 13 Encouraging Links You can: • Submit to directories (e.g. Yahoo!) • Use directory (and search engine) submission services • Have clear entry points with static URLs for key menu pages • Think about who you want to link to you and why they would do so • Target them and think of motivation (e.g. attractive small icon) • Monitor trends in links (e.g. try <http://www.linkpopularity.com/>) 14 Monitoring You may find it useful to: • Monitor the status of your Web site      Nos. of pages indexed. Nos. of links to your Web site Accessibility of your Web site Compliance with standards Downtime of the service • Monitor trends  Do the findings change over time / after dissemination • Compare your findings with your peers  Comparison with other projects  Comparison with other institutions  Comparison with other communities 15 Monitoring Many evaluation tools and Web services are available (some for free) See <http://www.ukoln.ac.uk/web-focus/events/ workshops/pub-lib-2000/workshop/> for exercises from Auditing and Evaluating Web Sites workshop (and new workshop next week) 16 Embedding Your Service So you’ve now: • Produced a high-quality Web site which is easily found, well-linked and accessible What Next? • You may want institutions to install your service • You may want institutions to install scripts which integrate with your service • You may want institutions to install software on users’ desktop PCs Your project may simply be a proof-of-concept, and you aren’t too concerned about deployment. But what if your project is so good that others want to deploy it? 17 Standards, Architectures, Applications, Resources Let’s agree on the standards and be agnostic on the applications used to implement the standards, provided services are interoperable Standards: concerned with protocols and file formats Open standards vs. Proprietary HTML / XML vs. PDF CSS / XSL vs. HTML 18 Architectures: models for implementing systems Which standards are applicable NT / Unix File system / database application HTML tools / content management Applications: software products used to implement systems Resources: financial and staff costs needed to implement systems Apache / IIS FrontPage / Dreamweaver Oracle / SQLServer / MySQL ColdFusion vs ASP vs JSP Development vs. Migration costs Use of in-house expertise In-house vs. out-sourced Licensed vs. open source Barriers to Embedding In order to persuade institutions to deploy your service: • You will have to convince the SysAdmin your software:       19 Doesn’t have security holes Won’t degrade the performance of the service Won’t require updates to any system libraries Won’t require any reconfiguration of server software Will be maintained and is adequately documented Is worth him (usually) spending his time on the work • You may have to convince the IT Service’s management • You may need buy-in from the user of your service (e.g. the Library) How big a barrier do you think this will be? RDN-Include – A Case Study Subject gateways (such as SOSIG & EEVL) are successful but institutions: • May feel they are taking users off-site • May feel that they should be doing (or seen to be doing) the job locally • Feel that their users will be disoriented by leaving the local look-and-feel (landscape) RDN-Include was developed: • To allow institutions to provide access to RDN hubs using the institution’s own look-and-feel and URL 20 Short paper on this work given at WWW 10. See <http://www.ukoln.ac.uk/web-focus/papers/www10/> RDN-Include and RDNi-Lite RDN-Include was developed: • As a CGI script written in Perl • Requires the institution to install the CGI • Requires the RDN to update its tables RDNi-Lite was developed: • To provide a lightweight alternative to RDNi • To allow the service to be tried and implemented by an HTML author • Implemented using JavaScript • See <http://www.rdn.ac.uk/rdn-i/> 21 <script type="text/javascript" src= "http://www.rdn.ac.uk/rdn-i/cgibin/rdnilite.cgi?tags=RDNTEXT,RDN LIST&template=http://www.mmu .ac.uk/services/library/rdni/rdnitempla te.html"></script> 22 It’s implemented using a single line of JavaScript News Feeds Providing automated news feeds which can be included in third party Web site with no manual intervention is a good way to support dissemination 23 Extension to News Feeds The RDN: • Wants to provide news feeds about developments by RDN hubs • It’s using the RSS standard for news feeds (and XML/RDF application) • A CGI-based RSS parser (and authoring tool) has been created • To allow potential users to try it out easily, a JavaScript parser has also been written • See <http://rssxpress.ukoln.ac.uk/> 24 Can this (slightly) heavyweight CGI solution complemented by a lightweight JavaScript solution be used within your project? Mirroring and Preservation Another way to embed your service remotely is for it to be mirrored: • Use of Web mirroring software to install service at another location (e.g. overseas to overcome network bandwidth problems or behind a firewall) • Issues about whether you are mirroring output from a service or the service itself (affected by push vs pull mode of mirroring) • JISC, for example, may wish to mirror your service in order to preserve it (once funding runs out and everyone leaves) 25 Note that you may wish to mirror only the project deliverables Web site, and not the Web site for partners or the Web site about the project – another reason for having separate Web sites Benchmarking You are responsible for designing architecture of your Web site and monitoring its effectiveness Certain things may be best done centrally: • Ensuring compliance with contractual agreements (Web site still exists, conforms with accessibility guidelines, etc.) • Benchmarks across programme in order to make comparisons, spot best practices, identify where advice & guidance is needed, etc. • Not intended as league tables (projects will have different funding levels, remits, communities, levels of visibility, etc.) Plans to produce a briefing document on “Web Portal Guidelines For Programme Coordinators” for JISC (and EU?) 26 Words On Browser Support The aim: • Services would degrade gracefully for old browsers This has not happened  My concern - Can I make assumptions about: • • • • 27 Frames & JavaScript support? Support for CSS (stylesheets) Browser plugins (eg Flash)? … Words On Browser Support Possible solutions: • Design for mid-1990s Web technologies • Client-side (JavaScript) user-agent sniffing • Server-side (e.g. PHP, JSP, ASP) user-agent sniffing • Design assuming support for current standards Should JISC aim to define minimum browser standards? Note: • Design of richly functional, accessible services using flawed 1990s applications is difficult • Pre 4.7 versions of Netscape are no longer supported (security concerns – see <http://home.netscape.com/cms/certinfo.html>) • Netscape moving out of browser market? See <http://browserwatch.internet.com/news/stories2001/ news-20010606-1.html> 28 Conclusions To conclude: • Make plans for the architecture of your Web service (URL naming, mirrorability, dissemination, etc.) at the start • Monitor aspects of your Web service • Design your service so that it can be embedded in other institutions (which will have different cultures, resource levels and priorities to your own) • Don’t forget the people issues (liaison, listening, etc.) not covered in this talk 29