Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Future Technologies: The Development Of Web Standards Brian Kelly UK Web Focus UKOLN University of Bath Email Address [email protected] URL http://www.ukoln.ac.uk/ UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it is based. 1 Contents • Introduction • Web Standards Overview • Web Standards: • Data Formats • Transport • Addressing • Metadata • Deployment Issues 2 Aims of Talk • To give a brief overview of web architecture • To describe developments to web standards • To explain why the developments are needed About UK Web Focus UK Web Focus: • JISC-funded post • Advises UK HE community on web developments • Represents JISC on World Wide Web Consortium (W3C) • Organises events (e.g. national web managers workshop) • Dissemination of information: e.g. see Web Focus and WebWatch columns in Ariadne - <http://www.ariadne.ac.uk/> and column in Exploit Interactive <http://www.exploit-lib.org/> 3 Why Care About Standards? This talk covers development of web standards, not web applications An understanding of web standards is needed: • To appreciate when solutions are proprietary • To provide flexibility and interoperability • To avoid developing home-grown application solutions, when protocol solutions are in the offing The seminar is aimed at: • Enthusiastic information providers • Web applications developers (e.g. CBL) • Web support staff • Web researchers • Other interested parties 4 Standardisation HTML Proprietary extensions • De facto standards PDF and Java? • Often initially appealing (cf PowerPoint, PDF) W3C PNG • May emerge as • Produces W3C HTML ISO standards Recommendations • Produces ISO Z39.50 on Web protocols Java? Standards • Managed approach to • Can be slow moving developments and bureaucratic • Protocols initially • Produce robust IETF developed by standards W3C members • Produces Internet • Decisions made by Drafts on Internet protocols W3C, influenced by • Bottom-up approach to developments member and public • Protocols developed by HTTP review interested individuals PNG URN • "Rough consensus and working HTML whois++ code" HTTP 5 The Web Vision Tim Berners-Lee's vision for the Web: • Evolvability is critical • Automation of information management: If a decision can be made by machine, it should • All structured data formats should be based on XML • Migrate HTML to XML • All logical assertions to map onto RDF model • All metadata to use RDF See keynote talk at WWW 7 conference at <URL: http://www.w3.org/Talks/1998/ 0415-Evolvability/slide1-1.htm> 6 What Is The Web? The web can be regarded as a distributed multimedia hypertext system which is based on three core architectural components: • Transport (HTTP) • Data Format (HTML) • Addressing (URL) Addressing URL Transport Data format HTTP HTML 7 HTML 4.0, CSS 2.0 and DOM HTML 4.0 used in conjunction with CSS 2.0 (Cascading Style Sheets) and the DOM provides an architecturally pure, yet functionally rich environment HTML 4.0 - W3C-Rec • Improved forms • Hooks for stylesheets • Hooks for scripting languages • Table enhancements • Better printing Problems • Changes during CSS development • Netscape & IE incompatibilities • Continued use of browsers with known bugs 8 CSS 2.0 - W3C-Rec • Support for all HTML formatting • Positioning of HTML elements • Multiple media support DOM - W3C-Rec • Document Object Model • Hooks for scripting languages • Permits changes to HTML & CSS properties and content HTML Limitations HTML 4.0 / CSS 2.0 have limitations: • Difficulties in introducing new elements – Time-consuming standardisation process (<ABBREV>) – Dictated by browser vendor (<BLINK>, <MARQUEE>) • Area may be inappropriate for standarisation: – Covers specialist area (maths, music, ...) – Application-specific (<STUD-NUM>) • HTML is a display (output) format • HTML's lack of arbitrary structure limits functionality: – Find all memos copied to John Smith – How many unique tracks on Jackson Browne CDs 9 XML XML: • • • • Extensible Markup Language A lightweight SGML designed for network use Addresses HTML's lack of evolvability Arbitrary elements can be defined (<STUDENTNUMBER>, <PART-NO>, etc) • Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998 • Support from industry (SGML vendors, Microsoft, etc.) • Support in IE 5 and Netscape 5(?) 10 XML Deployment Ariadne issue 15 has article on "What Is XML?" Describes how XML support can be provided: • Natively by new browsers • Back end conversion of XML - HTML • Client-side conversion of XML - HTML / CSS • Java rendering of XML Examples of intermediaries See http://www.ariadne.ac.uk/issue15/what-is/ 11 XLink, XPointer and XSL XLink will provide sophisticated England hyperlinking missing in HTML: France • Links that lead user to multiple destinations • Bidirectional links • Links with special behaviours: – Expand-in-place / Replace / Create new window – Link on load / Link on user action <commentary xml:link="extended" inline="false"> • Link databases <locator href="smith2.1" role="Essay"/> <locator href="jones1.4" role="Rebuttal"/> XPointer will provide <locator href="robin3.2" role="Comparison"/> access to arbitrary </commentary> portions of XML resource XSL stylesheet language will provide extensibility and transformation facilities (e.g. create a table of contents) 12 XHTML XHTML (Extensible HTML): • HTML as an XML application • Enables XML tools to be used • Main differences: – Elements in lower case – Elements must be closed: <li>List element</li> – Attribute values must be quoted: <img src="logo.gif" height="20" … – Empty attributes thus: <br / > • Proposed recommendation • See <http://www.w3.org/TR/xhtml1/> 13 Other XML Applications XML is being used as the basis of new Web data formats, such as: SVG (Scalable Vector Graphics): See <http://www.w3.org/Graphics/SVG/> SMIL (Synchronized Multimedia Integration Language): See <http://www.w3.org/AudioVideo/> MathML: See <http://www.w3.org/MarkUp/Math/> 14 Addressing URLs (e.g. http://www.bristol-poly .ac.uk/depts/music/) have limitations: • Lack of long-term persistency – Organisation changes name – Department shut down or merged – Directory structure reorganised • Inability to support multiple versions of resources (mirroring) URNs (Uniform Resource Names): • Proposed as solution • Difficult to implement (no W3C activity in this area) 15 Addressing - Solutions 16 DOIs (Document Object Identifiers): • Proposed by publishing industry as a solution • Aimed at supporting rights ownership • Business model needed PURLs (Persistent URLs): • Provide single level of redirection Pragmatic Solution: • URLs don't break - people break them • Design URLs to have long life-span NOTE: URL naming can affect how well a web site is indexed by search engines – see <http://www.ukoln.ac.uk/web-focus/ events/concertation/libraries-nov99/> Transport HTTP/0.9 and HTTP/1.0: Design flaws and implementation problems HTTP/1.1: Addresses some of these problems 60% server support Performance benefits! (60% packet traffic reduction) Is acting as fire-fighter Not sufficiently flexible or extensible HTTP/NG: 17 Radical redesign using object-oriented technologies Undergoing trials Gradual transition (using proxies) Integration of application (distributed searching?) Metadata Metadata - the missing architectural component from the initial implementation of the web Addressing URL Metadata Needs: 18 • • • • • • Resource discovery Content filtering Authentication Improved navigation Multiple format support Rights management Transport Data format HTTP HTML Metadata Examples DSig (Digital Signatures initiative): • Key component for providing trust on the web • DSig 2.0 will be based on RDF and will support signed assertion: – This page is from the University of Bath – This page is a legally-binding list of courses provided by the University P3P (Platform for Privacy Preferences): • Developing methods for exchanging Privacy Practices of Web sites and user Note that discussions about additional rights management metadata are currently taking place 19 Example - Sitemaps http://www.elsop.com/ linkscan/map.html Sitemaps could provide navigational alternatives to browsing a site by following links. Configurable site maps could enable end users to define hierarchies Sitemaps could be used by automated programs in B2B applications (cf 24 Hour Museum, HE Mall) 20 RDF RDF (Resource Description Framework): • Highlight of WWW 7 conference • Provides a metadata framework ("machine understandable metadata for the web") • Based on ideas from content rating (PICS), resource discovery (Dublin Core) and site mapping (MCF) • Applications include: – – – – 21 cataloging resources electronic commerce digital signatures intellectual property rights – resource discovery – intelligent agents – content rating – privacy • See <URL: http://www.w3.org/ Talks/1998/0417-WWW7-RDF> RDF Model RDF Data Model RDF: • Based on a formal data model (direct label graphs) • Syntax for interchange of data • Schema model page.html Cost Resource Property PropName Cost 22 Value Property page.html £0.05 PropObj InstanceOf PropertyType Value ValidUntil 11-May-98 Cost £0.05 ValidUntil 11-May-98 Browser Support for RDF Mozilla (Netscape's Trusted 3rd Party source code Metadata release) provides support for RDF. Mozilla supports site maps in RDF, as well as Embedded bookmarks and Metadata e.g. history lists sitemaps Image from http://purl.oclc.org/net/eric/talks/www7/devday/ 23 RDF Conclusion RDF is a general-purpose framework RDF provides structured, machineunderstandable metadata for the Web Metadata vocabularies can be developed without central coordination RDF Schemas describe the meaning of each property name Signed RDF is the basis for trust 24 Deployment Issues How can new technologies be deployed? • Expect (hope) everyone will move to new browsers • Use technologies in backwardscompatible manner • Develop additional protocols e.g. – Transparent Content Negotiation – CC/PP (see http://www.w3.org/TR/NOTE-CCPP) • User-agent negotiation • Use of proxy intermediaries 25 Deployment Issues More sophisticated deployment techniques can be adopted to overcome deficiencies in simple model Original Model Web server HTML resource browser Sophisticated Model Backend processing HTML / XML / database resource Intelligent Web server Intermediaries can provide functionality not available at client: • DOI support • XML support / format conversion 26 • Authentication Web server simply sends file to client File contains redundant information (for old browsers) plus client interrogation support Client proxy browser Server proxy Example of an intermediary Conclusions To conclude: • Standards are important, especially for national initiatives and other large-scale services • Proprietary solutions are often tempting because: – – – – They are available They are often well-marketed and well-supported They may become standardised Solutions based on standards may not be properly supported by applications • Metadata is big growth area • Intermediaries (brokers) likely to have a key role to play in deploying standards-based solutions • Intelligent servers likely to be important 27