Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LIS650 lecture 0 Introductory lecture Thomas Krichel 2004-01-23 administrative matters • Course home page is at http://wotan.liu.edu/home/krichel/lis650p04s • First quiz next lecture! • Deadline to finish web site: one week after the end of the last lecture. • You will not be able to change your web site between the deadline and the time that the grade is issued! • Subscribe to class mailing list https://lists.liu.edu/mailman/listinfo/cwp-lis650-krichel today • • • • • introduction to the course talk about you the basic ingredients of the web, without html introduction to our basic technical set up introduction to html Course history • Course was first run as an institute 2002-05-13 to 2002-05-17 • Title was “Webmastering I: the static web site”. • To the curriculum committee, this title did not sound academic enough. • Since “Web Site Architecture and Design” is now the full title, WeSaD (pronounced like “wizard”) is the official abbreviation. • Webmastering is still what we want to learn. teaching WeSaD • WeSaD combines many aspects: – – – – – Authoring pages Work on the organization of data to fit onto pages Set display style of different pages Organize the contribution of data Maintain a technical web installation • Some of them can be learned in a course, but others can not. • Emphasis has to be on learnable elements. teaching philosophy • Point and click on a computer software is not enough • Explain underlying principles • Promote standards – HTML 4.01 – CSS level 2.1 • Avoid proprietary software WeSaD contents • Deals with the maintenance of a static web site. Such a web site remains the same whatever the user does with it. • Topics include – html – css – site usability and information architecture, as far as relevant for static web sites – http, uri, web server things this course does not do • Forms: allow you to design forms that users fill in. But you do not have the programming skills to do something with the form. • Any HTML elements that require executable contents are not covered. • Frames: allow you to put several documents into one physical document. Most experts advise against them. • We do not cover image maps. • We don’t do some advanced CSS properties. Other courses: webmastering II • Deals with building dynamic web sites. – Users fill in a form – Users submit the form – Web server return a page that is specific to the request of the user. • Teaches a language called PHP, that is widely used to generate such web sites. – Gets you introduced to computer programming – Gets you to train analytical thinking. other courses: webmastering III • Deals with XML – XML is a syntax to encode any kind of data. – XML can be constrained to only allow certain types of data (XML Schema) – XML can be transformed to render the data in various ways (XSLT) • Achieve a separation of contents and presentation of a web page. • advanced course, has both Schema and Transformation The world wide web The World Wide Web (Web) is a network of information resources. The Web relies on three mechanisms to make these resources readily available to the widest possible audience: – A uniform naming scheme for locating resources on the Web (I.e. URIs). – Protocols, for access to named resources over the Web (e.g., HTTP). – Hypertext, for easy navigation among resources (e.g., HTML). URI introduction • Every resource available on the Web -- HTML document, image, video clip, program, etc. -has an address that may be encoded by a Universal Resource Identifier, or "URI". • URIs typically consist of three pieces: – The naming scheme of the mechanism used to access the resource. – The name of the machine hosting the resource. – The name of the resource itself, given as a path. example URI • http://openlib.org/home/krichel This URI may be read as follows: There is a document available via the HTTP protocol, residing on the site openlib.org, accessible via the path "/home/krichel". • mailto:[email protected] This URI may be read as follows: There is email user krichel in a domain openlib.org to whom email may be sent. client / server protocol • The web operates mostly on http. • This is a client-server protocol. • The client software is run on the local PC that you are using. – It is called a web browser or user agent. • Our server is a piece of hardware called wotan.liu.edu – It runs the Debian GNU/Linux operating system on a Intel architecture. – It provides http daemon software that serves http requests. The particular software is called Apache. communication with the server • The protocol for communicating with the server is the secure shell, short ssh. It is based publickey cryptography. • We two two ssh clients – For file editing and manipulation, we use putty. – For file transfer, we use winscp. – Both are available on the web. • Telnet and ftp servers are not available on wotan.liu.edu. Telnet and ftp do not encrypt the communication stream; therefore they are not secure. registration time • As part of the course, you are being provided with web space on the server wotan.liu.edu, at the URL http://wotan.liu.edu/~username where username is a user name that you will chose now. • It is my intention to maintain this web space for you into the foreseeable future. • You should also choose a password, now. • I will now register you. login time • Use putty, port 22 to wotan.liu.edu • set other attributes of the session as you like, using the menu on the left, for example – colors – font shapes and sizes – bell • Save the session as “wotan” (in the first screen) to save all the customization. • You do not normally need to login to the machine, unless you want to work with it. free software • I maintain wotan.liu.edu server but you can build your own server if – you have Internet access – you have an old PC to spare • All the server software, as well as putty and winscp are free, open-source. • It is one of my fundamental beliefs that free information should run on free software. • The library community can learn a hell of a lot from the free software community. • See my talk at http://openlib.org/home/krichel/ presentations/new_york_2003-11-07.ppt installing software at home • Go to your favorite search engine to search for – putty – winscp • Download and run windows-style installer software to install both pieces of software. • Download and install a recent version of at least two browsers. I suggest – Netscape Navigator at http://channels.netscape.com/ns/browsers/download.jsp – Opera at http://www.opera.com putty and winscp • You can either maintain files on wotan.liu.edu – by logging into wotan.liu.edu – using a file editor there, for example nano – past experience has shown that this is hard for students with no UNIX experience. • You can also maintain text files locally – each time you make a change, you save the file and upload to wotan.liu.edu using winscp. – you can use Notepad locally to maintain text files – I do not recommend using WordPad and Word. create a web page in MS notepad • Open Microsoft notepad. Type the text <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head><meta http-equiv="Content-Type" content="text/html; charset=UTF8"> <title></title></head><body> <div></div></body> </html> Saving the web page • save as “empty.html”. • If you want to open it again in notepad – – – – open notepad select file/open list all files empty.html • Don't click on the file. • Don't choose edit in the context menu. upload and view file • Once you have your file “empty.html”, use the menus of winscp to upload it to your file in the public_html directory of your home directory on wotan.liu.edu. • It has to be in public_html ! • Once it is there, use a web browser to view it at http://wotan.liu.edu/~user/empty.html, where user is your user id. • Then validate it at http://validator.w3.org. – enter the URL of the page that you want to validate – hit the validate button • It has to be in public_html ! public_html • Is your web directory. It is automagically created for you when Thomas registers you. • The web server will map requests to http://wotan.liu.edu/~user/file to show the file /home/user/public_html/file. • Here user stands for your user id, and file is the file name. • If file ends with “.html” or “.htm” the web browser will be told that the file is a html file. It will be rendered accordingly by the browser. index.html • The web server on wotan will map requests to http://wotan.liu.edu/~user to show the file ~user/public_html/index.html • If this file is not there, the server will prepare a html document from the list of files that it finds in the directory and send it to the user agent. • Once you have a file index.html, the web user can no longer see the individual files in your directory. HTML and XHTML • HTML is the hypertext markup language • HTML is a markup language that is widely used on the Word Wide Web (WWW) • The latest, and probably last version of HTML is at http://www.w3.org/TR/html4/ • The WC3, the standard making body for the WWW, have issued XHTML, a replacement of HTML that is compatible with XML. • We will ignore XHTML for the rest of the course. what is markup? • Everything in a document that is not content. It can be give in two ways • 1: Procedural – Codes identify point size, style, font, etc. – Usually only understood by defining tool – Example: Microsoft Word • 2: Descriptive – – – – Describes purpose of text within the document Chapter head, Paragraph, Section Head, TOC Structure and Style are kept separate Example: LaTeX, SGML SGML • Standard Generalized Markup Language • Descriptive approach with three separate layers – structure: types of information in document – content: the information itself – style: matches typesetting with structure • Developed for the publishing industry by a group around Goldfarb. • So complicated that no software implements it fully • Document Type Definition (DTD) – Defines the structure Document Type Definition (DTD) • Describes information the document handles – e.g Title,TOC, Chapter, Section • Relationships between fields – e.g. A Chapter contains Sections • Consistency • Logical structure • Information defined by tags HTML • HyperText Markup Language • Defines an SGML DTD – – – – – Head, Title, Body, Paragraph, etc. Headings, Bold, Italic, etc. Table, List, Image, etc. Links to other documents Forms • Style applied by Web Browser – User has some control HTML history • HTML was a very bare-bones language when first invented by Tim Berners-Lee. It did not describe pages with much of a visual appeal. • In the 90s, successful browsers invented “extensions” that aimed to stretch the visual boundaries of HTML. • Some of these extensions found their way in the official HTML spec issued by the W3C. “my HTML” • I will teach HTML 4.01. This version has two different DTDs: – the loose DTD – the strict DTD • I will only do the tags of the strict DTD • The loose DTD has more tags, but all the functionality of these tags is best done with style sheets. • Thus, the pages created with HTML only will look rather boring. • But we do cover style sheets later. HTML tags • HTML markup is written as tags. Tags are written as pairs (typically) – begin with <tag> – end with </tag> – tag is the tag name "tag start" "tag end" • Can be nested • Can contain non-markup data • Tag names are case-insensitive, but it is best to use the same case, consistently, for human readability. attributes to tags • <atag attribute_name_one="value_one" attribute_name_two="value_two"> • Here attribute_name_one and attribute_name_two are attribute names and value_one and value_two are attribute values. • I will say: tag <tag> “requires” attribute "attribute". • I will say tag <tag> “takes” attribute "attribute" if the attribute is optional. Example <a href="http://openlib.org/home/krichel" title="homepage of Thomas Krichel">Thomas Krichel</a> – the whole thing is an <a> tag. (I surround tag names with <>) – “href” is an attribute name – “http://openlib.org/home/krichel” is the value of the "href" attribute (I surround attribute names with straight quotes) – “Thomas Krichel” is character data. Characters: concept • A character set combine two things – Character repertoire: a set of characters e.g. "A", ""ﺾ "‼", "₣" – Character code positions: defines a number for each character in the repertoire. • Character encoding is a way to encode the code positions in bytes • To correctly display a document, the user agent needs to know both! playing safe with characters • Only use the characters on the US keyboard, don't insert symbols. • Save as ascii or utf-8. • Never save as "Unicode" within MS Notepad. • If you encounter a character that is not on your keyboard, use an SGML entity. Special Characters • Inserted as an entity reference – Format can be &code; • Ex. & – Insert an ampersand – Codes are often abbreviation of the character names – Codes can be in hex form • Ex. & to insert an ampersand http://www.w3.org/TR/REC-html40/sgml/entities.html has the list classifying tags • There is a whole bunch of different tags. • We can group tags together in different ways. • In the following, I will explain some of the ways. – block-level vs text-level tags – tags that require closing vs those that do not. block-level vs text-level tags • Block-level tags contain data that is aligned vertical by visual user agent. • Text-level tags are aligned horizontally by visual user agents. • There are a number of reasons behind this distinction – Block level can contain other block level tags and text-level tags. – Text-level tags can not contain block-level tags. – Visual user agents start a new line at the beginning of block-level tags. – Multidirectional text would be impossible without it. common frame for pages • We look at empty.html again. Here is the start again <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN“ "http://www.w3.org/TR/html4/strict.dtd"> • This is an SGML document type declaration. • It says which kind of HTML it is. • Use empty.html as a start to compose all your pages. special topic: images • The appeal of the web to the masses has a lot to do with its capability to transport image. • Image format are independent of the web, but there are two classic format that are widely supported by user agents. – GIF – JPEG GIF • stands for graphics interchange format. • developed by CompuServe. • unresolved copyright issues make the format abhorred by the free software community. • 250 colors maximum • uses a loss-less compression technique GIF has three tricks • interlacing: – when downloading the file, the browser can show every forth row first – user gets in an idea of the picture before it is sharp • transparency – some GIFs are transparent, so you can see them on top of already exist – technically, the GIF has one color as the background color, and pixels of that color are ignored by the user agent • animation – some GIFs are in fact sequences of GIFs that can be rendered one after the other. JPEG • The Joint Photographic Experts Group is a standard-making body for images • They can support thousands of colors. • The compression is lossy, i.e. the JPEG file will look like the original image, but not be the same. • The compression does not work well with drawings. • There are no copyright and patent problems with JPEG working with wotan • You can work with wotan directly if you like. Use putty to connect to wotan.liu.edu, then type cd public_html • You can start from empty.html, the file that validates, and copy it to test.html cp empty.html test.html nano test.html • Then you can change test.html to try out the tags as I discuss them here. working on the local machine • Open empty.html on your web site and save as test.html • edit it with notepad to be safe • open with Internet Explorer to see the rendered html • to validate – you have to upload the file first to your public_html directory on wotan.liu.edu – Then use the W3C validator at http://validator.w3c.org literature • I work from the text of the official standard at http://www.w3.org/TR/html4/ • To work with it faster, I made a copy at http://wotan.liu.edu/~krichel/html4/ • You can work from any HTML book. Homework • Look at course home page http://wotan.liu.edu/home/krichel/lis650p04s • Send [email protected] your secret word for course result delivery. • Prepare a one-page max summary of the type of website that you want to build, bring printed copy with you next week. • Prepare for quiz at the beginning of next lecture. http://openlib.org/home/krichel Thank you for your attention!