Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 27 HTTP and WWW McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 HTTP Hypertext Transfer Protocol (HTTP) is used mainly to access data on the World Wide Web. It can jump from one document to another Functions like FTP and SMTP McGraw-Hill Transfers files and uses services of TCP; Uses TCP port 80 Transfer data between client and server HTTP information is read and interpreted by HTTP server and HTTP client ©The McGraw-Hill Companies, Inc., 2004 Figure 27.1 HTTP itself is a stateless protocol Client initializes the transaction by sending a request message. Server replies by sending a response. Two types of HTTP messages McGraw-Hill HTTP Transaction Request Response ©The McGraw-Hill Companies, Inc., 2004 Figure 27.2 Request line Request type Uniform Resource Locator (URL): address of the web page McGraw-Hill Request Message Method: Protocol used to retrieve the document. Host computer: Name of the computer where the information is located Port: [Optional] Port number of server Path: Path name of the file where the information is located. Version: HTTP 1.1 OR 1.0 OR 0.9 Headers Body ©The McGraw-Hill Companies, Inc., 2004 Figure 27.3 Request line URL McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Methods Request method is the actual command or request that a client issues to the server McGraw-Hill GET: Client wants to retrieve a document from server HEAD: client wants information about a document and not the document itself. POST: Client provides information to the server. PUT: Client provides a document to the server. PATCH: similar to PUT but only with differences that should be implemented in existing file. COPY: Copies a file to another location. Source is in request line and destination is in entity header. MOVE: Moves a file to another location. DELETE: Removes a document from server. LINK: Creates a link or links from a document to another location. UNLINK: Deletes links created by LINK method. OPTION: Used by client to ask the server about available options. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.5 Status Line McGraw-Hill HTTP Version Status code: Status code field as in FTP & SMTP; three digits Status Phrase: Explains the status code in text form. Header Response Message Exchange additional information between client and server Header name, colon, space, header value. Body ©The McGraw-Hill Companies, Inc., 2004 Figure 27.6 Status Line Header Format McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Figure 27.8 General header: Info about message. Request header: specifies client’s configuration and client’s preferred document format. Response header: Specifies server’s configuration and special information about the request. Entity header: McGraw-Hill Header Categories Gives info about the body of document. Mostly in response message but in request messages of POST & PUT methods Request message has Only general, request and entity headers. Response message has general, response and entity headers. ©The McGraw-Hill Companies, Inc., 2004 Example 1 This example retrieves a document. We use the GET method to retrieve an image with the path /usr/bin/image1. The request line shows the method (GET), the URL, and the HTTP version (1.1). The header has two lines that show that the client can accept images in GIF and JPEG format. The request does not have a body. The response message contains the status line and four lines of header. The header lines define the date, server, MIME version, and length of the document. The body of the document follows the header (see Fig. 27.9, next slide). McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Figure 27.9 McGraw-Hill Example 1 ©The McGraw-Hill Companies, Inc., 2004 Example 2 This example retrieves information about a document. We use the HEAD method to retrieve information about an HTML document (see the next section). The request line shows the method (HEAD), URL, and HTTP version (1.1). The header is one line showing that the client can accept the document in any format (wild card). The request does not have a body. The response message contains the status line and five lines of header. The header lines define the date, server, MIME version, type of document, and length of the document (see Fig. 27.10, next slide). Note that the response message does not contain a body. McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Figure 27.10 McGraw-Hill Example 2 ©The McGraw-Hill Companies, Inc., 2004 Features of HTTP 1.1 Persistent connection HTTP 1.1 default option Server leaves the connection open for more requests after sending a response. Server can close the connection at the request of a client or if a timeout has been reached. Usually length of data is sent along with each response but when the length is not known, server informs the client that the length is not known and closes the connection after sending data so the client knows that the end of data has been reached. Nonpersistent connection McGraw-Hill HTTP 1.0 One TCP connection is made for each request/response. Client opens a TCP connection and sends a request Server sends the response and closes the connection Client reads the data until it encounters an end-of-file marker; the client then closes the connection. For N different images in different files, the connection must be opened and closed N times; impose high overhead on server. ©The McGraw-Hill Companies, Inc., 2004 Proxy Server HTTP support Proxy server. Proxy server is a computer that keeps copies of responses to recent requests. If proxy server is present, HTTP client sends a request to proxy server and the proxy server checks its cache. If the response is not stored in cache, the proxy server sends the request to corresponding server. Incoming responses are sent to proxy server and stored for further requests from other clients. McGraw-Hill Reduces load on original sever, decreases traffic, and improves latency. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.11 World wide web (WWW) McGraw-Hill Distributed services Repository of information spread all over the world. Unique combination of flexibility, portability and user friendliness. WWW today is a distributed client-server service, in which a client using a browser can access a service using a server. However, the service provided is distributed over many locations called websites. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.12 McGraw-Hill Hypertext Linking of documents is done using pointers Hypertext documents only contain text, hypermedia documents can contain pictures, graphics, and sound Unit of hypertext or hypermedia available on web is called a page. The main page for an organization or an individual is called homepage. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.13 Browser has three parts McGraw-Hill Browser architecture Controller: receives input from keyboard or mouse and uses the client programs to access the document. Client programs Interpreters: After the document has been accessed, the controller use one of the interpreters to display the document on the screen; HTML or Java. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.14 Static documents McGraw-Hill Categories of Web documents Fixed-content documents that are created and stored in the server. Client can get only the copy of the document. The contents in the server can be changed, but the user cannot change it. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.15 McGraw-Hill Static Document ©The McGraw-Hill Companies, Inc., 2004 Figure 27.16 HTML(Hypertext Markup Language) McGraw-Hill Boldface Tags Language for creating web pages. Tags are instructions to the browser. HTML allows us to embed formatting instructions in the file itself. HTML lets us use only ASCII characters for both the main text and formatting instructions. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.18 Beginning and Ending Tags Structure of a web page McGraw-Hill Head st part of a web page 1 Contains the title of the page and other parameters that the browser will use. Body Actual contents of a page are in the body, which includes text and tags. Tags define the appearance of the document. Tags Marks that are embedded into the text. Enclosed in two signs (< and >) and usually comes in pairs. Beginning tag starts with the name of the tag, and the ending tag starts with a slash followed by the name of the tag. ©The McGraw-Hill Companies, Inc., 2004 Table 27.1 Common tags Beginning Tag Ending Tag Meaning Skeletal Tags <HTML> </HTML> Defines an HTML document <HEAD> </HEAD> Defines the head of the document <BODY> </BODY> Defines the body of the document Title and Header Tags <TITLE> </TITLE> Defines the title of the document <Hn> </Hn> Defines the title of the document McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Table 27.1 Common tags (continued) Beginning Tag Ending Tag Meaning Text Formatting Tags <B> </B> Boldface <I> </I> Italic <U> </U> Underlined <SUB> </SUB> Subscript <SUP> </SUP> Superscript Data Flow Tag <CENTER> </CENTER> <BR> </BR> McGraw-Hill Centered Line break ©The McGraw-Hill Companies, Inc., 2004 Table 27.1 Common tags (continued) Beginning Tag Ending Tag Meaning List Tags <OL> </OL> Ordered list <UL> </UL> Unordered list <LI> </LI> An item in a list Image Tag <IMG> Defines an image Hyperlink Tag <A> </A> Defines an address (hyperlink) Executable Contents <APPLET> McGraw-Hill </APPLET> The document is an applet ©The McGraw-Hill Companies, Inc., 2004 Example 3 This example shows how tags are used to let the browser format the appearance of the text. <HTML> <HEAD> <TITLE> First Sample Document </TITLE> </HEAD> <BODY> <CENTER> <H1><B> ATTENTION </B></H1> </CENTER> You can get a copy of this document by: <UL> <LI> Writing to the publisher <LI> Ordering online <LI> Ordering through a bookstore </UL> </BODY> </HTML> McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Example 4 This example shows how tags are used to import an image and insert it into the text. <HTML> <HEAD> <TITLE> Second Sample Document </TITLE> </HEAD> <BODY> This is the picture of a book: <IMG SRC="Pictures/book1.gif" ALIGN=MIDDLE> </BODY> </HTML> McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Example 5 This example shows how tags are used to make a hyperlink to another document. <HTML> <HEAD> <TITLE> Third Sample Document </TITLE> </HEAD> <BODY> This is a wonderful product that can save you money and time. To get information about the producer, click on <A HREF="http://www.phony.producer"> Producer </A> </BODY> </HTML> McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Figure 27.19 McGraw-Hill Dynamic Document Dynamic documents do not exist in a predefined format. Dynamic document is created by a Web server whenever a browser requests the document. When a request arrives, the Web server runs an application program that creates the dynamic document. The server returns the output of the program as a response to the browser that requested the document. As fresh document is created for each request, the contents of a dynamic document can vary from one request to another. Example is getting date and time from the server. ©The McGraw-Hill Companies, Inc., 2004 Steps involved in handling dynamic documents. Server examines the URL to find if it defines a dynamic document. URL defines a dynamic document, the server executes the program. Sends the output of the program to the client (browser). Common Gateway Interface (CGI) McGraw-Hill Technology that creates and handles dynamic documents. CGI is a set of standards that defines how a dynamic document should be written, how input data should be supplied to the program, and how the output result should be used. Can use C, C++, Perl, … Use of common in CGI indicates that the standard defines a set of rules that are common to any language or platform. Gateway here means that a CGI program is a gateway that can be used to access other resources such as databases and graphics packages. Interface means that there is a set of predefined terms, variables, calls, and so on that can be used in any CGI program. ©The McGraw-Hill Companies, Inc., 2004 Example 6 Example 6 is a CGI program written in Bourne shell script. The program accesses the UNIX utility (date) that returns the date and the time. Note that the program output is in plain text. #!/bin/sh # The head of the program echo Content_type: text/plain echo # The body of the program now='date' echo $now exit 0 McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Example 7 Example 7 is similar to Example 6 except that program output is in HTML. #!/bin/sh # The head of the program echo Content_type: text/html echo # The body of the program echo <HTML> echo <HEAD><TITLE> Date and Time </TITLE></HEAD> echo <BODY> now='date' echo <CENTER><B> $now </B></CENTER> echo </BODY> echo </HTML> exit 0 McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Example 8 Example 8 is similar to Example 7 except that the program is written in Perl. #!/bin/perl # The head of the program print "Content_type: text/html\n"; print "\n"; # The body of the program print "<HTML>\n"; print "<HEAD><TITLE> Date and Time </TITLE></HEAD>\n"; print "<BODY>\n"; $now = 'date'; print "<CENTER><B> $now </B></CENTER>\n"; print "</BODY>\n"; print "</HTML>\n"; exit 0 McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Figure 27.20 McGraw-Hill Active document For active documents, we need a program to be run at the client side. For example, to run animations. When a browser requests an active document, the server sends a copy of the document in the form of byte code. The document is then run at the client (browser) site; the client can store this document in its own storage area also. Active document is stored in binary code in the server. ©The McGraw-Hill Companies, Inc., 2004 Creation, compilation and execution At server site, programmer writes a program, in source code, and stores it in a file. Compile the code into byte code. Path name of the file is the one used by a URL to refer to the file. In this file, each program command (statement) is in binary form, and each identifier (variable, constants, function names, and so on) is referred to by a binary offset address. Client (browser) requests a copy of the binary code, which is probably transported in compressed form from the server to the client (browser). Client (browser) uses its own software to change the binary code into executable code. The software links all the library modules and makes it ready for execution. Client (browser) runs the program and creates the result that can include animation or interaction with the user. McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Java Java is combination of a high-level programming language, a runtime environment, and a class library that allows a programmer to write an active document (an applet) and a browser to run it. Java can also be a stand-alone program without using a browser. Java is an object-oriented language like C++ without operator overloading or multiple inheritance. Java is platform-independent and does not use pointer arithmetic. Java is an object-oriented language, a programmer defines a set of objects and a set of operations (methods) to operate on those objects. Java is a typed language which means that the programmer must declare the type of any piece of data before using it. Java is also a concurrent language, which means the programmer can use multiple threads to create concurrency. McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Classes and Objects Inheritance Inheritance defines a hierarchy of objects, in which one object can inherit data and methods from other objects. In Java, we can define a class as the base class that contains data and methods common to many classes. Inherited classes can inherit these data and methods and can also have their own data and methods. Packages McGraw-Hill Object is an instance of a class that uses methods (procedures or functions) to manipulate encapsulated data. Java has a rich library of classes, which allows the programmer to create and use different objects in an applet. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.21 McGraw-Hill Skeleton of an applet Applet is an active document written in Java. It is actually the definition of a publicly inherited class, which inherits from the applet class defined in the java.applet library. Programmer can define private data and public and private methods in this definition. Client process (browser) creates an instance of this applet. The browser then uses the public methods defined in the applet to invoke private methods or to access data. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.23 McGraw-Hill Creation and compilation Use an editor to create a java source file. Name of the file is the same as the name of the publicly inherited class with the “java” extension. Java compiler creates the bytecode for the file, with the “class” extension. Create an applet which can be run by a browser. ©The McGraw-Hill Companies, Inc., 2004 Figure 27.24 McGraw-Hill HTML document carrying an Applet To use an applet, an HTML document is created and the name of the applet is inserted between the <APPLET> tags. The tag also defines the size of the window used for the applet. ©The McGraw-Hill Companies, Inc., 2004 Example 9 In this example, we first import two packages, java.awt and java.applet. They contain the declarations and definitions of classes and methods that we need. Our example uses only one publicly inherited class called First. We define only one public method, paint. The browser can access the instance of First through the public method paint. The paint method, however, calls another method called drawString, which is defined in java.awt.*. import java.applet.*; import java.awt.*; public class First extends Applet { public void paint (Graphics g) { g.drawString ("Hello World", 100, 100); } } McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004 Example 10 In this example, we modify the program in Example 9 to draw a line. Instead of method drawString, we use another method called drawLine. This method needs four parameters: the x and y coordinates at the beginning of the line and the x and y coordinates at the end of the line. We use 0, 0 for the beginning and 80, 90 for the end. import java.applet.*; import java.awt.*; public class Second extends Applet { public void paint (Graphics g) { g.drawLine (0, 0, 80, 90); } } McGraw-Hill ©The McGraw-Hill Companies, Inc., 2004