Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Web Programming Robert M. Dondero, Ph.D. Princeton University 1 Objectives You will learn: The fundamentals of web programming... The hypertext markup language (HTML) Uniform resource locators (URLs) The hypertext transfer protocol (HTTP) 2 HTML A simple instance of the Standard Generalized Markup Language (SGML) SGML: you define the tags HTML: set of tags is predefined 3 HTML Documents An HTML document: Consists of plain text Text contains content and markup <tag>...</tag> <tag attribute="value">... </tag> <tag> <tag /> Markup describes presentation, i.e. how the text should be presented/rendered Tags and attributes are case insensitive 4 Rendering HTML Documents Renderer usually is a Web browser Popular browsers: Microsoft Internet Explorer (53%) Mozilla Firefox (29%) Google Chrome (8%) Apple Safari (6%) Opera (2%) (Several others) 5 Rendering HTML Documents Browser notes: Each browser provides a way to examine the HTML document it has rendered There are substantial differences among browsers Firefox: View → Page Source Good learning/debugging tool Microsoft doesn't feel obliged to conform to standards Use Firefox for Web-related assignment(s) 6 Versions of HTML HTML 4.01 Strict Requires complete adherence to HTML 4.01 spec HTML 4.01 Transitional Allows some deprecated elements and attributes We'll use 7 Variants of HTML XML More strict syntax You define the tags! All tags must be closed by another All tags must be correctly nested ... Tags can be semantic in nature See upcoming "XML" lecture 8 Variants of HTML XHTML 1.0 Same as HTML 4.01, but requires use of XML syntax 9 HTML Details See showhtml.html Document structure <!DOCTYPE ...> <html> <head> <title>...</title> </head> <body> ... </body> </html> 10 HTML Details Comments Heading tags Paragraph-level tags Empty tags Physical character formatting tags Logical character formatting tags Entity references Character references Lists Tables 11 Uniform Resource Locators Uniform Resource Locator (URL) Format: protocol://host:port/file protocol We'll use http Others: file, https, ftp, mailto, … See http://en.wikipedia.org/wiki/URI_scheme host An IP address or domain name Recall "Network Programming" lecture 12 Uniform Resource Locators port A number Recall "Network Programming" lecture For HTTP protocol, default port is 80 file A filename Can specify a path Default file specified in web server settings Often index.html, index.php 13 Uniform Resource Locators Examples: http://www.cs.princeton.edu/~rdondero/index.html http://www.cs.princeton.edu:80/~rdondero/index.html 14 HTML (cont.) See showhtml.html (again) Links and anchors Each "page link" specifies a URL Each form specifies a URL Forms User commands browser to fetch page at a URL by: Typing the URL in the browser Clicking on a page link Submitting a form 15 Hypertext Transfer Protocol Hypertext Transfer Protocol (HTTP) A client/server protocol Server = web server Apache web server Apache Tomcat web server (written in Java, can interpret Java) Microsoft Internet Information Services (IIS) web server (for MS Windows) Client = browser (usually, but not necessarily) 16 HTTP Details Question: What happens when you: Type a URL which specifies the HTTP protocol? Click on a page link whose URL specifies the HTTP protocol? Submit a form whose URL specifies the HTTP protocol? Answer... 17 HTTP Details Or could be POST; see next lecture Browser Socket GET file HTTP/1.1 Host: host <Blank line> Redundant. Why? Web Server File system file 18 HTTP Details File system Web Server Socket HTTP/1.1 200 OK Date: date Server: server … Content-Type: text/html <Blank line> <Contents of file> There are many others... Browser A "program" interpreted by the browser as per the content type 19 HTTP Content Types Content types text/html text/plain image/gif image/jpeg audio/mp4 ... See this page: http://en.wikipedia.org/wiki/Internet_media_type 20 The Princeton CS Web Server Place html files in CS Dept file system (penguins) in this directory: ~YourLoginid/public_html Change directory/file permissions: chmod 755 ~YourLoginid chmod 755 ~YourLoginid/public_html chmod 644 ~YourLoginid/public_html/yourFile.html Browse to files using this URL: http://www.cs.princeton.edu/~YourLoginid/yourFile.html 21 The Princeton CS Web Server Beware: Web server demands that directories/files be accessible to all Rules concerning plagiarism apply 22 HTTP via a Browser Use a browser to visit: http://www.cs.princeton.edu http://www.cs.princeton.edu:80 http://www.cs.princeton.edu/~rdondero/ http://www.cs.princeton.edu/~rdondero/index.html 23 HTTP via Telnet Using telnet: $ telnet www.cs.princeton.edu 80 GET / HTTP/1.1 Host: www.cs.princeton.edu <Enter> $ telnet www.cs.princeton.edu 80 GET /~rdondero/ HTTP/1.1 Host: www.cs.princeton.edu <Enter> $ telnet www.cs.princeton.edu 80 GET /~rdondero/index.html HTTP/1.1 Host: www.cs.princeton.edu <Enter> 24 HTTP via Python Code See browser.py Try: browser.py www.cs.princeton.edu 80 / browser.py www.cs.princeton.edu 80 /~rdondero/ browser.py www.cs.princeton.edu 80 /~rdondero/index.html 25 HTTP via Java Code See Browser.java Try: java Browser www.cs.princeton.edu 80 / java Browser www.cs.princeton.edu 80 /~rdondero/ java Browser www.cs.princeton.edu 80 /~rdondero/index.html 26 Summary We have covered: The fundamentals of web programming... The hypertext markup language (HTML) Uniform resource locators (URLs) The hypertext transfer protocol (HTTP) 27