Download Web servers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

URL redirection wikipedia , lookup

Transcript
AF_IC01_U1. HTTP protocol and web servers
Unit 1
Table of contents
Introduction ............................................................................................................................. 2
Prerequisites ........................................................................................................................... 3
HTTP (HyperText Transfer Protocol) ....................................................................................... 4
URL / URI................................................................................................................................ 5
Web servers ............................................................................................................................ 7
Web applications ..................................................................................................................... 8
HTTP security ....................................................................................................................... 10
HTTP transactions ................................................................................................................ 11
Bibliography .......................................................................................................................... 13
AF_IC01_U1. HTTP protocol and web servers / Unit 1
1
Introduction
In this unit you will learn about the HTTP protocol, which enables you to surf on the Internet.
You will see the most relevant features, and learn what a URL is. Web servers are the other
topic. You will understand what a web server is, its purpose and also how to install and
configure it. Security and transactions will be explained at the end of the unit.
AF_IC01_U1. HTTP protocol and web servers / Unit 1
2
Prerequisites
Some of prerequisites needed in this unit:



OSI and TCP/IP stack
Knowledge of HTML
Database concepts
AF_IC01_U1. HTTP protocol and web servers / Unit 1
3
HTTP (HyperText Transfer Protocol)
HTTP is a protocol designed to transfer hypertext (text that can be linked to other texts). The
most common is the use of HTML pages, but also other formats like text files, images, etc…can
be transferred.
TO KNOW MORE
Some more information about HTML at Wikipedia
Wikipedia. HTML
The most important features are:





Application layer protocol.
Using Universal Resource Identifier (URI), specifically Universal Resource Locator
(URL), defined further, which permits you to identify every resource on the Internet.
Client-Server architecture (request/response paradigm).
- Default port is 80.
- Communication works over TCP (transport layer), but also can be used over UDP.
Connectionless and stateless protocol
- The server responds only to the current request, and remains unrelated to other
connections.
- A connection is set for every requested file (since version 1.1, a keep-alivemechanism was introduced, allowing to reuse connections). For instance: if a
webpage has 2 images, 3 connections are needed: one for the HTML page, and one
for each image.
Open to new data types.
- Use of the MIME (Multipart Internet Mail Extension) in order to determine the type of
data (designed for STMP protocol, but also used with HTTP).
Although HTTP is a connectionless and stateless protocol, there are some ways to provide
memory, that is, to remember what pages are related (identification on a website…):




Cookies (web cookie, browser cookie). RFC 6265
- Small piece of data stored on your own computer that a website can read when a
connection is established. With these cookies, information can be retrieved and also
users’ activity can be recognized.
- Cookies can install viruses neither malware, but they can compile a lot of information
(passwords for example)
HTTP authentication. RFC 2617
- Use username and password to log into a web server.
Store data on the server (IP address…).
Embed a query in the URL
- Example: … moodle2/course/view.php?id=16, where 16 indicates the number of the
course
AF_IC01_U1. HTTP protocol and web servers / Unit 1
4
URL / URI
URL stands for Uniform Resource Locator, and URI stands for Uniform Resource Identifier. In
HTTP protocol, normally the term URL is used. Both are a string that assigns a unique address
to each resource available on the Internet.
URL
Every resource on the Internet is identified by a unique address, the URL.
The resource URL is its Internet address, and allows the browser to find and display it correctly.
It is a combination of:




Protocol
Host
Path
Filename
In this case, the format is: protocol://host/folder/file. Example: http://ca.wikipedia.org/wiki/HTTP.
But there are more parameters. Therefore, the complete format is:
protocol://user:password@host:port/path/file?query#fragment
Let’s see a more detailed explanation of every parameter.
protocol://user:password@host:port/path/file?query#fragment
Examples of protocols that can be used to retrieve data:












http: Hypertext Transfer Protocol
https: HTTP over SSL
gopher: The Gopher protocol
ftp: File Transfer Protocol
mailto: Electronic mail address
ldap: LDAP (Lightweight Directory Access Protocol)
file: Host-specific file names
news: USENET news
nntp: USENET news using NNTP access
telnet: Reference to interactive sessions
wais: Wide Area Information Servers
prospero: Prospero Directory Service
protocol://user:password@host:port/path/file?query#fragment


user:password specifies the user and the password on the server.
Careful! The password is transferred visibly.
protocol://user:password@host:port/path/file?query#fragment
AF_IC01_U1. HTTP protocol and web servers / Unit 1
5



host:port specifies the transport address, that is, the host machine and the service
requested.
The host machine can be defined by its IP address or by a DNS name.
By default, port 80.
protocol://user:password@host:port/path/file?query#fragment



Indicates the path of the file. This is the path from the browser view.
To know where the file is located on the server, you must add the root directory at the
beginning.
Example: http://www.domain.cat/path/file.html
- Root directory: /var/www/htdocs
- Location on the server: /var/www/htdocs/path/file.html
protocol://user:password@host:port/path/file?query#fragment

The file itself could either be an HTML file or a web programming language file
(explained in the next section).
protocol://user:password@host:port/path/file?query#fragment


The query is used to pass parameters to the server.
It is a list of parameter-value pairs separated by ampersands.
- ?param1=value2&param2=value2&...
protocol://user:password@host:port/path/file?query#fragment

#fragment specifies a position within the document (defined by an anchor).
TO KNOW MORE
If you want to take a look at the specifications of the complete format, see the following sites:
RFC 1738. Uniform Resource Locator
RFC 3986. Uniform Resource Identifier (pay attention to section 1.1.3)
AF_IC01_U1. HTTP protocol and web servers / Unit 1
6
Web servers
HTTP is used to transfer resources. These resources, in addition to files, can be the result of a
program execution, a query to a database, automatic translation of a document, etc…
Therefore, for a web server, resources can be:


files or
the result of a program execution
A web server is a server with a software able to accept HTTP requests from clients (known as
web browsers), and deliver the web content.
The pages delivered by the server can be:


Static: there is an existing document (HTML file) in the file system.
Dynamic: the document is dynamically generated by a script or program executed by
the web server.
- Example: PHP, ASP, JSP pages.
Activity: My first web application
AF_IC01_U1. HTTP protocol and web servers / Unit 1
7
Web applications
Web applications are applications called by the web server or the browser in order to generate
dynamic web pages.
Two types must be distinguished:


Applications on the client side:
- The web client (browser) executes the code provided by the web server.
- The browser must have the capacity to run applications (also called scripts). Modern
browsers allow to you do that.
- Programming language are usually Javascript or Flash (also Java applets).
Applications on the server side:
- The web server executes the web application and generates the dynamic web page.
- The generated web page is sent to the client using the HTTP protocol
Applications
Client side
Server side
Advantages
If the application is loaded into the client, traffic can be reduced
between the server and the client using modern technologies (AJAX).
The host machine does not need any additional capacity. They can be
light clients.
Three levels (3-tier) can be distinguished in web applications, where each one provides a
specific functionality. These 3 tiers are:



First tier: presentation layer which includes the browser and the web server.
Second tier: a program or script capable of generating some web content.
Third tier: provides access to databases.
Server
2nd tier
Client
Web
server
Script or
application
File
system
Data
base
Web
browser
1st tier
AF_IC01_U1. HTTP protocol and web servers / Unit 1
3rd tier
8
This architecture is only used in dynamic pages. In static pages only the first tier is used, in
order to access to the file system to retrieve some HTML file. In dynamic pages, the next
scheme is followed:
1. Retrieve user data (1st level)
2. User data is used by the server, which executes a program or script (2nd level) in order
to access to a database (3rd level).
3. A new web page is generated by this process and the result is sent to the browser (1st
level again)
General scheme of web technologies:
Client
Browser
HTML
XML
JavaScript
Applet
Flash
…
↔
Web server
Apache
IIS
Tomcat
…
Server
Programming language
JSP
ASP
PHP
Servlets
CGI → application
…
↔
Data
Database
MySQL
MSSQL
Oracle
PostgreSQL
…
TO KNOW MORE
A web server survey about the market share of the most significative web servers can be
found at
Netcraft January 2013 Web Server Survey
Activity: Practice. LAMP server configuration
AF_IC01_U1. HTTP protocol and web servers / Unit 1
9
HTTP security
One of the most weaknesses of HTTP protocol is security. All the information which travels on
the Internet is unencrypted.
In order to secure this protocol, HTTPS was developed using the SSL/TLS protocols, which
provides cryptographic and authentication protocols. It uses port 443.
Not only must the communication be encrypted, but also a certification of who is sending the
data is necessary. A trusted third party (certification authority) creates those certifications.
Information about these CA can be viewed through the browser.
AF_IC01_U1. HTTP protocol and web servers / Unit 1
10
HTTP transactions
A simple HTTP transaction HTTP could be:
1. A client requests a web page
2. The server responds sending the requested resource
Basically, these transactions are made by two methods: the request method and the response
method. Both consists of a header and a body.
There are several request methods (GET, POST, HEAD…) but they have a common format.
The format of the initial line is 3 fields separated by blank spaces:
method resource version_of_protocol
Example: GET http://www.xtec.cat/web/guest/home HTTP/1.0
TO KNOW MORE
To become familiar with the GET and POST method and find out differences between them,
take a look at
HTTP methods
The response method is quite similar. The format of the initial line is as follows:
version_of_protocol response_code message
Example: HTTP/1.0 403 Forbidden
In this case, the response code could be very useful when a problem appears. They are
classified in ranges and every number is related to a type of error.
Range
Meaning
100 - 199
Informational
200 - 299
OK
300 - 399
Redirection
400 - 499
Client Error
500 - 599
Server Error
Example: when a file is not found because you have mistyped or copied it wrong, a 404 error
(Not Found) is sent by the server.
AF_IC01_U1. HTTP protocol and web servers / Unit 1
11
Activity: Listen and Watch. HTTP 500 Internal Server Error.
AF_IC01_U1. HTTP protocol and web servers / Unit 1
12
Bibliography




Instal·lació i manteniment de serveis d’Internet. Institut Obert de Catalunya (IOC). Edició
2006
Instal·lació i manteniment de serveis d’Internet. Editorial McGraw-Hill. Edició 2006
RFC 1945. Hypertext Transfer Protocol - HTTP/1.0
RFC 2616. Hypertext Transfer Protocol - HTTP/1.1
AF_IC01_U1. HTTP protocol and web servers / Unit 1
13