Download 1 Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Distributed firewall wikipedia , lookup

Net bias wikipedia , lookup

RapidIO wikipedia , lookup

Wake-on-LAN wikipedia , lookup

Lag wikipedia , lookup

Video on demand wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Nonblocking minimal spanning switch wikipedia , lookup

Deep packet inspection wikipedia , lookup

Content-control software wikipedia , lookup

Real-Time Messaging Protocol wikipedia , lookup

Hypertext Transfer Protocol wikipedia , lookup

Transcript
The Design and Implementation of a Linux
LVS-based Content Switch
Thesis Proposal
by
Weihong Wang
Computer Science Department
University of Colorado at Colorado Springs
9/12/2000
Approved by:
Dr. Edward Chow
(Advisor)
Dr. Marijike Augusteijn
Dr. Jugal Kalita
1 Introduction
With the rapid growth of Internet traffic, the workload on the servers is increasing
dramatically. Nowadays, servers are easily overloaded, especially for a popular web
server. One solution to overcome the overloading problem of the server is to build
scalable servers on a cluster of servers[1] [2]. With the scalable servers, when load
increases, one or more servers can be added into the server cluster to meet the increasing
request. A very efficient way to accomplish this is to use a load balancer to distribute
load among servers in the cluster. Load balancing can be done in two levels, Transport
level or layer4 switch and Application level which is also known as content switch[16].
1.1
Layer 4 Switching/Transport Level Load Balancing
Linux Virtual Server[3] is an example of Transport level load balancing approach. On
LVS ( Linux Virtual Server) cluster, there are one or two load balancer and a number of
real servers. The front end of the real servers is a load balancer, which schedules the
requests to different real servers and make parallel service of the cluster to appear as a
virtual service on a single IP address called VIP(virtual IP). The load balancer can
redirect the requests to the real servers with different algorithms, such as round-robin,
weighted round-robin, least connected and weighted least connected. The architecture of
the cluster is transparent to end users, and the users see only a single virtual server. As
the traffic increases, network administrator can choose to plug more servers into the
cluster. LVS is a highly scalable and available server cluster.
The advantage of layer 4 load balancing in switching is the overhead of load balancing is
small and the maxim number of real server nodes can reach 25 or up to 100. The reason
is when a user request arrives at the load balancer, the load balancer only examine the
source IP and port number of the incoming packet, the packet matching process can be
easily speeded up with the hash table based on these two fields. The common problem of
layer 4 switching is that it is content blind, and does not take the advantages of the
content information in the request messages.
1.2
Content Switching/Application Level Load Balancing
Application level load balancing (also known as content switching) provides the highest
level of control over the incoming web traffic. When making a load balancing decisions,
the content switch can check the header/content including HTTP meta header, URL, the
pay load of the incoming packet, rather than simply checking TCP/UDP port number or
IP address. By examining the content of the request, these switches can make decisions
on how to route the request to the real servers. The content switching system can achieve
better performance through load balancing the requests over a set of specialized web
servers, or achieve consistent user-perceived response time through persistent
connections (also called sticky connections).
1.2.1
How does the content switch work?
The process of a content switch can be simplified into the following two steps:
1. Packet classification or content switch rule matching[4,-12].
2. Routing (load balancing).
The challenge in the above process lies in the process of identifying packet class (packet
classification). The packet classification is a collection of rules. Each rule defines a class
that contains packets with similar characteristics. The rule also uniquely specifies the
action for each defined class. There are many trade-offs to be made when defining the
rules. For example, when the rule is defined deep into the content, the content switch can
route the requests based on more precise information. On the other hand, the deep search
for certain fields in the message produces processing overhead. Also rules may be
contradict each other. The conflict needs to be resolved. The efficient rule matching
algorithms become a challenging research issue.
1.2.2
Available Content Switch products
There are several commercial products with content switch features available on the
market
Cisco Content Engine 2.20 [14] (CE)
Cisco CE can support HTTP and HTTPS proxy server. The performance of CE is very
similar to the content switch, the only difference is that CE is for the proxy server instead
of the load balancer. CE examines the web request and makes the action decision such as
block, cache, and proxy.
An example of the rules looks like this:
rule no-cache url-regex\.*cgi-bin.*
This rule configures that the incoming packets with the url matching the pattern “*cgibin*” will not be forwarded to the proxy servers.
Cisco Network Based Application Recognition (NBAR)[15]
NBAR is a classification engine that recognizes a wide variety of application requests for
better utilization of the network resource. For example it classifies the HTTP traffic by
examining the domain name or mime type in the url of a request.
An example of rules:
Route(config)# class-map foo
Route(config)# match protocol http mime “*jpeg”
Route(config)# bandwidth 32 kb
The rule defines that any http packet with mime type “jpeg” will be routed with 32 kb
bandwidth.
Intel Action/Classification Engines(ACEs)[13]
An ACE classifies incoming packets according to the protocol and predicated definition
in its Network Classification Language(NCL)[13] rules file, and triggers action in the
associated action files.
The syntax of NCL is as below:
rule allpackets {ether} {action_all()}
Allpackets is the name of the rule, ether is a predicated class matching condition,
action_all() is the action function call.
There are several other content switch products, but the rules are similar to the three
mentioned above. Different classification rules can result in different web server
performance.
2 Goal
The goal of this thesis work is to design and implement an efficient content web switch,
which can load balance the web requests both on the Transport/L4 level and the
Application level. The incoming packets will be routed based on IP address, TCP/UDP
port number, URL regular expression, HTTP meta header, SSL session ID, and the
values of XML tags. The tasks include:

Design scenarios and content switching rules for packet classification and routing .

Design efficient data structures and algorithms for the content switch rule matching

Analyze the conflict among a set of rules

Investigate possible solutions for the rule conflicting problem.
3 Thesis Plan
The thesis will include the following activities:

Study and analyze the existing content switch technologies, especially the content
switching rules.

Study Linux kernel and Linux Virtual Server software.

Study Apatch web server, and their HTTP request processing.

Design and implement the content switch.

3.1
Test functionality and reliability of the design.
Design
The proposed content switch will be able to cover five different content switching cases.
The content switch will distribute the incoming requests to different web servers based on
the routing policy. The five different classes of routing policies are as follow:
1. Based on the source IP address and TCP/UDP port number.
2. Based on the URL regular expression.
3. Based on the HTTP meta header.
4. Based on the SSL session ID.
5. Based on values of XML tags.
The majority of the design tasks will fall into the following two categories:

matching incoming packets to the specific classs.

routing rules.
Since the content switch is a very new technology, there are only a few products available
on the market. Not much research has been done in the rule matching area for the content
switching. Therefore, design an efficient rule matching algorithm is going to be very
challenging. The main focus of the design will be on designing efficient rules and
algorithms for the construction of the content switch.
3.2
Implementation
Linux Virtual Server[3] is a public domain open source package. LVS enables transport
level load balancing in a web server cluster, and can be extended to support application
level load balancing. We will reuse the code that perform the interception of IP packets
and replace the rule matching code. New features will be added.
3.3
Testing
The implementation will be tested rigorously on a computer network in our lab. The
network configuration will include one content switch, four real servers and some basic
routing policies.
3.4
Deliverables
The deliverables will include:

Design documentation for the content switch.

Source code for implementing the design on the Red hat Linux 6.2 system.

Testing documentation.
4 Reference
[1]. “ Windows 2000 clustering Technologies: Cluster Service Architecture”, Microsoft
White Paper, 2000. http://www.microsoft.com.
[2]. “Network Load Balancing Technical Overview”, Micosoft White Paper, 2000.
http://www.microsoft.com.
[3] “Linux Virtual Server”, http://www.linuxvirtualserver.org
[4] George Apostolopoulos, David Aubespin, Vinod Peris, Prashant Pradhan, Debanjan
Saha, “ Design, Implementation and Performance of a Content-Based Switch”, Proc.
Infocom2000, Tel Aviv, March 26 - 30, 2000,
http://www.ieee-infocom.org/2000/papers/440.ps
[5]. Pankaj Gupta 1 and Nick McKeown, “Dynamic Algorithms with Worst-case
Performance for Packet Classification”, Proc. IFIP Networking, May 2000, Paris,
France.
http://www-cs-students.Stanford.edu/~pankaj/paps/ifip00.pdf
[6] Anja Feldmann S. Muthukrishnan “Tradeoffs for Packet Classification”,
Proceedings of Gigabit Networking Workshop GBN 2000, 26 March 2000 - Tel
Aviv, Israel
http://www.comsoc.org/socstr/techcom/tcgn/conference/gbn2000/anja-paper.pdf
[7] Thomas Woo, “A Modular Approach to Packet Classification: Algorithms and
Results”, Proc. Infocom2000, Tel Aviv, March 26 - 30, 2000,
http://www.ieee-infocom.org/2000/papers/609.ps
[8] Pankaj Gupta and Nick McKcown, “Packet Classification on Multiple Fields”, Proc.
Sigcomm, September 1999, Harvard University.
http://www-cs-students.Stanford.edu/~pankaj/paps/sig99.pdf
[9] Pankaj Gupta and Nick McKeown, “Packet Classification using Hierarchical
Intelligent Cuttings”, Proc. Hot Interconnects VII, August 99, Stanford. Also in
IEEE Micro, pp 34-41, Vol. 20, No. 1, January/February 2000.
http://www-cs-students.Stanford.edu/~pankaj/paps/hoti99.pdf
[10] V. Srinivasan S. Suri G. Varghese, “Packet Classification using Tuple Space
Search”, Proc. Sigcomm99, August 30 - September 3, 1999, Cambridge United
States, Pages 135 - 146
http://www.acm.org/pubs/articles/proceedings/comm/316188/p135-srinivasan/p135srinivasan.pdf
[11] Andrew Begel, Steven McCanne and Susan L. Graham, “BPF+: exploiting global
data-flow optimization in a generalized packet filter architecture” Proc. Sigcomm99,
August 30 - September 3, 1999, Cambridge United States, Pages 123 - 134
http://www.acm.org/pubs/articles/proceedings/comm/316188/p123-begel/p123begel.pdf
[12] T. V. Lakshman and D. Stiliadis, “High-speed policy-based packet forwarding using
efficient multi-dimensional range matching”, Proceedings of the ACM SIGCOMM
'98, August 31 - September 4, 1998, Vancouver Canada Pages 203 - 214
http://www.acm.org/pubs/articles/proceedings/comm/285237/p203-lakshman/p203lakshman.pdf
[13] “Intel IX-API”. http://www.intel.com/design/IXA/whitepapers/ixapi.html.
[14]”Release Notes for Cisco Content Engine Software”. http://www.cisco.com”.
[15] “Network-Based Application Recognition Enhancements”. http://www.cisco.com.
[16] Gregory Yerxa and James Hutchinson, “Web Content Switching”,
http://www.networkcomputing.com.