Download 1 Introduction

The Design and Implementation of a Linux LVS-based Content Switch Thesis Proposal by Weihong Wang Computer Science Department University of Colorado at Colorado Springs 9/12/2000 Approved by: Dr. Edward Chow (Advisor) Dr. Marijike Augusteijn Dr. Jugal Kalita 1 Introduction With the rapid growth of Internet traffic, the workload on the servers is increasing dramatically. Nowadays, servers are easily overloaded, especially for a popular web server. One solution to overcome the overloading problem of the server is to build scalable servers on a cluster of servers[1] [2]. With the scalable servers, when load increases, one or more servers can be added into the server cluster to meet the increasing request. A very efficient way to accomplish this is to use a load balancer to distribute load among servers in the cluster. Load balancing can be done in two levels, Transport level or layer4 switch and Application level which is also known as content switch[16]. 1.1 Layer 4 Switching/Transport Level Load Balancing Linux Virtual Server[3] is an example of Transport level load balancing approach. On LVS ( Linux Virtual Server) cluster, there are one or two load balancer and a number of real servers. The front end of the real servers is a load balancer, which schedules the requests to different real servers and make parallel service of the cluster to appear as a virtual service on a single IP address called VIP(virtual IP). The load balancer can redirect the requests to the real servers with different algorithms, such as round-robin, weighted round-robin, least connected and weighted least connected. The architecture of the cluster is transparent to end users, and the users see only a single virtual server. As the traffic increases, network administrator can choose to plug more servers into the cluster. LVS is a highly scalable and available server cluster. The advantage of layer 4 load balancing in switching is the overhead of load balancing is small and the maxim number of real server nodes can reach 25 or up to 100. The reason is when a user request arrives at the load balancer, the load balancer only examine the source IP and port number of the incoming packet, the packet matching process can be easily speeded up with the hash table based on these two fields. The common problem of layer 4 switching is that it is content blind, and does not take the advantages of the content information in the request messages. 1.2 Content Switching/Application Level Load Balancing Application level load balancing (also known as content switching) provides the highest level of control over the incoming web traffic. When making a load balancing decisions, the content switch can check the header/content including HTTP meta header, URL, the pay load of the incoming packet, rather than simply checking TCP/UDP port number or IP address. By examining the content of the request, these switches can make decisions on how to route the request to the real servers. The content switching system can achieve better performance through load balancing the requests over a set of specialized web servers, or achieve consistent user-perceived response time through persistent connections (also called sticky connections). 1.2.1 How does the content switch work? The process of a content switch can be simplified into the following two steps: 1. Packet classification or content switch rule matching[4,-12]. 2. Routing (load balancing). The challenge in the above process lies in the process of identifying packet class (packet classification). The packet classification is a collection of rules. Each rule defines a class that contains packets with similar characteristics. The rule also uniquely specifies the action for each defined class. There are many trade-offs to be made when defining the rules. For example, when the rule is defined deep into the content, the content switch can route the requests based on more precise information. On the other hand, the deep search for certain fields in the message produces processing overhead. Also rules may be contradict each other. The conflict needs to be resolved. The efficient rule matching algorithms become a challenging research issue. 1.2.2 Available Content Switch products There are several commercial products with content switch features available on the market Cisco Content Engine 2.20 [14] (CE) Cisco CE can support HTTP and HTTPS proxy server. The performance of CE is very similar to the content switch, the only difference is that CE is for the proxy server instead of the load balancer. CE examines the web request and makes the action decision such as block, cache, and proxy. An example of the rules looks like this: rule no-cache url-regex\.*cgi-bin.* This rule configures that the incoming packets with the url matching the pattern “*cgibin*” will not be forwarded to the proxy servers. Cisco Network Based Application Recognition (NBAR)[15] NBAR is a classification engine that recognizes a wide variety of application requests for better utilization of the network resource. For example it classifies the HTTP traffic by examining the domain name or mime type in the url of a request. An example of rules: Route(config)# class-map foo Route(config)# match protocol http mime “*jpeg” Route(config)# bandwidth 32 kb The rule defines that any http packet with mime type “jpeg” will be routed with 32 kb bandwidth. Intel Action/Classification Engines(ACEs)[13] An ACE classifies incoming packets according to the protocol and predicated definition in its Network Classification Language(NCL)[13] rules file, and triggers action in the associated action files. The syntax of NCL is as below: rule allpackets {ether} {action_all()} Allpackets is the name of the rule, ether is a predicated class matching condition, action_all() is the action function call. There are several other content switch products, but the rules are similar to the three mentioned above. Different classification rules can result in different web server performance. 2 Goal The goal of this thesis work is to design and implement an efficient content web switch, which can load balance the web requests both on the Transport/L4 level and the Application level. The incoming packets will be routed based on IP address, TCP/UDP port number, URL regular expression, HTTP meta header, SSL session ID, and the values of XML tags. The tasks include:  Design scenarios and content switching rules for packet classification and routing .  Design efficient data structures and algorithms for the content switch rule matching  Analyze the conflict among a set of rules  Investigate possible solutions for the rule conflicting problem. 3 Thesis Plan The thesis will include the following activities:  Study and analyze the existing content switch technologies, especially the content switching rules.  Study Linux kernel and Linux Virtual Server software.  Study Apatch web server, and their HTTP request processing.  Design and implement the content switch.  3.1 Test functionality and reliability of the design. Design The proposed content switch will be able to cover five different content switching cases. The content switch will distribute the incoming requests to different web servers based on the routing policy. The five different classes of routing policies are as follow: 1. Based on the source IP address and TCP/UDP port number. 2. Based on the URL regular expression. 3. Based on the HTTP meta header. 4. Based on the SSL session ID. 5. Based on values of XML tags. The majority of the design tasks will fall into the following two categories:  matching incoming packets to the specific classs.  routing rules. Since the content switch is a very new technology, there are only a few products available on the market. Not much research has been done in the rule matching area for the content switching. Therefore, design an efficient rule matching algorithm is going to be very challenging. The main focus of the design will be on designing efficient rules and algorithms for the construction of the content switch. 3.2 Implementation Linux Virtual Server[3] is a public domain open source package. LVS enables transport level load balancing in a web server cluster, and can be extended to support application level load balancing. We will reuse the code that perform the interception of IP packets and replace the rule matching code. New features will be added. 3.3 Testing The implementation will be tested rigorously on a computer network in our lab. The network configuration will include one content switch, four real servers and some basic routing policies. 3.4 Deliverables The deliverables will include:  Design documentation for the content switch.  Source code for implementing the design on the Red hat Linux 6.2 system.  Testing documentation. 4 Reference [1]. “ Windows 2000 clustering Technologies: Cluster Service Architecture”, Microsoft White Paper, 2000. http://www.microsoft.com. [2]. “Network Load Balancing Technical Overview”, Micosoft White Paper, 2000. http://www.microsoft.com. [3] “Linux Virtual Server”, http://www.linuxvirtualserver.org [4] George Apostolopoulos, David Aubespin, Vinod Peris, Prashant Pradhan, Debanjan Saha, “ Design, Implementation and Performance of a Content-Based Switch”, Proc. Infocom2000, Tel Aviv, March 26 - 30, 2000, http://www.ieee-infocom.org/2000/papers/440.ps [5]. Pankaj Gupta 1 and Nick McKeown, “Dynamic Algorithms with Worst-case Performance for Packet Classification”, Proc. IFIP Networking, May 2000, Paris, France. http://www-cs-students.Stanford.edu/~pankaj/paps/ifip00.pdf [6] Anja Feldmann S. Muthukrishnan “Tradeoffs for Packet Classification”, Proceedings of Gigabit Networking Workshop GBN 2000, 26 March 2000 - Tel Aviv, Israel http://www.comsoc.org/socstr/techcom/tcgn/conference/gbn2000/anja-paper.pdf [7] Thomas Woo, “A Modular Approach to Packet Classification: Algorithms and Results”, Proc. Infocom2000, Tel Aviv, March 26 - 30, 2000, http://www.ieee-infocom.org/2000/papers/609.ps [8] Pankaj Gupta and Nick McKcown, “Packet Classification on Multiple Fields”, Proc. Sigcomm, September 1999, Harvard University. http://www-cs-students.Stanford.edu/~pankaj/paps/sig99.pdf [9] Pankaj Gupta and Nick McKeown, “Packet Classification using Hierarchical Intelligent Cuttings”, Proc. Hot Interconnects VII, August 99, Stanford. Also in IEEE Micro, pp 34-41, Vol. 20, No. 1, January/February 2000. http://www-cs-students.Stanford.edu/~pankaj/paps/hoti99.pdf [10] V. Srinivasan S. Suri G. Varghese, “Packet Classification using Tuple Space Search”, Proc. Sigcomm99, August 30 - September 3, 1999, Cambridge United States, Pages 135 - 146 http://www.acm.org/pubs/articles/proceedings/comm/316188/p135-srinivasan/p135srinivasan.pdf [11] Andrew Begel, Steven McCanne and Susan L. Graham, “BPF+: exploiting global data-flow optimization in a generalized packet filter architecture” Proc. Sigcomm99, August 30 - September 3, 1999, Cambridge United States, Pages 123 - 134 http://www.acm.org/pubs/articles/proceedings/comm/316188/p123-begel/p123begel.pdf [12] T. V. Lakshman and D. Stiliadis, “High-speed policy-based packet forwarding using efficient multi-dimensional range matching”, Proceedings of the ACM SIGCOMM '98, August 31 - September 4, 1998, Vancouver Canada Pages 203 - 214 http://www.acm.org/pubs/articles/proceedings/comm/285237/p203-lakshman/p203lakshman.pdf [13] “Intel IX-API”. http://www.intel.com/design/IXA/whitepapers/ixapi.html. [14]”Release Notes for Cisco Content Engine Software”. http://www.cisco.com”. [15] “Network-Based Application Recognition Enhancements”. http://www.cisco.com. [16] Gregory Yerxa and James Hutchinson, “Web Content Switching”, http://www.networkcomputing.com.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1 Introduction