Download A Gateway-based Defense System for Distributed Denial-of-Service Attacks in High-Speed Networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
1
A Gateway-based Defense System for Distributed
Denial-of-Service Attacks in High-Speed Networks
Dong Xuan, Shengquan Wang, Ye Zhu, Riccardo Bettati, and Wei Zhao
Abstract— We describe a defense system to contain Distributed Denial-of-Service (DDoS) flooding attacks in highspeed networks. We aim at protecting TCP friendly traffic, which forms a large portion of Internet traffic. DDoS
flooding attacks tend to establish large numbers of malicious
traffic flows to congest network. These flows are marked as
TCP flows, and use spoofed source identifiers to hide their
identities. Current network equipment lacks the countermeasure abilities for such kind of DDoS attack. We describe
a gateway-based countermeasure approach. A gateway is a
device that is inserted in some point of the network. We envision the gateway devices that are deployed in the network
to collaboratively perform the desired countermeasure functions, including detection of DDoS flooding attacks and access control of network traffic. Given the nature of DDoS
attack in high speed networks and the limitation of defense
resources, it is impossible for the gateway to work on the
individual level of on-going traffic flows. We use a groupbased strategy where we partition the network under DDoS
attack into several subnetworks, and handle the traffic from
the same subnetworks as an aggregate. This approach is applied both in attack detection and access control. With this
strategy, the system can be free from the overhead to handle
individual flows, and focus on the groups of traffic flows.
I. I NTRODUCTION
Recent events have shown how various forms of scripts
and other forms of automation can be used to harness large
numbers of largely unprotected resources on the Internet
to mount security attacks on very large scales. The most
prominent form of such attacks is the distributed denial
of service (DDoS) attack. Given the large amounts of resources available to the attacker, critical components of a
victim can be easily overwhelmed, and so the service provided by the victim effectively disrupted. These attacks
typically exhaust link bandwidth, router processing capacity, and/or network stack resources, to achieve their objective of breaking network connectivity to the victims.
Very little has been done to date in terms of early detecDong Xuan is with the Department of Computer Information and
Science, the Ohio-State University, Columbus, OH 43210. E-mail:
[email protected] .
Shengquan Wang, Ye Zhu, Riccardo Bettati, and Wei Zhao are with
the Department of Computer Science, Texas A&M University, College Station, TX 77843. E-mail: {swang, bettati, zhao}@cs.tamu.edu,
[email protected] .
tion and containment of this form of DDoS attacks. This is
largely caused by the difficulties encountered in designing
such systems. Difficulties arise from three aspects:
First, it is difficult to maintain high friendly TCP traffic
throughput under the DDoS attack. Current DDoS defense
strategies based on packet dropping cannot avoid dropping
significant numbers of TCP friendly packets due to the difficulty to separate TCP friendly traffic from malicious traffic in high-speed networks. Since TCP traffic is inherently
responsive, additional dropping of TCP traffic significantly
amplifies the effect caused by the DDoS attack.
Second, DDoS flooding attacks are inherently difficult
to detect. The attack flows hide their identities by using
spoofed source identifiers and by marking themselves as
TCP flows, although their dynamics is completely unresponsive. An individual attack flow looks like a friendly
flow in terms of the network bandwidth consumption.
Millions of such mini-flows (generated by using spoofed
sources) make the network congested, however. Naturally,
the attack flows aggregate together with the friendly TCP
flows (they maybe come from the same sources).
Finally, the large number of flows involved in massive
DDoS attacks require large amounts of resources to be devoted to classifying, monitoring, and countering malicious
flows. The limited system resources, such as the CPU processing capacity, buffer etc, are easily exhausted in detecting millions of the above attack flows from the friendly
TCP flows. Individual malicious flows can operate significantly below the detection level of current monitoring
technology.
In this study, we aim at designing a defense system that
contains DDoS flooding attacks in high-speed networks.
The objectives are to (a) maximize friendly traffic throughput while reducing attack traffic as much as possible, (b)
minimize the disturbance of the defense system on delay
performance of friendly traffic, and (c) achieve high compatibility to the existing systems.
We adopt the following two main strategies to achieve
these objectives:
• A gateway-based defense strategy: We adopt a gateway based approach. In this context, a gateway is a device
that is inserted in some point of the network. We envision
the gateway devices that are deployed in the network to
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
2
collaboratively perform the desired countermeasure functions, including detection of DDoS flooding attacks and
access control of network traffic.
• A group-based defense strategy: Given the nature of
DDoS attack in high speed networks and the limitation of
defense resources, it is impossible for the gateway to work
on the individual level of on-going traffic flows. In this
study, we adopt a group-based strategy. The basic idea is
that we partition the network under DDoS attack into several subnetworks, and use the same treatments to the traffic
from the same subnetworks. The idea is applied both in attack detection and access control. With this strategy, the
system can be free from the overhead to handle individual
flows, and focus on the groups of traffic flows.
Besides the above main defense strategies, we adopt efficient defense approaches at each stage of DDoS defense.
At the stage of attack detection, we design TCP-ACK
based attack detection, and use statistical sampling to efficiently obtain the knowledge of the traffic under DDoS attack. At the stage of access control, we classify the traffic
into different classes according to their geometry similarity and the damaged degree by DDoS attack. We design a
multi-class RED control block with a Class-based Queueing (CBQ) scheduler to control the consumption of bandwidth, aiming to achieve the maximum possible throughput.
The rest of the paper is organized as follows: Previous
work related to our study is discussed in Section II. In Section III, we introduce the network model used in this paper. We also categorize DDoS flooding attacks, and identify the one we will address in this study. The defense system and gateway architecture are studied in Section IV and
Section V respectively. Attack detection strategy and access control strategy are described in Section VI and Section VII respectively. We describe gateway cooperation in
Section VIII. In Section IX, we discuss extension of the
proposed system. We summarize the paper in Section X.
cure Broader Gateway Protocol (S-BGP) [2] architecture
employs three security mechanisms to render BGP robust against attacks: A Public key Infrastructure is used
to support the authentication of Autonomous Systems and
BGP routers, and of various authorizations. A BGP transitive path attribute is employed to carry digital signatures
(in ”attestation”) covering the routing information in BGP
UPDATEs. IPsec is used to provide data and partial sequence integrity, and to enable BGP routers to authenticate
each other for exchanges of BGP control traffic. Anderson
et al. [7] addresses methods to render protocols enforceable. In such cases, behavioral properties must be checked
in a no-trust relation. This can be done by having appropriate countermeasures to respond to misbehavior or by
modifying protocols to carry enough status to verify correct operation. This work is mostly focusing on TCP.
• DoS Detection: BBN’s Source Path Isolation Engine
(SPIE) attempts to locate source of attacks by tracing back
the path of packets. The result is a graph back to a set of
origins, some of which may be the attackers. Work at Network Associates attempts to enhance the attack detection
and response capacity via active network (AN) technology. The work is based on CITRA (Cooperative Intrusion
Trace-back and Response Architecture) and the Intruder
Detection and Isolation Protocol architecture [8]. Packets
belonging to DDoS attacks do not have readily-identifiable
flow signatures, some researchers developed the concept of
Aggregate-based Congestion Control, which can be used
to counter some formats of DDoS flooding attacks [5].
• DoS Response: Mechanisms for response use a combination of (a) restricting the access of the attacker by limiting access to resources, (b) re-routing to isolate critical
components, and (c) back-tracing and offensive attack suppression.
II. P REVIOUS W ORK
In this work, we restrict ourselves to a single domain,
and we focus our attention on domains that are not transit
domains. This means that either sources or destinations of
traffic flows belong to the domain. We also assume that
the domain is fully within our jurisdiction. This means,
for example, that we can deploy our gateways anywhere in
the domain, and that gateways know the exact topology of
the domain.
Recent work on DoS can be categorized into one of
the following three classes: Network Infrastructure Protection, DoS Detection, and DoS Response. We elaborate
on each of them shortly, describe at least one example for
each category.
• Network Infrastructure Protection: This line of work
focuses on attack prevention and defense through a robust infrastructure. Work at the University of Michigan,
for example, starts from an analysis of Internet routing instabilities [3], and studies the use of methods to prevent
attackers from getting network information, and methods
to automate back-tracing of DDoS attacks. BBN’s Se-
III. M ODELS
A. Network Model
B. DDoS Flooding Attack Model
In this paper, we propose a defense system against
DDoS flooding attacks. This form of DDoS attacks is
caused by the attacker(s) breaking into a large number of –
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
geographically dispersed – machines, and harnessing their
computing and communication resources for large-scale,
coordinated attacks on victim sites. These attacks typically exhaust link bandwidth, router processing capacity,
and/or network stack resources, to achieve their objective
of breaking network connectivity to the victims.
Network resources can be consumed by DDoS flooding
attacks in two forms:
• A1: When the attack originates from only a few number of hosts, the individual attack flows must very aggressively consume bandwidth, and the attack flows can easily
be identified by their bandwidth consumption behavior.
• A2: If more hosts are involved as sources of the attack,
individual flows can be made to behave in a much more
compliant fashion, and so can behave similarly to TCP
or UDP flows expected in the system. The attacker can
achieve this by frequently changing sources (i.e. using
spoofed sources) to hide flow identities. If such flows are
multiplexed with friendly traffic, it becomes very difficult
to detect and drop the attack traffic to prevent losing a lot
of friendly packets, which is not tolerant for TCP traffic.
In addition, flows in a DDoS flooding attack tend to
be non-responsive, i.e., they use UDP-style dynamics to
congest the network. However, the attack traffic may be
marked as:
• B1: TCP
• B2: UDP
in the IP packet header.
Some DDoS flooding attacks may spoof their sources,
others may not, hence we can categorize the attacks as
• C1: spoofed-source attacks
• C2: non-spoofed-source attacks
As mentioned above, we work in the network with a single domain. There are two possibilities of the distribution
of the attack sources:
• D1: all attack sources are outside the network.
• D2: there may be some attack sources inside the network.
We use a 4-tuple to represent the different cases of
DDoS flooding attacks. For instance, hA2, B1, C1, D1i
represents the case in which attacks use an extraordinary
large number of attack traffic with TCP header and spoofed
source to congest the network, and the real attack sources
are out of the attacked network.
Obviously, there may be mixed cases. For example,
attacks may use both TCP and UDP marked flows to
congest the network. In this study, we work on Case
hA2, B1, C1, D1i, where many hosts have been harassed
into flooding the victim with non-responsive traffic marked
as TCP, where the source addresses are spoofed, and where
all sources are outside the considered network. We believe
3
that this is one of the most typical and challenging cases.
In Section IX, we will discuss how to extend our work to
the other cases.
IV. S YSTEM OVERVIEW
In this section, we give the overview of the whole defense system to DDoS attacks.
21
22
13
13
6
23
15
14
7
24
16
17
8
9
3
k
4
25
18
19
10
26
20
11
12
5
2
1
Fig. 1. A Part of a Network with Gateways
The defense system centers around the gateway. A gateway is a device that is inserted in some point of the network. It is an external unit to the network existing equipments. With this strategy, no change is need to the current
network equipments or network protocols, and high compatibility can be achieved. Figure 1 illustrates a part of a
network with several gateways deployed. The basic functions of the gateways are attack detection and access control. Gateways in the network cooperate with each other
to achieve high defense efficiency. They may share attack
detection results, and work on different portion of the ongoing traffic.
Given the nature of DDoS attack in high speed networks
and the limitation of defense resources, it is impossible for
the gateway to work on the individual level of on-going
traffic flows. In this study, we adopt a group-based strategy. The basic idea is that we partition the network under
DDoS attack into several subnetworks, and handle the traffic from the same subnetworks as an aggregate. Our idea of
grouping can be applied to all stages of defense including
attack detection and access control. With this strategy, the
system can be free from the overhead to handle individual
flows, and focus on the grouped traffic.
In the following sections, we will first describe the architecture of the basic unit in our defense system, then introduce two main defense functions: attack detection and
access control.
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
V. G ATEWAY A RCHITECTURE
As mentioned above, the gateway is the basic unit in
our proposed defense system. The basic functionality of
the gateway is to perform attack detection and traffic access control. Figure 2 shows the basic architecture of one
Access Control
Module
RED
RED
Classifier
Scheduler
RED
Network
Traffic DB
Signaling
Module
Attack Detection
Module
Checking
Traffic Sampling
Sampling Rules
Fig. 2. The Gateway Architecture
gateway. There are three modules within a gateway:
• Attack Detection Module (AD Module, in short): This
module is responsible for obtaining the knowledge of the
network traffic suffering DDoS attack. As mentioned
above, given the limited resources, it is impossible for a
gateway to get individual flow information. We propose a
way to know the overall knowledge of traffic under DDoS
attack, such as the percentage of traffic belonging to the
bad traffic. The knowledge will be used by the Traffic Access Control module. The module will select a portion of
the on-going traffic to perform attack detection. Accordingly, this module can be further divided into the traffic
sampling sub-module and the checking sub-module. The
traffic is sampled and selected by the traffic sampling submodule, and queued at the buffer between the two submodules for checking. The traffic handled by this module
is copied from the on-going traffic, hence there is little disturbance on the on-going friendly traffic introduced by this
module.
• Traffic Access Control Module (TAC Module, in short):
This module takes response actions on the on-going traffic
based on the knowledge obtained by the AD module. The
response actions can be packet dropping, and forwarding.
Recall that we aim to protect TCP traffic. We overall reserve a limited amount of bandwidth for UDP traffic, and
do comprehensive control on TCP traffic. The control is
based on equations of RED and TCP. The results of detection, i.e, the overall situation of defection by DDoS attack
of each group, are used in BW assignment to maximize the
total friendly TCP traffic throughput.
• Signaling Module (SIG Module, in short) : This module
4
provides communication channels among gateways. Gateways cooperate with each other via these channels by exchanging networking information and coordination rules.
It is not necessary for a gateway to have all above three
modules. Some gateways may just have the Detection
Module and the Signaling Modules, which stick to attack
defense. Some may just have Access Control And Signaling Modules, focusing on access control.
Cooperation among gateways is introduced to make sure
gateways work on the different and proper portions of the
on-going traffic. Eventually, the individual gateways countermeasure behavior together with their cooperation construct the working scenario of the whole defense system.
VI. ATTACK D ETECTION
The main purpose of attack detection is to obtain the
knowledge of the traffic that may be under DDoS attack.
In the following, we will first introduce the basic strategy
for attack detection, and then discuss how to adopt this
technology in high speed networks.
A. TCP-ACK based Attack Detection
As mentioned early, we aim at protecting friendly TCP
flows. Once under an attack, there may be millions of
low-bandwidth, unresponsive flows marked as TCP traffic
present.
Thus, a successful classification mechanism has to be in
place. In this study, we decide to keep track of the TCP
friendly flows rather than the attack flows.
We identify the friendly TCP flows based on the TCP
semantics. There are two special characteristics in TCP
semantics which are different from UDP. One is that a
TCP flow (connection) experiences a three-stage of handshaking in the flow (connection) establishment. An unresponsive attack flow with a spoofed source, although
marked as a TCP flow, cannot establish a real TCP flow
(connection). The reason is that its source unlikely gets
the SYN-ACK packet from the receiver which destines to
the spoofed source rather than its real source. Unfortunately, it will be very difficult for the gateway to monitor
the three-stage connection establishment for the individual
flows. The other special point in TCP semantics is that
within an established TCP flow (connection), the sender
and receiver keep exchanging ACK packets (maybe piggy
bagged in data packets) to confirm the success of transmission. The matching degree of ACK packets between
the sender and the receiver of a flow can be used to decide
whether the flow is a friendly one or not. In this study, we
rely on detecting the matching degree of ACK packets to
identify the friendly traffic. We call this approach as the
TCP-ACK based attack detection.
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
B. Discussion
The TCP-ACK based approach can be applied to identify whether an individual flow is a friendly TCP flow or
not. Ideally, the gateway can keep track of all the individual flows, and do attack detection flow by flow. However,
in high speed networks, there are thousands of flows passing through a gateway. The approach may not be feasible
for the following reasons:
• The overhead for the gateway to keep and manage perflow information is very large. The gateway has to use a
very large table to keep per-flow information, and spend
the significant processing power to do table management
(i.e. lookup, add and delete).
• The overhead for the gateway to read and examine each
packet header is significant, given the large number of
packets passing through the gateway.
To reduce both storage and process overheads mentioned above, we adopt the following schemes:
• Flow aggregation (or grouping): instead of working on
individual flows, our scheme works on groups of flows.
According to the TCP semantics, for an individual TCP
flow, there is a profile on ACK amount. For a group of
TCP flows, we can also get a profile on their ACK amount.
We can use this profile to estimate the overall damage of
a group of flows that may be under DDoS attack. With
this scheme, the gateway need only use the limited size of
table to keep flow information, and the overhead of table
management is also reduced. The problem is that the precise degree of estimation to the overall damage of a group
of flows decreases as the number of flows (i.e. population)
in the group increases. An interesting issue is that given
the group number 1 , how to group the traffic to achieve the
maximum fairness among groups in terms of the preciseness degree of estimation. We design a heuristic grouping
algorithm. The basic idea of the algorithm is to let each
group have the similar amount of traffic, i.e. the traffic
population 2 . To one gateway, the routes of the on-going
traffic construct a tree rooted at the gateway itself. Our
algorithm assigns the group number recursively to subtrees driven by the total traffic population of sub-trees. The
more the population of a sub-tree is, the larger group number the sub-tree can be assigned3 .
1
The group number means the number of groups that the traffic can
be split. Generally speaking, the group number is much smaller than the
number of flows in the network. It may be determined by the processing
power and storage of the gateway which can be used in attack detection.
2
In this study, we use traffic population to represent the amount of
traffic in some time unit, say, second, it is equivalent to traffic arriving
rate.
3
In our algorithm, the information about the traffic population of the
networks with the certain degree of granularity assume to be available.
We believe that the information can be obtained with much less over-
5
Traffic sampling: We can use the statistical sampling
technology to examine a subset of packets that are randomly selected, rather than to examine every packet in the
traffic. With this scheme, the overhead of packet header
reading can be reduced. As long as the sample size is sufficient large, a desired degree of confidence can be maintained.
•
VII. ACCESS C ONTROL
Attack detection itself is not the final goal of the defense
system. Once the detection is done, the system should take
action based on the detection results. As mentioned above,
the group-based approach is also applied here. We classify the traffic into different groups (or classes) and assign
different bandwidth to achieve the overall maximum TCP
throughput. In the following, we will first introduce how
to classify traffic, and then concentrate on how to control
traffic by using RED and CBQ technologies.
A. Classification
The goal of traffic classification is to put the traffic sharing the certain degree of similarities together. Since within
the same class, the traffic will be treated uniformly, it is
very important to make sure that the traffic in the same
class share the certain degree of similarities. The similarities include:
• Damaging similarity: The damage degree of DDoS attack of the traffic in the same class should be similar. It is
unfair to group the traffic with very low level damaged degree together with the traffic seriously damaged by DDoS
attack.
• Geometry similarity: We will use TCP-equation based
solution to control the bandwidth consumption of the traffic. The solution requires that the traffic should share some
geometry similarities. By this way, the delay and other behaviors can be approximated to be same.
In this study, we use the notation of the overall variance
in (1) to describe the damaging similarity. The smaller
variance, the higher degree of the damaging similarity of a
group. On the contrary to the damaging similarity, it is difficult to quantize the similarity of geometry. In this study,
we determine that only the traffic from brother-nodes can
be grouped into one class.
In reality, the number of classes that the traffic can be
grouped into is fixed. It may be determined by the number of RED queues at the output link of the gateway. In
this case, with the above consideration about similarity, the
problem of classification can be defined as follows:
head than the one in attack detection.
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
Given the number of classes |G|, and the bad traffic ratio ej for traffic group j, classify the traffic into different
classes Gi to minimize the variance of traffic:
σ2 =
1 X X
(ei − ej )2 Pj ,
P i∈G j∈G
(1)
i
where G is the set of class IDs, Gi is the set of traffic for
class i, P is the total population, ei is the average bad traffic ratio, i.e.,
=
P
X X
Pj ,
(2)
i∈G j∈Gi
ei =
P
Pj ej
.
j∈Gi Pj
j∈Gi
P
6
TABLE I
G ROUPS G ENERATED BY T HE G ROUPING A LGORITHM
67, 68, 69, 70
71, 74
72
73
75, 76, 78
77
79, 80, 81, 82
83
84, 85, 86
47
48
49, 50
51, 52, 53, 54
55, 56, 57
58
59, 60, 61, 62
63, 65
64
66
23, 26
24
25
27, 29, 30
28
31, 32, 33, 34
35, 36, 37, 38
39, 40, 41
42
43, 44, 45, 46
(3)
Note that the bad traffic ratio ej and the population Pj
for traffic group j are obtained in the Attack Detection
module of the gateway. The classifier in the Access Control Module uses the information to classify the traffic into
different classes.
The problem is NP-hard. We design a heuristic classification algorithm, which is polynomial. The basic idea of
this algorithm is that we sort the child-nodes of each node
in the increasing order of the bad traffic ratio, and then recursively assign the class numbers to each node to get the
minimum variance. Since we have sorted the child-nodes
of each node, it is easy to prove that our algorithm is polynomial. The detail of the algorithm is in Appendix A.
The measurement of the classification algorithm is the
variance of traffic (the variance in short) which is defined
in (1). For the purpose of comparison, we introduce the
low-bound and the up-bound for the variance. They are
obtained by randomly generating a large number 4 of classification plans. The low-bound is the minimum variance
among ones resulted by these randomly generated plans.
The up-bound is the average value of the variances resulted
by these plans. The variance resulted by our classification
algorithm should be smaller than the up-bound, and close
to the low-bound.
0.04
0.035
26
35
71
38
7
10
74
19
3
83
0.03
86
variance
23
22
0.025
0.02
0.015
Classification Alg
Low-bound
Up-bound
6
0.01
k
0.005
2
0
2
6
8
10
12
14
16
The number of classes
1
Fig. 4. Classification Results
Fig. 3. Network Topology for Simulation
In the following, we will evaluate the performance of
the classification algorithm. For the purpose of evaluation, we generate a 4-ary tree shown in Figure 3. The
traffic from each leaf has population randomly generated
between 1, 600kbps and 32, 000kbps. With the grouping
algorithm, the tree is grouped as Table I. In Table I, all
traffic from sources in same row will form one group. For
example, Traffic from source 23, 26 forms one group. In
the following, we use this grouped tree as the input of the
classification algorithms.
4
Figure 4 shows the evaluation data of our algorithm with
the two bounds. We can find that as expected, the variance
resulted by our algorithm is very close to the low-bound,
and much smaller than the up-bound. Note that our algorithms performance turns to be better as the number of
groups increases. It can be explained by the fact that as the
class number increases, the algorithm has more freedom to
classify the traffic to achieve its objective.
4
In our simulation, the number is 1000.
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
B. TCP-Equation Based Access Control
The overall goal of access control is to achieve maximum TCP throughput under DDoS attack. The problem
this step faces is how to smartly drop traffic to achieve
the goal. In this study, we design a multi-class RED control block attached with a CBQ scheduler (see the Access
Control module of the gateway in Figure 2). The block is
composed of several RED queues. Different queues will
have different bandwidth assignment. The bandwidth assignment determines the drop probability of the traffic.
The total TCP throughput will be the sum of the
throughput of all the classes of traffic. We can express
it as follows:
achieved, and also the overall delay performance of the
TCP traffic will be less disturbed by packet dropping performed by the gateway. Both of these are the main objectives of our work.
To gain the advantages of the stability, we have to derive the stable conditions of systems. The basic idea of
deriving these conditions is that first to describe the application traffic behavior into differential equations, then
obtain the transfer functions of the system, finally use the
control method to analyze the whole system. The differential equations to describe TCP behavior have been derived
in [6]. The equations are listed as follows:
dWi (t)
dt
Ttotal = p1 (1 − δ1 )(1 − e1 ) + p2 (1 − δ2 )(1 − e2 )
+ . . . + pn (1 − δn )(1 − en ).
(4)
p̄ is defined as the vector of arriving rate of different
classes of traffic,






p1
p2
..
.
pn



.


(5)
ē is defined as the vector of arriving probability of the bad
traffic in different classes of traffic,






e1
e2
..
.
en



.


(6)
dqi (t)
dt





δ1
δ2
..
.
δn



.


=
1
Ri (t)
Wi (t)Wi (t − Ri (t))
−
δi (t − Ri (t)), (8)
2Ri (t)
= −Cir +
n
X
Wi (t)
i=1
Lired−tcp (Ri+ Cir )3
≤
(2Ni− )2
where
wg = 0.1 min{
(7)
Among the above three vectors, vector p̄ and ē are the
results of classification. δ̄ is what we want to determine at
this step. An intuitive way to determine the drop probability is to let the traffic with small ei have the small drop
probability. While this way is easy, it may not be able
to achieve the high overall TCP throughput. The reason
is that TCP traffic is responsive. The dynamic behavior
of TCP traffic to packet loss should be considered. Ideally, the maximum throughput should be achieved at the
stable point of the system. If the system is stable, a longterm (in other words, stable) maximum throughput can be
Ri (t)
,
(9)
where for class i traffic, Wi (t) is the TCP window function, Ri (t) is TCP round trip time function, δi (t) is RED
drop probability function, qi (t) is RED queue length function and Cir is the bandwidth for the friendly TCP traffic
of each class 5 .
(8) describes the TCP congestion control mechanism multiplicative decrease and additive increase. (9) describes
the RED queue change.
Based on the above two equations, we can get stable
conditions and stable points as follows:
• Stable conditions: According to [1], stable conditions
are listed as follows:
δ̄ is defined as the optimal vector of drop probability of
different classes of traffic,

7
s
wg2
+ 1,
K2
(10)
2Ni−
1
+ 2 r , + },
(Ri ) Ci Ri
(11)
Lired−tcp is RED curve slop, one of RED parameters, Ri+ is
the upper bound of round trip time, Ni− is the lower bound
, α is the average factor in
of flow numbers, K = log(1−α)
∆
calculating the average queue length, and ∆ is the sample
time.
• Stable points: The stable points can be got from the differential equations (8) and (9) [1] as long as the bad traffic
can be regarded as stable:
Wi2 δi = 2,
5
(12)
In fact, Cir = Ci − pi ei , where Ci is the bandwidth assigned to i-th
class of traffic.
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
Wi =
Ri Cir
,
Ni
(13)
where Wi is the window size, and Ri is the round trip
time when the system is stable. Ni is the number of TCP
friendly flows in the class6 .
Observing the above equations, we can find the drop probabilities δi have very clear relationship with the assigned
bandwidth Ci . It is very nature that higher bandwidth assigned to the class of traffic, less drop probability the class
will get.
Having derived the system stable conditions, now we
consider some constraint system conditions:
• Apparently, the sum of the bandwidth assigned to all the
classes of traffic should be no greater than the total bandwidth available. Hence, we have
n
X
Ci ≤ βC,
(14)
i=0
where Ci is the bandwidth assigned to i-th class of traffic,
C is the link bandwidth and β is a parameter7 .
• Since δi is the traffic loss rate or drop probability for the
i-th class of traffic, we have
0 ≤ δi ≤ 1,
(15)
for i = 0, 1, ...n.
Now our problem turns to find the optimal traffic drop
vector δ̄ to maximize the total TCP throughput expressed
in equation (4) under the constraints of inequalities (10)
– (15). It is a constrained optimization NLP problem.
We can use Lagrange Multiplier and non-negative KuhnTucker conditions to solve it.
By solving the defined optimization NLP problem, we
can get the optimal drop probability for each class of traffic, and the link bandwidth assignment to each class of traffic. The RED and CBQ scheduler can work based on these
parameters to achieve our objectives. Note the traffic is dynamic, vectors p̄ and ē may change, accordingly, the drop
probability and bandwidth assignment need adjusted.
VIII. G ATEWAY C OOPERATION
As mentioned above, due to the limitation of the gateway capacity, it is necessary for gateways to cooperate
with each other to achieve the high defense performance.
Cooperation is needed among gateways to achieve the following goals:
• Reducing duplication of processing the on-going traffic
among gateways.
6
Ni can be estimated at the stage of attack detection.
β can be a value between 0 and 1. It is related to the overall percentage of the bad traffic in the whole traffic passing through this gateway.
7
8
Selecting the proper portion of the on-going traffic to
process.
• Sharing the detection results among gateways.
There are two schemes to reduce duplication. One
scheme is to explicitly mark the IP header once a packet
is selected to be further-checked. The successive gateways
need not select the marked packet, and duplication can be
avoid. While this scheme is effective in term of duplication
reduction (in fact, it can avoid duplication), it is not compatible to the existing IP protocols. Furthermore, the overhead in writing packets is significant. We prefer the second scheme, in which the explicit coordination approach is
used. With this scheme, the carefully designed rules in the
attack detection module (i.e. the sampling rules in Figure
2) coordinate different gateways to select different portions
of the on-going traffic. The rules also direct the classifier
to class the undetected portion of traffic into one specific
class. The access control module leaves certain amount of
bandwidth for this portion of traffic. The following example shows how the sampling rules can reduce duplication
among gateways. The rules guarantee Gateway I and J
to select the different portions of the traffic based on the
source address information.
• At Gateway I: If the last digital of an incoming packet’s
source address is X, the packet will be selected.
• At Gateway J: If the last digital of an incoming packet’s
source address is Y, the packet will be selected.
While cooperation among gateways can reduce duplication, it can also help individual gateways to make smart selection on the portion of the on-going traffic. For example,
one gateway can inform its neighboring gateways to select
the traffic that the gateway has no enough capacity to handle. Cooperation in this example belongs to the dynamic
explicit coordination approach. With this approach, the defense load can be distributed dynamically among gateways
depending on the dynamic network situation. Cooperation
at this point can also be in a static manner. The following distance-based traffic selection falls into this category.
To reduce bandwidth consumption of attack traffic is our
basic approach to DDoS flooding attacks. Different attack
traffic may have different targets with different paths, accordingly, having different potential bandwidth consumption damage. Generally speaking, the attack traffic with a
longer path will consume more bandwidth and cause more
damage than the traffic with a shorter path. Hence, it is
beneficial for the gateway to process more packets with
longer remaining paths from the gateway to the destinations of the packets, as opposed to ones with shorter remaining paths among the on-going traffic.
Gateways can also exchange the detected traffic information to complement the locally obtained database. It
•
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
is particularly useful among the gateways who are on the
same path of the attack traffic. Recall that some gateways
in our system may not have the attack detection module.
Sharing detection information is particular useful for this
type of gateways.
IX. E XTENSIONS
In this study, we do not discuss the issue of gateway
deployment. The interested readers can refer the work reported in [4].
Recall that in Section III, we used a 4-tuple to model
DDoS flooding attacks. In this study, we focus on containing the attack in case hA2, B1, C1, D1i, where attacks use
an extraordinary large amount of attack traffic with TCP
headers and spoofed sources to congest the network, and
the real attack sources are out of the attacked network. We
believe that this is one of the most typical and challenging
cases. In this section, for the sake of space limitation, we
discuss how to extend our work to DDoS flooding attacks
which use UDP or mixed traffic i.e. UDP and TCP traffic
to congest the network.
In cases where attackers use pure UDP traffic, that is,
traffic marked as UDP, to congest the network, at least
80 percent of the TCP friendly traffic can be easily separated from the attack traffic. Also, since UDP flows are
not responsive flows, the friendly UDP flows are tolerant
to some degree of packet losses. Hence, packet dropping
can be relatively easy to perform on the UDP flows (both
attack and friendly flows) to control their bandwidth consumption. Also, the defense system can monitor the bandwidth usage of individual UDP flows to identify the attack
traffic.
In cases that attackers use the mixed traffic, i.e. UDP
and TCP traffic to congest the network, we have to handle both the TCP and UDP attacks. Due to the resource
limitation, the defense system may have to spend most of
its capacity on protecting TCP flows by: (1) discriminating
TCP and UDP traffic via strictly limiting the bandwidth usage of UDP traffic, say in any cases, only up to 10 percent
of bandwidth can be used for UDP traffic. In this way, the
bandwidth for TCP traffic can be guaranteed; (2) adopting
the approaches proposed in this study to protect friendly
TCP traffic.
X. C ONCLUSION
We have proposed a defense system for DDoS flooding
attack. The individual gateways countermeasure behavior
together with their cooperation construct the working scenario of the whole defense system.
Our designed system is compatible in the sense that it
adopts the gateway-based approach, and no changes are
9
needed to the existing systems and network protocols. The
system is efficient and feasible in high speed networks
in the sense that it adopts the group-based approach, and
the system is free from the overhead to handle individual
flows.
In this study we propose several efficient defense approaches at each stage of DDoS defense. At the stage of
attack detection, we design TCP-ACK based attack detection, and use statistical sampling to efficiently obtain
the knowledge of the traffic under DDoS attack. At the
stage of access control, we classify the traffic into different
classes according to their geometry similarity and the damaged degree by DDoS attack. We design a multi-class RED
control block with a Class-based Queuing (CBQ) scheduler to control the consumption of bandwidth, aiming to
achieve the maximum possible throughput.
Currently, we are implementing the prototype in the
Linux environment. We are also investigating how to integrate the current existing detection technologies such
spoof-source filtering schemes into our defense system.
R EFERENCES
[1] C.V. Hollot, Vishal Misra, Don Towsley and Wei-Bo Gong, A Control Theoretic Analysis of RED, Proceedings of IEEE Infocom,
2001.
[2] S. Kent, C. Lynn, J. Mikkelson, and K. Seo, Secure Border Gateway Protocol (S-BGP)-Real Worked Performance and Deployment
Issues, in Proceedings of the Network and Distributed System Security Symposium (NDSS2000), Feb. 2000.
[3] G. Labovitz, G. Robert malan and F. Jahanian, Origins of Internet
Routing Instability, in Proceedings of IEEE Infocom’99.
[4] B. Li, M. J. Golin, G. F. Italiano and X. Deng, On the optimal
placement of web proxies in the Internet, in Proceedings of IEEE
Infocom’99.
[5] R. Matajan, S. Bellovin, S. Floyd, J. Ioannidis, V. Paxson and S.
Shanker, Controlling high bandwidth aggregates in the network,
submitted to ACM SIGCOMM 2001.
[6] Vishal Misra, Weibo Gong, Don Towsley, Fluid-based Analysis of
a Network of AQM Routers Supporting TCP Flows with an Application to RED, in Proceedings of ACM SIGCOMM, 2000.
[7] S. Savage, N. Gardwell, D. Wetherall and T. Anderson, TCP Congestion Control with a Misbehaving Receiver Review, ACM Computer Communications Review, v29, no5, October 1999.
[8] D. Schnackenberg, K. Djahandari and D. Sterne, Infrastructure for
Intrusion detection and Response, in Proceedings of the DARPA
Information Survivability Conference and Exposition (DISCEX)
2000.
[9] Defense Information Systems Agency, Network Warfare Simulation, URL: http://www.disa.mil/D8/netwars
SUBMITTED TO IEEE TRANSACTIONS ON SYSTEM, MAN, AND CYBERNETICS
A PPENDIX A: T HE A LGORITHM
OF
C LASSIFICATION
tree T with root ROOT , traffic population Pi
and bad traffic ratio ei going through node i,
and class number CN .
Output: classified tree CT and variance V AR.
Input:
1. sort tree T , such that for each father node, all his children’s
ei ’s are ordered increasingly;
2. each node is initialized with class number 0;
3. call Classif ication(ROOT, CN, CT, V AR)
4. return CT and V AR.
Fig. 5. The Algorithm of Classification
Classif ication(i, CNi , CT, V AR)
1. if i is not a leaf
1.1. assign each child j with class number CNj , such that
P
j∈Ci CNj = CNi and only brother nodes can be
grouped together (Ci is the set of children of node i);
1.2. for each class number assignment
1.2.1. for each child j
1.2.1.1. RET = Classif ication(j, CNj , CT, V AR);
1.2.1.2. if RET = F ALSE
goto 1.2;
else
continue;
1.2.2. compute the current variance cur V AR,
set V AR = min{cur V AR, V AR} and update its
corresponding CT ;
2. if i is a leaf
2.1. if CNi > 1
return F ALSE;
else
return T RU E;
Fig. 6. Procedure Classification
10