Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Unveiling Anomalies in Large-scale
Networks via Sparsity and Low Rank
Morteza Mardani, Gonzalo Mateos and Georgios Giannakis
ECE Department, University of Minnesota
Acknowledgments: NSF grants no. CCF-1016605, EECS-1002180
Asilomar Conference
November 7, 2011
1
Context
Backbone of IP networks
Traffic anomalies: changes in origin-destination (OD) flows
Failures, transient congestions, DoS attacks, intrusions, flooding
Motivation: Anomalies congestion limits end-user QoS provisioning
Goal: Measuring superimposed OD flows per link, identify anomalies
by leveraging sparsity of anomalies and low-rank of traffic.
2
Model
Graph G (N, L) with N nodes, L links, and F flows (F >> L)
(as) Single-path per OD flow xf,t
1
0.9
f2
0.8
0.7
Packet counts per link l and time slot t
l
0.6
0.5
Anomaly
f1
0.4
0.3
0.2
0.1
0
0
0.2
0.4
0.6
0.8
1
є {0,1}
Matrix model across T time slots
LxT
LxF
3
Low rank and sparsity
X: traffic matrix is7 low-rank [Lakhina et al‘04]
x 10
4
|xf,t|
3
2
1
0
0
100
200
300
Time index (t)
400
500
A:
anomaly matrix is sparse across both time
and flows
8
8
x 10
4
|af,t|
|af,t|
4
2
0
0
200
400
600
Time index(t)
800
1000
x 10
2
0
0
50
Flow index(f)
100
4
Objective and criterion
Given
and routing matrix
, identify sparse
when
is low rank
R fat but XR still low rank
Low-rank sparse vector of SVs nuclear norm || ||* and l1 norm
(P1)
5
Distributed approach
Centralized
Y=
n
Goal: Given (Yn, Rn) per node n є N and single-hop exchanges, find
(P2)
XR=LQ’
Lxρ
≥r
Nonconvex; distributed solution reduces complexity: LT+FT ρ(L+T)+FT
M. Mardani, G. Mateos, and G. B. Giannakis, ``In-network sparsity-regularized rank
minimization: Algorithms and applications," IEEE Trans. Signal Proc., 2012 (submitted).
6
Separable regularization
Key result [Recht et al’11]
New formulation equivalent to (P2)
(P3)
Proposition 1. If
then
stationary pt. of (P3) and
is a global optimum of (P1).
,
7
Distributed algorithm
(P4)
Consensus with
neighboring nodes
Network connectivity implies (P3) (P4)
Alternating direction method of multipliers (AD-MoM) solver
Primal variables per node n :
Message passing:
n
8
Distributed iterations
Dual variable updates
Primal variable updates
9
Attractive features
Highly parallelizable with simple recursions
FxF
Sτ(x)
Low overhead for message exchanges
Qn[k+1] is T x ρ and An[k+1] is sparse
Recap
(P1)
Centralized
Convex
Stationary (P4)
(P2)
LQ’ fact.
Nonconvex
τ
(P3)
Sep. regul.
Nonconvex
Stationary (P3)
(P4)
Consensus
Nonconvex
Global (P1)
10
Optimality
Proposition 2. If
and
i)
ii)
where
converges to
, then:
,
is the global optimum of (P1).
AD-MoM can converge even for non-convex problems
Simple distributed algorithm identifying optimally network anomalies
Consistent network anomalies per node across flows and time
11
Synthetic data
Random network topology
1
0.8
0.6
N=20, L=108, F=360, T=760
Minimum hop-count routing
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
1
Detection probability
0.8
---- True
---- Estimated
0.6
0.4
PCA-based method, r=5
PCA-based method, r=7
PCA-based method, r=9
Proposed method, per time and flow
0.2
0
0
0.2
0.4
0.6
False alarm probability
0.8
1
Pf=10-4
Pd = 0.97
12
Real data
Abilene network data
Dec. 8-28, 2008
N=11, L=41, F=121, T=504
1
---- True
---- Estimated
Detection probability
0.8
0.6
6
5
0.4
4
r=1, PCA-based method
r=2, PCA-based method
r=4, PCA-based method
Proposed, per time and flow
0.2
0
0
0.2
0.4
0.6
False alarm probability
0.8
3
2
1
Pf = 0.03
Pd = 0.92
Qe = 27%
1
0
100
400
300
50
0
100
0
200
Time
13
500
Concluding summary
Anomalies challenge QoS provisioning
Unveiling anomalies via convex optimization
Leveraging sparsity and low rank
Distributed algorithm
Identify when and where anomalies occur
Ongoing research
Missing data
Online implementation
Thank You!
14