Download 2 Tomography of a Network, Basics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Quantum tomography wikipedia , lookup

Transcript
2 Tomography of a Network, Basics
2.1 Origin of Tomography
The word Tomography has a Greek etymology. Taking its origin from the roots “ ”
tomos] which can be translated as “writing” and “” [grapos] which can be
translated as “section”, this term means to write something that is possible to be observed
only from the internal side of a of structure.
The Tomography approach has gained its popularity first as a very important tool for
different medical applications. This term has recently given the name to many kinds of
medical tests and investigations. For example, the axial computerized tomography,
generally known as TAC, is a radiology technique based on a particular system of
measurements of the coefficients of absorption of the X-Rays from the fabrics subject to
examination in a scanning axial crossroads. This invention changed radically the science of
the knowledge of a lot of hidden issues that are incomparably more difficult to investigate
than those that could be traced by the direct sight.
But why are these notions being discussed and what is their common aspect with the
studies of the Network? The answer to this question can be well illustrated by considering
the Internet infrastructure today as the cybernetic equivalent of a human organism. The last
miles connection from the Internet to personal computers are similar to thousands of
capillaries and small and medium sized Internet service providers (ISPs) are interconnected
like the arteries to the backbone transit providers. The global infrastructure of Internet
consists, therefore, in a complex array of interconnected telecommunication networks,
where each network is a part of an Internetwork, and is very difficult to be diagnostically
analyzed. That is why the described analogy can be used here to introduce how the term
Tomography can be applied to the Internet world.
The suggested solution is to develop a method for estimating internal behaviour patterns by
using the measurements taken at the network edges only, like, for example, exploiting endto-end traffic to reconstruct the network internal performance. This new types of research
were a necessary condition for the continuous growth of the Internet, which was originally
a small controlled network in the late 1970’s serving the needs of few users and rapidly
developed then into a huge multi-layered heterogeneous collection of terminals. This
growth required a continuous extension of monitoring techniques able to provide managing
it in space and time.
Emanuele Orlando
Obtaining the information about network performance, such as link loss rate and queuing
delays, plays an important role in isolation of network congestion and detection of
performance degradation. The behaviour of the global network, such as routing algorithms,
servicing strategies and performance verifications, can benefit from monitoring techniques
reporting information. That is why monitoring of large communications system is simply
the minimum necessary condition to diagnose all, or at least a part of most important vital
components for the efficient network operation.
But why the Tomography can be considered as the exceptional approach the world of the
network is looking for? Why cannot the normal internal approach, which has always been
used during the Internet growth, continue to be applied?
2.2 Internal and External Measurements
It is important to differentiate between the common internal approach and the external
trade motif of Tomography science. Internal direct measurements are made within the
network elements, as, for example, packet lost, delay or traffic characteristics, inside the
network. External measurements are made along the network on the end-to-end or edge-toedge paths. From Figure 2.1 it is possible to realize the basic difference of measurements
between these two approaches.
Internal approach is becoming irrelevant because of the high number of its potential
limitations. The measurement process imposes potential load to the network. In fact, the
cooperation between internal servers or routers can lead to impracticable overhead in terms
of extra communication requirements and coordination efforts. Besides, the measurements
access can be limited by service providers, because, for example, they might consider such
kind of information as highly confidential, therefore, some administrative problems can
take place. Finally, internal measurements investigate the average behaviour, especially the
throughput or average queuing with a restricted number of possible statistics.
For these reasons, external approach has the most important advantage: it allows to carry
out the measurements that do not require a dedicated cooperation from internal network
devices and minimize the network load. These aspects can be overcome by means of
tomography.
Studying and monitoring the network by means of making the measurements only at the
network edges lead to the lack of cooperation between the internal devices, which, in turn,
can be the reason for the waste of resources. Computing and elaborate these
measurements from two dedicated external calculators on the edge provide with the
opportunity of separation between network and its management.
4
2 Tomography of a Network, Basics
External side
Network
2
Network
1
Router
End-host
Figure 2.1: Internal measurement, is made directly between two elements of the
network. External, is obtained indirectly from the measurement along a path,
from host 1 to host 2.
2.3 Topics of Network Tomography Approaches
With the emerging of Internet tomography there appeared a broad new science of reporting
several groups of methods to investigate inference of internal network behaviour based on
external end-to-end network measurements. This approach is based on the inference theory,
the goal of which is to obtain “less price” network statistics revealing hidden network
structure and helping to detect isolated congestion, routing faults or anomalous traffic.
The term tomography can have different meaning depending on the branch of studies it is
being applied for. It can be used as a tool for the following areas:
 Link-level delay estimate
Inference of the probability distribution on links along global path, where the
measurements are made. An experimental application in a LAN (Local Area Network) is
mentioned in this text (Section 5).
 Link load at limit and Throughput
Estimates link load limit, allows to manage possible congestions or bottlenecks, knowing
the traffic intensity estimate.
 Topology identification (Topography)
Estimate of the internal network connectivity and estimate of the topology configuration.
These are some of the most important applications showing how the “inference” world can
use its complex tools, for the estimation of a large number of spatially distributed
parameters[1].
5
Emanuele Orlando
Figure 2.2: The collection of the measurements along the network are input for
large-scale inference strategies to estimate some parameters of the network, like
the probability loss rate or delay distribution of internal links[2].
The most difficult task is to adopt the simplest possible model to a large-scale estimation,
taking into consideration the error under the minimum upper bound request. It is essential
to know the degree of the reliability of an estimation.
Often the accuracy can be considered as the equivalent to increase in computation efforts,
requiring the application of an intricate analysis of network, which is a complex and
difficult to estimate algorithm process. Usually, it is better to use a more simple model,
because the results of it will be sufficient for inference of the performance characteristics. It
is important to have a trade-off between the accuracy and computations to be able to apply
these results to the network.
Tomography approach enables to move the focus from a detailed real analysis of the
network (as, for example, queuing or traffic modeling) to a set of measurements and large
scale inference strategies. This aspect is shown in the Figure 2.2.
The measurements play here a very important role. Tomography approach must be able to
avoid conditioning the same network by the measurements, which may be Passive or
Active. The passive measurements use the existing traffic along a path in the network. To
conduct the measurement it is necessary to sample the network traffic flow. The active
measurements are used to generate a traffic flow and notify the network behavior. These
6
2 Tomography of Network, Basics
are two different kinds of measurement, and the use of them depends on the ability to apply
it to the network. Both approaches play an important role because they do not overload the
system being used to obtain measurement with a sufficient reliability. For example, in case
of passive measurement, the sample is not influenced by a particular state of the
network. It is important to conduct measurement independently from the spatial and
temporal structure. Probe tools, on the contrary, are necessary to apply for active
measurement.
The first important conclusion can be made: network tomography is a science able to
capture a network performance adjusting the active or passive measurement to an inference
strategy.
2.4 Basic Definitions
Getting to the core of a network it is possible to recognize several different structures along
a large scale network. They are the basic definitions, but are fundamental to explain and
define in the inference method. The simplest way to depict a network is shown in Figure
2.3. The goal is to visualize the connectivity topology of a large and widely spread
network. Each node can represent a computer terminal, router or subnetwork, depending on
the accuracy level of the logical representation. The connection between two adjacent
nodes is called link. A link is a direct connection between two nodes. The union of adjacent
links defines a path. This is a logical representation, where each logical link is a chain of
physical links connected by routers. Each path is built by a source-node and a destinationnode. To identify the measurement on this path, the source-node transmits a message
(packets) to the destination-node, passing through several nodes sharing the same path. A
possible inference strategy is to use the obtained measurement and to carry out some
internal link characteristics, such as loss rate estimation or link delay estimation. If a path is
defined, there exist two forms of tomography approach, depending on the hierarchical
structure level of the network.
i.
Link-level parameter estimation
In this context, the target of tomography strategy is the estimation of parameters
belonging to links, such as time delay estimation. The goal is to estimate the time delay
necessary to cross a link along a path, having the measurement on the integer path. It is an
extremely challenging task if the possible reasons, which make the real time swing, are
taken into consideration. The time delay can have many random components determining
the behavior, ranging from the propagation delay, to the queuing at the router and router
packet servicing delay, bearing in mind the possibility of the packet dropped event. This
happens, usually, due to the overload of finite output buffer of the routers across the link or
power failures.
ii.
Path-level traffic intensity estimation
7
Emanuele Orlando
It is an estimate operating on a higher hierarchical level. Usually, this approach
focuses on the traffic intensity estimation crossing all paths in the network. The
measurements consist of counts of packets that pass through the nodes, combining all the
possible origin-destination paths. This represents a global approach to measure the
network. The complexity increases not only for the random measurement along each path,
but also due to the fact that there exist no fixed paths in the network.
link
node
path
Figure 2.3: The collection of the measurements along the network are input for
large-scale inference strategies to estimate some parameters of the network, like
the probability loss rate or delay distribution of internal link [2].
The first model is of lower complexity. It can be applied, for example, to conduct the
estimations in a LAN. This approach is used in the practical application enounced in
Chapter 6.
A large scale network can be represented by many mathematical models, the goal of which
is to simulate its real behavior. It is possible to resolve many Network Tomography
problems with the help of these models. In this work there will be given a short description
of Linear Model.
2.5 Linear Model
Linear model is a rough approximation of a Network Tomography inference problem
Y=A+
8
(2.1)
2 Tomography of Network, Basics
where y is a vector of the measurement, and each component can represent, for example,
end-to-end delay, or packet counts, depending by the inference strategy adopted. A
represents the Routing Matrix or Traffic Matrix.  is a vector of quantities which lead
information, which are usually the parameters to estimate, such as mean delay, logarithms
of loss rate, or packet transmission probability over a link.  is an addictive vector error
which absorbs all the error effects, from measurement errors in vector y or from the
parameter estimation .
A is a binary matrix, which capture the topology of the network. Figure 2.4 depicts a
simple example showing how this matrix is constructed. The i, j-th element is equal to 1
only if there is the connection between nodes i and j. Using the topology it is possible to
consider the example of unicast loss rate tomography. The goal is to obtain the
measurement of the probability of a packet right transmission over a link, which can be
seen in the Figure 2.5, using active measurement. Packets are sent by the sender 0 and
received by the nodes 2 and 3. n k and m k denote the number of packets sent and received
by receiver node k. Measurements are obtained simply by the number of the packets
arrivals and define the packet arrival probability p̂ k =m k /n k .  j for j =1,2,3 represents the
parameter associated with each link to be estimated. This is a simplified example and the
error  is ignored.
0
1
j
Link
2 2
Link 1
Link 3
3
i
a
a
A   11 12
 a21 a22
1 1
A
1 0
a13 
a23 
0
1

Figure 2.4: The matrix A is generated allowing 1 where exists a link along the
i-th path.
The particular application is shown in the equation 2.2.
ˆ 2  1 1
 log p


ˆ 3  1 0
 log p
9
 1 
0 
 2
1   
 3 
(2.2)
Emanuele Orlando
Large scale network inference estimates the network parameter  if y is being measured.
Although it is a simplified inference model, it represents a lot of complex aspects. Usually
 vector is replaced by an x vector function of , parameter to estimate. x represents a
measurable quantity containing the hidden parameter to estimate. Each component of x
vector depends from the respective  component.
x j  f ( j )
j= 1,….,J
(2.3)
This relationship may be more or less complex and depends on the inference strategy.
Besides, as it can be noticed from the Equation 2.2, this dependence is a rough
approximation, which add, an error to the estimated results. This correlation must be only
clear enough.
sender
K=0
1 
 y2 
 
 y   A  2 
 3

 3 

1
2
K=2
K=1
3
K=3
 y2 
y 
 3
Figure 2.5: y represents the measurements at the end hosts.  j represents the parameter to
estimate over j-th link.
Another important point of the linear model is the time dependence. In fact, all the
variables of the Equation 2.1 are time-dependent. This scenario reflects the dynamical
nature of the real network, which increases the complexity of the described model.
yt = A xt + t
(2.4)
The matrix A depends on time, too, but can also be assumed independent in the first
approximation under the hypothesis of knowledge of the topology of the network. That is
why the estimation problem involves tracking time-dependent parameters. yt becomes a
vector of measurements recorded at given time t at a number of different measurement
sites.
10
2 Tomography of Network, Basics
The target of the inference strategy is to estimate x, and then the respective dependent
parameter . This is a specific case of inverse problem. But the real dimension of A can
range depending on different sizes of the network. In a small network, such as shown in
Figure 2.4 , A is a 2 x 3 dimension matrix, with two packet parameters and two
measurements sites. But A can reach up to ten thousands of rows and columns for a large
subnet of the Internet. The methods to solve this inverse problem depend not only on the
matrix A, but also on the nature of the error . Its components are assumed to be
independent, with Gaussian, Poisson, binomial or multinomial distribution. The nature of
this error can change the inference strategy, either reducing or increasing the inverse
problem. Gaussian distributed noise with covariance that is independent of A allows to
use the recursive linear least squares methods and other iterative equations solvers. This is
the most simple case for computation, but is also the most rare one. In fact, usually the
error is Poisson, binomial and multinomial distributed and more sophisticated statistical
approach is required, as, for example, the use of non-linear least squares methods,
maximum likelihood approach with expectation-maximization algorithm (EM) or
maximum at posteriori algorithm (MAP)[3]. This recursive algorithm requires the inverse
of matrix A, but usually A is not of full rank (for example, in Equation 2.2) which
increases the mathematical computation. This scenario shows the degree of complexity of
the inference field and the necessity to choose the right strategy for parameter estimation.
In fact, the most important is to reach the trade-off between statistical accuracy and
computational overhead. The statistical accuracy is required to obtain an estimate error
within the prefixed bound, without too complicated computational efforts.
Another approach of inference strategy is the Maximum Likelihood Estimation (MLE).
MLE gives good results in terms of efficiency but is too difficult to calculate [3,4]. That is
why it has been recently replaced with the maximum pseudo likelihood estimation
(MPLE)[1,5]. This approach reduces the computational burden and provides the good
statistical efficiency.
The basics of the inference model will be described in next paragraph and this approaches
will be described.
2.6 Inference Model
The goal of the inference model is to attribute a value to an unknown parameter using a
statistics of measurement. The estimates are necessary when the parameter cannot directly
be measured. Usually, the parameter is, as in the example described above, a
multidimensional vector of parameter. Therefore, the goal will be to estimate the vector of
these parameters. It is preferable first to study a general one-dimension parameter estimate
and then to extend these notions to the multidimensional case.
11
Emanuele Orlando
Let  the real parameter to estimate and Y the measurement, the estimate of which is being
constructed. Y is a vector of N samples (at different time) y  ( y1, y2 ,.., yN ) or a
continuous observation within a given time interval. The target is to give a value to the
parameter using N samples y  ( y1, y2 ,.., yN ) of the integer observation. Each
component yn , the sample of the measurement, represents identical random variable. It is
possible to define the probability density function f (y; ), which expresses the dependence
of data y from the parameter .
Figure 2.6 depicts a general estimate process. A source of parameters generates a vector ,
identifying a point in the parameters space. This means that the target of the inference
estimate has been fixed. The data measurements Y are points of the observations space.
Each measurement generates a specific point in this space. The function f(y; ) expresses
the statistic connection between these spaces. Using these data the inference strategy
should be focused to discover the value of the parameters adopting an estimate rule
θ̂ =g(y). Function f(y; ) plays a very important role in this application. This function
represents the probability density of data if the parameter to estimate varies. It is actually a
group of density probability functions, because each of them corresponds to a different
value of the parameter. The knowledge of this function allows to apply the method of
statistic inference called Maximum Likelihood Estimation method (MLE).
f (Y ;  )
Parameters
source
Estimator

Y
Parameters
space
Observations
space
ˆ  g ( y )
Figure 2.6: General estimate process. A parameter source generates a vector of
parameters to estimate, defined in a parameter space. The measurements define
the linking function f(Y;) between two space to obtain the estimator of the
vector of parameters [3].
Given a sample of measurement a point in the observation space is defined (Figure 2.6).
After this observation the estimate can be obtained by choosing the value of the parameter
which has given the measurement y with the highest probability.
The Maximum Likelihood Estimate (MLE) is the estimated value ˆ , at which the sample of
measurement y  ( y1, y2 ,.., yN ) gives the maximum probability density function f(Y=y; ).
When Y is observed, f(Y=y; )=L() becomes the function of the only parameter  to
estimate. It represents the maximum likelihood function which is to be maximized to obtain
12
2 Tomography of Network, Basics
the value ˆ . Under the independence hypothesis of the random variables yn , the maximum
likelihood function is
n
L( )   f ( xi ; )
i1
(2.5)
and the maximum likelihood estimate is the parameter which maximizes this function
ˆ  arg L( )

(2.6)
where arg represents the argument of the function L().
The Figure 2.7 depicts the case of a mean estimate of a random variable with normal
distribution. For the given measurement there should be defined the interval where the
parameter which gives the maximum of L() has to be chosen.
f(x;)
1
2
x
x+x
3

Figure 2.7. If the measurements are contained in (x, x+x), the estimate more
likelihood of the parameter  is ˆ   2 [3].
The maximum likelihood estimate gives statistically good results and it is characterized by
the following basic properties: Unbiased [3], Efficient[3], Consistent[3].


 Unbiased : E ˆ  X    . An estimator is unbiased if the average on the number
experiments of ˆ  X  gives a result equal to the true value of .
 Efficient: An estimator is efficient if it is unbiased and its variance reaches the CramerRao Bound (CRB).
13
Emanuele Orlando
 Consistent: lim E{xˆ}   and lim Var{xˆ}  0 . An estimator is consistent if it improves
N 
N 
the performance of its estimation when the number of samples increases.
In the multidimensional approach the parameter is replaced by a vector of parameter
  (1,2 ,.., J ) . The measurements are the components of the vector Y  (Y1, Y2 ,.., YJ ) .
This approach is basically the same, although the mathematical efforts increase. Maximum
likelihood estimate represents the unknown location of the point in the multidimensional
parameter space, given the vector of vectors Y of the measurements. In this context
maximum likelihood function is a multidimensional one, depending on the vector of
parameters . ˆ represents the set of parameters which have given the measurement
y = ( y1 , y2 , .., yJ ) with the higher probability. In the described application vector ˆ can
represent the result of an inference strategy in a link-level delay estimation. Each
component θj is the delay estimate on j-th link.
Now it is important to link maximum likelihood approach to the practical application and
to introduce a variant called Maximum Pseudo Likelihood Estimate (MPLE). This approach
has the same statistical characteristics as the maximum likelihood, but is much easier to
compute, especially by adopting a linear model.
2.7 Maximum-Pseudo Likelihood Approaches
Consider the linear model of the Equation 2.1 omitting the error .
Y= A X
(2.7)
X  ( X1, X 2 ,.., X J ) is a J-dimensional vector. Each component represents the dependence
of the parameter to estimate through the density function f j
X j  f j ( j )
j=1,..,J
(2.8)
The goal is to estimate networks dynamic parameters, such as link delay or traffic flow
counts, obtaining X and then, if f j is known, to estimate the j-th parameter. In this
application the independence of all the components of X is required. This is the necessary
assumption for the application of likelihood approaches.
Y  (Y1, Y2 ,..YI ) is a I-dimensional vector of measurements. A is the routing matrix, as
shown in Figure 2.4. The example of this application is demonstrated in Figure 2.8. This is
an arbitrary multicast tree with four receivers. It is multicast, because in this application
root sends a packet probing to all the receivers[6]. In this case it can be noticed in Figure
2.9 how the Equation 2.7 can be applied. Each component Yj represents the one way delay
14
2 Tomography of Network, Basics
measured from the root to node j. The connection between node i and its parent defines link
i. Matrix A is given, because the topology of the network is known. It is obtained with a
spanning tree algorithm. It is a I x J matrix, so there are I receivers and J internal links.
Therefore, Y1,..,Y4 are the measurements (at nodes 4-7), and X1,..X7 are the delays over
internal links 1-7.
root
1
2
4
3
5 6
7
Figure 2.8: A multicast tree, with four receivers. A root sends a packet to all
four receivers.
root
Path1
x3f(3)
1 Lin
2
4
k3
 Y1  1
  
 Y2   1
 Y3  1
 Y  1
 4 
3
5 6
7
1 0 1 0 0
1 0 0 1 0
0 1 0 0 1
0 1 0 0 0
 X1 
 
X2
0 
 X3 
0 
X4
0 
 X5 
1   
 X6 
X 
 7
Y1=y1
Figure 2.9: A path is defined from the root to the end host . It is possible to
obtain four paths. Each link measurements is represented by xi .
The goal of the maximum likelihood strategy is to estimate the components of , given the
measurement Y and using the dependence of the Equation 2.7 and the Equation 2.8. From
the probability density f(Y; ) for this application in Figure 2.8 the maximum likelihood
function is obtained when the measurement has been fixed.
15
Emanuele Orlando
ˆ  arg L( y1 , y2 , y3 , y4 ;θ )
(2.9)
The main problem is that a multidimensional maximization requires a lot of efforts when
the number of variables to estimate increases. This example is a simplified case with seven
unknown values. If this strategy will be applied to a bigger network, the maximization
algorithm will become more complex. Estimating the vector ˆ means, in fact, resolving a
multidimensional maximization problem, the essence of which is to find a
multidimensional vector with components that represent a point of maximum. Usually this
research is conducted using the iterative algorithm, such as Expectation Maximum
algorithm [7,8]. After a number of steps the algorithm reaches a stationary point, which
does not change from one step to another. It represents the result of the maximization. The
problem is contained in the number of steps required from the algorithm. If the dimension
of the vector to estimate increases, the number of steps increases, too. Besides, there is an
inevitable loss of accuracy and it is more relevant when the dimension of the vector
increases. If the estimate is required for a real time service, this application cannot be used.
The requested time is too long and depends on the dimension of the network. To be less
restricted by these boundaries a variant of the maximum likelihood function, which is
called pseudo maximum likelihood function, should be introduced.
The main idea of this method is to decompose the original model into a series of simpler
subproblems and to construct the pseudo-likelihood function by multiplying the marginal
likelihoods of such subproblems. One of them is obtained by selecting the pairs of rows
from the routing matrix A. Let S be the set of subproblems obtained by selecting all the
possible pairs of rows from the I x J matrix A.
A: S ={ s= (i1,i2) : 1 i1 i2 I}
(2.10)
For each subproblem sS the observed subvector is obtained.
Y S  AS X S
(2.11)
 Yi 
1
s
Y   
Yi
(2.12)
 2 
X S is the vector which contains the variables to estimate for the subproblem s using the
Equation 2.11. AS represents the subrouting matrix selecting rows i1 and i2 . For each s let
l s (Y s ; θ s ) the marginal likelihood function with θ s the parameters of s. The hypothesis of
independence of subproblems defines the pseudo-likelihood function as the product of the
16
2 Tomography of Network, Basics
marginal likelihood functions of all the subproblems. The Equation 2.13 shows the pseudolikelihood function.
N
LP ( y1,..., yN ; )    f s ( yts ; s )
n1 sS
(2.13)
Let l s ( yts ; s )  log f s ( yts ; s ) the log-likelihood function of subproblem s, the pseudo loglikelihood function is defined as
N
LP ( y1,..., yN ; )    l s ( yts ; s )
n1 sS
(2.14)
Maximum of this function represents the Maximum Pseudo-Likelihood Estimation (MPLE)
of parameters  . The Pseudo function allows to divide the analyses of the global parameter
 in some marginal analyses, which is the great advantage of this method. In fact, it allows
to conduct a number of simpler computation, as, for example, to investigate several
subproblems, instead of conducting a complex investigation. Using pseudo-EM algorithm,
which is a variant of EM-algorithm, it is possible to maximize pseudo likelihood
function[9].
Let l s ( X s ; θ s ) be the log-likelihood function of the subproblem s if all the data are given.
It is the likelihood function of the dependence in the Equation 2.8.
Let  (k ) be the estimate of  obtained in the k-th step. The iterative function Q( , ( k ) ) is
maximized in k+1 steps, until it reaches a stationary value of  . It is defined as
N
Q( , (k ) )    E
s(k )
sS n1 
(l s ( xts ; s ) | yns )
(2.15)
The investigation of this maximum multiplies many subexpectations. A subexpectation
represents the expectation of a subproblem. This is the main advantage of MPLE. The right
choice of the starting point for this iterative algorithm makes it easy to reach faster the
estimate.
Figure 2.10 shows how a marginal likelihood function is obtained. A subproblem consists
in the choice of a subtree considering only two receivers. If I is the number of end
receivers, there are I(I-1)/2 subtrees. Configuration of a subtree is equivalent to selecting
two rows of the matrix A. Figure 2.11 can be used as an example for a larger network. The
the hypothesis of independence, it is necessary to come back to the global model only to
17
Emanuele Orlando
root
i2
1
2
4
 Y1  1
  
 Y2   1
 Y3  1
  
 Y4  1
3
5
As
6
i1
7
1 0 1 0 0
1 0 0 1 0
0 1 0 0 1
0 1 0 0 0
Y s  As X s
As
 X1 
 
X2
0 
 X3 
0 
X4
0 
 X5 
1   
 X6 
X 
 7
S ={ s= (i1,i2) : 1 i1 i2 I}
Figure 2.10: A subproblem is identified selecting two rows of the matrix A. This
choice is equivalent to identify two end receivers.
multiply all the subtrees. This assumption is required, even if it does not represent the
reality, where usually the subproblems are dependent.
root
1
2
3
4
root
5 6
7
1
3
Figure 2.11: When two end receivers are chosen a subtree is identified, and it is
more simple to calculate the marginal likelihood function for this subproblem.
18
2 Tomography of Network, Basics
2.8 Computational Complexity
Usually maximum likelihood method is infeasible for real size networks. It observes all
possible internal delay vectors X. The conducted studies show that the computational
complexity grows at a non-polynomial rate [1]. The Pseudo likelihood method, on the
contrary, reduces this augment. To compare both methods it is necessary to apply them to a
small network as shown in the Figure 2.8. In this case the performances of MLE and MPLE
are comparable. MPLE is more robust than MLE, because it has a smaller standard
deviation of the error generated in each estimate. This is a consequence of the partition
adopted in the pseudo method. The Pseudo function, in fact, is a product of less complex
likelihood function on subproblems. N is the number of edge nodes, where the
measurements are made, and complexity is the number of operations to be computed. The
complexity of MLE is proportional to N 5 and, in case of MPLE, to N 3.5 [1]. For this
reason a small network can be compared by complexity, but not a larger one.
Another performance to test is the computational speed. MPLE is faster than MLE, because
a matrix of a subproblem has a smaller dimension than the total matrix. This difference in
time increases when the number of edge nodes increases. Pseudo likelihood is a variant of
the normal likelihood model, and it becomes really important when the size of the network
increases. In a LAN (Local Area Network) the models are equivalent. In this work there is
described the likelihood model with a maximum expectation algorithm to obtain a linkdelay estimation on a LAN. To introduce the both of them it is necessary to have a global
vision of this scenario. It can be a valid variant whenever the dimension of the network to
analyze is huge.
Besides, the goal is to use a model without any a priori information, which could be helpful
for the estimation. This is the case of the MAP model (Maximum a Posteriori)[3].
Measurement from the edges will be then the only available information.
19
Emanuele Orlando
20