Download Longitudinal-Partitioning-Based Waveform Relaxation Algorithm for Efficient Analysis of Distributed Transmission-Line Networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Fault tolerance wikipedia , lookup

Electrical substation wikipedia , lookup

Transmission line loudspeaker wikipedia , lookup

History of electric power transmission wikipedia , lookup

Two-port network wikipedia , lookup

Network analysis (electrical circuits) wikipedia , lookup

Transcript
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 60, NO. 3, MARCH 2012
451
Longitudinal-Partitioning-Based Waveform
Relaxation Algorithm for Efficient Analysis of
Distributed Transmission-Line Networks
Sourajeet Roy, Student Member, IEEE, Anestis Dounavis, Member, IEEE, and Amir Beygi, Student Member, IEEE
Abstract—In this paper, a waveform relaxation algorithm is presented for efficient transient analysis of large transmission-line networks. The proposed methodology represents lossy transmission
lines as a cascade of lumped circuit elements alternating with lossless line segments, where the lossless line segments are modeled
using the method of characteristics. Partitioning the transmission
lines at the natural interfaces provided by the method of characteristics allows the resulting subcircuits to be weakly coupled by
construction. The subcircuits are solved independently using a proposed hybrid iterative technique that combines the advantages of
both traditional Gauss–Seidel and Gauss–Jacobi algorithms. The
overall algorithm is highly parallelizable and exhibits good scaling
with both the size of the network involved and the number of CPUs
available. Numerical examples have been presented to illustrate
the validity and efficiency of the proposed work.
Index Terms—Convergence analysis, delay, longitudinal partitioning, transient simulation, signal integrity, transmission line,
waveform relaxation.
I. INTRODUCTION
W
ITH the constant increase in operating frequencies,
interconnects need to be modeled as distributed
transmission lines for accurate signal integrity analysis of
modern integrated circuits (IC) [1]. Accurate modeling of large
distributed networks using commercial circuit solvers with integrated circuit emphasis (like SPICE) require significant central
processing unit (CPU) time and memory, thereby making
them computationally prohibitive for fast transient simulation.
The waveform relaxation (WR) algorithm has emerged as an
attractive technique to reduce the simulation costs of such large
networks [2]–[23]. Typically, waveform relaxation attempts to
break a large circuit into smaller subcircuits that can be solved
iteratively in sequence or in parallel. Each iteration involves an
exchange of voltage/current waveforms between the subcircuits
for the response to converge to the actual solution.
Presently, two approaches exist for application of waveform
relaxation to transmission line networks. One such approach
Manuscript received September 26, 2011; accepted November 21, 2011. Date
of publication January 18, 2012; date of current version March 02, 2012. This
work was supported in part by the Natural Sciences and Engineering Research
Council of Canada, Canada Foundation for Innovation, Canadian Microelectronics Corporation and Ministry of Research and Innovation—Early Research
Award.
The authors are with the Department of Electrical and Computer Engineering, University of Western Ontario, London, ON, Canada N6A 5B9
(e-mail: [email protected]; [email protected]; [email protected]).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TMTT.2011.2178261
is the transverse partitioning scheme [11]–[14] where multiconductor transmission lines (MTLs) are partitioned into single
lines by assuming weak capacitive and inductive coupling between the lines. The coupling between the lines is represented
as time-domain relaxation sources introduced into the circuit
model of each line.
An alternative waveform relaxation algorithm is based on
longitudinal partitioning of the network into repeated subcircuits [4]–[8], [10], [16]. While longitudinal partitioning
schemes based on the generalized method of characteristics
(MoC) has been reported in [4]–[8], more recent works [16]
have focused on partitioning the line based on segmentation
models such as the conventional resistive-inductive-conductive-capacitive (RLGC) lumped model [24]. Partitioning
techniques based on segmentation models have a common
limitation that since each segment directly feeds into the next
segment, the adjacent segments are strongly coupled in physical
space. This is reflected in the fact that blindly partitioning the
conductor between segments requires resolving the stringent
Dirichlet’s transmission condition across the partition and
consequently exhibits poor convergence [16]. The work of
[16] accelerated the convergence of the WR algorithm by
artificially exchanging additional voltage/current waveforms
(i.e., increasing the overlap between subcircuits) followed by
optimization routines.
More recently, in [25], a WR algorithm based on the delay
extraction-based passive compact transmission-line (DEPACT)
segmentation model [26], [27] was presented for two conductor
transmission-line networks. The DEPACT model represents
lossy transmission lines as a cascade of lumped circuit elements
alternating with lossless line segments where the lossless line
segments are realized in the time domain using the MoC [24],
[28]. The work of [25] exploited the inherent weak coupling
across the natural interfaces provided by the MoC [4]–[8] to
longitudinally partition the transmission line at these interfaces
into smaller, disjoint subcircuits. The iterative solution of the
subcircuits was performed using the sequential Gauss–Seidel
(GS) technique and was shown to naturally achieve fast convergence without the need of any artificial exchange of waveforms
or optimization techniques as proposed in [16].
This work extends the concepts of [25] to multiconductor
transmission-line systems. Furthermore, the efficiency of the
proposed algorithm for any general transmission-line network
(two conductor or multiconductor) has been investigated on parallel processing-based platforms. To this end, two highly parallelizable iterative techniques have been implemented—the traditional Gauss–Jacobi (GJ) and a novel hybrid technique that
0018-9480/$31.00 © 2012 IEEE
452
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 60, NO. 3, MARCH 2012
combines the complimentary features of Gauss–Seidel (GS) and
the Gauss–Jacobi (GJ). This hybrid technique exhibits superior convergence properties when compared to the traditional
GJ algorithm while maintaining its high parallelizability with
respect to the number of CPUs available. In addition, a mathematical framework has been provided to demonstrate the scalability of the algorithm with respect to both the size of the network involved and the number of CPUs available for parallel
processing. Numerical examples have been provided to illustrate the validity and efficiency of the proposed WR algorithm
over full SPICE simulations.
The paper is organized as follows. Section II deals with the
background of waveform relaxation algorithms and concludes
with a review of the DEPACT model [26], [27]. Section III
presents the details of the proposed algorithm and Section IV describes the mathematical framework for analyzing the computational cost of the proposed work. The numerical examples and
conclusions are presented in Sections V and VI, respectively.
II. BACKGROUND AND DEPACT MODEL
In order to explain the contributions of the proposed work,
here we briefly discuss the background of general waveform
relaxation algorithms followed by a review of the DEPACT
model.
A. Background of Waveform Relaxation Algorithms
Waveform relaxation, from its introduction in [2], has proven
to be an attractive algorithm to address the issue of exorbitant
computational costs for solving large networks using traditional
circuit solvers like SPICE. The algorithm is based on partitioning large networks into smaller subcircuits where the coupling between the subcircuits is represented using time-domain
relaxation sources introduced into each subcircuit. Assuming an
initial guess for the waveforms of the relaxation sources, the
subcircuits are solved independently. The present solution of
the subcircuits is then used to update the relaxation sources for
the next iteration. This process is repeated until the error between two successive iterations falls within a prescribed error
tolerance. Solving the individual subcircuits using modern parallel processing resources has allowed the utilization of multiprocessor hardware and provided significant CPU savings in
memory and time compared with traditional full circuit simulation [14]. It is noted that the main limitation of relaxation algorithms is the speed of convergence of the iterations. Several
methods have been reported to speed up convergence, such as
time windowing [3], overlapping subdomains [22], [23], and optimization [16], [22].
B. Review of DEPACT Model
A general coupled MTL system for quasi-transverse electromagnetic (TEM) mode of propagation is described by the
Telegraphers partial differential equations [24]
(1)
where
and
represent the spatial distribution
of the voltage and current along the longitudinal direction
, and
are the frequency-dependent
and
resistive, inductive, conductive, and capacitive per-unit-length
(p. u. l.) parameters of the line, respectively. The solution of
the above equations can be written as an exponential matrix
function [29], [30] as
(2)
where
(3)
and
are the p. u. l. inducand
tive and capacitive parameters at the maximum frequency of
interest
. Typically, the solution of (2) does not
have an exact time domain counterpart and hence segmentation
based modeling techniques [26], [27], [29]–[34] are generally
used to derive an equivalent time domain expression of (2). Of
these segmentation algorithms, the DEPACT is suitable for electrically long transmission lines due to the fact that it explicitly
extracts the delay of the network leading to smaller number of
lumped segments.
However, extracting the delay terms
from
is
not a trivial task since the matrices and
do not commute
(i.e.,
). To approximate
in terms of
a product of exponentials, a modified Lie product [35] is used
as
(4)
where is the number of sections. The associated error of the
approximation scale as
[34] (i.e., (4) quickly
converges to the exponential matrix of (2) with increase in
number of sections ). Equation (4) provides a methodology of
discretizing the transmission line into a cascade of alternating
subsections with the individual stamps of
and
, as
illustrated in Fig. 1 (for single lines) and Fig. 2 (for MTLs).
The exponential matrix
represents the attenuation
does not contain
losses of the transmission line. Since
and
, it can be approximated by a low-order rational
function, which in turn can be realized in SPICE using either
lumped RLC elements or lumped dependent sources [26], [27].
As a result, the subsections with stamps of
are replaced
by a macromodel referred to as “lumped circuit elements” in
Figs. 1 and 2. On the other hand, the matrix
contains
only
and
and can be modeled as a lossless line using
the MoC [24], [28]. As a result, the subsections with stamps of
are replaced by the equivalent MoC circuit [24], [28] in
Figs. 1 and 2. More detailed derivations of a SPICE realization
of the DEPACT model of (4) has been provided in [26] and
[27]. The rational macromodel describing the lossy sections
and the MoC equations describing the lossless sections both
enjoy exact representations in the time domain and together
approximate the frequency domain solution of (2) as a set of
ROY et al.: WAVEFORM RELAXATION ALGORITHM FOR EFFICIENT ANALYSIS OF TRANSMISSION-LINE NETWORKS
453
Fig. 1. SPICE equivalent circuit of a two conductor transmission line using DEPACT.
delayed ordinary differential equations in the time domain
which can be solved by SPICE.
Section III discusses the development of the proposed WR
algorithm based on the DEPACT model of (4).
III. DEVELOPMENT OF PROPOSED ALGORITHM
Here, we begin by describing the proposed longitudinal partitioning scheme for single lines and the methodology to iteratively solve the subcircuits. From this discussion, the algorithm
is extended to MTLs.
A. Proposed Partitioning Scheme for Single Lines
The DEPACT model of (4) provides a methodology to discretize two conductor transmission lines into alternating cascade of lossy and lossless line segments (Fig. 1). To better explain the proposed partitioning methodology, consider the equations for the th lossless line segment in Fig. 1 given as follows:
(5)
are the near and far end voltages, respecwhere
tively, and
are the near and far end currents. respectively, of the th lossless line segment. Using simple algebraic
manipulations on (5) followed by converting the resultant equations into the time domain provides the following MoC relation
[24], [26], [27]:
(6)
where
and
are the characteristic impedance and the delay of each lossless section, respectively. The MoC equations of (6) can be realized by the
simple circuit equivalent of Fig. 1. From Fig. 1, it is observed
that the MoC provides natural interfaces across which information is exchanged using the time delayed equations of (6) rather
than the more stringent Dirichlet’s transmission conditions. As
a result, partitioning the transmission lines at these interfaces
as shown in Fig. 3 was found to yield reliably efficient convergence without the need for artificial overlap of subcircuits
and optimization like [16]. From (6), it can be further concluded
that the delayed sources
serve as
the relaxation sources responsible for ensuring the coupling between the subcircuits for the proposed WR algorithm. The next
section describes the methodology to iteratively solve the subcircuits and update the relaxation sources.
B. Iterative Solution of Subcircuits for Single Lines
Typically, two techniques exist for the iterative solution of the
subcircuits—the Gauss–Seidel (GS) and the Gauss–Jacobi (GJ)
techniques. According to the GS technique, the th iterative solution of any th subcircuit requires the present ( th) solution of
th subcircuits as well. This translates to a
all of the preceding
sequential solution of the subcircuits where all of the relaxation
sources are updated after solution of each individual subcircuit
[3]. On the other hand, according to the GJ iterative technique,
the th iterative solution of any th subcircuit requires only the
previous (
th) solution of all subcircuits. This corresponds
to a possible parallel solution of the subcircuits where the relaxation sources are only updated when the solution of all subcircuits is complete [3]. The above discussion shows that the GS
technique involves updates or exchanges of information per
iteration where is the number of subcircuits, compared with
GJ that involves only one exchange of information. Thus, GS
exhibits better convergence than GJ [15]. However, a potential
drawback of GS is that it does not naturally lend itself to parallel processing like the GJ technique since the present solution
of any th subcircuit is dependent on the present solution of all
previous
subcircuits.
In [25], a sequential GS iterative technique to solve the subcircuits was implemented. In this work, with the focus being on
highly parallelizable iterative techniques, two schemes are proposed—first, the traditional GJ technique, followed by a hybrid
technique that combines the complementary features of GS and
GJ.
454
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 60, NO. 3, MARCH 2012
Fig. 2. SPICE equivalent circuit of an MTL using DEPACT.
Fig. 3. Partitioning of single line into subcircuits for waveform relaxation.
1) Gauss–Jacobi (GJ): This discussion begins by considering a general two-conductor transmission line discretized into
subcircuits, as illustrated in Fig. 3. Prior
to beginning the th iteration, it is assumed that the
th iteration has been completed for all
subcircuits and waveforms of all of the relaxation sources have
been updated to
.
For
, the waveforms of the relaxation sources,
, is simply the initial guess.
For the th iteration, considering the th subcircuit of Fig. 3,
the corresponding relaxations sources with known waveforms
serve as the input excitation. This
translates to the following terminal conditions for the th
subcircuit:
(7)
The terminal conditions of (7) along with the equations of
the corresponding lumped circuit elements, together form
the set of ordinary differential equations describing the th
subcircuit, which can be solved for a self consistent solution
. It is noted that the
of the waveforms
) of
relaxation sources of (7) (i.e.,
each th subcircuit are assumed to be known beforehand and,
hence, considered independent of the present ( th) solution of
subcircuits. This particular aspect allows
the remaining
the subcircuits to be solved in parallel on a multiprocessor
machine.
Once all of the subcircuits are solved, the voltage wave, determined from the present ( th)
forms
iteration, is used to update the relaxation sources for the future
th iteration using (6) as follows:
(8)
The total
equations of (8) required to update all of the relaxation sources, being decoupled, can be solved in parallel as well.
Using the updated values of (8) as the new source waveforms
for the next
th iteration, the subcircuits are solved again.
This iterative cycle continues until the absolute error satisfies a
predefined tolerance expressed as
(9)
where
is the predefined error tolerance.
ROY et al.: WAVEFORM RELAXATION ALGORITHM FOR EFFICIENT ANALYSIS OF TRANSMISSION-LINE NETWORKS
455
. If
, the waveforms of the above relaxation
is simply the initial guess. For the
sources,
th iteration, using the above relaxation sources with known
waveforms as the input excitation to the corresponding subcircuits of group A, the
subcircuits can be solved in parallel
via the GJ technique explained in previous section. Once the
GJ is concluded, voltage waveforms
determined from the present ( th) iteration of group A is used to
update the relaxation sources responsible for exciting only the
even numbered subcircuits (group B) of Fig. 3 as
(11)
The total
equations of (11) can be solved in parallel,
similar to (8).
The relaxation sources
of (11) serve as the input for the corresponding
subcircuits can also be
subcircuits of group B and the
solved in parallel using the GJ technique. The voltage waveforms
determined from the present
( th) iteration of group B is used to update the relaxation
sources responsible for exciting only the subcircuits of group
A for the future
th iteration as
Fig. 4. Hybrid GS–GJ iterative technique.
2) Hybrid GS–GJ: To explain this contribution, the
subcircuits of Fig. 3 is considered to be divided among two
groups—group A containing the odd numbered subcircuits and
group B containing the even numbered subcircuits, where the
total number of subcircuits within each group is defined as
—group A
—group B
(10)
and represents the modulus function. Since, for the specific
case of longitudinal partitioning, coupling exists between an
odd-numbered and an even-numbered subcircuit only (and not
between two odd-numbered or two even-numbered subcircuits
themselves), the th iterative solution of any subcircuit in any
group is independent of the present ( th) solution of any other
subcircuit within the same group and rather depends on the
present ( th) solution of particular subcircuits within the opposite group. This coupling is addressed using a nested iterative technique. The outer iteration solves groups A and B in sequence (using GS) with updating the relaxation sources after
every group solution. The inner iteration solves the subcircuits
within each group in parallel (using GJ). This forms the basis
of the proposed hybrid iterative technique and is illustrated in
Fig. 4.
In each iteration, the GS sequence begins with group A before proceeding to group B. Hence, prior to beginning the th
iteration, it is assumed that the
th iteration has been completed for all subcircuits and those relaxation sources responsible for exciting only the odd numbered subcircuits (group A)
in Fig. 3 have been updated to
(12)
equations of (12) can be solved in parallel as
The total
well. The above iterative cycle continues until the absolute error
of the iterations satisfies the error tolerance as in (9). It is noted
that the hybrid technique provides more frequent exchange of
waveforms using (11)–(12) compared with traditional GJ which
allows only a single exchange of (8). As a result, the hybrid
technique exhibits better convergence than GJ. In Section III-C,
the proposed algorithm is extended for MTLs.
C. Extension for Multiconductor Transmission Lines
To better explain the partitioning methodology for MTLs, the
equations for the th lossless line segment in Fig. 2 is provided
as
(13)
coupled equations. However,
It is observed that (13) leads to
the coupled lossless sections can be decoupled into
single
lossless lines using a linear transformation of modal voltages/
currents as
(14)
456
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 60, NO. 3, MARCH 2012
Fig. 5. Partitioning of MTLs into subcircuits for waveform relaxation.
where
and
and
are constant matrices chosen to diagonalize
and have the following properties [24]:
(15)
and
are
diagonal matrices and the superscript denotes the transpose
of the matrix. Replacing (14) and (15) in (13) and performing
the same algebraic manipulations as in Section III-A followed
by converting the resultant equations into the time domain, the
decoupled lossless sections can be represented using the MoC
equations similar to (6) as
(16)
is expected to yield efficient convergence of the proposed
WR algorithm. The following section describes the iterative
solution of the subcircuits of Fig. 5.
D. Iterative Solution of Subcircuits for MTLs
Once the MTL network is partitioned using the above
methodology, both the GJ and hybrid GS-GJ iterative technique
can be used to solve the subcircuits as explained below. The
iterative procedures (GJ and hybrid GS-GJ) for MTLs are
similar to that of two conductor line with the main difference
being that, the MoC equations of (6) now has to be extended to
consider the decoupled equations of (16).
1) GJ for MTLs: This discussion begins by considering a
general MTL discretized into
subcircuits as illustrated in
Fig. 5. Assuming that the waveforms of all of the relaxation
sources
are known from the previous
th iteration and are used
as input excitations for the subcircuits of Fig. 5, the terminal
conditions required for the th iterative solution of the th
subcircuits is changed from (7) to include the effect of MTLs
described by (16) as
where
represents the line number,
and
represents the characteristic impedance and delay of
each lossless section, respectively, of the th line and
(18)
Since
(17)
where
the
time-domain
are
of
the
vectors
,
respectively,
defined in (14). The MoC equations (16) for MTLs can be
realized using the equivalent circuit of Fig. 2, where the
matrices
and
arising from the similarity transformation
of (14) is grouped with the lumped representation of the
lossy section. It is observed that, similar to the single-line
case of Fig. 1, the MoC provides natural interfaces for
MTLs across which information is exchanged using the time
delayed equations of (16). Hence, longitudinally partitioning
transmission lines at these interfaces, as shown in Fig. 5,
counterparts
the
relaxation
sources
of
(18)
(i.e.,
) of each
th subcircuit
are assumed known beforehand and independent of the present
( th) solution of the remaining
subcircuits, the
subcircuits can be solved in parallel, similar to two conductor
subcircuits
lines. The th iterative solution of all of the
provides the self consistent solution of the waveforms
which are thereafter used to update the
relaxation sources for the future
th iteration using (16) as
(19)
This iterative cycle continues until the absolute error satisfies a
predefined tolerance
as
(20)
ROY et al.: WAVEFORM RELAXATION ALGORITHM FOR EFFICIENT ANALYSIS OF TRANSMISSION-LINE NETWORKS
2) Hybrid GS–GJ for MTLs: The characteristic of longitudinal partitioning where couplings exist between an odd
numbered and an even numbered subcircuit only (and not
between two odd-numbered or two even-numbered subcircuits
themselves), is applicable to MTLs as well. Hence, the hybrid
iterative technique of Fig. 4 can be easily extended to MTLs.
Assuming that the waveforms of all of the relaxation
sources responsible for exciting the subcircuits of group A
are
known from the previous
th iteration, the
subcircuits
of group A can be solved in parallel via the GJ technique
explained in the previous section. Once the GJ is concluded,
determined
voltage waveforms
from the present ( th) iteration of group A is used to update the
relaxation sources responsible for exciting only the subcircuits
of group B as
(21)
The relaxation sources of (21) now serve as the input for the
corresponding subcircuits of group B and the
subcircuits
can also be solved in parallel using the GJ technique. The
voltage waveforms
determined from
the present ( th) iteration of group B is used to update the
relaxation sources responsible for exciting only the subcircuits
of group A for the future
th iteration as
457
computational cost for traditional circuit simulators is a major
factor limiting its applicability. To address the above issue in
the proposed WR algorithm, the DEPACT sections are separated into subcircuits each described using delayed differential equations which can now be solved independently. The total
computational cost of the proposed WR algorithm is mathematically quantified using the following lemmas.
Lemma 1: For
subcircuits, the computational cost of
the proposed WR algorithm using traditional GJ iterations is
, where
is the number of iterations and is the
number of CPUs available for parallel processing.
Proof: For typical WR algorithms, the total computational
cost can be divided into two parts—the first part is to solve the
subcircuits independently and the next is to update the relaxation sources.
It is assumed that the cost of solving one subcircuit scales
as
, where
is the scaling coefficient. Using a GJ iterative technique where the task of independently solving subcircuits can be distributed over CPUs, the total cost of solving
the subcircuits per iteration is given by
. The
second stage of the algorithm involves updating the
relaxation sources using (8) and (19). This translates to the solution of
linear algebraic equations in the time domain
per iteration. Since the equations are all decoupled, they can
be solved independently in parallel using CPUs for a cost of
where
is the scaling coefficient for the second
part of the proposed WR algorithm. Since, within the context of
this analysis,
is a constant, the above cost can be rewritten
as
.
The total cost of each iteration is the sum of the above costs
given as
(23)
(22)
The above iterative cycle of continues till the absolute error of
the iterations satisfies the error tolerance as in (20). Equations
(21)–(22) provide twice the amount of waveform exchange
compared to the single waveform exchange of (19) and hence,
the hybrid technique exhibits improved convergence compared
with the GJ technique.
IV. COMPUTATIONAL COMPLEXITY
ALGORITHM
OF THE
PROPOSED
The analysis begins by considering a general MTL network
of Fig. 2 discretized into DEPACT sections. Assuming each
DEPACT section to be described using number of delayed
ordinary differential equations, the size of the overall circuit
matrix describing the original network is
. The computational complexity of directly inverting the above matrix to
perform time-domain analysis is
or
[36], [37].
However, the matrices obtained by traditional circuit simulators
are sparse by nature and can be solved more efficiently using
sparse matrix routines at a cost of
where typically
depending on the sparsity of the matrix [11].
For large distributed networks, the interconnect have to be discretized into many segments to accurately capture the response
at the output ports. For such cases, the super linear scaling of the
is the cost of each GJ iteration. Since the above
where
process needs to be redone for
iterations, the total cost of
the proposed algorithm using traditional GJ is
(24)
is the total cost of the proposed algorithm using GJ.
where
It is observed that the solution of the
linear algebraic
equations to update the relaxation sources of (8) and (19) does
not involve any matrix inversion. On the other hand, the solution of each subcircuits involves the inversion of a matrix of
size . As a result, the cost of solving the subcircuits (first part)
is found to dominate over the cost of updating the relaxation
sources (second part) [13] (i.e.,
). Hence, the result
of (24) can be simplified to
(25)
where, within the context of this work, is a function of the
number of MTLs
and is treated as a constant. Equation
(25) demonstrates that the proposed WR algorithm scales as
when using the traditional GJ. The following lemma
extends the above analysis to the hybrid iterative technique.
Lemma 2: For subcircuits, the computational cost of the
proposed WR algorithm using the hybrid GS–GJ iterations is
, where is the number of iterations.
458
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 60, NO. 3, MARCH 2012
Fig. 6. Circuit of Example 1.
Proof: The cost of the proposed WR algorithm using the
hybrid iterative technique can be divided into two parts—the
first part is to solve the
subcircuits and update the
relaxation sources using (21). The second part is to solve the
subcircuits and update the
relaxation sources
using (22). Since updating the relaxation sources using (21) and
(22) does not require any matrix inversion, the contribution of
solving (21) and (22) is minimal compared with the cost of the
solution of each subcircuit. As a result, the total cost of the hybrid iterative technique can be approximated as simply the cost
of the independent solution of the
and
subcircuits.
The computational cost of solving the
subcircuit per
iteration using the GJ technique with parallel CPUs is given by
(from Lemma 1). Similarly, the cost of the
subcircuits per iteration is approximated as
.
Since the solution of
and
subcircuits proceeds in
sequence, the total cost of the hybrid technique per iteration is
the sum of the above two costs, given here as
(26)
Multiplying the above cost with the number of iterations (in this
case, ) provides an estimate of the full computational cost of
the proposed WR algorithm using the proposed GS-GJ hybrid
iterative technique as follows:
(27)
From the definition of
proximated to
and
in (10), (27) can be ap(28)
Equation (28) demonstrates that the proposed WR algorithm
scales as
when using the hybrid iterative technique.
Comparing the scaling of (25) and (28) with the number of
, it is appreciated that the hybrid iterative
available CPUs
technique retains the high degree of parallelizability as the GJ
technique. However, the hybrid technique has the added advantage of faster convergence over the GJ counterpart due to the
greater exchange of waveforms using (11) and (12) and (21)
and (22) compared with the single exchange of (8) and (19).
It is observed that the main reason behind the attractiveness
of the proposed algorithm [whether using GJ as in (25) or the
hybrid technique as in (28)] is the ability to solve the subcircuits
independently. This translates to an almost linear scaling of the
computational costs of the proposed algorithm with number of
DEPACT sections unlike SPICE which suffers from a super
linear scaling. In addition, using GJ and the hybrid technique
provides an additional advantage over SPICE (and GS based
WR algorithms like [25]) of dividing the computational cost
of the proposed algorithm over multiple CPUs
. These
results will be validated using the numerical examples in
Section V.
V. NUMERICAL EXAMPLES
Three examples are presented here to demonstrate the validity
and efficiency of the proposed algorithm. For a fair comparison
of the proposed work with full SPICE simulations, all of the
subcircuits of the WR iterations are also solved using SPICE.
A customized C++ code is used to extract the waveforms of
the th subcircuit and update the relaxation sources without any
external communication between the user and SPICE engine.
The scheduling of each subcircuit solve (whether using GJ or
GS–GJ technique) is automated using MATLAB 2010b. Within
the context of this work, full SPICE simulations refer to the
DEPACT algorithm of [26] and [27].
Example 1: The objective of this example is to demonstrate
the accuracy of the proposed WR algorithm and the superior
convergence of the hybrid iterative technique over the traditional GJ technique. For this example a transmission line network consisting of seven transmission line segments as shown
in Fig. 6 is considered. The p. u. l. parameters of the network
0.25 /cm,
4 nH/cm,
pF/cm,
are
mmho/cm and
5
/cm where
represents the skin effect losses as a function of
frequency [38], [39]. The network is excited by a trapezoidal
voltage source of rise time
0.1 ns, pulsewidth
5 ns,
ROY et al.: WAVEFORM RELAXATION ALGORITHM FOR EFFICIENT ANALYSIS OF TRANSMISSION-LINE NETWORKS
Fig. 7. Transient response for Example 1 using the proposed algorithm and full SPICE simulation. All line lengths are
. (b) Transient response at output port
.
port
Fig. 8. Convergence properties of the proposed hybrid iterative technique comcm.
pared to GJ. All line lengths are
amplitude of 2 V, and loaded with two SPICE level 49, CMOS
inverters using 180-nm technology.
To illustrate the accuracy of the proposed algorithm, the line
30 cm. In this case, the
length of each segment is set to
number of subcircuits required is 420. The network is then
solved using both proposed work and the full SPICE simulation. The proposed work uses the hybrid iterative technique
to solve the subcircuits on a sequential platform
with
the predefined error tolerance set to
and an initial
guess of the relaxation sources set to the dc solution of zero.
The transient responses at the far end of the network
using the proposed WR algorithm and full SPICE simulations
are shown in Fig. 7.
Next, the convergence properties of the proposed hybrid technique are compared with the traditional GJ technique. For each
algorithm, the number of iterations is varied from 1 to 10 and the
scaling of the associated error [ of (9)] is displayed in Fig. 8.
It is observed that the proposed hybrid technique shows significantly faster convergence than the traditional GJ algorithms.
This is due to the fact that the proposed hybrid technique involves twice the amount of information exchange as the GJ
technique for same number of iterations (see Sections III-B and
III-D).
459
cm. (a) Transient response at output
Example 2: The objective of this example is to illustrate the
computational efficiency of the proposed work over full SPICE
simulations for MTL structures. For this example, a seven-coupled line network with the physical dimensions as shown in
Fig. 9(a) is considered. The p. u. l. parameters for this example
are extracted from the HSPICE field solver [38] and include frequency dependent parameters. For the following analyses, the
MTL network topology is shown in Fig. 9(b), where lines 1, 3,
5, and 7 are excited with trapezoidal voltage sources of rise time
0.1 ns, pulsewidth
5 ns, and amplitude of 2 V.
This example begins with a demonstration of the performance
of the proposed work compared with full SPICE simulations as
the size of the network increases. The line length of the network
in Fig. 9(b) is increased from 0 to 200 cm in steps of 10 cm.
To accurately model the network, the numbers of subcircuits
are increased in steps of 16 for each 10-cm step and range from
0 to 320. For each case, the network is solved using both proposed work and the full SPICE simulation. The proposed work
uses both the hybrid technique and traditional GJ technique on a
with the predefined error tolerance
sequential platform
set to
and an initial guess of the relaxation sources
set to the DC solution of zero. For this particular error tolerance,
the number of iterations required for convergence is found to
be consistently between 5 and 6. The accuracy of the proposed
work (with the hybrid technique) compared to full SPICE simcm (i.e., for 80 subcirulation is illustrated in Fig. 10 for
cuits). The scaling of the computational cost of both proposed
work and full SPICE simulation with the line length is shown
in Fig. 11(a). It is observed from Fig. 11(a) that the proposed
work scales almost linearly
for both GJ and the hybrid
algorithm as predicted in (25), (28) respectively while the full
SPICE solution of the original network scale super linearly as
where
for this example. In addition, the hybrid iterative technique converges twice as fast as traditional GJ
technique.
Next, the performance of the proposed work is demonstrated
on a parallel platform. The length of the network is fixed at the
cm and the network
corner of our design space where
solved using both proposed work and full SPICE simulation.
The proposed WR iterations are performed using both the hybrid technique and the traditional GJ technique where number
of processors are varied from
to
for the same
460
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 60, NO. 3, MARCH 2012
Fig. 9. Transmission line structure of example 2.
Fig. 10. Transient response for Example 2 using proposed WR algorithm and the SPICE full simulation. Line length of the network is
. (b) Transient response at output port
.
response at output port
Fig. 11. Scaling of computational cost for Example 2. (a) Scaling of computational cost with line length
where line length
cm.
speed up with number of CPUs
error tolerance as before. The CPU speed up offered by both
iterative techniques over full SPICE simulations is shown in
Fig. 11(b) and summarized in Table I. The speed up for either
iterative technique scale almost linearly with number of processors, thereby demonstrating the high parallelizability of both as
theoretically expected from (25) and (28). The minor deviation
of Fig. 11(b) from the exactly linear scaling of (25) and (28)
with respect to number of CPUs
is due to the incurred communication overheads between processors.
Example 3: For this example a network consisting of a
cascade of subnetworks as shown in Fig. 12 is considered.
Each subnetwork consists of the three coupled MTL structure
of [40] with line length
cm. For the following analysis,
where number of CPUs
cm. (a) Transient
. (b) Scaling of CPU
TABLE I
CPU TIME COMPARISON FOR EXAMPLE 2
line one and three of the network is excited with a trapezoidal
voltage source of rise time
ns, pulsewidth
ns
ROY et al.: WAVEFORM RELAXATION ALGORITHM FOR EFFICIENT ANALYSIS OF TRANSMISSION-LINE NETWORKS
461
Fig. 12. Circuit of Example 3.
Fig. 13. Scaling of computational cost for Example 3. (a) Scaling of computational cost with number of subnetworks
where number of subnetworks
.
(b) Scaling of computational cost with number of CPUs
and amplitude of 5 V. Each subnetwork is modeled using eight
subcircuits.
In this analysis, the number of subnetworks ( of Fig. 12) is
increased from 0 to 50 in steps of 5 (i.e., the number of subcircuits are increased from 0 to 400 in steps of 40). For each case,
the network is solved using both proposed work and the full
SPICE simulation. The WR iterations for the proposed work is
performed using the hybrid technique on a sequential machine
with the predefined error tolerance set to
and an initial guess of the relaxation sources set to the dc solution of zero. For this particular error tolerance, the number
of iterations required for convergence was found to be consistently between 6 and 7. The scaling of the computational cost
is
of both proposed work and full SPICE simulation with
demonstrated in Fig. 13(a). Similar to the previous example, the
proposed WR algorithm shows linear scaling
with the
size of the network compared to the super linear scaling of full
SPICE (
where
for this example).
Next, the performance of the proposed work is demonstrated
on a parallel platform. The number of subnetworks is fixed at
the corner of our design space where
and the network
solved using both proposed work and full SPICE simulation.
The proposed WR iterations are performed on a parallel platform where number of processors are varied from
to
and the same error tolerance of
is used with
an initial guess of the relaxation sources set to the DC solution
of zero. The scaling of the CPU speed up offered by the proposed algorithm over full SPICE simulations as a function of the
number of processors is shown in Fig. 13(b) and summarized in
Table II. As expected, the speed up for the proposed WR algo-
where number of CPUs
.
TABLE II
CPU TIME COMPARISON FOR EXAMPLE 3
rithm scales almost linearly with number of processors, similar
to Example 2.
VI. CONCLUSION
In this paper, a longitudinal-partitioning-based waveform relaxation algorithm for efficient transient analysis of distributed
transmission-line networks is presented. The proposed methodology represents lossy transmission lines as a cascade of lumped
circuit elements alternating with lossless line segments, where
the lossless line segments are modeled using the method of characteristics. Partitioning the transmission lines at the natural interfaces provided by the method of characteristics allows the
resulting subcircuits to be weakly coupled by construction. The
subcircuits are solved independently using a hybrid iterative
technique that combines the fast convergence of the proposed
GS technique with the parallelizability of the GJ technique. Numerical examples illustrate that the proposed algorithm exhibits
good scaling with both the size of the network and the number of
CPUs available for parallel processing, thereby providing sig-
462
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 60, NO. 3, MARCH 2012
nificant savings in run time costs compared with full SPICE
simulations.
REFERENCES
[1] R. Achar and M. Nakhla, “Simulation of high-speed interconnects,”
Proc. IEEE, vol. 89, no. 5, pp. 693–728, May 2001.
[2] E. Lelarasmee, A. E. Ruehli, and A. L. Sangiovanni-Vincentelli, “The
waveform relaxation method for time-domain analysis of large-scale
integrated circuits,” IEEE Trans. Comput.-Aided Des. (CAD) Integr.
Circuits Syst., vol. CAD-1, no. 3, pp. 131–145, Jul. 1982.
[3] J. White and A. L. Sangiovanni-Vincentelli, Relaxation Techniques for
the Simulation of VLSI Circuits. Norwell, MA: Kluwer, 1987.
[4] F. Y. Chang, “The generalized method of characteristics for waveform
relaxation analysis of lossy coupled transmission lines,” IEEE Trans.
Microw. Theory Tech., vol. 37, no. 12, pp. 2028–2038, Dec. 1989.
[5] F. Y. Chang, “Waveform relaxation analysis of RLCG transmission
lines,” IEEE Trans. Circuits Syst., vol. 37, no. 11, pp. 1394–1415, Nov.
1990.
[6] F. Y. Chang, “Relaxation simulation of transverse electromagnetic
wave propagation in coupled transmission lines,” IEEE Trans. Circuits
Syst., vol. 38, no. 8, pp. 916–936, Aug. 1991.
[7] F. Y. Chang, “Waveform relaxation analysis of nonuniform lost transmission lines characterized with frequency dependent parameters,”
IEEE Trans. Circuits Syst., vol. 38, no. 12, pp. 1484–1500, Dec. 1991.
[8] F. Y. Chang, “Transient simulation of nonuniform coupled lossy
transmission lines characterized with frequency-dependent parameters—Part I: Waveform relaxation analysis,” IEEE Trans. Circuits
Syst. I, Fundam. Theory Appl., vol. 39, no. 8, pp. 585–603, Aug. 1992.
[9] J. Mao and Z. Li, “Waveform relaxation solution of ABCD matrices
of nonuniform transmission lines for transient analysis,” IEEE Trans.
Comput.-Aided Des. (CAD) Integr. Circuits Syst., vol. 13, no. 11, pp.
1409–1412, Nov. 1994.
[10] F. C. M. Lau and E. M. Deeley, “Transient analysis of lossy coupled transmission lines in a lossy medium using the waveform relaxation method,” IEEE Trans. Microw. Theory Tech., vol. 43, no. 3, pp.
692–697, Mar. 1995.
[11] N. M. Nakhla, A. E. Ruehli, R. Achar, and M. S. Nakhla, “Simulation
of coupled interconnects using waveform relaxation and transverse partitioning,” IEEE Trans. Adv. Packag., vol. 29, no. 1, pp. 78–87, Feb.
2006.
[12] N. Nakhla, A. E. Ruehli, M. S. Nakhla, R. Achar, and C. Chen, “Waveform relaxation techniques for simulation of coupled interconnects
with frequency-dependent parameters,” IEEE Trans. Adv. Packag.,
vol. 30, no. 2, pp. 257–269, May 2007.
[13] D. Paul, N. M. Nakhla, R. Achar, and M. S. Nakhla, “Parallel simulation of massively coupled interconnect networks,” IEEE Trans. Adv.
Packag., vol. 33, no. 1, pp. 115–127, Feb. 2010.
[14] Y.-Z. Xie, F. G. Canavero, T. Maestri, and Z.-J. Wang, “Crosstalk analysis of multiconductor transmission lines based on distributed analytical representation and iterative technique,” IEEE Trans. Electromagn.
Compatibil., vol. 52, no. 3, pp. 712–727, Aug. 2010.
[15] R. Achar, M. S. Nakhla, H. S. Dhindsa, A. R. Sridhar, D. Paul, and N.
M. Nakhla, “Parallel and scalable transient simulator for power grids
via waveform relaxation (PTS-PWR),” IEEE Trans. Very Large-Scale
Integr. (VLSI) Syst., vol. 19, no. 2, pp. 319–332, Feb. 2011.
[16] M. Al-Khaleel, A. E. Ruehli, and M. J. Gander, “Optimized waveform
relaxation methods for longitudinal partitioning of transmission lines,”
IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 9, pp. 1732–1743,
Aug. 2009.
[17] M. J. Gander and A. Stuart, “Space-time continuous analysis of waveform relaxation for the heat equation,” SIAM J. Sci. Comput., vol. 19,
no. 6, pp. 2014–2031, Nov. 1998.
[18] E. Giladi and H. B. Keller, “Space time domain decomposition for parabolic problems,” Numer. Math., vol. 93, no. 2, pp. 279–313, 2002.
[19] W. T. Beyene, “Application of multilinear and waveform relaxation
methods for efficient simulation of interconnect-dominated nonlinear
networks,” IEEE Trans. Adv. Packag., vol. 31, no. 3, pp. 637–648, Aug.
2008.
[20] V. B. Dmitriev-Zdorov and B. Klaassen, “An improved relaxation approach for mixed system analysis with several simulation tools,” in
Proc. EURO-DAC, 1995, pp. 274–279.
[21] V. B. Dmitriev-Zdorov, “Generalized coupling as a way to improve
the convergence in relaxation-based solvers,” in Proc. EURO-DAC/
EUROVHDL Exhib., Geneva, Switzerland, Sep. 1996.
[22] M. J. Gander and L. Halpern, “Optimized Schwarz waveform relaxation methods for advection reaction diffusion problems,” SIAM J.
Numer. Anal., vol. 45, no. 2, pp. 666–697, Apr. 2007.
[23] M. J. Gander, “Overlapping Schwarz waveform relaxation methods for
parabolic problems,” in Proc. Algoritmy, 1997, pp. 425–431.
[24] C. R. Paul, Analysis of Multiconductor Transmission Line. New
York: Wiley-Interscience, 2008.
[25] S. Roy and A. Dounavis, “Longitudinal partitioning based waveform
relaxation algorithm for transient analysis of long delay transmission
lines,” in IEEE MTT-S Int. Microw. Symp. Dig., Baltimore, Jun. 2011,
pp. 1–4.
[26] N. Nakhla, A. Dounavis, R. Achar, and M. S. Nakhla, “DEPACT: Delay
extraction-based passive compact transmission-line macromodeling algorithm,” IEEE Trans. on Adv. Packaging, vol. 28, no. 1, pp. 13–23,
Feb. 2005.
[27] N. Nakhla, M. S. Nakhla, and R. Achar, “Simplified delay extraction-based passive transmission line macromodeling algorithm,” IEEE
Trans. Adv. Packag., vol. 33, no. 2, pp. 498–509, May 2010.
[28] F. H. Branin, Jr., “Transient analysis of lossless transmission lines,”
Proc. IEEE, vol. 55, no. 11, pp. 2012–2013, Nov. 1967.
[29] A. Odabasioglu, M. Celik, and L. T. Pilleggi, “PRIMA: Passive
reduced-order interconnect macromodeling algorithm,” IEEE Trans.
Comput.-Aided Des. (CAD) Integr. Circuits Syst., vol. 17, no. 8, pp.
645–653, Aug. 1998.
[30] A. Dounavis, R. Achar, and M. Nakhla, “Efficient passive circuit
models for distributed networks with frequency-dependent parameters,” IEEE Trans. Adv. Packag., vol. 23, no. 8, pp. 382–392, Aug.
2000.
[31] A. Dounavis, R. Achar, and M. Nakhla, “A general class of passive
macromodels for lossy multiconductor transmission lines,” IEEE
Trans. Microw. Theory Tech., vol. 49, no. 10, pp. 1686–1696, Oct.
2001.
[32] A. Cangellaris, S. Pasha, J. Prince, and M. Celik, “A new discrete transmission line model for passive model order reduction and macromodeling of high-speed interconnections,” IEEE Trans. Adv. Packag., vol.
22, no. 3, pp. 356–364, Aug. 1999.
[33] Q. Yu, J. M. L. Wang, and E. S. Kuh, “Passive multipoint moment
matching model order reduction algorithm on multiport distributed interconnect networks,” IEEE Trans. Circuits Syst. I, Fundam. Theory
Appl., vol. 46, no. 1, pp. 140–160, Jan. 1999.
[34] E. Gad and M. Nakhla, “Efficient simulation of nonuniform transmission lines using integrated congruence transform,” IEEE Trans. Very
Large-Scale Integr. (VLSI) Syst., vol. 12, no. 5, pp. 1307–1320, May
2004.
par
[35] F. Fer, “Resolution de l’equation matricielle
produit infini d’exponentielles matricielles,” Acad. Roy. Belg. Cl. Sci.,
vol. 44, no. 5, pp. 818–829, 1958.
[36] J. D. Dixon, “Exact solution of linear equations using p-adic expantions,” Numerische Mathematik, vol. 40, no. 1, pp. 137–141, 1982.
[37] W. Eberly, M. Giesbrecht, P. Giorgi, A. Storjohann, and G. Villard,
“Solving sparse integer linear systems,” in Proc. ISSAC’06, Genova,
Italy, Jul. 2006, pp. 63–70.
[38] “HSPICE U-2008.09-RA,” Synopsis Inc..
[39] “HSPICE Signal Integrity User Guide,” Synopsis Inc., Sep. 2005.
[40] M. Celik and A. C. Cangellaris, “Efficient transient simulation of lossy
packaging interconnects using moment-matching techniques,” IEEE
Trans. Compon., Packag., Manuf. Technol. B, vol. 19, no. 1, pp. 64–73,
Feb. 1996.
Sourajeet Roy (S’11) received the B.Tech. degree in
electrical engineering from Sikkim Manipal University, India, in 2006, and the M.E.Sc. degree from University of Western Ontario, London, ON, Canada, in
2009, where he is currently working toward the Ph.D.
degree.
His research interests include modeling and simulation of high speed interconnects, signal and power
integrity analysis of electronic packages and design
and implementation of parallel algorithms.
Mr. Roy was the recipient of the Vice-Chancellors
Gold Medal for academic excellence at the undergraduate level.
ROY et al.: WAVEFORM RELAXATION ALGORITHM FOR EFFICIENT ANALYSIS OF TRANSMISSION-LINE NETWORKS
Anestis Dounavis (S’00–M’03) received the B.Eng.
degree from McGill University, Montreal, QC,
Canada, in 1995, and the M.Sc. and Ph.D. degrees
from Carleton University, Ottawa, ON, Canada,
in 2000 and 2004, respectively, all in electrical
engineering.
He currently serves as an Associate Professor
with the Department of Computer and Electrical Engineering, University of Western Ontario, London,
ON, Canada. His research interests are in electronic
design automation, simulation of high-speed and
microwave networks, signal integrity and numerical algorithms.
Dr. Dounavis was the recipient of the Ottawa Centre for Research and Innovation (OCRI) futures award—student researcher of the year in 2004 and the
INTEL Best Student Paper Award at the Electrical Performance of Electronic
Packaging Conference in 2003. He also received the Carleton University Medal
for outstanding graduate work at the M.Sc. and Ph.D. levels in 2000 and 2004,
respectively. He was the recipient of the University Student Council Teaching
Honour Roll Award at the University of Western Ontario in 2009 to 2010.
463
Amir Beygi (S’08) received the B.S. degree in
electrical engineering from K.N. Toosi University of
Technology, Tehran, Iran, in 2004, the M.S. degree
in electrical engineering from Iran University of
Science and Technology, Tehran, Iran, in 2007, and
the Ph.D. in electrical and computer engineering
from The University of Western Ontario, London,
ON, Canada, in 2011.
His research interests include simulation and modeling algorithms for electromagnetic compatibility
and signal integrity of high-speed interconnects.