Download Equivalence Verification of Large Galois Field

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Scalar field theory wikipedia , lookup

Secure multi-party computation wikipedia , lookup

Horner's method wikipedia , lookup

System of polynomial equations wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Gröbner basis wikipedia , lookup

Transcript
Equivalence Verification of Large Galois Field Arithmetic
∗
Circuits using Word-Level Abstraction via Gröbner Bases
Tim Pruss
Priyank Kalla
Florian Enescu
ECE
University of Utah
ECE
University of Utah
Math & Stats
Georgia State University
[email protected]
[email protected]
[email protected]
ABSTRACT
Custom arithmetic circuits designed over Galois fields F2k are prevalent in cryptography, where the field size k is very large (e.g. k =
571-bits). Equivalence checking of such large custom arithmetic
circuits against baseline golden models is beyond the capabilities of
contemporary techniques. This paper addresses the problem by deriving word-level canonical polynomial representations from gatelevel circuits as Z = F (A) over F2k , where Z and A represent the
output and input bit-vectors of the circuit, respectively. Using algebraic geometry, we show that the canonical polynomial abstraction
can be derived by computing a Gröbner basis of a set of polynomials extracted from the circuit, using a specific elimination (abstraction) term order. By efficiently applying these concepts, we can
derive the canonical abstraction in hierarchically designed, custom
arithmetic circuits with up to 571-bit datapath, whereas contemporary techniques can verify only up to 163-bit circuits.
Categories and Subject Descriptors
B.6.3 [Logic Design]: design aids – verification
General Terms
Verification, Arithmetic Circuits
Keywords
Hardware Verification, Word-Level Abstraction, Gröbner Bases
1. INTRODUCTION
Arithmetic circuits designed over Galois fields of the type F2k
find application in areas such as hardware security, cryptography,
error-correction codes, VLSI testing, among others. In such applications, the field size – and thus the circuit data-path size (k)
– can be very large. For example, the US National Institute for
Standards and Technology (NIST) recommends fields F2k corresponding to k = 163, 233, 283, 409, and 571, for Elliptic Curve
Cryptography (ECC). The large size and high-complexity of such
∗This research is funded in part by NSF grants CCF-1320335 and
CCF-1320385.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from [email protected].
DAC ’14 June 01 - 05 2014, San Francisco, CA, USA
ACM 978-1-4503-2730-5/14/06 ...$15.00.
http://dx.doi.org/10.1145/2593069.2593134.
architectures necessitates custom hierarchical design [1] [2]. Custom design raises the potential for bugs in the implementation. As
bugs can compromise the security of cryptosystems [3], formal verification of Galois field circuits becomes an imperative.
Verification of these circuits is very challenging, as custom architectures are usually structurally very dissimilar from the baseline specification (golden) models. Contemporary verification techniques [4] (including the recent approaches targeted for Galois field
circuits [5]) are unable to prove equivalence of such large circuits.
This paper presents an automatic combinational equivalence verification approach for very large Galois field arithmetic circuits. At
the core of our approach is a symbolic method to derive the wordlevel, canonical, polynomial representation from a given combinational circuit. It employs concepts from commutative algebra
and algebraic geometry — notably, Gröbner bases [6] theory — to
derive the word-level abstraction. The approach is well-suited to
arithmetic circuits that are hierarchically designed and cases where
the verification instances are structurally dissimilar.
Verification Problem: Given: i) a Galois field F2k , along with the
primitive polynomial P(X) used for its construction; ii) the golden
model circuit C1 (called Spec); iii) the custom implementation C2
(called Impl), along with any available design hierarchy. Prove or
disprove the functional equivalence C1 ≡ C2 ; i.e. prove whether or
not C1 ,C2 implement the same (polynomial) function over F2k .
To solve this problem, we analyze the circuits C1 ,C2 separately,
and derive unique canonical polynomial representations F1 , F2 , respectively. The equivalence test is then performed by simply matching the coefficients of F1 , F2 . The polynomial extraction approach
is based on the following novel mathematical insights:
The mathematical framework: A combinational circuit C with kbit inputs and k-bit outputs implements Boolean functions that are
mappings between k-dimensional Boolean spaces: f : Bk → Bk ,
where B = {0, 1}. The function f , which is a mapping among 2k
elements, can also be construed as a function over a Galois field of
2k elements, f : F2k → F2k . There is a well-known “textbook” result [7] which states that: i) over a Galois field (Fq ) of q elements,
every function f : Fq → Fq is a polynomial function; and ii) there
exists a unique canonical polynomial F that describes f . Motivated by this fundamental result, we devise an approach to derive
a word-level, canonical, polynomial abstraction of the function as
Z = F (A) over F2k , where Z = {z0 , . . . , zk−1 }, A = {a0 , . . . , ak−1 }
are, respectively, the output and input bit-vectors (words) of the
circuit, and F denotes a polynomial representation of the circuit’s
functionality. The approach easily generalizes to circuits with arbitrary number of word-level inputs — i.e. (multivariate) functions
f : Fn2k → F2k represented by a polynomial Z = F (A1 , . . . , An ).
The polynomial F can be derived by means of the Lagrange
interpolation formula [7] [8]. However, this requires to analyze f
over the entire field F2k , which is exhaustive and infeasible. To
make this approach practical, we propose a symbolic method based
on computer algebra and algebraic geometry to derive the canonical
polynomial abstraction and employ it for design verification.
Contributions: Using polynomial abstractions, we analyze the
given circuits and model the gate-level Boolean operators as elements of a multivariate polynomial ring with coefficients in F2k . By
exploiting concepts of Nullstellensatz, Gröbner bases, elimination
ideals and projections of varieties [6], we formulate the polynomial
abstraction problem as one of computing a Gröbner basis of this set
of polynomials, using a specific elimination term order — termed
as the abstraction term order >. Computing Gröbner bases using
elimination orders is infeasible for large circuits. To overcome this
limitation, we refine the term order based on a topological analysis
of the circuit. Using this refinement, we guide the S-polynomial
computations in the Buchberger’s algorithm [9] to derive the polynomial representation of the circuit’s functionality.
This approach identifies the function implemented by the given
Galois field arithmetic circuits for verification. We experiment with
different architectures of Galois field multipliers and show that: i)
when the circuits are given as flattened netlists, we can abstract the
polynomial for up to 409-bit NIST specified fields; and ii) when the
design hierarchy is available, our approach can identify the polynomial up to 571-bits, i.e. for all NIST-specified Galois fields F2k
used in ECC. Our approach scales well for practical verification,
whereas other techniques [5] fail beyond 163-bit circuits.
2. RELATED PREVIOUS WORK
Canonical Representations: The Reduced Ordered Binary Decision Diagram (ROBBD) [10] — and its variants FDDs, ADDs,
BMDs, etc. — are canonical DAG representations of functions
that are employed in design verification. The various decomposition principles behind these diagrams are based on point-wise, binary decomposition, w.r.t. each (Boolean) variable. As such, these
do not fully provide word-level abstraction capabilities from bitlevel representations. Taylor Expansion Diagrams (TEDs) [11] are
a word-level canonical representation of a polynomial expression,
but they do not represent a polynomial function canonically.
MODDs [12] are a DAG representation of the characteristic function of a circuit over Galois fields F2k . MODDs come close to satisfying our requirements as a canonical word-level representation
that can be employed over Galois fields. However, MODDs do not
scale well w.r.t. the circuit size. MODDs are known to be infeasible
in representing functions over larger than 32-bit vectors [12].
Equivalence Checking: Modern equivalence checkers employ
techniques based on AIG-based reductions [4] and circuit-SAT
solvers [13]. Such techniques are able to identify internal structural
equivalences between the Spec and Impl circuits and reduce the instances for verification. However, when the arithmetic circuits are
structurally very dissimilar, these techniques are infeasible in proving equivalence (Tables I and II in [5] depict such experiments).
Word-Level Verification of Galois field circuits: In [14], the authors present the BLUEVERI tool for verification of Galois field circuits for error correcting codes against an algorithmic spec. The
implementation consists of a set of (pre-designed and verified) circuit blocks that are interconnected to form the system. Their objective is to prove the equivalence of the implementation against
a “check file” (spec), for which they employ a Nullstellensatz and
Gröbner basis formulation. In their setting, the polynomial function
representation of the sub-circuit blocks is already available.
In [5], Lv et al. present computer algebra techniques for formal
verification of Galois field arithmetic circuits. Given a specification polynomial F , and a circuit C, they formulate the verification
problem as an ideal membership test using the Gröbner basis theory. Verification is performed by a sequence of divisions modulo
the polynomials of the circuit. This approach moves the verification
complexity solely to that of polynomial division — which results
in the size-explosion of intermediate remainders in the division. As
a result, their approach does not scale beyond 163-bit circuits.
In contrast to [14] [5], we are not given the specification polynomial F . Given the circuit C, we have to derive (extract) the wordlevel specification F . Moreover, we perform a Gröbner basis computation on a subset of polynomials to derive the abstraction polynomial, which is the reason behind the success of our approach.
Polynomial Interpolation: Interpolation can be used to derive a
polynomial representation for a function over F2k . However, Newton’s dense interpolation techniques exhibit very high complexity.
While such techniques have been investigated by logic synthesis
and testing communities [8], they are feasible only over small fields
— e.g. for computing Reed-Muller forms for multi-valued logic.
3.
PRELIMINARIES
Galois fields and Polynomial functions: A Galois field (Fq ) is
a field with a finite number (q) of elements, where q is a power
of a prime integer — i.e. q = pk , where p is a prime integer, and
k ≥ 1. We consider fields where p = 2 and k > 1 — i.e. binary
Galois extension fields F2k — as they are employed in hardware
implementations of cryptography primitives.
To construct F2k , we take the polynomial ring F2 [x], where F2 =
{0, 1}, and an irreducible polynomial P(x) ∈ F2 [x] of degree k, and
construct F2k as F2 [x] (mod P(x)). As a result, all field operations
are performed modulo the irreducible polynomial P(x) and the coefficients are reduced modulo p = 2. Any element A ∈ F2k can be
represented in polynomial form as A = a0 + a1 α + · · · + ak−1 αk−1 ,
where ai ∈ F2 , i = 0, . . . , k − 1, and α is a root of the irreducible
polynomial, i.e. P(α) = 0. Note that A is essentially represented
as a k-bit vector. The field F2k can therefore be construed as a kdimensional vector space over F2 , so F2 ⊂ F2k .
Polynomial Functions f : F2k → F2k : Arbitrary mappings among
k-bit vectors can be constructed; each such mapping generates a
function f : Bk → Bk . Every such function is also a polynomial
function over Galois fields: f : F2k → F2k .
T HEOREM 3.1. From [7]: Any function f : Fq → Fq is a polynomial function over Fq , that is there exists a polynomial F ∈ Fq [x]
such that f (a) = F (a), for all a ∈ Fq .
An important property of Galois fields is that for all elements
A ∈ Fq , Aq = A, and hence Aq − A = 0. Therefore, the polynomial
X q − X vanishes on all points in Fq . Consequently, any polynomial F (X) can be reduced (mod X q − X) to obtain a canonical
representation (F (X) (mod X q − X)) with degree at most q − 1.
D EFINITION 3.1. Any function f : Fdq → Fq has a unique canonical representation (UCR) as a polynomial F ∈ Fq [x1 , . . . , xd ] such
that all its nonzero monomials are of the form x1i1 · · · xdid where 0 ≤
i j ≤ q − 1, for all j = 1, . . . d.
Modulo-multipliers over F2k : Over Galois fields F2k , multiplication is performed as Z = A × B (mod P(x)), where A, B ∈ F2k are
k-bit inputs and P(x) is the given irreducible polynomial. The multiplier circuit takes bit-level inputs {a0 , . . . , ak−1 , b0 , . . . , bk−1 } and
i=k−1
ai αi , B =
produces output Z = {z0 , . . . , zk−1 }, such that A = ∑i=0
i=k−1
i=k−1
i
i
∑i=0 bi α and Z = ∑i=0 zi α . First, the bit-wise multiplication
S = A × B is computed using an array multiplier architecture, and
then the result S is reduced (mod P(x)) to obtain Z = S (mod P(x)).
Such architectures are termed Mastrovito multipliers [15].
Mastrovito multipliers are inefficient, specially for cryptosystems where multiplication is often performed repeatedly. For such
applications, Montgomery reduction operations are proposed [1]
[2]. Montgomery reduction (MR) computes: MR(A, B) = A·B·R−1
(mod P(x)), where A, B are k-bit inputs, R is suitably chosen as
R = αk , R−1 is multiplicative inverse of R in F2k , and P(x) is the
irreducible polynomial. Since Montgomery reduction cannot directly compute A · B (mod P(x)), we need to pre-compute A · R and
B · R, as shown in Fig. 1.
R
MR
AR
MR
B
A B R MR
G=A B (mod P)
BR
2
R
D EFINITION 3.3. [Gröbner Basis] For a monomial ordering >,
a set of non-zero polynomials G = {g1 , g2 , · · · , gt } contained in
an ideal J, is called a Gröbner basis for J ⇐⇒ ∀ f ∈ J, f 6= 0,
there exists i ∈ {1, · · · ,t} such that lm(gi ) divides lm( f ); i.e., G =
GB(J) ⇔ ∀ f ∈ J : f 6= 0, ∃gi ∈ G : lm(gi ) | lm( f ).
As a consequence of Definition 3.3, the set G is a Gröbner basis
of ideal J if and only if for all f ∈ J, dividing f by polynomials of
G
2
A
V (g1 , . . . , gt ). A Gröbner basis is a representation of an ideal which
allows to solve many polynomial decision questions.
MR
"1"
Figure 1: Montgomery multiplication over F2k using four MRs.
Clearly, Montgomery multipliers are hierarchically designed as
an interconnection of MR blocks (Fig. 1). These circuits are structurally dissimilar from the baseline Mastrovito multipliers. In this
paper, Mastrovito and Montgomery multipliers are used as Spec
and Impl benchmarks, respectively, for equivalence verification.
3.1 Computer Algebra Preliminaries
Let Fq [x1 , . . . , xd ] be the polynomial ring with indeterminates
x1 , . . . , xd , where q = 2k . A monomial is a power product X =
x1α1 · x2α2 · · · xdαd , where αi ≥ 0, i ∈ {1, . . . , d}. A polynomial f ∈
Fq [x1 , . . . , xd ], f 6= 0, is a finite sum of terms f = c1 X1 + c2 X2 +
· · · + ct Xt . Here c1 , . . . , ct are coefficients and X1 , . . . , Xt are monomials. A monomial ordering > is imposed on the ring such that
X1 > X2 > · · · > Xt . Subject to such an ordering, lt( f ) = c1 X1 ,
lm( f ) = X1 , lc( f ) = c1 , are the leading term, leading monomial
and leading coefficient of f , respectively. Similarly, tail( f ) = c2 X2 +
· · · + ct Xt . Division of a polynomial f by polynomial g gives reg
mainder polynomial r, denoted f →
− + r. Similarly, f can be reduced (divided) w.r.t. a set of polynomials F = { f1 , . . . , fs } to obF
G gives 0 remainder: G = GB(J) ⇐⇒ ∀ f ∈ J, f −→+ 0.
Buchberger’s algorithm [9], shown in Algorithm 1, computes a
L
Gröbner basis over a field. Spoly( f , g) = lt(Lf ) · f − lt(g)
· g where
L = LCM(lm( f ), lm(g)). Note that Spoly( f , g) cancels the leading
G′
terms of f , g, and the remainder r obtained in Spoly( f , g) −→+ r
gives a new leading term. A Gröbner basis is computed when all
G′
Spoly( f , g) −→+ 0. A Gröbner basis can be further reduced; a
reduced Gröbner basis is a canonical representation of the ideal
w.r.t. the set monomial order.
Algorithm 1: Buchberger’s Algorithm
Input: F = { f1 , . . . , fs }
Output: G = {g1 , . . . , gt }
G := F;
repeat
G′ := G;
for each pair { f , g}, f 6= g in G′ do
G′
Spoly( f , g) −→+ r ;
if r 6= 0 then
G := G ∪ {r} ;
end
end
until G = G′ ;
4.
WORD-LEVEL ABSTRACTION USING
GRÖBNER BASIS
tain a remainder r, denoted f −→+ r, such that no term in r is
divisible by the leading term of any polynomial in F.
An ideal J generated by polynomials f1 , . . . , fs ∈ Fq [x1 , . . . , xd ]
is: J = h f1 , . . . , fs i = {∑si=1 hi · fi : hi ∈ Fq [x1 , . . . , xd ]}. The polynomials f1 , . . . , fs form the basis or generators of J.
Let a = (a1 , . . . , ad ) ∈ Fdq be a point, and f ∈ Fq [x1 , . . . , xd ] be a
polynomial. We say that f vanishes on a if f (a) = 0. For any ideal
J = h f1 , . . . , fs i ⊆ Fq [x1 , . . . , xd ], the affine variety of J over Fq is:
V (J) = {a ∈ Fd : ∀ f ∈ J, f (a) = 0}. In other words, the variety
corresponds to the set of all solutions to f1 = · · · = fs = 0.
We are given a circuit C with k-bit inputs and outputs that performs a polynomial computation Z = F (A) over Fq = F2k . Let
P(x) be the given irreducible or primitive polynomial used for field
construction, and let α be its root, i.e. P(α) = 0. Note that we do
not know the polynomial representation F (A) and our objective is
to identify (the coefficients of) F (A). Let {a0 , . . . , ak−1 } denote
the primary inputs and let {z0 , . . . , zk−1 } be the primary outputs of
C. Then, the word-level and bit-level correspondences are:
D EFINITION 3.2. For any subset V of Fdq , the ideal of polynomials that vanish on V , called the vanishing ideal of V , is defined
as: I(V ) = { f ∈ Fq [x1 , . . . , xd ] : ∀a ∈ V, f (a) = 0}. Therefore, if a
polynomial f vanishes on a variety V , then f ∈ I(V ).
We analyze the circuit and model all the gate-level Boolean operators as polynomials in F2 ⊂ F2k . To this set of Boolean polynomials, append the polynomials of Eqn. (1) that relate the wordlevel and bit-level variables. Denote this set of polynomials as F =
{ f1 , . . . , fs } over the ring R = Fq [x1 , . . . , xd , Z, A]. Here x1 , . . . , xd
denote, collectively, all the bit-level variables of the circuit — i.e.
primary inputs, primary outputs and the intermediate circuit variables — and Z, A, are the word-level variables. Denote the generated ideal as J = hFi ⊂ R. Also, denote the (unknown) “specification” of the circuit as a polynomial f : Z − F (A), or equivalently as
f : Z + F (A), as −1 = +1 in F2k .
As Z = F (A), clearly f : Z + F (A) agrees with the solutions to
the circuit equations f1 = · · · = fs = 0. This means that f : Z +
F (A) vanishes on the variety VFq (J). If f : Z + F (A) vanishes on
T HEOREM 3.2. Strong Nullstellensatz over Fq : (From [16]):
q
q
Let J ⊆ Fq [x1 , . . . , xd ] be an ideal, and let J0 = hx1 − x1 , . . . , xd −
xd i be the ideal of all vanishing polynomials. Let VFq (J) denote
q
the variety of J over Fq . Then, I(VFq (J)) = J + J0 = J + hx1 −
q
x1 , . . . , xd − xd i.
Gröbner Bases: An ideal J may have many different generators (representations): i.e. F = { f1 , . . . , fs } and G = {g1 , . . . , gt }
such that J = h f1 , . . . , fs i = hg1 , . . . , gt i and V (J) = V ( f1 , . . . , fs ) =
A = a0 + a1 α + · · · + ak−1 αk−1 ; Z = z0 + z1 α + · · · + zk−1 αk−1 ;
(1)
VFq (J), then due to Definition 3.2, f : Z + F (A) is a member of the
ideal I(VFq (J)). Strong Nullstellensatz over Galois fields (Theorem
3.2) tells us that I(VFq (J)) = J + J0 , where J0 = hx12 − x1 , . . . , xd2 −
xd , Z q − Z, Aq − Ai is the ideal of all vanishing polynomials in R.
From these results, we deduce that:
D EFINITION 4.2. Abstraction Term Order >: Using the variable order x1 > x2 > · · · > xd > Z > A, impose a lex term order
> on the polynomial ring R = Fq [x1 , . . . , xd , Z, A]. This elimination
term order > is defined as the Abstraction Term Order. The relative
ordering among x1 , . . . , xd can be chosen arbitrarily.
P ROPOSITION 4.1. The (unknown) specification polynomial f :
Z + F (A) ∈ (J + J0 ).
T HEOREM 4.2. Abstraction Theorem: Using the setup and notations from Problem Setup 4.1 above, compute a Gröbner basis G
of ideal (J + J0 ) using the abstraction term order >. Then:
(i) G must contain a polynomial of the form Z + G (A); and
(ii) Z + G (A) is such that F (A) = G (A), ∀A ∈ Fq . In other words,
G (A) and F (A) are equal as polynomial functions over Fq .
The variety VFq (J) is the set of all consistent assignments to the
nets (signals) in the circuit C. If we project this variety on the
word-level input and output variables of the circuit C, we essentially generate the function f implemented by the circuit. Projection of varieties from d-dimensional space Fdq onto a lower dimensional subspace Fd−l
is equivalent to eliminating l variables from
q
the corresponding ideal.
D EFINITION 4.1. (Elimination Ideal) From [6]: Given J =
h f1 , . . . , fs i ⊂ Fq [x1 , . . . , xd ], the lth elimination ideal Jl is the ideal
of Fq [xl+1 , . . . , xd ] defined by Jl = J ∩ Fq [xl+1 , . . . , xd ].
In other words, the lth elimination ideal does not contain variables x1 , . . . , xl , nor do the generators of it. Moreover, Gröbner
bases may be used to generate an elimination ideal by using an
“elimination term order.” One such ordering is a pure lexicographic
ordering, which features into the following theorem:
T HEOREM 4.1. (Elimination Theorem) From [6]: Let J ⊂
Fq [x1 , . . . , xd ] be an ideal and let G be a Gröbner basis of J with
respect to a lex ordering where x1 > x2 > · · · > xd . Then for every
0 ≤ l ≤ d, the set Gl = G ∩ Fq [xl+1 , . . . , xd ] is a Gröbner basis of
the lth elimination ideal Jl .
E XAMPLE 4.1. Consider polynomials f1 : x2 − y − z − 1, f2 :
x − y2 − z − 1, f3 : x − y − z2 − 1 and ideal J = h f1 , f2 , f3 i ⊂
C[x, y, z]. Let us compute a Gröbner basis G of J w.r.t. lex term
order with x > y > z. Then G = {g1 , . . . , g4 } is obtained as: g1 :
x − y − z2 − 1; g2 : y2 − y − z2 − z; g3 : 2yz2 − z4 − z2 ; g4 :
z6 − 4z4 − 4z3 − z2 . Notice that the polynomial g4 contains only the
variable z, and it eliminates variables x, y. Similarly, polynomials
g2 , g3 , g4 , contain variables y, z and eliminate x. According to Theorem 4.1, G1 = G ∩ C[y, z] = {g2 , g3 , g4 } and G2 = G ∩ C[z] = {g4 }
are the Gröbner bases of the 1st and 2nd elimination ideals of J, respectively.
The above example motivates our approach: since we want to
derive a polynomial representation from a circuit in variables Z, A,
we can compute a Gröbner basis of J + J0 w.r.t. an elimination
order that eliminates all the (d) bit-level variables of the circuit.
The Gröbner basis Gd = G ∩ Fq [Z, A] of the d th elimination ideal
of (J + J0 ) will contain polynomials in only Z, A.
P ROBLEM S ETUP 4.1. Given a circuit C with k-bit inputs and
outputs which computes a polynomial function f : F2k → F2k . Let
A = {a0 , . . . , ak−1 } and Z = {z0 , . . . , zk−1 } be the inputs and outputs of the circuit, respectively, such that A = a0 + a1 α + · · · +
ak−1 αk−1 and Z = z0 + · · · + zk−1 αk−1 , where P(α) = 0. Let Z =
F (A) be the unknown polynomial function implemented by the circuit. Denote by xi , i = 1, . . . , d all the Boolean variables of the circuit. Let R = F2k [xi , Z, A : i = 1, . . . d] denote the corresponding
polynomial ring and let ideal J ⊂ F2k [xi , Z, A : i = 1 . . . d] be generated by the bit-level and word-level polynomials of the circuit.
k
k
Let J0 = hxi2 − xi , Z 2 − Z, A2 − A : i = 1, . . . , di denote the ideal of
vanishing polynomials in R.
✷
P ROOF. (i) Since f : Z + F (A) is a polynomial representation of
the circuit, Z + F (A) ∈ J + J0 , due to Proposition 4.1. Therefore,
according to the definition of a Gröbner basis (Definition 3.3), the
leading term of Z + F (A) (which is Z) should be divisible by the
leading term of some polynomial gi ∈ G. The only way lt(gi ) can
divide Z is when lt(gi ) = Z itself. Moreover, due to our abstraction
(lex) term order, Z > A, so this polynomial must be of the form
Z + G (A).
(ii) As Z = F (A) represents the function of the circuit, Z +
F (A) ∈ J + J0 . Moreover, V (J + J0 ) ⊂ V (Z + F (A)). Project
this variety V (J +J0 ) onto the co-ordinates corresponding to (A, Z).
What we obtain is the graph of the function A 7→ F (A) over F2k .
Since Z + G (A) is an element of the Gröbner basis of J + J0 , V (J +
J0 ) ⊂ V (Z + G (A)) too. Due to this inclusion of varieties, the points
that satisfy (J + J0 ) also satisfy Z + G (A) = 0 and Z + F (A) = 0.
Therefore, Z = G (A) gives the same function as Z = F (A), for all
A ∈ F2k .
C OROLLARY 4.1. Computing a reduced Gröbner basis Gr of
J +J0 , we will obtain one and only one polynomial in Gr of the form
Z + G (A), such that Z = G (A) is the unique, minimal, canonical
representation of the function f implemented by the circuit.
As a consequence of Theorem 4.2 and Corollary 4.1, if we compute a reduced Gröbner basis G of J + J0 using the abstraction term
order, we will always find the one and only polynomial of the form
Z + G (A) in the Gröbner basis, such that Z = G (A) is the unique
canonical polynomial representation of the circuit.
The above results trivially extend to circuits with multiple wordlevel input variables A1 , . . . , An , and the canonical polynomial representation obtained by computing a reduced Gröbner basis Gr of
J + J0 using > is of the form Z = F (A1 , . . . , An ).
E XAMPLE 4.2. Demonstration of our approach: Consider the
2-bit multiplier circuit over F22 given in Fig. 2, which implements
a polynomial function: Z = A × B, Z, A, B ∈ F4 . Here, A = a0 +
a1 α, B = b0 +b1 α are the word-level inputs and Z = z0 +z1 α is the
output in F4 , and P(x) = x2 + x + 1 (given) where P(α) = 0. The
Figure 2: A 2-bit Multiplier over F22 . The gate ⊗ corresponds to AND-
gate, i.e. bit-level multiplication modulo 2. The gate ⊕ corresponds to
XOR-gate, i.e. addition modulo 2.
functionality of the circuit is described using the following polynomials derived from the Boolean gate-level operators: f1 : z0 +
z1 α + Z; f2 : b0 + b1 α + B; f3 : a0 + a1 α + A; f4 : s0 + a0 · b0 ; f5 :
s1 + a0 · b1 ; f6 : s2 + a1 · b0 ; f7 : s3 + a1 · b1 ; f8 : r0 + s1 + s2 ; f9 :
z0 + s0 + s3 ; f10 : z1 + r0 + s3 . Ideal J = h f1 , . . . , f10 i. Generate J0 as the ideal of vanishing polynomials. Impose the following abstraction term order, i.e. a lex order with “circuit variables” > “Output Z” > “Inputs, A, B”, and compute a Gröbner
basis G of J + J0 . We find the following polynomials in the basis: g1 : z0 + z1 α + Z; g2 : b0 + b1 α + B; g3 : a0 + a1 α + A; g4 :
s3 + r0 + z1 ; g5 : s1 + s2 + r0 ; g6 : s0 + s3 + z0 ; g7 : Z + AB; g8 :
a1 b1 + a1 B + b1 A + z1 ; g9 : r0 + a1 b1 + z1 ; g10 : s2 + a1 b0 , and the
polynomials of J0 . The polynomial g7 : Z + AB describes Z = AB
as the (canonical) polynomial function implemented by the circuit.
5. IMPROVING OUR APPROACH
Computing Gröbner bases w.r.t. elimination orders is infeasible for large circuits. The worst-case complexity of computing
GB(J + J0 ) in Fq [x1 , . . . , xd ] is known to be bounded by qO(d) [16],
which is prohibitive over large fields. Therefore, we need to improve our approach to overcome this complexity. Notice that our
approach “searches” for only one polynomial (Z + G (A)) in the
Gröbner basis, and it does so by computing the entire Gröbner basis. This motivates us to investigate whether it is possible to guide
J+J
0
a sequence of Spoly( f , g) −−−→
+ r computations to arrive at the
desired word-level polynomial, without considering other polynomials in the generating set. For this purpose, we exploit the wellknown product criteria:
L EMMA 5.1. [Product Criterion [17]] Let f , g ∈ F[x1 , · · · , xd ]
be polynomials. If the equality lm( f ) · lm(g) = LCM(lm( f ), lm(g))
G
holds, then Spoly( f , g) −→+ 0.
The above result states that when the leading monomials of f , g
are relatively prime, then Spoly( f , g) always reduces to 0 modulo
G. Thus Spoly( f , g) need not be considered in Buchberger’s algorithm. Recall that in the Abstraction Term Order (Definition 4.2),
we have “circuit variables x1 , . . . , xd ” > Z > A, where the relative
ordering among x1 , . . . , xd is not important. We will now further refine the abstraction term order while exploiting the product criteria.
D EFINITION 5.1. Refined Abstraction Term Order >r : Starting
from the primary outputs of the circuit C, perform a reverse topological traversal toward the primary inputs. Order each variable
of the circuit according to its reverse topological level: i.e. xi > x j
if xi appears earlier in the reverse topological order. Impose a
lex term order >r on Fq [x1 , . . . , xd , Z, A] with “circuit variables ordered reverse topologically” > Z > A. This term order >r is called
the refined abstraction term order (RATO).
When RATO is imposed on the set of polynomials F = { f1 , . . . , fs },
J = hFi, it is easy to see that each polynomial in F is of the form
fi = xi +Pi , where xi is a gate-output and Pi = tail( fi ) represents the
function implemented by that gate. Moreover, each indeterminate
x j that appears in Pi satisfies xi > x j (acyclic circuit). Furthermore,
each gate output is a leading term of some polynomial in F. Since
each gate output is a unique signal, fi = xi + Pi and f j = x j + Pj
have relatively prime leading terms (xi 6= x j ). So, Spoly( fi , f j ) need
not be considered in the Gröbner basis computation.
However, there is one (and only one) pair of polynomials ( fw , fg )
∈ F with leading terms that are not relatively prime: i) the wordlevel polynomial ( fw ) corresponding the outputs: fw = z0 + z1 α +
· · · + zk−1 αk−1 + Z, with gate output z0 as the leading term; and ii)
the polynomial fg that models the function at the gate z0 . Due to
J+J
0
RATO, Spoly( fw , fg ) −−−→
+ r is the only candidate critical pair to
be evaluated at the start of Buchberger’s algorithm. Based on these
concepts, we devise the following approach to efficiently search for
the polynomial function:
1. Impose RATO on the ring. Select the only critical pair ( fw , fg )
that does not have relatively prime leading terms, and comF,F0
pute Spoly( fw , fg ) −−→+ r.
2. Then r will contain only the following variables: i) the bitlevel primary input variables of the circuit; ii) the word-level
output Z; and iii) the word-level input A. The remainder r
will not contain any bit-level variable corresponding to the
output of any gate in the design; i.e. primary output bits and
intermediate variables of the circuit do not appear in r. To
prove this, assume that a non-primary-input variable x j appears in a monomial term m j in r. Since there always exists
a polynomial f j ∈ F such that f j = x j + tail( f j ), lt( f j ) divides monomial m j and m j can be canceled. Therefore, all
such terms m j with non-primary-input bit-level variables can
be eliminated.
3. Two cases need to be considered:
(a) (Case 1:) Remainder r does not even contain the primary input bits. Then, r contains only the word-level
variables Z, A. Since RATO is lex with Z > A, the remainder r corresponds to the desired canonical polynomial representation: r : Z + G (A).
(b) (Case 2:) Remainder r contains both the bit-level primary input variables (call this set XPI ), as well as the
word-level variables. Then, due to Lemma 5.1, we
only need to consider the set F ′ = {r, fwi } and F0′ =
{xi2 − xi , Z q − Z, Aq − A : xi ∈ XPI }, where fwi = a0 +
a1 α + · · · + ak−1 αk−1 + A is the polynomial that relates
the word-level (A) and bit-level inputs {a0 , . . . , ak−1 }.
Compute the reduced Gröbner basis G′ of F ′ ∪F0′ , which
is a much simplified computation. Then, G′ will definitely contain a polynomial of the form Z + G (A), which
will be the canonical polynomial representation of the
function of the circuit.
E XAMPLE 5.1. Consider, again, the example shown in Example 4.2, corresponding to the multiplier circuit of Fig. 2. Impose
RATO: {z0 > z1 } > {r0 > s0 > s3 } > {s1 > s2 } > {a0 > a1 > b0 >
b1 } > Z > A. Then, the polynomials f1 . . . , f10 shown in Example 4.2 are already represented in RATO. Assume that the circuit is
correct and it has no bugs. Then f1 and f9 are the only two polynomials whose leading terms are not relatively prime. Computing
F,F0
Spoly( f1 , f9 ) −−→+ r, we find that r = Z + A · B — which is the
word-level polynomial representation of the circuit.
Now, let us introduce a bug in the design. Replace the polynomial f8 : r0 + s1 + s2 in F with f8 : r0 + s0 + s2 (bug introduced).
F,F0
Computing Spoly( f1 , f9 ) −−→+ r, we find that r = αa1 b1 + (α +
1)a1 B + b1 A + Z + (α + 1)AB. Note that in addition to word-level
variables Z, A, B, we also have bit-level primary inputs a1 , b1 in r.
Moreover, all other polynomials in F have leading terms that are
relatively prime w.r.t. lt(r).
Now we take F ′ = {r, a0 + a1 α + A, b0 + b1 α + B} and F0′ =
2
{a0 − a0 , a21 − a1 , b20 − b0 , b21 − b1 , A4 − A, B4 − B, Z 4 − Z} and compute the reduced Gröbner basis G′ of F ′ ∪ F0′ . We find the polynomial Z + (α) · A2 · B2 + A2 · B + (α + 1) · A · B2 + (α + 1) · A · B in G′
which is indeed the polynomial representation of the buggy circuit!
6. EXPERIMENTAL RESULTS
Using the approach described in Section 5, we have performed
experiments to prove equivalence between Mastrovito (C1 ) and Montgomery (C2 ) multiplier circuits. The Mastrovito multiplier, baseline golden model (Spec), is provided as a bit-blasted/flattened gatelevel netlist. The (Impl) is given as the hierarchically designed
Montgomery multiplier, as shown in Fig. 1; i.e. each MR block
is given as a flattened gate-level netlist, and these MR blocks are
interconnected to construct the multiplier circuit.
For equivalence checking using AIG and SAT-based methods, a
miter is constructed between Spec and Impl, and the ABC tool [4]
and CSAT solver [13] are used. These tools cannot prove equivalence beyond 16-bit multiplier circuits within 24-hours; none of
the NIST-specified ECC circuits can be verified. This is exactly the
same observation made by the authors of [5] (cf. Table I & II in [5]).
When we apply the approach of [5], we are able to prove equivalence only up to 163-bit multipliers, beyond which the verification
tool of [5] runs into a memory explosion.
We apply our abstraction-based approach to derive the canonical word-level polynomials F1 , F2 from circuits C1 ,C2 and then
prove equivalence by checking if F1 = F2 (coefficient matching).
First, we use the SINGULAR computer algebra tool [18] to derive
the polynomial abstraction by computing a full Gröbner basis of
J + J0 (using the slimgb command), and find that the technique
is infeasible (memory explosion) beyond only 32-bit circuits; as
the full Gröbner basis using elimination orders is extremely large.
Finally, we apply the approach presented in Section 5 to specifically guide the search for the abstraction polynomial. Since this
approach constitutes only a sequence of polynomial divisions, we
exploit an F4-style reduction approach, described in [5] (Section 7),
for which we built a custom tool. All experiments are conducted on
Intel Xeon 6-core CPU running Scientific Linux 6.2 x86_64 with
96GB RAM. Timeout limit for all experiments, for all tools, was
restricted to 24 hours.
Table I depicts the time required to derive the polynomial abstraction from Mastrovito circuits. The tool takes the circuit as
input, performs a reverse topological traversal to determine RATO,
applies the approach presented in Section 5 and derives the polynomial representation Z = A · B. For up to 409-bit multipliers,
with 508K gates, our approach is successful. Table II depicts the
results for Montgomery multipliers. In the table, ’BLK A’ and
’B’ denote the input MR blocks, ’BLK Mid’ denotes the middle
block and ’BLK Out’ is the output block. While each block is an
MR block, some have been simplified by constant-propagation (recall, R = αk ), hence they have different sizes. First, a polynomial
is extracted for each MR block (gate-level to word-level abstraction), and then the approach is re-applied at word-level to derive the
input-output relation (solved trivially in < 1 second). Our approach
can extract the word-level polynomial for up to 571-bit circuits!
Table 1: Abstraction of Mastrovito multipliers. Time given in seconds, memory given in MB. T O = 24 hours.
Size (k)
# of Gates
163 233
283
409
571
153K 167K 399K 508K 1.6M
Time 4,351 5,777 40,114 72,708 TO
Our tool
Max Mem 162 168
381
509
-
7. CONCLUSION
This paper has presented a technique to derive a word-level, canonical polynomial representation from a circuit by modeling the function over the Galois field F2k . We show that this can be achieved by
computing a Gröbner basis of the ideal generated by the constraints
Table 2: Abstraction of Montgomery blocks. Time given in seconds, memory is given in MB
Circuit Size (k)
Blk A
Blk B
# of Gates
Blk Mid
Blk Out
Blk A
Blk B
Time
Blk Mid
Our Tool
Blk Out
Total Time
Max Mem
163 233 283
409
33K 55K 82K 168K
33K 55K 82K 168K
85K 163K 241K 502K
32K 54K 81K 168K
145 322 1,011 5,084
101 306 1,058 5,381
264 1,014 5,085 20,294
126 267 1,032 3,243
636 1,909 8,186 34,002
34
71
104
224
571
330K
330K
980K
328K
14,288
12,298
47,364
13,508
87,458
477
derived from the circuit using an elimination term order. To overcome the complexity of computing the Gröbner basis, we have proposed a refinement of the abstraction term order, using which we
can more efficiently guide the search for the word-level polynomial
abstraction. Using our approach, we can identify the polynomial
function and thus prove the correctness of Galois field multiplier
circuits with up to 571-bit data-path size.
8.
REFERENCES
[1] C. K. Koc and T. Acar, “Montgomery Multiplication in GF(2k )”,
Designs, Codes and Cryptography, vol. 14, pp. 57–69, 1998.
[2] Huapeng Wu, “Montgomery Multiplier and Squarer for a Class of
Finite Fields”, IEEE Transactions On Computers, vol. 51, May 2002.
[3] E. Biham, Y. Carmeli, and A. Shamir, “Bug Attacks”, in Proceedings
on Advances in Cryptology, pp. 221–240, 2008.
[4] A. Mishchenko, S. Chatterjee, R. Brayton, and N. Een,
“Improvements to Combinational Equivalence Checking”, in Proc.
Intl. Conf. on CAD (ICCAD), pp. 836–843, 2006.
[5] J. Lv, P. Kalla, and F. Enescu, “Efficient Grb̈ner Basis Reductions for
Formal Verification of Galois Field Arithmetic Circuits”, in IEEE
Trans. on CAD, vol. 32, pp. 1409–1420, 2013.
[6] D. Cox, J. Little, and D. O’Shea, Ideals, Varieties and Algorithms,
Springer-Verlag, 1997.
[7] Rudolf Lidl and Harald Niederreiter, Finite Fields, Cambridge
University Press, 1997.
[8] Z. Zilic and Z. Vranesic, “A deterministic multivariate interpolation
algorithm for small finite fields”, IEEE Trans. Comp., vol. 51, 2002.
[9] B. Buchberger, Ein Algorithmus zum Auffinden der Basiselemente
des Restklassenringes nach einem nulldimensionalen Polynomideal,
PhD thesis, Philosophiesche Fakultät an der
Leopold-Franzens-Universität, Austria, 1965.
[10] R. E. Bryant, “Graph Based Algorithms for Boolean Function
Manipulation”, IEEE Trans. on Comp., vol. C-35, pp. 677–691, 1986.
[11] M. Ciesielski, P. Kalla, and S. Askar, “Taylor Expansion Diagrams:
A Canonical Representation for Verification of Data-Flow Designs”,
IEEE Transactions on Computers, vol. 55, pp. 1188–1201, 2006.
[12] A. Jabir et al., “A Technique for Representing Multiple Output
Binary Functions with Applications to Verification and Simulation”,
IEEE Trans. on Comp., vol. 56, pp. 1133–1145, 2007.
[13] F. Lu, L. Wang, K. Cheng, and R. Huang, “A Circuit SAT Solver
With Signal Correlation Guided Learning”, in IEEE Design,
Automation and Test in Europe, pp. 892–897, 2003.
[14] A. Lvov, L. Lastras-Montaño, V. Paruthi, R. Shadowen, and
A. El-Zein, “Formal Verification of Error Correcting Circuits using
Computational Algebraic Geometry”, in Proc. Formal Methods in
Computer-Aided Design (FMCAD), pp. 141–148, 2012.
[15] E. Mastrovito, “VLSI Designs for Multiplication Over Finite Fields
GF(2m )”, Lecture Notes in CS, vol. 357, pp. 297–309, 1989.
[16] S. Gao, “Counting Zeros over Finite Fields with Gröbner Bases”,
Master’s thesis, Carnegie Mellon University, 2009.
[17] B. Buchberger, “A criterion for detecting unnecessary reductions in
the construction of a groebner bases”, in EUROSAM, 1979.
[18] W. Decker, G.-M. Greuel, G. Pfister, and H. Schönemann,
“S INGULAR 3-1-3 — A computer algebra system for polynomial
computations”, 2011, http://www.singular.uni-kl.de.