Download Reduced Number Theoretic Transforms(RNTT)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Georg Cantor's first set theory article wikipedia , lookup

List of important publications in mathematics wikipedia , lookup

Infinity wikipedia , lookup

Series (mathematics) wikipedia , lookup

Arithmetic wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Volume and displacement indicators for an architectural structure wikipedia , lookup

Hyperreal number wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Addition wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Transcript
INFOTEH-JAHORINA Vol. 7, Ref. B-I-13, p. 107-111, March 2008.
Extended Number Theoretic Transforms
Milan Vukoslavcevic, Vojin Senk, Dragana Bajic*, Djordje Baralic**, Mathias Richter ***
***Infineon Technologies AG; *University of Novi Sad, Department of Telecommunication;
**University of Kragujevac, Department of Mathematic
contact: [email protected]
Sadržaj – U ovome radu opisane su eXtended Number Theoretic Transforms (xNTT) koje predstavljaju
proširenje Number Theoretic Transforms (NTT). xNTT uvode nove module, elemente generatore i generalno
veće dužine transformacija umesto GCD(( p1  1), ( p2  1),..., ( pq  1)) , predložene xNTT transformacije
imaju maksimalnu dužinu ( pq  1) pri čemu je p1  p2      pq ;
m  p1e1  p2e2    pqq .
e
Abstract – In this paper an eXtended Number Theoretic Transforms (xNTT) is presented, giving an extension of
Number Theoretic Transforms (NTT). It introduces new modulus, generating elements and in general longer
length of transforms instead of old definition of NTT being GCD(( p1  1), ( p2  1),..., ( pq  1)) , the
proposed xNTT transforms have maximal transformation length ( pq  1) where p1  p2      pq ;
m  p1e1  p2e2    pqq .
e
1. Introduction
(or
Transforms possessing the cyclic convolution
property (CCP) can be used to compute
convolutions of two long discrete integer sequences
efficiently. Most famous algorithm for fast
implementation of DFT-like transforms is FFT.
FFT supports fast computation of DFT over finite
and infinite algebraic structures and posses CCP.
DFT over finite algebraic structures Z / mZ like
fields and rings [1] are known as
Number
Theoretic Transforms or shortly NTT. Basic
developments in this area were done by Agarwal
and Burrus[2][3][4]. NTT attracted very fast a lot of
popularity because of two reasons. First some types
of NTTs do not need multiplications at all and they
can compute exactly, i.e. without any rounding
error. Still there are some areas where exact
calculation are mandatory like multiplication of big
numbers[1],
cryptography[17],
digital
watermarking[16] and image/audio processing[5].
From the point of arithmetical complexity the
fastest NTTs are the ones based on FFT (most
= 5N log 2
N real operations
In case of NTT (FFT implementation) operation
count roughly is:
N log 2 N - additions/subtractions (modulus m)
N
log 2 N - real multiplication (modulus m)
2
= 1.5N log 2 N real operation (modulus m)
Such a difference in the operation count was quite
attractive and Agarwal/Burrus [2] implemented an
optimized NTT algorithm on a 32bit IBM 370/155.
For convolutions of real sequences, they found NTT
to be 3 to 5 times faster and more accurate than
FFT.
It is known that [4][5] that an NTT of length N in
the ring Z / mZ posses the CCP when:
N | GCD(( p1  1), ( p2  1),..., ( pq  1))
k
efficient for lengths of 2 ). A drawback of using
FFT over infinite algebraic structures (further
referred as ‘real FFT’ because of real coefficients
involved in computation) for correlation of real
sequences is the need to use complex arithmetic due
to the appearance of complex so called twiddle
factors. The number of operations needed for real
FFT transforms over infinite algebraic field roughly
is:
N log 2 N - complex additions/subtractions –
(or
3N log 2 N - equivalent real operations)
(1)
m  p1e1  p2e2    pqq , p1  p2  ...  pq
e
In this paper we present an eXtended NTT, called
shortly xNTT, which posses the CCP even for
transform lengths:
N xNTT  pq  1
(2)
The main drawback of NTTs is the need to
implement arithmetic modulus some specific
number. For this reason in the past the most popular
transforms were Fermat Number Transform (FNT)
[2] and Mersenne transform(MNT) [18] which had
the simplest modulus arithmetic and modulus
2 N log 2 N - equivalent real operations)
N
log 2 N - complex multiplications 2
107
U (Z / nZ )  U (Z / 2 e1 Z ) U (Z / p 2e2 Z )     U (Z / p qq Z )
reduction could be implemented in a couple of
clocks [7].
Although faster methods for convolutions over
rational numbers like Walsh transform have been
discovered, NTT stays one of the fastest methods
for convolution over finite algebraic structures.
More references on the history of NTT theory can
be found in [8]. The HW (hardware)
implementation is consider in [9] where is proven
that savings in comparison with real real FFT do
exist. NTT can be also defined over complex
sequences [10].
e

N r . max  LCM (| U (Z / p1ek Z ) |,..., | U (Z / pqq Z ) |) (4)
e
where
cardinal
number,
U ( Z / piee Z ) - units in Z / piei Z .
Observing xNTT behaviour over the base structures,
according to Chinese Remainder Theorem-CRT,
Z / mi Z , mi  piei , i  1,..., q , provides us
with a better insight of parameter relationships.
New existence conditions instead of calculation of
different exponential sums, Greatest Common
Divisor
are based on knowing orders
N i ( N  N i  ni ) of generator  in Z / mi Z and
2. Analysis of cyclic groups in Z/mZ
The analysis of NTT is based on the finite
multiplicative cyclic groups with neutral element
for multiplication ‘1’. A goal of this paper is to
define xNTT over every finite ring Z / mZ and
over extended subset of cyclic groups over which
NTT could not be defined. The structure of those
cyclic groups is defined with so called structural
theorem (3) (or Fundamental theorem of finitely
generated Abelian groups):
e
U (Z / nZ )  U (Z / 2 e1 Z ) U (Z / p 2e2 Z )     U (Z / p qq Z ) (3)
on the small number of simple operations, as shown
in the following two chapters.
3. Existence of classic NTT and
correlation core sum analysis
Classic NTT (further denoted as cNTT) belongs to
the group of DFT-like transformation and is defined
as:
which also shows what are the expected orders of
cyclic groups in some ring Z / mZ . Cyclic groups
existing in some ring Z / mZ can be visually
presented by so called cycle graphs [11] based on
decompositions stated by structural theorem. An
example of the construction of an xNTT will be
shown in the cycle graph for Z/15Z (Fig. 1)
X k m
N 1
 x(n)
nk
(5)
n 0
inverse NTT, noted INTT is defined as:
N 1
xn  m N 1  X (k ) nk
(6)
k 0
where
14
| U ( Z / piee Z ) | is
11
 m is operation ‘congruent modulus m’,
m  p1 1 p2 2    pk k , p1  p2      pk .
e
1
13
8
e
NTT
2
4
e
The convolution property of two sequences x(n),
y(n) exists for any DFT-like transforms under
following conditions:
7
X (k )  m
xNTT
Figure 1 Cycle Graph for Z/15Z
N 1
 x(m)
mk
(7)
m 0
N 1
 y(l )
l k
It is obvious that while the Euler totient function
gives us  (15)  8 we do not necessary have an
Y (k )  m
element of order 8 in Z/15Z in fact in this case the
maximal order is 4 . According to the existence
condition for NTT from this cycle graph, the
maximal transform length is 2 . In the following
chapters it is shown that xNTT can be constructed in
this very case with a transform length of 4. The
maximal possible cyclic group length N r . max can
1 N 1
 W (k ) nk
N k 0
W (k )  m X (k )  Y (k )
(8)
l 0
r ( n)  m
r ( n)  m
be seen by applying least common multiple
function over Cartesian factors (4), got by applying
the structural theorem (3):
1
N
N 1 N 1 N 1
  x(m) y(l )
(10)
( m l  n ) k
(11)
k 0 l 0 m 0
introducing substitutes for
occuring:
m  l  n two cases are
a ) m  l  n  f  N ; f  [0,1]
108
(9)
b ) m  l  n  t  q  N ; q  [0,1]
1
N
a)
b)
1
N
N 1
Nevertheless
reasonable
tradeoffs
between
transform length and output range can be achieve.
N 1
N 1
l 0
k 0
N 1
 x(m) y( fN  n  m)  kfN
 x(m) y(t  qN  n  m) 
l 0
4. Construction of LNTT
k ( t  qN )
stands for ‘reduced Length NTT’. They are
defined with the following existence conditions:
LNTT
k 0
Underlined parts are irrelevant due to periodicity of
N . Therefore, the following condition is necessary
for the convolution property:
 N ; s  fN
;
 0; s  fN
N 1
 ks  m 
k 0
1. order ( )  N in ring
2. GCD( N , m)  1
3.
(12)
The first condition is a standard condition needed
for the existence of a cyclic group, the second is
used to be distinguished from cNTT, and the third
condition is to have sum (12) satisfied.
Since the second condition is different for cNTT
and LNTT they are not overlapping, although they
may exist over the same m . In order to get an
equivalent set of existence conditions and to get
better insight in the set of the conditions necessary
for the existence of LNTT one should observe the
behaviour of transforms in the base structures:
Classic set of existence conditions for cNTT which
are additionally used for searching for parameters
of cNTT are [1][12]:
1. order ( )  N in ring Z / mZ
2. GCD( N , m)  1
GCD( s  1, m)  1; s  [1, N )
N 1
mi  piei , |   ks |m i  0 , s  [1, N ) , i  1,., q
k 0
The maximal transform length is N . Correlation
sum (12) have the interesting property of having,
for every m , at least one transforms of length
This will hold under the following two conditions
(can be proved by the binomial formula):
N xNTT max  pq  1 . This statement can be
1. At least in one base structures order of element is
not N [3].
2.
confirmed by observing according to the CRT in
each of the rings Z / mi Z , i  1,..., q there exists a
cNTT
of length
a.) if mi  pi i  ( (
e
N i  pi  1 which makes
the only condition to be checked is:
mq  p . So starting from the original
eq
q
mi | N
sum (12) one can get its reduced version:
1
||
N
|
1
Nq
N 1
1
 | |mq = |

N
k 0
m
ks
N q 1
  ks |
k 0
b.) if
ks
k 0
|  |m1  m1  1
|mq =
1; s  fNq
0; s  fNq
From the existence condition 2 follows that LNTT
can not be defined for prime numbers, so m for
LNTT is always composite. Also from the same
condition since a common factor between N and
m exists, the output range will not be as for cNTT
but reduced to:
=
mq
One can notice that the output range is reduced to
mq but on the other side xNTT can be computed
over a much wider set of modulus than only mq , in
this case over every moduli
m1  2 e1  ord ( ) m1  2
the only condition to be checked is:
N 1

N i  pi  1 )  ni  1 ) ,
pi  2
possible instead observing sum (12) over modulus
m to observe it over the biggest prime factor
exponent
 N  d m 1 ;
 s  1  d  Tmax , Tmax  {T }max ,
 s  1  d  T  GCD(T , m)  1
This sum will be referred to as ‘correlation core
sum’ since by observing it one can derive other
types of NTTs. This identity is not only important
for the convolution property but for the existence of
inverse NTT transforms [8].
3.
Z / mZ
m
dynmax =
where mq | m .
109
m
d
Interesting is that LNTT exists even for even
numbers for which cNTT does not exist. A
disadvantage is that the output range is no longer
equal to m , nevertheless a reasonable trade-off can
be achieved.
The HW architecture of LNTT is the same as for the
cNTT with one additional dividing (at output) with
d in the case of LNTT.
long
as
m'
 1 the convolution
GCD( N , m' )
property holds.
The output range of mNTT is:
dynmax =
m'
GCD( N , m' )
5. Construction of mNTT
The benefits from mNTT are that all computations
can be done over the modulus m , and only at the
very last stage needs to be applied reduction
modulus m' .
mNTT stands for ‘reduced Modulus NTT’. mNTT
are based on the observation that if the convolution
property does not originally holds for m
potentially (12) holds for some value m'  m , so
mNTT exists under following conditions:
6. Conclusion
1. order ( )  N in ring
Z / mZ
N 1
| N |m ' ; s  fN
ks
2.    m ' 
k 0
 0; s  fN
where s  [1, N ) and 1  m'  m .
m'
3.
>1
GCD( N , m' )
The reduced modulo
by means of CRT :
The Number Theoretic Transforms (cNTT) are
explained and their extension xNTTs are introduced
and discussed. cNTT have a maximal transform
length of:
N max  GCD(( p1  1), ( p2  1),..., ( pq  1))
while the newly introduced xNTT have a maximal
length of:
N max  pq  1 ; p1  p2  ...  pq
m' can be computed simplier
Those bigger transform lengths have a drawback of
reduction in output range. Nevertheless reasonable
tradeoffs can be achieved. From the point of HW
implementations new transforms exist over every
modulus m which enables the use of composite
Mersenne/Fermat numbers (Table 1). Highly
composite (hc.) transforms lengths are also
available and are presented in the same table. With
the progress in the cryptology [13][14][15] new
favorable moduli were invented and xNTT could as
well be defined over them successfully.
The simplicity of the existence conditions observed
over base structures enable us predicting the
behaviors of xNTT over Z / mZ .
m'  m c1  | N | m1 sw1  ...  cq  | N |mq swq
where
1; (( N i  pi  1)  ni  1)  mi | N
swi  
0; others

, pi  2 , c i -coeficients of CRT decomposition
Example:
m  3 5 , c1  10 , c2  6 ,   2 , N  4 
m'  10
This leads to a reduction in the final output range
but up to the last stage the computation are done
over the implementation simpler modulus m - like
in Pseudo Fermat/Mersenne transforms[20], being a
special case of this transfor.
7. Acknowledgments
Special thanks to Infineon AG Software Defined
Radio department for support and sponsorship for
this research. Also special thanks to Ivan Stanojevic
Department of telecommunication of Novi Sad and
Catherine Thiallier Siemens AG Munich for a lot of
patience and useful comments.
1
Differing from LNTT, some mNTT have N .
There are cases when GCD( N , m' )  1 but as
n
Mersenne number
cNTT
Factor decomposition
N max
2n  1
Fermat number
cNTT
xNTT
N xNTT max
110
Factor decomposition
N max
2n  1
xNTT
N xNTT max
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
3
7
3*5
31
3*3*7
127
3*5*17
7*73
3*11*31
23*89
3*3*5*7*13
8191
3*43*127
7*31*151
3*5*17*257
131071
3*3*3*7*19*73
524287
3*5*5*11*31*41
7*7*127*337
3*23*89*683
47*178481
3*3*5*7*13*17*241
31*601*1801
3*2731*8191
7*73*262657
3*5*29*43*113*127
233*1103*2089
3*3*7*11*31*151*331
2147483647
3*5*17*257*65537
2 (hc.2)
6 (hc.2)
2 (hc.2)
30 (hc.2)
2 (hc.2)
126 (hc.2)
2 (hc.2)
6 (hc.2)
2 (hc.2)
22 (hc.2)
2 (hc.2)
8190 (hc.2)
2 (hc.2)
6 (hc.2)
2 (hc.2)
131070 (hc.2)
2 (hc.2)
524286 (hc.2)
2 (hc.2)
6 (hc.2)
2 (hc.2)
46 (hc.2)
2 (hc.2)
30 (hc.2)
2 (hc.2)
6 (hc.2)
2 (hc.2)
58 (hc.2)
2 (hc.2)
2147483646 (hc.2)
2 (hc.2)
2 (hc.2)
6 (hc.2)
4 (hc.4)
30 (hc.2)
6 (hc.2)
126 (hc.2)
16 (hc.16)
72 (hc.8)
30 (hc.2)
88 (hc.8)
12 (hc.4)
8190 (hc.2)
126 (hc.2)
150 (hc.2)
256 (hc.256)
131070 (hc.2)
72 (hc.8)
524286 (hc.2)
40 (hc.8)
336 (hc.16)
682 (hc.8)
178480 (hc.16)
240 (hc.16)
1800 (hc.8)
8190 (hc.2)
262656 (hc.512)
126 (hc.16)
2088 (hc.8)
330 (hc.2)
2147483646(hc.2)
65536 (hc. 65536)
5
3*3
17
3*11
5*13
3*43
257
3*3*3*19
5*5*41
3*683
17*241
3*2731
5*29*113
3*3*11*331
65537
3*43691
5*13*37*109
3*174763
17*61681
3*3*43*5419
5*397*2113
3*2796203
97*257*673
3*11*251*4051
5*53*157*1613
3*3*3*3*19*87211
17*15790321
3*59*3033169
5*5*13*41*61*1321
3*715827883
641*6700417
4 (hc.4)
2 (hc.2)
16 (hc.16)
2 (hc.2)
4 (hc.4)
2 (hc.2)
256 (hc.256)
2 (hc.2)
4 (hc.4)
2 (hc.2)
16 (hc.16)
2 (hc.2)
4 (hc.4)
2 (hc.2)
65536 (hc.65536)
2 (hc.2)
4 (hc.4)
2 (hc.2)
16 (hc.16)
2 (hc.2)
4 (hc.4)
2 (hc.2)
32 (hc.32)
2 (hc.2)
4 (hc.4)
2 (hc.2)
16 (hc.16)
2 (hc.2)
4 (hc.2)
2 (hc.2)
128 (hc.128)
4 (hc.4)
2 (hc.2)
16 (hc.16)
10 (hc.2)
12 (hc.4)
42 (hc.2)
256 (hc.256)
18 (hc.2)
40 (hc.8)
682 (hc.2)
240 (hc.16)
2730 (hc.2)
112 (hc.16)
330 (hc.2)
65536 (hc.65536)
43690 (hc.2)
108 (hc.4)
174762 (hc.2)
61680 (hc.16)
5418 (hc.2)
2112 (hc.64)
?2796202 (hc.2)
672 (hc.256)
4050 (hc.2)
1612 (hc.4)
87210 (hc.2)
15790320 (hc.16)
3033168 (hc.16)
1320 (hc.8)
715827882 (hc.2)
6700416 (hc.128)
Table 1 Overview of maximal transforms lengths, in case of cNTT and xNTT
8. Literature
[1] J. Pollard, “The fast fourier transform in a finite
filed” Math.of Comp.,vol.25, pp.365-374, April
1971.
[2] R.C. Agarwal and C.S. Burrus, “Fast OneDimensional
Digital
Convolution
by
Multidimensional Techniques”, IEEE Trans.
Acoustics, Speech and Signal Processing ASSP-22,
1 (1974)
[3] R. C. Agarwal and C. S. Burrus, “Fast
Convolution Using Fermat Number Transform with
Applications to Digital Filtering”, IEEE
Transaction on acoustic, speech and signal
processing, vol. ASSP-22, NO.2, April 1974.
[4] H.J. Nussbaumer, “Fast Fourier Transformation
and Convolution Algorithms”, Springer Series in
Information Sciences
[5] S.Gudvangen, Hogskulen i Buskerud, “Practical
Application of Number Theoretic Transforms”
NORSIG-99, september 1999.
[6] Douglas G.Mayers, “Digital signal processing“,
1990 Prentice Hall.
[7] E. Vegh and L. M. Leibowitz, “Fast Complex
Convolution in Finite Rings”, IEEE Transactions
on acoustic, speech and signal processing, August
1976
[8] M. Bhattacharya, R.Creutzburg, J. Astola,
„Some Historical Notes on Number Theoretic
Transform“, The 2004 International TICSP
Workshop on Spectral Methods and Multirate
Signal Processing Proceedings, SMMSP2004.
[9] James H.McClellan, „Hardware Realization of a
Fermat Number Transform“, IEEE Transactions on
acoustic, speech, and signal processing, vol.ASSP24, NO.3, June 1976.
[10] Irving S.Reed, T.K.Truong, „The use of Finite
Fields to Compute Convolutions“, IEEE
Transactions on Information Theory, VOL.IT-21,
NO.2.March 1975.
[11] Daniel Shanks, “Solved and Unsolved
Problems in Number Theory”, Chelsea Publishing
Co, NY.
[12] P.J.Erdelski, “Exact convolutions by NumberTheoretic Transforms”, Naval Undersea Center,
May 2,1975
[13] J. Solinas, "Generalized Mersenne numbers",
Technical report CORR-39, Dept. of C&O,
University of Waterloo, 1999
[14] Knuth, Donald E., Seminumerical Algorithms,
Addison-Wesley, 1981
[15] Handbook of Applied Cryptography, by A.
Menezes, P. van Oorschot, and S. Vanstone, CRC
Press, 1996.
[16] T.Hideaki; “Detection of Image Alterations
Using Fragile Watermarks Based on Number
Theoretic Transform”, IEICE General Conference
2002
[17 ] C. Porkodi, “Public Key Cryptosystem based
on Number Theoretic Transforms”, International
Journal of Math. Volume 2 Number 1
[18] C.M. Rader, “Discrete Convolutions via
Mersenne Transforms” IEEE Trans.Computers C21, 1269 (1972).
111