* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Reduced Number Theoretic Transforms(RNTT)
Georg Cantor's first set theory article wikipedia , lookup
List of important publications in mathematics wikipedia , lookup
Series (mathematics) wikipedia , lookup
Proofs of Fermat's little theorem wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Volume and displacement indicators for an architectural structure wikipedia , lookup
Hyperreal number wikipedia , lookup
INFOTEH-JAHORINA Vol. 7, Ref. B-I-13, p. 107-111, March 2008. Extended Number Theoretic Transforms Milan Vukoslavcevic, Vojin Senk, Dragana Bajic*, Djordje Baralic**, Mathias Richter *** ***Infineon Technologies AG; *University of Novi Sad, Department of Telecommunication; **University of Kragujevac, Department of Mathematic contact: [email protected] Sadržaj – U ovome radu opisane su eXtended Number Theoretic Transforms (xNTT) koje predstavljaju proširenje Number Theoretic Transforms (NTT). xNTT uvode nove module, elemente generatore i generalno veće dužine transformacija umesto GCD(( p1  1), ( p2  1),..., ( pq  1)) , predložene xNTT transformacije imaju maksimalnu dužinu ( pq  1) pri čemu je p1  p2      pq ; m  p1e1  p2e2    pqq . e Abstract – In this paper an eXtended Number Theoretic Transforms (xNTT) is presented, giving an extension of Number Theoretic Transforms (NTT). It introduces new modulus, generating elements and in general longer length of transforms instead of old definition of NTT being GCD(( p1  1), ( p2  1),..., ( pq  1)) , the proposed xNTT transforms have maximal transformation length ( pq  1) where p1  p2      pq ; m  p1e1  p2e2    pqq . e 1. Introduction (or Transforms possessing the cyclic convolution property (CCP) can be used to compute convolutions of two long discrete integer sequences efficiently. Most famous algorithm for fast implementation of DFT-like transforms is FFT. FFT supports fast computation of DFT over finite and infinite algebraic structures and posses CCP. DFT over finite algebraic structures Z / mZ like fields and rings [1] are known as Number Theoretic Transforms or shortly NTT. Basic developments in this area were done by Agarwal and Burrus[2][3][4]. NTT attracted very fast a lot of popularity because of two reasons. First some types of NTTs do not need multiplications at all and they can compute exactly, i.e. without any rounding error. Still there are some areas where exact calculation are mandatory like multiplication of big numbers[1], cryptography[17], digital watermarking[16] and image/audio processing[5]. From the point of arithmetical complexity the fastest NTTs are the ones based on FFT (most = 5N log 2 N real operations In case of NTT (FFT implementation) operation count roughly is: N log 2 N - additions/subtractions (modulus m) N log 2 N - real multiplication (modulus m) 2 = 1.5N log 2 N real operation (modulus m) Such a difference in the operation count was quite attractive and Agarwal/Burrus [2] implemented an optimized NTT algorithm on a 32bit IBM 370/155. For convolutions of real sequences, they found NTT to be 3 to 5 times faster and more accurate than FFT. It is known that [4][5] that an NTT of length N in the ring Z / mZ posses the CCP when: N | GCD(( p1  1), ( p2  1),..., ( pq  1)) k efficient for lengths of 2 ). A drawback of using FFT over infinite algebraic structures (further referred as ‘real FFT’ because of real coefficients involved in computation) for correlation of real sequences is the need to use complex arithmetic due to the appearance of complex so called twiddle factors. The number of operations needed for real FFT transforms over infinite algebraic field roughly is: N log 2 N - complex additions/subtractions – (or 3N log 2 N - equivalent real operations) (1) m  p1e1  p2e2    pqq , p1  p2  ...  pq e In this paper we present an eXtended NTT, called shortly xNTT, which posses the CCP even for transform lengths: N xNTT  pq  1 (2) The main drawback of NTTs is the need to implement arithmetic modulus some specific number. For this reason in the past the most popular transforms were Fermat Number Transform (FNT) [2] and Mersenne transform(MNT) [18] which had the simplest modulus arithmetic and modulus 2 N log 2 N - equivalent real operations) N log 2 N - complex multiplications 2 107 U (Z / nZ )  U (Z / 2 e1 Z ) U (Z / p 2e2 Z )     U (Z / p qq Z ) reduction could be implemented in a couple of clocks [7]. Although faster methods for convolutions over rational numbers like Walsh transform have been discovered, NTT stays one of the fastest methods for convolution over finite algebraic structures. More references on the history of NTT theory can be found in [8]. The HW (hardware) implementation is consider in [9] where is proven that savings in comparison with real real FFT do exist. NTT can be also defined over complex sequences [10]. e  N r . max  LCM (| U (Z / p1ek Z ) |,..., | U (Z / pqq Z ) |) (4) e where cardinal number, U ( Z / piee Z ) - units in Z / piei Z . Observing xNTT behaviour over the base structures, according to Chinese Remainder Theorem-CRT, Z / mi Z , mi  piei , i  1,..., q , provides us with a better insight of parameter relationships. New existence conditions instead of calculation of different exponential sums, Greatest Common Divisor are based on knowing orders N i ( N  N i  ni ) of generator  in Z / mi Z and 2. Analysis of cyclic groups in Z/mZ The analysis of NTT is based on the finite multiplicative cyclic groups with neutral element for multiplication ‘1’. A goal of this paper is to define xNTT over every finite ring Z / mZ and over extended subset of cyclic groups over which NTT could not be defined. The structure of those cyclic groups is defined with so called structural theorem (3) (or Fundamental theorem of finitely generated Abelian groups): e U (Z / nZ )  U (Z / 2 e1 Z ) U (Z / p 2e2 Z )     U (Z / p qq Z ) (3) on the small number of simple operations, as shown in the following two chapters. 3. Existence of classic NTT and correlation core sum analysis Classic NTT (further denoted as cNTT) belongs to the group of DFT-like transformation and is defined as: which also shows what are the expected orders of cyclic groups in some ring Z / mZ . Cyclic groups existing in some ring Z / mZ can be visually presented by so called cycle graphs [11] based on decompositions stated by structural theorem. An example of the construction of an xNTT will be shown in the cycle graph for Z/15Z (Fig. 1) X k m N 1  x(n) nk (5) n 0 inverse NTT, noted INTT is defined as: N 1 xn  m N 1  X (k ) nk (6) k 0 where 14 | U ( Z / piee Z ) | is 11  m is operation ‘congruent modulus m’, m  p1 1 p2 2    pk k , p1  p2      pk . e 1 13 8 e NTT 2 4 e The convolution property of two sequences x(n), y(n) exists for any DFT-like transforms under following conditions: 7 X (k )  m xNTT Figure 1 Cycle Graph for Z/15Z N 1  x(m) mk (7) m 0 N 1  y(l ) l k It is obvious that while the Euler totient function gives us  (15)  8 we do not necessary have an Y (k )  m element of order 8 in Z/15Z in fact in this case the maximal order is 4 . According to the existence condition for NTT from this cycle graph, the maximal transform length is 2 . In the following chapters it is shown that xNTT can be constructed in this very case with a transform length of 4. The maximal possible cyclic group length N r . max can 1 N 1  W (k ) nk N k 0 W (k )  m X (k )  Y (k ) (8) l 0 r ( n)  m r ( n)  m be seen by applying least common multiple function over Cartesian factors (4), got by applying the structural theorem (3): 1 N N 1 N 1 N 1   x(m) y(l ) (10) ( m l  n ) k (11) k 0 l 0 m 0 introducing substitutes for occuring: m  l  n two cases are a ) m  l  n  f  N ; f  [0,1] 108 (9) b ) m  l  n  t  q  N ; q  [0,1] 1 N a) b) 1 N N 1 Nevertheless reasonable tradeoffs between transform length and output range can be achieve. N 1 N 1 l 0 k 0 N 1  x(m) y( fN  n  m)  kfN  x(m) y(t  qN  n  m)  l 0 4. Construction of LNTT k ( t  qN ) stands for ‘reduced Length NTT’. They are defined with the following existence conditions: LNTT k 0 Underlined parts are irrelevant due to periodicity of N . Therefore, the following condition is necessary for the convolution property:  N ; s  fN ;  0; s  fN N 1  ks  m  k 0 1. order ( )  N in ring 2. GCD( N , m)  1 3. (12) The first condition is a standard condition needed for the existence of a cyclic group, the second is used to be distinguished from cNTT, and the third condition is to have sum (12) satisfied. Since the second condition is different for cNTT and LNTT they are not overlapping, although they may exist over the same m . In order to get an equivalent set of existence conditions and to get better insight in the set of the conditions necessary for the existence of LNTT one should observe the behaviour of transforms in the base structures: Classic set of existence conditions for cNTT which are additionally used for searching for parameters of cNTT are [1][12]: 1. order ( )  N in ring Z / mZ 2. GCD( N , m)  1 GCD( s  1, m)  1; s  [1, N ) N 1 mi  piei , |   ks |m i  0 , s  [1, N ) , i  1,., q k 0 The maximal transform length is N . Correlation sum (12) have the interesting property of having, for every m , at least one transforms of length This will hold under the following two conditions (can be proved by the binomial formula): N xNTT max  pq  1 . This statement can be 1. At least in one base structures order of element is not N [3]. 2. confirmed by observing according to the CRT in each of the rings Z / mi Z , i  1,..., q there exists a cNTT of length a.) if mi  pi i  ( ( e N i  pi  1 which makes the only condition to be checked is: mq  p . So starting from the original eq q mi | N sum (12) one can get its reduced version: 1 || N | 1 Nq N 1 1  | |mq = |  N k 0 m ks N q 1   ks | k 0 b.) if ks k 0 |  |m1  m1  1 |mq = 1; s  fNq 0; s  fNq From the existence condition 2 follows that LNTT can not be defined for prime numbers, so m for LNTT is always composite. Also from the same condition since a common factor between N and m exists, the output range will not be as for cNTT but reduced to: = mq One can notice that the output range is reduced to mq but on the other side xNTT can be computed over a much wider set of modulus than only mq , in this case over every moduli m1  2 e1  ord ( ) m1  2 the only condition to be checked is: N 1  N i  pi  1 )  ni  1 ) , pi  2 possible instead observing sum (12) over modulus m to observe it over the biggest prime factor exponent  N  d m 1 ;  s  1  d  Tmax , Tmax  {T }max ,  s  1  d  T  GCD(T , m)  1 This sum will be referred to as ‘correlation core sum’ since by observing it one can derive other types of NTTs. This identity is not only important for the convolution property but for the existence of inverse NTT transforms [8]. 3. Z / mZ m dynmax = where mq | m . 109 m d Interesting is that LNTT exists even for even numbers for which cNTT does not exist. A disadvantage is that the output range is no longer equal to m , nevertheless a reasonable trade-off can be achieved. The HW architecture of LNTT is the same as for the cNTT with one additional dividing (at output) with d in the case of LNTT. long as m'  1 the convolution GCD( N , m' ) property holds. The output range of mNTT is: dynmax = m' GCD( N , m' ) 5. Construction of mNTT The benefits from mNTT are that all computations can be done over the modulus m , and only at the very last stage needs to be applied reduction modulus m' . mNTT stands for ‘reduced Modulus NTT’. mNTT are based on the observation that if the convolution property does not originally holds for m potentially (12) holds for some value m'  m , so mNTT exists under following conditions: 6. Conclusion 1. order ( )  N in ring Z / mZ N 1 | N |m ' ; s  fN ks 2.    m '  k 0  0; s  fN where s  [1, N ) and 1  m'  m . m' 3. >1 GCD( N , m' ) The reduced modulo by means of CRT : The Number Theoretic Transforms (cNTT) are explained and their extension xNTTs are introduced and discussed. cNTT have a maximal transform length of: N max  GCD(( p1  1), ( p2  1),..., ( pq  1)) while the newly introduced xNTT have a maximal length of: N max  pq  1 ; p1  p2  ...  pq m' can be computed simplier Those bigger transform lengths have a drawback of reduction in output range. Nevertheless reasonable tradeoffs can be achieved. From the point of HW implementations new transforms exist over every modulus m which enables the use of composite Mersenne/Fermat numbers (Table 1). Highly composite (hc.) transforms lengths are also available and are presented in the same table. With the progress in the cryptology [13][14][15] new favorable moduli were invented and xNTT could as well be defined over them successfully. The simplicity of the existence conditions observed over base structures enable us predicting the behaviors of xNTT over Z / mZ . m'  m c1  | N | m1 sw1  ...  cq  | N |mq swq where 1; (( N i  pi  1)  ni  1)  mi | N swi   0; others  , pi  2 , c i -coeficients of CRT decomposition Example: m  3 5 , c1  10 , c2  6 ,   2 , N  4  m'  10 This leads to a reduction in the final output range but up to the last stage the computation are done over the implementation simpler modulus m - like in Pseudo Fermat/Mersenne transforms[20], being a special case of this transfor. 7. Acknowledgments Special thanks to Infineon AG Software Defined Radio department for support and sponsorship for this research. Also special thanks to Ivan Stanojevic Department of telecommunication of Novi Sad and Catherine Thiallier Siemens AG Munich for a lot of patience and useful comments. 1 Differing from LNTT, some mNTT have N . There are cases when GCD( N , m' )  1 but as n Mersenne number cNTT Factor decomposition N max 2n  1 Fermat number cNTT xNTT N xNTT max 110 Factor decomposition N max 2n  1 xNTT N xNTT max 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 3 7 3*5 31 3*3*7 127 3*5*17 7*73 3*11*31 23*89 3*3*5*7*13 8191 3*43*127 7*31*151 3*5*17*257 131071 3*3*3*7*19*73 524287 3*5*5*11*31*41 7*7*127*337 3*23*89*683 47*178481 3*3*5*7*13*17*241 31*601*1801 3*2731*8191 7*73*262657 3*5*29*43*113*127 233*1103*2089 3*3*7*11*31*151*331 2147483647 3*5*17*257*65537 2 (hc.2) 6 (hc.2) 2 (hc.2) 30 (hc.2) 2 (hc.2) 126 (hc.2) 2 (hc.2) 6 (hc.2) 2 (hc.2) 22 (hc.2) 2 (hc.2) 8190 (hc.2) 2 (hc.2) 6 (hc.2) 2 (hc.2) 131070 (hc.2) 2 (hc.2) 524286 (hc.2) 2 (hc.2) 6 (hc.2) 2 (hc.2) 46 (hc.2) 2 (hc.2) 30 (hc.2) 2 (hc.2) 6 (hc.2) 2 (hc.2) 58 (hc.2) 2 (hc.2) 2147483646 (hc.2) 2 (hc.2) 2 (hc.2) 6 (hc.2) 4 (hc.4) 30 (hc.2) 6 (hc.2) 126 (hc.2) 16 (hc.16) 72 (hc.8) 30 (hc.2) 88 (hc.8) 12 (hc.4) 8190 (hc.2) 126 (hc.2) 150 (hc.2) 256 (hc.256) 131070 (hc.2) 72 (hc.8) 524286 (hc.2) 40 (hc.8) 336 (hc.16) 682 (hc.8) 178480 (hc.16) 240 (hc.16) 1800 (hc.8) 8190 (hc.2) 262656 (hc.512) 126 (hc.16) 2088 (hc.8) 330 (hc.2) 2147483646(hc.2) 65536 (hc. 65536) 5 3*3 17 3*11 5*13 3*43 257 3*3*3*19 5*5*41 3*683 17*241 3*2731 5*29*113 3*3*11*331 65537 3*43691 5*13*37*109 3*174763 17*61681 3*3*43*5419 5*397*2113 3*2796203 97*257*673 3*11*251*4051 5*53*157*1613 3*3*3*3*19*87211 17*15790321 3*59*3033169 5*5*13*41*61*1321 3*715827883 641*6700417 4 (hc.4) 2 (hc.2) 16 (hc.16) 2 (hc.2) 4 (hc.4) 2 (hc.2) 256 (hc.256) 2 (hc.2) 4 (hc.4) 2 (hc.2) 16 (hc.16) 2 (hc.2) 4 (hc.4) 2 (hc.2) 65536 (hc.65536) 2 (hc.2) 4 (hc.4) 2 (hc.2) 16 (hc.16) 2 (hc.2) 4 (hc.4) 2 (hc.2) 32 (hc.32) 2 (hc.2) 4 (hc.4) 2 (hc.2) 16 (hc.16) 2 (hc.2) 4 (hc.2) 2 (hc.2) 128 (hc.128) 4 (hc.4) 2 (hc.2) 16 (hc.16) 10 (hc.2) 12 (hc.4) 42 (hc.2) 256 (hc.256) 18 (hc.2) 40 (hc.8) 682 (hc.2) 240 (hc.16) 2730 (hc.2) 112 (hc.16) 330 (hc.2) 65536 (hc.65536) 43690 (hc.2) 108 (hc.4) 174762 (hc.2) 61680 (hc.16) 5418 (hc.2) 2112 (hc.64) ?2796202 (hc.2) 672 (hc.256) 4050 (hc.2) 1612 (hc.4) 87210 (hc.2) 15790320 (hc.16) 3033168 (hc.16) 1320 (hc.8) 715827882 (hc.2) 6700416 (hc.128) Table 1 Overview of maximal transforms lengths, in case of cNTT and xNTT 8. Literature [1] J. Pollard, “The fast fourier transform in a finite filed” Math.of Comp.,vol.25, pp.365-374, April 1971. [2] R.C. Agarwal and C.S. Burrus, “Fast OneDimensional Digital Convolution by Multidimensional Techniques”, IEEE Trans. Acoustics, Speech and Signal Processing ASSP-22, 1 (1974) [3] R. C. Agarwal and C. S. Burrus, “Fast Convolution Using Fermat Number Transform with Applications to Digital Filtering”, IEEE Transaction on acoustic, speech and signal processing, vol. ASSP-22, NO.2, April 1974. [4] H.J. Nussbaumer, “Fast Fourier Transformation and Convolution Algorithms”, Springer Series in Information Sciences [5] S.Gudvangen, Hogskulen i Buskerud, “Practical Application of Number Theoretic Transforms” NORSIG-99, september 1999. [6] Douglas G.Mayers, “Digital signal processing“, 1990 Prentice Hall. [7] E. Vegh and L. M. Leibowitz, “Fast Complex Convolution in Finite Rings”, IEEE Transactions on acoustic, speech and signal processing, August 1976 [8] M. Bhattacharya, R.Creutzburg, J. Astola, „Some Historical Notes on Number Theoretic Transform“, The 2004 International TICSP Workshop on Spectral Methods and Multirate Signal Processing Proceedings, SMMSP2004. [9] James H.McClellan, „Hardware Realization of a Fermat Number Transform“, IEEE Transactions on acoustic, speech, and signal processing, vol.ASSP24, NO.3, June 1976. [10] Irving S.Reed, T.K.Truong, „The use of Finite Fields to Compute Convolutions“, IEEE Transactions on Information Theory, VOL.IT-21, NO.2.March 1975. [11] Daniel Shanks, “Solved and Unsolved Problems in Number Theory”, Chelsea Publishing Co, NY. [12] P.J.Erdelski, “Exact convolutions by NumberTheoretic Transforms”, Naval Undersea Center, May 2,1975 [13] J. Solinas, "Generalized Mersenne numbers", Technical report CORR-39, Dept. of C&O, University of Waterloo, 1999 [14] Knuth, Donald E., Seminumerical Algorithms, Addison-Wesley, 1981 [15] Handbook of Applied Cryptography, by A. Menezes, P. van Oorschot, and S. Vanstone, CRC Press, 1996. [16] T.Hideaki; “Detection of Image Alterations Using Fragile Watermarks Based on Number Theoretic Transform”, IEICE General Conference 2002 [17 ] C. Porkodi, “Public Key Cryptosystem based on Number Theoretic Transforms”, International Journal of Math. Volume 2 Number 1 [18] C.M. Rader, “Discrete Convolutions via Mersenne Transforms” IEEE Trans.Computers C21, 1269 (1972). 111
 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            