Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NK-Algorithm to Storage Sparse Matrix Dr. Nidhal Khdhair El Abbadi Al-Mustansiriyah University, College of Education [email protected] Abstract Many efficient algorithms to mapping sparse matrix have been developed. And all of them concentrated on reduce sparse matrix size by storage non-zero value each with corresponding location in matrix. This paper presents a novel algorithm ( NK-Algorithm ) to mapping or compress sparse matrix in different way, it gives output size always and almost increases with small fixed percent to size of non-zero value. It is applicable for all sparse matrices with different size of nonzero value. The algorithm has different performance behavior as a function of non-zero value size in matrix, and type of matrix. The analysis and experimental results show that the algorithm has better performance than the traditional algorithms. Key Words: Matrix, Sparse Matrix , Compression, Mapping, Reduce Size, NK-Algorithm . 1. Introduction A sparse matrix is a matrix that contains a sufficient number of zero elements to make special processing worthwhile. A sparse matrix in computing is an array where most of the elements have the same value (called the default value -- usually 0 or null) and only a few elements have a non-default value. Arrays are data structures used in programming languages, databases, and other computer systems. They use a value, called a subscript or an index that indicates a position to locate a particular value. A naive implementation may allocate space for the entire array, but this is inefficient since there are only a few non-default values. If the sparsity is known in advance, more space1 efficient implementations can be used which allocate space only for the non-default values. The natural idea to take advantage of the zeros of a matrix and their location was initiated by engineers in various disciplines. In the simplest case involving banded matrices, special techniques are straight forward to develop. Electrical engineers dealing with electrical networks in the 1960s were the first to exploit sparsity to solve general sparse linear systems for matrices with irregular structure. The main issue, and the first addressed by sparse matrix technology, was to devise direct solution methods for linear systems. These had to be economical, both in terms of storage and computational effort. Sparse direct solvers can handle very large problems that cannot be tackled by the usual “dense” solvers. Essentially, there are two broad types of sparse matrices: structured and unstructured. A structured matrix is one whose nonzero entries form a regular pattern, often along a small number of diagonals. A sparse matrix with regular structure is typically one in which the positions of the nonzero elements are easily encoded and evaluated. As example a diagonal matrix has its nonzero Elements in positions ( i, i ) , 1 ≤ i ≤ n. Alternatively, the nonzero elements may lie in blocks (dense submatrices) of the same size, which form a regular pattern, typically along a small number of (block) diagonals. A matrix with irregularly located Entries is said to be irregularly structured. A sparse matrix with irregular structure is typically one in which the positions of the nonzero elements are not easily encoded and evaluated. As a result, row, column, block indices are usually stored explicitly in a data structure that facilitates the basic operations and access patterns of the algorithm using the sparse matrix. The best example of a regularly structured matrix is a matrix that consists of only a few diagonals. Finite difference matrices on rectangular grids are typical examples of matrices with regular structure. Most finite element or finite volume techniques applied to complex geometries lead to irregularly structured matrices. When storing and manipulating sparse matrices on a computer, it is beneficial and often necessary to used specialized algorithms and data structures that take advantage of the sparse structure of the matrix. Operations using standard matrix structures and algorithms are slow and 2 consume large amounts of memory when applied to large sparse matrices. Sparse data is by nature easily compressed, and this compression almost always results in significantly less memory usage. Indeed, some very large sparse matrices are impossible to manipulate with the standard algorithms. it is desirable to develop techniques that can access the data in their compressed form and can perform logical operations directly on the compressed data. Such techniques ( see[3]) usually provide two mappings, one is forward mapping , it computes the location in the compressed dataset given a position in the original dataset . The other one is backward mapping; it computes the position in original dataset given a location in the compressed dataset. A compression method is called mapping-complete if it provides forward mapping and backward mapping. Many compression techniques are mapping – complete. The proposed algorithm in this paper is mapping-complete. 2. RELATED WORK Many numerical problems in real life applications such as engineering, scientific computing and economics use huge matrices with very few non-zero elements, referred to as sparse matrices. As there is no reason to store and operate on a huge number of zeros, it is often necessary to modify the existing algorithms to take advantage of the sparse structure of the matrix. Sparse matrices can be easily compressed, yielding significant savings in memory usage[4]. Several sparse matrix formats exist; each format takes advantage of a specific property of the sparse matrix, and therefore achieves different degree of space efficiency. Some of these sparse matrix formats are: 2.1 Coordinate Storage (COO) [6] A sparse matrix stores only non-zero elements to save space . The simplest sparse matrix storage structure is COO. The index structure is stored in three sparse vectors in COO. The first vector (non-zero vector) stores non-zero elements of the sparse matrix. Non-zero elements of the sparse matrix in information retrieval system correspond to the distinct terms that appear in each document. Second vector in COO is the column vector. Each element of column vector stores the column index for the 3 corresponding term in non-zero vector. The third vector is the row vector that stores the row index for each term in the non-zero vector [10]. Figure (2) shows COO storage structure for sample collection shown in figure (1). 0.44 0 0.22 0.22 0 0.22 0.8 0.8 0 0 0 0 0 1.4 0 0 0 0 0 0 0.7 0 0 0 0 0 0.4 0 0.4 0.4 0 0 0 0.7 0 0 0 0 0 0.7 0 0 0 0 0 0.7 0 0 0 0 0 0 0.7 0 Fig ( 1 ): Sample of sparse matrix non_zero_vector 0.44 0.8 0.8 1.4 0.22 0.7 0.4 0.22 0.7 0.7 0.7 0.4 0.7 column_vector 0 1 1 2 0 3 4 0 5 6 7 4 8 row_vector 0 0 1 1 2 2 2 3 3 3 3 4 4 Fig ( 2 ): COO Storage structure 2.2 Compressed Sparse Row (CSR) [3] CSR permits indexed access to rows. Similar to COO, CSR storage structure also consists of three sparse vectors, non-zero vector, column vector and row vector. Index structure differs in the formation of row vector. In CSR row vector consists of pointers to each row of the matrix. The row vector consists of only one element for each row of matrix and the value of element is the position of the first non-zero element of each row in non zero vector. Figure (3) shows CSR storage structure for sample shown in figure (1). 4 non_zero_vector 0.44 0.8 0.8 1.4 0.22 0.7 0.4 0.22 0.7 0.7 0.7 0.4 0.7 column_vector 0 1 1 2 0 3 4 0 5 6 7 4 8 row_vector 0 2 4 7 11 13 Fig ( 3 ): CSR Storage Structure 2.3 Compressed Sparse Column (CSC)[5] CSC in deviation to CSR and COO permits indexed access to column of the matrix. Similar to COO and CSR, CSC storage structure also consists of three sparse vectors, non-zero vector, column vector and row vector. Non-zero vector stores the non-zero elements of each column of the matrix and column vector stores pointer to the first non-zero element of each column. Row vector stores the row index associated with each non-zero element. Figure (4) shows storage structure for CSC for sample shown in figure (1). non_zero_vector 0.44 0.22 0.22 0.8 0.8 1.4 0.7 0.4 0.4 0.7 0.7 0.7 0.7 column_vector 0 3 5 6 7 9 10 11 12 13 row_vector 0 2 3 0 1 1 2 2 4 3 3 3 4 Fig (4): CSC Storage Structure 2.4 Block Sparse Row (BSR) BSR is different from the other three algorithms that we discussed so far. Each element of non-zero vector in BSR is mapped to a non-zero square block of n_dimensions. The block row algorithm assumes that the number of nonzero elements in each row is a multiple of block size. Additional zeros are stored in a block to satisfy this condition. In BSR a non-zero vector is a rectangular array that stores non-zero blocks in row fashion, column vector stores the column indices of the first element of each non-zero block and row vector stores the pointer to each block row in the matrix. Figure 9 shows storage structure for BSR for sample shown in figure(1), taking a block size of (2). As earlier discussed that additional zeros can be added to make n dimensional blocks, they added a dummy 5 document as the last row and a dummy term as the last column with all zero elements to the initial matrix shown in figure(1). The modified matrix is presented in table(1), while the storage structure showed in figure(5). Non_zero_factor 0.44 0 Column_vector 0 Row_vector 0 0.8 0.8 2 2 0 1.4 0 6 0 0.22 0 0 0.7 0.4 0 0 0 0.4 0 0.7 0 0 0.22 0 0 0 0 0.7 0.7 0.7 0 0 0 0 2 4 6 4 8 8 Fig ( 5 ): BSR Storage Structure Table ( 1 ): Modified Matrix Generated for the Sample collection for BSR 0.44 0 0.22 0.22 0 0 2.5 0.8 0.8 0 0 0 0 0 1.4 0 0 0 0 0 0 0.7 0 0 0 0 0 0.4 0 0.4 0 0 0 0 0.7 0 0 0 0 0 0.7 0 0 0 0 0 0.7 0 0 0 0 0 0 0.7 0 0 0 0 0 0 0 Fixed-Size Blocking [2] In this approach, the matrix is written as sum of several matrices, some of which contain the dense blocks of the matrix with a prespecified size. For instance, given a matrix (A), you can decompose it into two matrices ( A12 ) and ( A11 ), such that : A = A12 + A11 Where (A12) contains the (1x2) dense blocks of the matrix and (A11) Contains the remainder. An example is illustrated in Figure (6). A simple greedy algorithm is Sufficient to extract the maximum number of (1x L) blocks where( 1< L ≤ n) matrix. However, the problem is more difficult for (k x L) blocks for( k >1). A similar problem has been studied in the context of vector processors [ 1], but those efforts concentrate on finding fewer blocks of larger size. 6 x x x x x x x x x x x = x A x x x + x x x x = x x x x x A12 + A11 Fig( 6 ) : fixed size blocking with 1x2 blocks 2.6 Blocked Compressed Row Storage (BCRS) [2] The idea of this scheme is to exploit the non-zeros in contiguous locations by packing them. Unlike fixed-size blocking, the blocks will have variable lengths. This enables longer nonzero strings to be packed into a block. As in fixed-size blocking, if you know the column Index of the first nonzero in a block, then you will also know the column indices of all its other Non-zeros. In other words, only one memory indirection (extra load operation) is required for each block. This storage scheme requires an array (of length the number of blocks) in addition to the other three arrays used in CRS: a floating-point array (of length the number of non-zeros) to store the nonzero values, an array Rowptr of length the number of rows) to point to the position where the blocks of each row start. Nzptr stores the location of the first nonzero of each block in array Af. refer to this storage scheme as blocked compressed row storage (BCRS). Figure (7) presents an example of BCRS . 5. 0 0 0 0 1. 1. 2. 0 6. 7. 0 4. 1. 0 0 2. 0 3. 0 0 3. 0 0 3. Af = (5., 1., 7., 1. , 2. , 3. , 2. , 3. , 2. , 4. , 1. , 3. , 6. , 3. ) Colind = ( 1, 2, 4, 2, 3, 2, 5) Rowptr = ( 1, 2, 4, 5, 6, 8) Nzptr = (1, 4, 5, 7, 9, 11, 12, 13) Fig( 7 ) : Exampled of blocked compressed row storage 7 2.7 Jagged Diagonal Storage (JDS)[7] The Jagged Diagonal Storage format can be useful for the implementation of iterative methods on parallel and vector processors. Like the Compressed Diagonal format, it gives a vector length essentially of the size of the matrix. It is more space-efficient than CDS at the cost of a gather/scatter operation. A simplified form of JDS, called ITPACK storage or Purdue storage, can be described as follows. In the matrix from figure (8) all elements are shifted left: Fig( 8 ): Matrix ( A ) and shifted matrix After which the columns are stored consecutively. All rows are padded with zeros on the right to give them equal length. Corresponding to the array of matrix elements val(:,:), an array of column indices, col_ind(:,:) is also stored: The JDS format for the above matrix in using the linear arrays {perm, jdiag, col_ind, jd_ptr} is given in figure (9) (jagged diagonals are separated by semicolons) . Fig ( 9 ): JDS format for matrix (A) 8 2.8 Double-linked structure When a large matrix is sparse, with a high proportion of its entries zero (or some other fixed value), it is convenient to store only the non-zero entries of the matrix. The representation of such a sparse matrix is a doubly-linked structure. In this representation, each non-zero element belongs to two lists: a list of the non-zero elements of its column and of its row. Each list is ordered according to the appearance of the elements in the left-to-right or top-to-bottom traversal of the row or, respectively, column. 3. PROPOSED ALGORITHM 3.1 NK-Algorithm Computing compressed sparse matrix is a big challenge, since most large sparse matrices must be compressed for storage. It is more efficient to store only the non-zeros of a sparse matrix. This assumes that the sparsity is large, i.e., the number of non-zero entries is a small percentage of the total number of entries. If there is only an occasional zero entry, the cost of exploiting the sparsity actually slows down the computation when compared to simply treating the matrix as dense, meaning that all the values, zero and non-zero, are used in the computation. There are a number of common storage schemes used for sparse matrices, but most of the schemes employ the same basic technique. That is, compress all of the non-zero elements of the matrix into a linear array, and then provide some number of auxiliary arrays to describe the locations of the non-zeros in the original matrix. Goal of this paper is to develop efficient algorithm to compress or mapping sparse matrix, called ( NK-ALGORITHM ), which concentrate on two dimensional sparse matrix. NK-Algorithm depended on transform two dimensional sparse matrix (naming it matrix (A) , figure (11)) to two vectors (one dimensional array each) (naming (AV), and (AI) ). The first vector hold all non-zero value in matrix ( A ) with the same sequence ( given priority to rows, and from left to right), this vector name is (AV), and has the same type of matrix (A). 9 Second vector generate by the following steps: 1. Replace each non-zero value in matrix ( A ) with the value (1), thus the new matrix (RA) generated will consist of sequences of zero and ones ( 0, 1 ) figure (12). 2. The new generated matrix ( RA ) store in second vector ( naming (AI) ) at the same sequence in the new matrix (A) ( given priority to rows, and from left to right ), the type of matrix ( AI ) is either ( bit matrix ) ( each cell represent with one bit )… which is the best way to simplified access to matrix data. Or ( Byte matrix ) ( each cell represent with one byte, this done by grouping each 8-contigious bits with one byte , and then store its value in matrix ( AI ). The first cell in vector ( AV ) hold value which represent number of columns in matrix ( A ). At this case size of vector (AV) equal to : Number of non-zero value +1 While size of vector (AI) equal to: 12.5% size of matrix ( A ) Now size need to represent matrix ( A ) , by using NKAlgorithm = AV + AI It is clear that the result size need to represent any sparse matrix ( matrix ( A ) ) ( regardless its origin size, type, and number of non-zero value) is equal : Number of non-zero value in matrix ( A ) + 1 + 12.5% matrix ( A ) size Figure (10) express the relation between number of elements in matrix ( A ) ( size of matrix ( A ) ) and % of compression of matrix ( A ) , for different types of matrix elements, by using NK-Algorithm. Figure showed that the percentage of compression for each type of elements is constant regardless changing in the size of matrix (A). 10 % of compression Number of elements in matrix ( size of matrix (A)) Fig (10 ) : Show relation between % of compression and size of matrix (A) for different types of matrices ( size of non-zero value for this chart is 40% ) Now, to explain how to replace matrix ( A ) with the two vectors ( ( AV ) and ( AI ) ) , let take matrix ( A ) showed in figure (11) 0 9 0 0 4 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 4 0 0 3 0 0 0 0 4 5 0 0 6 0 0 13 0 6 0 0 0 2 7 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 20 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 6 0 16 0 3 0 0 0 0 0 0 18 0 0 Fig ( 11 ) : matrix ( A ) 11 0 0 0 3 0 0 2 0 0 0 0 0 0 0 0 5 0 0 15 0 0 0 0 0 0 0 0 0 0 7 Matrix ( A ) is two dimensional integer matrix ( 10 x 12 ) , that mean there are ( 120 ) integer element in it . If we regarded that each integer value represented with two bytes, then we need ( 240 Byte ) to store matrix ( A ). Matrix ( A ) is sparse matrix with ( 10% non-zero value ) . Vector ( AV ) of integer type = [ 12, 5, 3, 9, 4, 8, 2, 6, 7, 3, 4, 3, 12, 4, 5, 13, 6, 2, 18, 6, 16, 15, 4, 20, 7 ] Size of vector ( AV ) is ( 25 integer elements ) , and that equivalent to ( 50 bytes in memory). To generate vector (AI) , first replace each non-zero value in matrix (A) to get matrix showed in figure (12) 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 Fig (12) : new generated matrix ( RA ) ( after replacing non-zero value by one ) Then generate vector (AI) . Vector ( AI ) of bits type = [0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0,0, , 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1]. 12 OR Vector ( AI ) of byte type = [ 16, 138, 16, 8, 1, 132, 160, 4, 66, 17, 64, 8, 17, 34, 65 ] . Size of vector (AI) is ( 120 bit elements , which equivalent to (15) byte ) . Total size to represent matrix ( A ) ( after using NK-Algorithm ) : = 50 + 15 = 65 byte Which equivalent to 27% of matrix (A). % of compression for matrix ( A ) = 73 % Relation between ( % of compression ) and ( size of non-zero value in matrix ) is reverse relation , which mean decrease the size of non-zero value will increases the % of compression, and vise versa. Figure ( 13 ) show this relation for different types of matrix elements. 80 % of compression 70 60 50 2 Byte 40 4 Byte 30 8 Byte 20 10 0 20 25 30 35 40 45 50 % of non-zero value in matrix Fig (13) : Chart show relation between number of non-zero values in matrix and compression ratio for ( 2, 4, 8 Bytes which represent each element in array ) Note: Relation between percent of compression and type of matrix ( when using NK-Algorithm ) shown in figure (14) , when increases 13 number of bytes represent each element in matrix ( A ), will increase percent of compression as figure (14 ) show these relation for different size of non-zero value in matrix ( A ). 80 % of compression 70 60 25% 50 30% 40 35% 30 40% 20 10 0 1 2 4 8 No. of bytes represent each element in matrix Fig (14) : Chart show relation between number of bytes represent each element in matrix and compression ratio for ( 25, 30, 35, 40 % of non-zero values ) 3.2 How can Access the Matrix Data: This algorithm present simple and efficient way to access the elements in matrix ( A ) from the resulting matrices ( AV , and AI ) either for reading data or updating data .there are two ways to retrieve data from matrices ( AV , and AI ) : 1. By given coordinate ( x, y ), in this case the following part of program help to find corresponding data . z= ((x-1)*AV[1]+y ; /* z is the position of needed data in matrix AI*/ //value in position z in AI t= AI[z]; j=0; if ( t!=0){ for ( i=1; i<=z; i++) if (AI[i]==1) j++; cout<< AV[j+1];} else cout<<”Null Value”; 14 2. The second case is to search for specific data, to retrieve the corresponding coordinate, the following part of program help to find coordinate. Suppose value ( z ) is the value need to find it in matrix. J=1; For ( i=2;; i<=n; i++) // n is AV size If(AV[i]==z) { J=I; break; } if (j!=1) c=0; for ( i=1; i<=k; i++) // k is AI size { if(AI[i]==1) { c++; if(c==j) break;}} row = i/AV[1]; col = I % AV[1]; cout<< “Position = (”<<row <<”, “<<col<<”)”; 4. CONCLUSION NK-Algorithm gives a high percent of compression more than any known algorithms, it increase with increasing sparsity . NK-algorithm can use with different size of non-zero value in matrix (A) and give an efficient percent of compression, while the traditional algorithms will not be useful when increase percent of nonzero value ( size of results matrices may become more than the size of matrix ( A ) when the size of non-zero value more than ( 25% ) for many algorithms). Also it gives an easy access to the data for both reading and update. 15 Reference [1] R. C. Agarwal, F. G. Gustavson, and M. Zubair, “ A high performance algorithm using pre- processing for sparse matrix vector multiplication”, Proceedings of Supercomputing ’92, pp. 32–41. [2] P. Ali , M. T. Heath " Improving Performance of Sparse MatrixVector Multiplication" , Research supported by the Center for Simulation of Advanced Rockets, funded by the U.S.Department of Energy through the University of California under subcontract number B341494. [3] M.A. Bassiouni “Data compression in scientific and statistical databases” , IEEE Transactions on software Engineering Vol.SE-11,No. 10, 1985) [4] S. Fethulah , Georgi N.Gaydadjiev, Stamatis Vassiliadis " Sparse Matrix Storage Format" , Computer Engineering Laboratory, Electrical Engineering Mathematics and Computer Science ,Mekelweg 4, 2628CD Delft TU Delft [5] G. Nazli, J. Ankit , S. Qian " Comparative Analysis of Sparse Matrix Algorithms For Information Retrieval" , Information Retrieval Laboratory Illinois Institute of Technology Chicago, Illinois [6] P. Sergio , “ Sparse Matrix Technology”, Academic Press, London, 1984. [7] H. Shahadat " On Efficient Storage of Sparse Matrices", Department of Mathematics and Computer Science University of Lethbridge, Canada , Matheon Workshop 2006 [8] S.Stein, N. Goharian, “On the Mapping of Index Compression Techniques on CSR Information retrieval”, IEEE International Conference on Computing and Coding (ITCC’03), April 2003. 16