Download Data Structures in Java for Matrix Computations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Java ConcurrentMap wikipedia , lookup

Array data structure wikipedia , lookup

Transcript
Data Structures in Java for
Matrix Computations
Geir Gundersen
Department of Informatics
University of Bergen
Norway
Joint work with Trond Steihaug
Overview
We will show how to utilize Java’s native arrays for matrix computations.
 How to use Java arrays as a 2D array for efficient dense matrix
computation.
 How to create efficient sparse matrix data structure using Java arrays.
 Object-oriented programming have been favored in the last decade(s):
 Easy to understand paradigm.
 Straightforward to build large scale applications.
 Java will be used for (limited) numerical computations.
 Java is already introduced as the programming language in introductory courses in
scientific computation.
 Impact on computing will force new fields to use Java.
A “mathematical” 2D array
A 2D Java Array
 Array elements that refers to
another array creates a
multidimensional array.
A true 2D Java Array
Java Arrays
 Java arrays are true objects.
 Thus creating an array is object creation.
 The objects of an array of objects are not necessarily stored
continuously.
 An array of objects stores references to the actual objects.
 The primitive elements of an array are most likely stored
continuously.
 An array of primitive elements holds the actual values for those elements.
Frobenius Norm Example
The mathematical definition of the operation is:
m 1 n 1
s =
A
ij
i 0 j 0
These two next code examples shows the implementation of the mathimatical definition in
Java:
Loop-order (i,j):
Loop-order (j,i):
double s = 0;
double[] array = new double[m][n];
for(int i = 0;i<m;i++){
for(int j = 0;j<n; j++){
s+=array[i][j];
}
}
double s = 0;
double[] array = new double[m][n];
for(int j = 0;j<n;j++){
for(int i = 0;i<m; i++){
s+=array[i][j];
}
}
Frobenius Norm Example
 Basic observation:
 Accessing the
consecutive elements
in a row will be faster
then accessing
consecutive elements
in a column.
Matrix Multiplication Algorithms
 The efficiency of the matrix multiplication operation is dependent on
the details of the underlying data structure both hardware and
software.
 We discuss several different implementations using Java arrays as
the data structure:
 A straightforward matrix multiplication algorithm
 A package implementation that is highly optimized
 An algorithm that takes the row-wise layout into fully consideration and
uses the same optimizing techniques as the package implementation.
Matrix Multiplication Algorithms
Cij =
n1
A B
ik
k 0
k`j
i = 0,1,2,3,…,m-1
j=0,1,2,3,…,p-1
A straightforward matrix multiplication operation.
for(int i = 0; i<m;i++){
for(int j = 0;j<n;j++){
for(int k = 0;k<p;k++){
C[i][j] += A[i][k]*B[k][j];
}
}
}
Interchanging the three for loops give six distinct ways (pure row, pure column,
and partial row/column).
Matrix Multiplication Algorithms
m=n=p
80
115
138
240
468
Matrix Multiplication
Pure Row
Partial
(k,i,j) (i,k,j) (i,j,k) (j,i,k)
Pure Column
(j,k,i)
(k,j,i)
66
178
298
1630
13690
100
295
468
4458
56805
63
174
257
1538
13175
66
208
331
2491
27655
72
233
341
2617
28804
99
299
474
4457
58351
The loop orders tells us how the matrices involved gets traversed in the
course of the matrix multiplication operation.
We see the same time differences with pure row versus pure column as we
did with the Frobenius norm example. This is the same effect.
Matrix Multiplication Algorithms
 The time differences are due to accessing different object arrays when
traversing columns as opposed to accessing the same object array several
times (when traversing a row).
 For a rectangular array of primitive elements, the elements of a row will
be stored continuously, but the rows may be scattered.
 Differences between row and column traversing is also an issue in
FORTRAN, C and C++ but the differences are not so significant.
JAMA
 A basic linear algebra package implemented in Java.
 It provides user-level classes for constructing and manipulating real
dense matrices.
 It is intended to serve as the standard matrix class for Java.
 JAMA is comprised of six Java classes:
 Matrix:
 Matrix Multiplication: A.times(B)





CholeskyDecomposition
LUDecomposition
QRDecomposition
SingularValueDecomposition
EigenvalueDecomposition
Matrix Multiplication Operations
JAMA versus Pure-Row
 A comparison on input
AB is shown for square
matrices.
 The pure row-oriented
algorithm has an
average of 30 % better
performance than
JAMA's algorithm.
JAMA versus Pure-Row
 JAMA's algorithm is
more efficient than
the pure row-oriented
algorithm on input Ab
with an average
factor of two.
JAMA versus Pure-Row
 There is a significant
difference between
JAMA's algorithm versus
the pure row-oriented
algorithm on bTA with an
average factor of 7.
 In this case JAMA is less
efficient.
 The break even results.
Sparse Matrices
 A sparse matrix is usually
defined as a matrix where
"many" of its elements are
equal to zero
 We benefit both in time and
space by working only on the
nonzero data structure.
 Currently there is no packages
implemented in Java for matrix
computation on sparse
matrices, as complete as JAMA
(for dense matrices).
Sparse Matrix Concept
 The Sparse Matrix
Concept (SMC) is a
general objectoriented structure.
 The Rows objects
stores the arrays for
the nonzero values
and indexes.
Java Sparse Array
 The Java Sparse Array (JSA) format
is a new concept for storing sparse
matrices made possible with Java.
 One array for storing the references
to the value arrays and one for
storing the references to the index
arrays.
Java's native arrays can store
object references therefore the
extra Rows object layer in SMC is
unnecessarily in Java.
Compressed Row Storage
 The most commonly used
storage schemes for large
sparse matrices:
 Compressed Row/Column
Storage
 These storage schemes have
enjoyed several decades of
research
 The compressed storage
schemes have minimal memory
requirements.
Numerical Results
m=n=p
115
468
2205
4884
10974
17282
Sparse Matrix Multiplication
nnz(A)
nnz(C)
CRS
JSA
421
1027
1
2
2820
8920
19
17
14133
46199
21
38
147631
473734
185
169
219512
620957
207
228
553956
2525937
829
642
SMC
2
17
36
165
278
628
These numerical results shows that CRS, SMC and JSA have approximately
the same performance.
Sparse Matrix Update
 Consider the outer product abT of the two vectors a,b where many of the
elements are 0.
 The outer product will be a sparse matrix with some rows where all
elements are 0, and the corresponding sparse data structure will have
rows without any elements.
 A typical operation is a rank one update of an n x n matrix A:
Aij  Aij  aibj  i 0,1,2,...,
n1,j 0,1,2,...,
n1 
 where ai is element i in a and bj is element j in b. Thus only those rows
of A where ai is different from 0 need to be updated.
Numerical Results
m=n=p
115
468
2205
4884
10974
17282
Sparse Matrix Update
nnz(A)
nnz(B)
nnz(new A)
421
7
426
2820
148
2963
14133
449
14557
147631
2365
149942
219512
1350
220104
553956
324
554138
CRS
11
13
44
183
753
1806
JSA
0
1
8
8
8
11
These numerical results shows that JSA is more efficient than CRS with an
average factor of 78 which is significant.
Concluding Remarks
 Using Java arrays as a 2D array for dense matrices we need to consider
that the rows are independent objects.
 Other suggestion to eliminate the row versus column “problem”:
 Cluster row objects together in memory.
 Creating a Java array class, avoiding array of arrays.
 Java Sparse Array:
 Manipulating only the rows of the structure without updating or traversing the
rest of the structure, unlike Compressed Row Storage.
 More efficient, less memory requirements and have a more natural notation than
SMC.
 People will use Java for numerical computations, therefore it may be useful to invest
time and resources finding how to use Java for numerical computation.
 This work has given ideas of how some constructions in Java restricts natural
development (rows versus columns).

Java has flexibility that is not fully explored (Java Sparse Array).