Download Lazy Evaluation in Numeric Computing

Lazy Evaluation in Numeric Computing 20 Октября 2006 Игорь Николаевич Скопин Agenda • Elementary introduction to functional programming: reduction, composition and mapping functions • Lazy Evaluation: – – – – – Scheme Principles Imperative Examples Analysis Necessity in Procedure Calls • Dependencies of Lazy Evaluation on the Programming Style • Newton-Raphson Square Roots – Functional Program • Numerical Differentiation • FC++ library – “Lazy list” data structure in FC++ • Memoization • Lazy Evaluation in Boost/uBLAS: Vectors/Matrices Expressions – Vectors/matrices expressions and memoization – style of programming • The Myths about Lazy Evaluations • Q&A Gluing Functions Together: reduce list of x = nil | cons x (list of x) []  nil specific to sum [1]  cons 1 nil sum nil = 0 sum (cons num list) = num + sum list [1,2,3]  cons 1 (cons 2 (cons 3 nil)) sum = reduce add 0 add x y = x + y (reduce f x) l reduce f x nil = x reduce sum 0 ( 1 , 2 , 3 , [ ] ) ⇒ reduce f x (cons a l) = f a (reduce f x l) 1+2+3+0 reduce multiply 1( 1 , 2 , 3 , [ ] ) ⇒ product = reduce multiply 1 1*2*3*1 (reduce cons nil) α ⇒ α //copy of α reduce (:) [ ] // — Haskell reduce — is a function of 3 arguments, but it applies to 2 only ⇒ the result is a function! append a b = reduce cons b a append [1,2] [3,4] = reduce cons [3,4] [1,2] = (reduce cons [3,4]) (cons 1 (cons 2 nil)) f x a l = cons 1 (reduce cons (cons 3 (cons 4 (cons nil)))) (cons 2 nil)) f x a l = cons 1 (cons 2 (reduce cons (cons 3 (cons 4 (cons nil))) nil)) = cons 1 (cons 2 ((cons 3 (cons 4 (cons nil)))) Gluing Functions Together: Composition and Map A function to double all the elements of a list reduce f nil gives expansion f to list doubleall = reduce doubleandcons nil where doubleandcons num list = cons (2*num) list specific to double Further: doubleandcons = fandcons double where An arbitrary function double n = 2*n fandcons f el list = cons (f el) list Function composition — standard operator “.”: (f . g) h = f (g h) This definition is correct: So fandcons f el = (cons . f) el fandcons f = cons . f = cons (f el) Next version of doubleall: fandcons f el list = cons (f el) list doubleall = reduce (cons . double) nil Function map (for all the elements of list): specific to double map f = reduce (cons. f) nil One more example: Final version of doubleall : summatrix = sum . map sum doubleall = map double Lazy Evaluation: Scheme F and G — programs: Using a temporary file ( G.F ) input  G ( F input ) It is possible: F input → tF; G tF, but it is not good! The attractive approach is to make requests for computation: More precisely: G: Hold up Hold up G: Needed data produced by F Hold up F: Data are ready Hold up Needed data F: … Resume F Data are ready Lazy Evaluation: Principles Postulates: • • • Any computation is activated if and only if it is necessary for one or more other computations. This situation is named as necessity of computation. An active computation is stopped if and only if its necessity vanishes (for example it has been satisfied). The computation, as a whole, is activated forcibly as a request (necessity) to obtain the results of computational system execution. Consequences: • • The activation of all computational units is driven by a dataflow started when a necessity of computation as a whole arises. A control flow of the computation is not considered as a priori defined process. It is formed up dynamically by the necessities of computations only. Lazy Evaluation: Imperative Examples Boolean expression ℜ ≡ α&β&γ : (α == false) ⇒ ℜ == false (α == true) & (β == false) ⇒ ℜ == false if ( (precond) && (init) && (run) && (close) ) { printf (“OK!”); } else … Arithmetic expression x = ( … ) * 0; ⇒ x = 0; without computation of (…)! Necessity of computation may not appear! Lazy and eager evaluation of files handling cat File_F | grep WWW | head -1 When we write Subject of next slide Vector/matrix expressions The compiler does the following: vector a(n), b(n), c(n); a = b + c + d; The same for matrices Not everything is so good! We’ll discuss this later Vector* _t1 = new Vector(n); for(int i=0; i < n; i++) _t1(i) = b(i) + c(i); Vector* _t2 = new Vector(n); for(int i=0; i < n; i++) _t2(i) = _t1(i) + d(i); for(int i=0; i < n; i++) a(i) = _t2; delete _t2; delete _t1; So we have created and deleted two temporaries! for(int i=0; i < n; i++) a(i) = b(i) + c(i) + d(i); Analysis of cat File_F | grep WWW | head -1 Lazy and Eager Evaluation Eager variant of execution: stdin stdout grep WWW stdin stdout stdin stdout grep WWW We omit that the Suppose needed string intermediate was not steps here detected at first stdout head -1 One string For F strings of File_F A nice question: is this version more correct? Note: It is the same object in both cases! One string with ‘WWW’ cat File_F One string All strings of File_F Lazy variant of execution: stdout head -1 Strings with ‘WWW’ cat File_F stdin stdout UNIX pipeline may be considered as optimization of lazy evaluation (in this case)! Analysis of Vector/matrix Example Consider the example more closely: a = b + c + d; t1 = b + c; t2 = t1 + d; a = t2; Necessity of Order of calculations Traditional scheme Lazy evaluation a b c d a b c d computation (↓) appears only when a [i] is I needed + t1 II + = = t2 III I ⇒ II ⇒ III (or I ⇒ II U III) + b[i] a[i] + a[i] d[i] c[i] + b[i] c[i] d[i] Necessity in Procedure Calls procedure P (in a, out b); … P (6*8, x); … r = x*5; … P ( in a, out b); 6*8 { … a… X … b = …; … } When does the necessity of computations appear? in parameters Q ( in a ); • Real and forced necessity { … a = 7; … • There are many details … r = 9; … • Ingerman’s thunks (algol 60) … t = 2 + a; … out parameters …} • Only forced necessity is possible in imperative languages … What hampers the real necessity? r = 1; • Dependences on context Q ( r + 5); • Possibility of reassignment for variables SISAL and others languages with a single assignment — this palliative seems to be not good Dependencies of Lazy Evaluation on the Programming Style Functional programming • Exactly needed necessities • Automatic dataflow driven necessities Operational programming • Forcibly arising necessities • Manual control flow and agreement driven necessities • Control flow and data • Combining functions and composition oriented transforming oriented approach approach • Declarative programming • Imperative programming Nevertheless, there are useful possibilities to apply lazy evaluation in both cases! The key notion is the definition of necessities. Functional style may be characterized as a style that allows automatic dataflow driven necessities. Newton-Raphson Square Roots Functional programs are inefficient. Is it true? Algorithm: – starting from an initial approximation a0 – computing better approximation by the rule a(n+1) = (a(n) + N/a(n)) / 2 If the approximations converge to some limit a, then a = (a + N/a) / 2 so 2a = a + N/a, a = N/a, a*a = N ⇒ a = squareroot(N) Imperative program (monolithic): X = A0 Y = A0 + 2.*EPS 100 IF (ABS(X-Y).LE.EPS) GOTO 200 Y=X X = (X + N/X) / 2. GOTO 100 200 CONTINUE This program is indivisible in conventional languages. We want to show that it is possible to obtain: • simple functional program • technique of its improving •The result is a very expressive program! Newton-Raphson Square Roots: Functional Program • First version: next N x = (x + N/x) / 2 [a0, f a0, f(f a0), f(f(f a0)), ..] repeat f a = cons a (repeat f (f a)) repeat (next N) a0 within eps (cons a (cons b rest)) = b, if abs(a-b) <= eps = within eps (cons b rest), otherwise sqrt a0 eps N = within eps (repeat (next N) a0) • Improvement: relative eps (cons a (cons b rest)) = b, if abs(a-b) <= eps*abs(b) = relative eps (cons b rest), otherwise relativesqrt a0 eps N = relative eps (repeat (next N) a0) Numerical Differentiation easydiff f x h = (f (x + h) - f (x)) / h A problem: small h ⇒ small (f ( x + h ) - f (x)) ⇒ error differentiate h0 f x = map (easydiff f x) (repeat halve h0) halve x = x/2 within eps ( differentiate h0 f x ) (1) But the sequence of approximations converges fairly slowly elimerror n (cons a (cons b rest)) = = cons ((b*(2**n)-a)/(2**n-1)) (elimerror n (cons b rest)) But n is unknown order (cons a (cons b (cons c rest))) = round(log2((a-c)/(b-c)-1)) So a general function to improve a sequence of approximations is improve s = elimerror (order s) s More efficient variants: within eps (improve (differentiate h0 f x)) (2) n ≈ of log (ai+2 – and ai)we / is (athe ai) term – 1 B*h ) order A isnthe right answer B error . method Using halve Let property the obtain the fourth 2( sequence i+1 – n*hn and a(i+1) = A + B*(h**n). Then a(i) = A + B*2 within eps (improve (improve (improve (differentiate h0 f x)))) (3) Using the following A = (a(i+1)*(2**n) — a(i)) / 2**n – 1 super s = map second (repeat improve s) second (cons a (cons b rest)) = b Here repeat improve is used to get a sequence of more and more improved sequences. So we obtain very difficult algorithm in easy manner within eps (super (differentiate h0 f x)) (4) FC++ library • High order functions — functions with functional arguments • FC++ library is a general framework for functional programming • Polymorphic functions—passing them as arguments to other functions and returning them as results. • Support higher-order polymorphic operators like compose(): a function that takes two functions as arguments and returns a (possibly polymorphic) result • Large part of the Haskell • Support for lazy evaluation • transforming FC++ data structures ⇔ data structures of the C++ Standard Template Library (STL) • operators for promoting normal functions into FC++ functoids. Finally, the library supplies • “indirect functoids”: run-time variables that can refer to any functoid with a given monomorphic type signature. “Lazy list” data structure in FC++ List<int> integers = enumFrom(1); // infinite list of all the integers 1, 2, List<int> evens = filter(even, integers); // infinite list of all the integers 2, 4, bool prime( int x ) { ... } // simple ordinary algorithm filter( ptr_to_fun(&prime), integers ); // ptr_to_fun transform normal function to functoid plus ( x, y ) ⇒ x+y plus ( 2 ) ⇒ 2+x Limitations – Lambda functions – Dependences on context (?) – Possibility of reassignment for variables (?) – Template technique is insufficient (blitz++) Memoization Fibonacci example: F (n) = F (n-1) + F (n-2) It is a classical bad case for imperative computations. Why? Previous functional programs are easier for development and understanding. Why? Imperative scheme Functional scheme 1. We need to call expressions explicitly only 2. Procedure calls depend on the context 3. Strong sequence of computation units (hard for parallelization) 4. If we want to memoize previous results we should do this explicitly 5. Memoization process is controlled by programmer 6. Only manual transforming to a suitable scheme of data representation is possible 7. Circle head and circle body are joined monolithicly 8. Controlflow centric approach 9. Require a difficult technique for defuse chains analysis 1. Expressions do not depend on context 2. Context independent procedure calls 3. The sequence of computation units is chosen by execution system (more flexible for parallelization) 4. Automatic memoization 5. Memoization process is not controlled, but filters are allowed (indirect control) 6. Stack technique of program representation and execution is not appropriate 7. Constructions like reduce and composition of functions allow considering circle head and body independently 8. Dataflow centric approach 9. Suitable for def-use chains analysis Lazy Evaluation in Boost/uBLAS: Vectors/Matrices Expressions Example: A + prod (B,V) – Assignment activate evaluation – Indexes define evaluation of expression tree: if we write the example, we initiate the following computation for all i { A [i] + ∑k (B [i,k] *V[k]) } (1) • “for all” means a compatibility of computations (order of computations is chosen by the execution system) • It is possible (but not necessary) to have the following representation: A + [ ∑k (B [i,k] *V[k])] (”[ ]“denotes a vector constructor) • Necessity of computation is defined by this fragment for each i (may be dynamically) • Postulate that “=“ operator always leads to appearance of necessity of computations: its left hand side should be computed and assigned to the right hand side: D = A + prod (B,V); • Common expression tree is used for computations for each i in (1) – Types coordination is hold: • Correct vectors/matrices expression includes constituents with types allowed by operators of expression (including prod and others) Boost/uBLAS: vectors/matrices expressions: temporary and memoization problems • Let us consider x = A*x expression (this is an error from the functional style viewpoint, but correct for operational C++) – Naive implementation is for all i { x [i] = ∑k (A [i,k] * x[k]); } It is not correct! – The suitable implementation should use a temporary t : for all i { t [i] = ∑k (A [i,k] *x[k]); } x = t; The last assignment should not be a copy of the value, but a reference coping and deleting the previous value of x. • Let us consider A * ( B * x). – If we don’t use the lazy evaluation, we obtain an n2 complexity (C1*n2 for B * x plus C2*n2 for other multiplication). – But in a straightforward lazy case the obtained complexity would be n3. • It is not a problem for a real functional language implementation because of a value propagation technique (automatic memoization) • Instead of the value propagation technique in C++ implementation, we can provide a temporary. Its assignment breaks the expression • We use temporaries in both cases. But what part of information should be really saved? Boost/uBLAS: style of programming • Object-oriented style – Standard technique and patterns should be used • C++ style (as addition to the previous) – Standard template technique • Vectors/matrices expressions (as addition to the previous) – Tendency to use matrix and vector objects instead of variables with indexes – Tendency to write expressions instead of simple statements – Use uBLAS primitives as specializations of general templates – Don’t use direct classes extensions by multilevel inheritance Boost/uBLAS. Example task: Jacobi method • Let us consider the Jacobi method of solving the linear system n a j 1 i, j x j  bi – We are able to write it using such formulas as xi( k )  (bi   ai , j xi( k 1) ) / ai ,i j i – Instead of this, we should use it in matrix terms as x( k )  D1 ( L  U ) x( k 1) )  D1b – As the result, we obtain the following program (next slide) Boost/uBLAS. Example program Jacobi method x( k )  D1 ( L  U ) x( k 1) )  D1b 11 11 matrix<double> A (n, n); vector<double> B (n); U U(A) 11 11 D A.. For input data *X=B .. vector<double> X (n); For output data matrix<double> D (n, n); Diagonal L L(A) .. 11 Example task in matrix terms matrix<double> D_1 (n, n); Diagonal-1 triangular_adaptor<matrix<double>, unit_lower> L (A); triangular_adaptor<matrix<double>, unit_upper> U (A); identity_matrix<double> I(n,n); D-1 Element of uBLAS data structure … D = A - L - U + 2*I; Diagonal receiving D_1 = inverse_matrix(D); We need to write a special function! for (i = 0; i < count; ++ i) X = -prod(prod( D_1,L+U-2*I),X) + prod (D_1,B); Boost/uBLAS style of programming in comparison of indexes using for ( k = 0; k < count; ++ k ) { for (i = 0; i < A.size1 (); ++ i) { T (i) = 0; We force a temporary using! for (j = 0; j < i; ++ j) We force to divide the process off T (i) += A (i, j) * Y (j); for selecting an diagonal for (j = i+1; j < A.size1 (); ++ j) activities with diagonal T (i) += A (i, j) * Y (j); -1 (This case may be better Using D T (i) = ( B (i) – T (i) ) / A (i, i); than vector/matrix expression) } Y = T; } • Obviousness is lost • Sequence of computations is stated hard (losses of possibilities for compiler’s optimization) • It is harder for development of programs than in the alternative case • Possibilities of indexes using are not lost in vectors/matrices expressions • Vectors/matrices expressions are more suitable to finding patters needed for using special optimized external libraries Boost/uBLAS: Gauss-Seidel method Jacoby method: x (k ) ~ ( k 1) ~  Ax b x ( k )  F ( x ( k 1) , x ( k ) ) xi( k )  (bi   ai , j x (jk )   ai , j x (jk 1) ) / ai ,i j i j i X(k-1) X(k) X(k-1) x i( k) x( k )  F ( x( k 1) ) Gauss-Seidel method: X(k) x i( k) x( k )  ( D  L)1 (Ux( k 1)  b) Gauss-Seidel method: … D_1 = inverse_matrix(D+L); Jacoby method: … D_1 = inverse_matrix(D); for (i = 0; i < count; ++ i) for (i = 0; i < count; ++ i) X = prod( D_1, B - prod (U, X)); X = - prod(prod( D_1,L + U-2*I),X) + prod (D_1,B); Recall x=Ax problem. It solves by temporary: t=Ax; x=t. But Seidel method may be consider as Jacoby method when the temporary is avoided! It is a new approach to organizing of computations! Boost/uBLAS: style of programming (limitations) • Frequently a rejection of the vectors/matrices style is required • A problem of styles compatibility • Not closed set of operators with matrices and vectors are presented – Instead of vectors/matrices operator “*” one should use a generic function prod (mv1, mv2) provided for all needed cases – uBLAS Vectors and Matrices operators are not presented as an algebraic system (for instance “-1”, “|| ||” are not offered because of the problems of vectors/matrices expression lazy evaluation) An acceptable approach may be proposed as follow: – The results of “-1”, “|| ||” and so on computation are considered as attributes of matrix and vector classes – These attributes are computed out of the expression computation by outside control if it is necessity – Expression’s constituents with these operators are replaced by extraction needed items from corresponded attributes The direction to Vectors and Matrices expression is very promising The myths about lazy evaluations 1. Lazy evaluations is possible only in functional languages Examples above indicate that it is not right 2. Lazy evaluations and functional languages may be applied only in artificial intelligence area As we have seen it is not right: high order functions using is very prospective in many cases 3. We are able to realize lazy evaluations using an arbitrary programming language Language states a lot of limitation for lazy evaluations using* 4. Using lazy evaluations decreases performance This statement depends on quality of algorithms programming only * C++ is subjected to criticism from this point of view by blitz++ project developers Q&A

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lazy Evaluation in Numeric Computing