Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lazy Functional Parser Combinators in Java Picking the best of two worlds Atze Dijkstra & Doaitse Swierstra [email protected], [email protected] http://www.cs.uu.nl/research/techreps/UU-CS-2001-18.html About... ß Laziness + Functions in Java: Jazy ß usage + implementation ß Used for Parser Combinators ß ß usage (+ implementation) 'technology transfer' to Java ß Motivated by ß ß Jan 2003 flexibility and elegance of combinators desire for 'easy' functional programming in Java Utrecht University, ICS 2 History ß First attempt Java parser combinators ß ß list based backtracking via arrays very inefficient: all solutions calculated ß Second attempt ß ß only lazy lists inefficient: backtracking inherently inefficient ß Final attempt ß Jan 2003 lazy evaluation engine + translation of (error correcting) Haskell parser combinators Utrecht University, ICS 3 End product of this talk pNat = foldl (\a b -> a*10 + b) 0 <$> pList1 pDigit Object pNat = p.pApp.apply2 ( Prelude.foldl.apply2 ( new Function2() { public Object eval2( Object a, Object b ) { return Int.valueOf ( Int.evalToI(a) * 10 + Int.evalToI(b) ) ; } } , Int.Zero ) , p.pList1.apply1( pDigit ) ) ; Jan 2003 Utrecht University, ICS Haskell Java 4 Preferred background ß Familiarity with ß ß ß functional programming, Haskell OO programming, Java grammars ß Less familiarity with ß ß Jan 2003 functional language implementation, graph reduction (parser) combinators Utrecht University, ICS 5 Content ß Laziness and functions in Java ß mapping functional language to Java ß Parser combinators ß as an example ß A larger example ß small calculator ß And a demo Jan 2003 Utrecht University, ICS 6 Jazy: Lazy Java ß Issues ß ß ß Jan 2003 mapping functions to something equivalent in Java lazy evaluation mixing Java & Jazy Utrecht University, ICS 7 Describing a calculation Haskell Java fac n = if n > 0 then n * fac (n-1) else 1 int fac( int { if ( n > 0 return n else return 1 } n ) ) * fac( n-1 ) ; ; As part of Java class Function is first class citizen map fac [1,2,3,4] Jan 2003 Object is first class citizen ?? Utrecht University, ICS 8 Functions in Java ß How to enable usage of functions as first class citizens? ß mold into Java's first class citizen: Object fac = new Function1() { ... } ; Jan 2003 Utrecht University, ICS 9 Function ß Function (without side effect) definition ß defines description of calculation ß Function invocation ß ß ß ß 1) application to (evaluated) parameters + 2) evaluation of calculation description + 3) return & passsing around of result 4) using 'real' value of result ß Which are all combined in Java ... xx.fac( 4 ) ... Jan 2003 Utrecht University, ICS 10 Laziness ß Laziness requires uncoupling of ß ß use of application of function to (possibly unevaluated) parameters (1 & 3) use of evaluation of an application (2 & 4) fac4 = fac.apply1( ... ) ; ... = eval( fac4 ) ; ß Proxy (pattern) for lazily calculated values Jan 2003 Utrecht University, ICS 11 Haskell fac n = if n then else putStr (show Java > 0 n * fac (n-1) 1 (fac 4)) Java + lazy functions int fac( int n ) { if ( n > 0 ) return n * fac( n-1 ) ; else return 1 ; } ...println( xx.fac( 4 ) ) ; Eval fac = new Function1() { public Object eval1( Object n ) { return Prelude.ifThenElse.apply3 ( Prelude.gt.apply2( n, Int.Zero ) , Prelude.mul.apply2 ( n, fac.apply1( Prelude.sub.apply2( n, Int.One ) ) ) , Int.One ) ; } } ; System.out.println ( ((Int)Eval.eval( fac.apply1( Int.valueOf( 4 ) ) )).intValue() ) ; Jan 2003 Utrecht University, ICS 12 Implementation ß Computation represented by graph of objects boundParams 0..n Object funcOrVal Eval applyX(..) List Int head Nil Apply evalSet(..) Jan 2003 Function nParams evalX(..) Utrecht University, ICS Cons tail 13 Evaluation and usage ß Programmer ß ß ß uses applyX() variants to construct calculation description (i.e. function body) + defines calculation in evalX() variants calls eval() when result is needed ß Underlying machinery ß ß Jan 2003 evaluator eval() calls evalSet() to evaluate Apply graph node and replace it by the (shared) result (WHNF, graph reduction) which calls evalX(...) of a Function to perform the evaluation Utrecht University, ICS 14 Laziness ß Does laziness really matter? ß not for fac (fac is strict in its arguments) ß But for structures & applications ß ß structure elements or application parameters may not be needed evaluation can be avoided if not needed addOne :: [Int] -> [Int] addOne [] = [] addOne (l:ls) = l+1 : addOne ls add 1 to a list of integers main = take 5 (addOne (repeat 1)) Jan 2003 Utrecht University, ICS 15 Strict Java addOne List addOne( List l ) { if ( l.isEmpty() ) return List.Nil ; else return List.Cons ( Prelude.add( l.head(), Int.One ) , addOne( l.tail() ) ) ; } List lInfinite = ... ; IO.showln( ...take...( ..., addOne( lInfinite ) ) ) ; Nonterminating recursive invocation of addOne Jan 2003 Utrecht University, ICS 16 Lazy Java addOne static Eval addOne = new Function1() { public Object eval1( Object ll ) { List l = List.evalToL( ll ) ; if ( l.isEmpty() ) return List.Nil ; else return List.Cons ( Prelude.add.apply2( l.head(), Int.One ) , eval( addOne.apply1( l.tail() ) addOne.apply1( l.tail() ) ) ) ; } ...and back to the strict variant } ; Object lInfinite = Prelude.repeat.apply1( Int.One ) ; IO.showln( Prelude.take.apply2( Int.Five, addOne.apply1( lInfinite ) ) ) ; Recursive invocation only when necessary (lazy) Jan 2003 Utrecht University, ICS 17 Programming lazily in Java ß Lack of typing (everything is an Object) ß programming directly is error prone ß So ß ß start with type correct functional program manually (but systematically) translate to Java + Jazy ß As an example ß ß Jan 2003 parser combinator library + usage where laziness is essential Utrecht University, ICS 18 Parser combinators ß Parsing problem ß ß checking if input string is described by a grammar do some calculation directed by input string and grammar ß Haskell library of functions ß ß Jan 2003 no parser generator needed allows abstractions Utrecht University, ICS 19 Grammar ß Grammar for a language resembles parser combinator ß E.g. grammar describing single character 'a' ß ß as BNF: G :: 'a' as combinator: pG = pSym 'a' ß Combinators can be executed Jan 2003 Utrecht University, ICS 20 Examples Demo> pG `on` "a" Result: 'a' Demo> pG `on` "aa" Errors: no correct parses: [('a',"a")] Demo> pG `on` "b" Errors: no correct parses: [] ß p `on` i, parse input i with parser p, return result ß (error shows bit of implementation) Jan 2003 Utrecht University, ICS 21 EBNF equivalents EBNF Combinator Result symbol, 's' terminal alternative x | y pSym 's' 's' x <|> y result of x or y sequence x y x <*> y x applied to y repetition x* pList x list of results empty e pSucceed x x apply Jan 2003 f <$> x Utrecht University, ICS f (result of x) 22 Examples parser correct input example incorrect input example result of correct input pList (pSym 'a') list of 'a' "aaaa" "abaa" "aaaa" toUpper <$> pSym 'a' single 'a' "a" "b" "A" (\x y -> [y,x]) <$> pSym 'a' <*> pSym 'b' 'a' followed by 'b' "ab" "ba" "ba" pList (pSym 'a' <|> pSym 'b') list of 'a' or "abbab" 'b' "abcab" "abbab" Jan 2003 Utrecht University, ICS 23 Implementation ß Many varieties, simplest here type Parser s a = [s] -> [(a,[s])] ß Basic idea ß ß calculate list of all possible solutions/parses laziness avoids unnecessary calculation ß A Parser ß Jan 2003 is a function accepting an input list of symbols of type s, returning a list of possible correct parses, each parse consists of a result of type a and the remaining input Utrecht University, ICS 24 Implementation ß E.g. parsing a symbol pSym pSym a pSym a :: s -> Parser s s (b:rest) = if a == b then [(b,rest)] else [] [] = [] ß pSym ß ß Jan 2003 if no input symbols left, no correct parse if input available, and if the first symbol b is equal to the expected symbol a, return list with one correct parse Utrecht University, ICS 25 Implementation ß Basic combinators infixl 3 <|> infixl 4 <*> pSucceed :: a -> Parser s a (<|>) :: Parser s a -> Parser s a -> Parser s a (<*>) :: Parser s (b -> a) -> Parser s b -> Parser s a pSucceed v input = [ (v , input)] (p <|> q) input = p input ++ q input (p <*> q) input = [ (pv qv, rest ) | (pv , qinput) <- p input , (qv , rest ) <- q qinput ] Jan 2003 Utrecht University, ICS 26 Example ß Expression (what else :-)) <expr> <term> <factor> ::= <term> (('+' | '-') <term>)* . ::= <factor> (('*' | '/') <factor>)* . ::= ('0'..'9')+ | '(' <expr> ')' . ß Factor pFact = pNat <|> pSucceed (\_ e _ -> e) <*> pSym '(' <*> pExpr <*> pSym ')' ß Result: Int value represented by expression Jan 2003 Utrecht University, ICS 27 Abstractions ß Higher level (order) abstractions for ß ß ß application of function to parse result: <$> throwing away result: <*, *> embracing by (delimiters): pPacked, pParens f <$> p = p <* q = p *> q = pPacked l r x = pParens = pSucceed f <*> (\ x _ -> x) <$> (\ _ x -> x) <$> l *> (x <* r) pPacked (pSym '(') p p <*> q p <*> q (pSym ')') pFact = pNat <|> pParens pExpr Jan 2003 Utrecht University, ICS 28 Expression example <expr> <term> <factor> ::= <term> (('+' | '-') <term>)* . ::= <factor> (('*' | '/') <factor>)* . ::= ('0'..'9')+ | '(' <expr> ')' . pDigit = (\d -> ord d - ord '0') <$> pAnySym ['0'..'9'] pNat = foldl (\a b -> a*10 + b) 0 <$> pList1 pDigit pFact = pNat <|> pPacked (pSym '(') (pSym ')') pExpr pTerm = pChainl ( (*) <$ pSym '*' <|> div <$ pSym '/' ) pFact pExpr = pChainl ( (+) <$ pSym '+' <|> (-) <$ pSym '-' ) pTerm Jan 2003 Utrecht University, ICS 29 Mapping to Java ß Compilation scheme for ß Application (of function to expression(s)) f a1 ... an ß Abstraction (function definition) f a1 ... an = e ß Choice (if, case, patterns) if c then t else e ß Plus convenience functions Jan 2003 Utrecht University, ICS 30 Application ß f a1 ... an , 0 < n ≤ 5 f.applyn( a1,... , an ) ß f a1 ... an , 5 < n f.applyN( new Object[] {a1,... , an} ) Jan 2003 Utrecht University, ICS 31 Function definition ß f a1 ... an = e, 0 < n ≤ 5 Function f = new Functionn() { Object evaln( Object a1,... , Object an ) { return e ; } } ; ß f a1 ... an = e, 5 < n Function f = new FunctionN() { Object evalN( Object[] an ) { return e ; } } ; Jan 2003 Utrecht University, ICS 32 Choice ß if c then t else e ( ( ((Bool)Eval.eval( c )).boolValue() ) ? t : e ) ß In practice ß ß Jan 2003 via convenience functions and/or optimised (using e.g. strictness) but always involves eval() Utrecht University, ICS 33 Example Java translation pSym a pSym a (b:rest) = if a == b then [(b,rest)] else [] [] = [] public class ... { ... public static final Eval pSym = new Function2( "pSym" ) { public Object eval2( Object sym, Object inp ) { List s_ss = (List)eval( inp ) ; if ( s_ss.isEmpty() ) return List.Nil ; else if ( Bool.evalToB ( Prelude.eq.apply2( sym, s_ss.head() ) ) ) return List.one( new Tuple( s_ss.head(), s_ss.tail() ) ) ; else return List.Nil ; } } ; } ... } Jan 2003 Utrecht University, ICS 34 Example Java translation pFact = pNat <|> pPacked (pSym '(') (pSym ')') pExpr pTerm = pChainl ( (*) <$ pSym '*' <|> div <$ pSym '/' ) pFact ParserPrelude p = ... ; Object pTerm = p.pChainl.apply2 ( p.pOr.apply2 ( p.pAppL( Int.mul, p.pSym( '*' ) ) , p.pAppL( Int.div, p.pSym( '/' ) ) ) , pFact ) ; Jan 2003 Utrecht University, ICS Object pFact = p.pOr.apply2 ( pNat , p.pPacked.apply3 ( p.pSym( '(' ) , p.pSym( ')' ) , pExpr ) ) ; 35 Example Java translation pNat = foldl (\a b -> a*10 + b) 0 <$> pList1 pDigit Object pNat = p.pApp.apply2 ( Prelude.foldl.apply2 ( new Function2() { public Object eval2( Object a, Object b ) { return Int.valueOf ( Int.evalToI(a) * 10 + Int.evalToI(b) ) ; } } , Int.Zero ) , p.pList1.apply1( pDigit ) ) ; Jan 2003 Utrecht University, ICS 36 Demo ß calculator (with expression parser) ß addOne Jan 2003 Utrecht University, ICS 37 Other implementation issues ß Evaluation engine ß graph reduction ß (Mutually) recursive definitions ß e.g.: repeat x = xs where xs = x:xs (requires indirection Object) ß Absence of typing ß ß everything is Object: error horror class system: cannot use type info ß Optimisations ß Jan 2003 using OO mechanisms efficiently by optimised implementations tailored for nr of parameters given/available Utrecht University, ICS 38 Performance ß Speed of evaluation engine ß ß ß ß Jan 2003 per evaluation: 3-10µsec (500Mhz G3 PowerPC, Java 1.1.8 (JIT), MacOS 9.1) unoptimised C variant: 1.5µsec parser combinator uses ± 50 evaluations per symbol (optimised non-list variant) comparable to Hugs on same platform (bit slower) Utrecht University, ICS 39