Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Tools for Refactoring Functional Programs Simon Thompson with Huiqing Li Claus Reinke www.cs.kent.ac.uk/projects/refactor-fp Design Models Prototypes Design documents Visible artifacts LIL 2006 2 All in the code … Functional programs embody their design in their code. This is enabled by their high-level nature: constructs, types … data Message = Message Head Body data Head = Head Metadata Title data Metadata = Metadata [Tags] type Title = String … LIL 2006 3 Evolution Successful systems are long lived … … and evolve continuously. Supporting evolution of code and design? LIL 2006 4 Soft-Ware There’s no single correct design … … different options for different situations. Maintain flexibility as the system evolves. LIL 2006 5 Refactoring Refactoring means changing the design or structure of a program … without changing its behaviour. Modify LIL 2006 Refactor 6 Not just programming Paper or presentation moving sections about; amalgamate sections; move inline code to a figure; animation; … Proof add lemma; remove, amalgamate hypotheses, … Program the topic of the lecture LIL 2006 7 Splitting a function in two LIL 2006 8 Splitting a function in two LIL 2006 9 Splitting a function in two LIL 2006 10 Splitting a function module Split where f :: [String] -> String f ys = foldr (++) [] [ y++"\n" | y <- ys ] LIL 2006 11 Splitting a function module Split where f :: [String] -> String f ys = foldr (++) [] [ y++"\n" | y <- ys ] LIL 2006 12 Splitting a function module Split where f :: [String] -> String f ys = join [y ++ "\n" | y <- ys] where join = foldr (++) [] LIL 2006 13 Splitting a function module Split where f :: [String] -> String f ys = join [y ++ "\n" | y <- ys] where join = foldr (++) [] LIL 2006 14 Splitting a function module Split where f :: [String] -> String f ys = join addNL where join zs = foldr (++) [] zs addNL = [y ++ "\n" | y <- ys] LIL 2006 15 Splitting a function module Split where f :: [String] -> String f ys = join addNL where join zs = foldr (++) [] zs addNL = [y ++ "\n" | y <- ys] LIL 2006 16 Splitting a function module Split where f :: [String] -> String f ys = join (addNL ys) where join zs = foldr (++) [] zs addNL ys = [y ++ "\n" | y <- ys] LIL 2006 17 Splitting a function module Split where f :: [String] -> String f ys = join (addNL ys) where join zs = foldr (++) [] zs addNL ys = [y ++ "\n" | y <- ys] LIL 2006 18 Splitting a function module Split where f :: [String] -> String f ys = join (addNL ys) join zs = foldr (++) [] zs addNL ys = [y ++ "\n" | y <- ys] LIL 2006 19 Overview Example refactorings: what they involve. Building the HaRe tool. Design rationale. Infrastructure. Haskell and Erlang. The Wrangler tool. Conclusions. LIL 2006 20 Haskell 98 Standard, lazy, strongly typed, functional programming language. Layout is significant … “offside rule” … and idiosyncratic. doSwap pnt = applyTP (full_buTP (idTP `adhocTP` inMatch `adhocTP` inExp `adhocTP` inDecl)) where inMatch ((HsMatch loc fun pats rhs ds)::HsMatchP) | fun == pnt = case pats of (p1:p2:ps) -> do pats'<-swap p1 p2 pats return (HsMatch loc fun pats' rhs ds) _ -> error "Insufficient arguments to swap." inMatch m = return m inExp exp@((Exp (HsApp (Exp (HsApp e e1)) e2))::HsExpP) | expToPNT e == pnt = swap e1 e2 exp inExp e = return e LIL 2006 21 Why refactor Haskell? The only design artefact is (in) the code. Semantics of functional languages support largescale transformations (?) Building real tools to support functional programming … heavy lifting. Platform for research and experimentation. LIL 2006 22 Lift / demote fxy=…h… where h=… f x y = … (h y) … hy=… Hide a function which is Makes h accessible to the clearly subsidiary to f; other functions in the clear up the module and beyond. namespace. Free variables: which parameters of f are used in h? Need h not to be defined at the top level, … , Type of h will generally change … . LIL 2006 23 Algebraic or abstract type? data Tr a flatten :: Tr a -> [a] = Leaf a | Node a (Tr a) (Tr a) Tr Leaf Node flatten (Leaf x) = [x] flatten (Node s t) = flatten s ++ flatten t LIL 2006 24 Algebraic or abstract type? Tr data Tr a = Leaf a | Node a (Tr a) (Tr a) isLeaf = … isNode = … … LIL 2006 isLeaf flatten :: Tr a -> [a] isNode leaf left right mkLeaf mkNode flatten t | isleaf t = [leaf t] | isNode t = flatten (left t) ++ flatten (right t) 25 Information required Lexical structure of programs, abstract syntax, binding structure, type system and module system. LIL 2006 26 Program transformations Program optimisation source-to-source transformations to get more efficient code Program derivation calculating efficient code from obviously correct specifications Refactoring transforming code structure usually bidirectional and conditional. Refactoring = Transformation + Condition LIL 2006 27 Conditions: renaming f to g “No change to the binding structure” 1. 2. 3. No two definitions of g at the same level. No capture of g. No capture by g. LIL 2006 28 Capture of renamed identifier hx=…h…f…g… where gy=… hx=…h…g…g… where gy=… fx=… gx=… LIL 2006 29 Capture by renamed identifier hx=…h…f…g… where fy=…f…g… hx=…h…g…g… where gy=…g…g… gx=… gx=… LIL 2006 30 Refactoring by hand? By hand = in a text editor Tedious Error-prone • Implementing the transformation … • … and the conditions. Depends on compiler for type checking, … … plus extensive testing. LIL 2006 31 Machine support invaluable Reliable Low cost of do / undo, even for large refactorings. Increased effectiveness and creativity. LIL 2006 32 Demonstration of HaRe, hosted in vim. LIL 2006 33 LIL 2006 34 LIL 2006 35 LIL 2006 36 The refactorings in HaRe Move def between modules Rename Delete/add to exports Delete Clean imports Lift / Demote Make imports explicit Introduce definition Remove definition data type to ADT Unfold Short-cut, warm fusion Generalise All module aware Add/remove parameters LIL 2006 37 HaRe design rationale Integrate with existing development tools. Work with the complete language: Haskell 98 Preserve comments and the formatting style. Reuse existing libraries and systems. Extensibility and scriptability. LIL 2006 38 Information required Lexical structure of programs, abstract syntax, binding structure, type system and module system. LIL 2006 39 The Implementation of HaRe Information gathering Pre-condition checking Program transformation Strafunski Program rendering LIL 2006 40 Finding free variables ‘by hand’ instance FreeVbls HsExp where freeVbls (HsVar v) = [v] freeVbls (HsApp f e) = freeVbls f ++ freeVbls e freeVbls (HsLambda ps e) = freeVbls e \\ concatMap paramNames ps freeVbls (HsCase exp cases) = freeVbls exp ++ concatMap freeVbls cases freeVbls (HsTuple _ es) = concatMap freeVbls es … Boilerplate code: 1000 noise : 100 significant. LIL 2006 41 Strafunski Strafunski allows a user to write general (read generic), type safe, tree traversing programs, with ad hoc behaviour at particular points. Top-down / bottom up, type preserving / unifying, full LIL 2006 stop one 42 Strafunski in use Traverse the tree accumulating free variables from components, except in the case of lambda abstraction, local scopes, … Strafunski allows us to work within Haskell … Other options? Generic Haskell, Template Haskell, AG, Scrap Your Boilerplate, … LIL 2006 43 Rename an identifier rename:: (Term t)=>PName->HsName->t->Maybe t rename oldName newName = applyTP worker where worker = full_tdTP (idTP ‘adhocTP‘ idSite) idSite :: PName -> Maybe PName idSite v@(PN name orig) | v == oldName = return (PN newName orig) idSite pn = return pn LIL 2006 44 The coding effort Transformations: straightforward in Strafunski … … the chore is implementing conditions that the transformation preserves meaning. This is where much of our code lies. LIL 2006 45 Program rendering example -- This is an example -- This is an example module Main where module Main where sumSquares x y = sq x + sq y where sq :: Int->Int sq x = x ^ pow pow = 2 :: Int sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int main = sumSquares 10 20 sq :: Int->Int->Int sq pow x = x ^ pow main = sumSquares 10 20 module Main where sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int sq :: Int->Int->Int sq pow x = x ^ pow main = sumSquares 10 20 LIL 2006 46 Token stream and AST White space + comments only in token stream. Modification of the AST guides the modification of the token stream. After a refactoring, the program source is recovered from the token stream not the AST. Heuristics associate comments with program entities. LIL 2006 47 Work in progress ‘Fold’ against definitions … find duplicate code. All, some or one? Effect on the interface … fx=…e…e… Symbolic evaluation Data refactorings Interfaces … ‘bad smell’ detection. LIL 2006 48 API and DSL Combining forms ??? Refactorings Refactoring utilities Strafunski Library functions Grammar as data Strafunski Haskell LIL 2006 49 What have we learned? Efficiency and robustness of libraries in question. • type checking large systems, • linking, • editor script languages (vim, emacs). The cost of infrastructure in building practical tools. Reflections on Haskell itself. LIL 2006 50 Reflections on Haskell Cannot hide items in an export list (cf import). Field names for prelude types? Scoped class instances not supported. ‘Ambiguity’ vs. name clash. ‘Tab’ is a nightmare! Correspondence principle fails … LIL 2006 51 Correspondence Operations on definitions and operations on expressions can be placed in one to one correspondence (R.D.Tennent, 1980) LIL 2006 52 Correspondence Definitions Expressions where let fxy=e \x y -> e fx | g1 = e1 | g2 = e2 f x = if g1 then e1 g2 … … LIL 2006 else if 53 Function clauses fx | g1 = e1 f x = if g1 then e1 g2 … else if fx | g2 = e2 Can ‘fall through’ a function clause … no direct correspondence in the expression language. LIL 2006 No clauses for anonymous functions … no reason to omit them. 54 Haskell 98 vs. Erlang: generalities Haskell 98: a lazy, statically typed, purely functional programming language featuring higher-order functions, polymorphism, type classes and monadic effects. LIL 2006 Erlang: a strict, dynamically typed functional programming language with support for concurrency, communication, distribution and faulttolerance. 55 Haskell 98 vs. Erlang: example -- Factorial In Haskell. module Fact(fac) where fac :: Int -> Int fac 0 = 1 fac n | n>0 = n * fac(n-1) %% Factorial In Erlang. -module (fact). -export ([fac/1]). fac(0) -> 1; fac(N) when N > 0 -> N * fac(N-1). LIL 2006 56 Haskell 98 vs. Erlang: pragmatics Type system makes implementation complex. Layout and comment preservation. Types also affect the refactorings themselves. Clearer semantics for refactorings, but more complex infrastructure. Dynamic semantics of Erlang makes refactorings harder to pin down. LIL 2006 Untyped traversals much simpler. Use the layout given by emacs. Use cases which cannot be understood statically. 57 Challenges of Erlang refactoring Multiple binding occurrences of variables. Indirect function call or function spawn: apply (lists, rev, [[a,b,c]]) Multiple arities … multiple functions: rev/1 Concurrency Refactoring within a design library: OTP. Side-effects. LIL 2006 58 Generalisation and side-effects -module (test). -module (test). -export([f/0]). -export([f/0]). repeat(0) -> ok; repeat(N) -> io:format (“hello\n"), repeat(N-1). repeat(A, 0) -> ok; repeat(A, N) -> A, repeat(A,N-1). f( ) -> repeat(5). f( ) -> repeat (io:format (“hello\n”), 5). LIL 2006 59 Generalisation and side-effects -module (test). -module (test). -export([f/0]). -export([f/0]). repeat(0) -> ok; repeat(N) -> io:format (“hello\n"), repeat(N-1). repeat(A, 0) -> ok; repeat(A, N) -> A(), repeat(A,N-1). f( ) -> repeat(5). f( ) -> repeat (fun( )-> io:format (“hello\n”), 5). LIL 2006 60 The Wrangler Program source Scanner/Parser Parse Tree Syntax tools AST annotated with comments Refactorer AST + comments + binding structure Program analysis and transformation by the refactorer Transformed AST Pretty printer Program source LIL 2006 61 Teaching and learning design Exciting prospect of using a refactoring tool as an integral part of an elementary programming course. Learning a language: learn how you could modify the programs that you have written … … appreciate the design space, and … the features of the language. LIL 2006 62 Conclusions Refactoring + functional programming: good fit. Real win from available libraries … with work. Substantial effort in infrastructure. De facto vs de jure: GHC vs Haskell 98. Correctness and verification … Language independence … LIL 2006 63