Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Short Notes on Haskell and Functional Programming Languages Bob Dickerson January 1994 (revised since) 1 Functional programming languages A functional or applicative language is one in which the computational model is that of expression evaluation. This is not like a procedural language (such as C, Basic, Cobol or Ada) where the computational principle is changing the contents of stored values (called “variables”). Functional languages have no storage, no statements. In functional languages no value can ever be changed, new values can only be built from existing ones without changing them, this is what happens when expressions are evaluated. If a procedural language like Modula-2 or Ada was stripped of all its statements, its variables, its arrays it would be close to a functional language, but a very weak one. Useful functional languages must have much richer expressions and data constructors to allow them to describe non-trivial computations. So a functional program only consists of declarations, definitions of functions, data types, constants and just one expression to be evaluated calling the defined functions with actual arguments (known as applying functions to arguments), its result is the result of the program1. This might appear very limiting but complicated tasks can be programmed, just as: a = 3 b = 4 a * a + b * b is a functional program, so is: codegen(typecheck(parse(scan sourcecode))) a functional program given suitable definitions of the functions codegen etc. and a value for sourcecode. There are various languages: Hope: see [RMB80], SML: see the definition[RHM86] or how to program in it in Åke Wikström’s book [Wik87], Miranda: see [Tur86], Haskell see the report [PH90] and for FP see the paper by Backus [Bac78]. With the exception of FP all of them have quite a lot in common. Sorry these references are old, to find newer ones visit http://www.haskell.org. The example language used in these notes is Haskell. It was designed by a group of people from Europe and America. It was intended to contain the best features of functional languages, to be a vehicle for language and programming research and to have high quality compilers. It is also publicly and freely available. Haskell owes a lot to Miranda which was designed by D. Turner. 2 Basic components 2.1 Primitive, built-in data types The type system, the primitive data types and the names of their operators have a fairly complicated underlying structure, some of it will be explained later. But at a simple level Haskell provides the following basic types: Int The constant values can be written as: 47, -2, etc. There are the normal complement of operators: +, - (unary and binary), *, /, and giving integer results: mod and div. Float The constant values can be written as: 0.712, -10.1, or 1.34e-4. There are the normal complement of operators: ˆ (power of), +, - (unary and binary), *, and /. Bool The boolean type, called Bool, has the values: True and False, (note the initial capitals). There are the standard operations: && logical and, || logical or, and not negation. 1 Actually functional programs often interact with files and other parts of their environment but how this is fitted into the simple functional model is quite problematic. 1 Char Character values. Constants are: ’a’, ’A’, ’*’ and ’\n’. The C convention for escaping special characters is used, so ’\n’ means newline. Characters are ordered by their underlying ASCII representation order and can be compared. There are loads of other numeric types and type classes: Integral, Double, Fractional, and Integer(arbitrary precision Int). 2.2 Expressions Expressions are a way of describing a new value as the result of combining other values using operators and function application (calling). All values have a type, operators yield results of a given type so expressions have type. 3 * 3 + 2 * 2 is a simple expression of type Int. And the expression: 1.0 + sqrt 10.89 has value 4.3 of type Float. It is the addition operator applied to 1.0 and the result of the function application of the function sqrt to the number 10.89. In most functional languages function calling (or application) is written by juxtaposing the function and its argument, this causes the function to be evaluated with the argument value producing a result. Parentheses (round brackets) are not needed to indicate application, they are only used to control order of evaluation. Also note that application is more binding than any operator. So: sqrt (10.89) + 1.0 (sqrt 10.89) + 1.0 sqrt 10.89 + 1.0 all mean the same, apply sqrt to 10.89 and then add 1.0. But the following means something different: sqrt (10.89 + 1.0) first evaluate the addition yielding 11.89 and then apply sqrt to the result. 2.2.1 if expressions An expression can be selected on the basis of the evaluation of a conditional expression, the format is: if expression1 then expression2 else expression3 expression1 is evaluated, if it yields True the expression2 is evaluated to give the result, otherwise expression3 is evaluated. 2.2.2 let expressions A let expression contains local definitions and one expression which is evaluated in the scope of the new definitions. After the let expression the local definitions are forgotten. So: 100 + let a = 4 b = a + a in a + b should produce 112. For more about the form of the definitions see the next section. 2.3 Definitions and declarations Definitions introduce new types or values into a program and declarations just constrain the meaning of a name. Simple value definitions just give names to values2 : y = x + a message = "hello world\n" x = 100 a = x * x year = 1991 2 These are in fact a simplified case of conformal definitions, see the summary, see a fuller language description. 2 the name y is given the value 10100 etc. These definitions are just namings, they are like the CONST declarations of, for example, Modula-2, or Ada. They are not changeable storage variables. They are like equations that the compiler solves, therefore the construct x=x+1 which is meaningful if it is an assignment statement is meaningless if it is just defining a value for x: it is not soluble. Note that Haskell uses a recursive scope, ie. the definitions are not just elaborated in sequence but that names can be used that are defined later. However some definitions might have no solution: x = y + 1 y = x + 1 and will produce errors. 2.3.1 Simple function definitions Function values for names can be defined, here the name on the left of the = is followed by one or more formal argument patterns, on the right is the expression to be evaluated when the function is applied: double n = n * 2 nonzero a = a /= 0 sqr :: Int -> Int sqr x = x * x sumsq a b = sqr a + sqr b Defines four functions. Functions are values like any other value and so they have types, the type of a function is given as the type of its domain (its source), an arrow -> and the type of its codomain (its target). So the type of double is: Int -> Int and the type of nonzero is Int -> Bool. Interlude on expression evaluation Now a tiny interlude about a way of thinking about the evaluation of functional programs. It is an informal model to help understand how they work. Think of the evaluation of an expression (ie. a whole functional program) as the repeated replacement of names by their defined values and the replacement of built-in operations by their results until a non-replaceable value is left, this is the result. So given the above definitions, the expression: x rewrites to: 100 and the expression: year + x rewrites to: 1991 + 100 which rewrites, applying built-in operators, to: 2091 and now rewriting with functions, the only difference is that when a function and the actual argument it is applied to is replaced by the functions meaning (the right hand side) there is a substitution of the actual arguments for the formals: sumsq 4 5 where a is 4 and b is 5 rewrites to: sqr 4 + sqr 5 which, taking the replacements one at a time, rewrites to: 4 * 4 + sqr 5 and evaluating the built in * first gives: 16 + sqr 5 and now the last name: 16 + 5 * 5 then: 16 + 25 and finally: 41 That is the end of the interlude. Functions definitions can, in fact, be a lot more complicated then the simple single equation ones show here. The full forms are described in section 3. 3 2.3.2 Type constraint declarations Names can also be “typed”, ie. constrained to be of a certain type, this information can be given in a type declaration, name :: type: flag:: Bool count:: Int nonzero:: Int -> Bool names can only be given one typing, and any subsequent definition of the name must match the typing, so that later: flag = 77 will produce a static type error. The functional languages: Haskell, SML, Hope and Miranda are all statically strongly typed which means all functions and operators can only be applied to arguments or operands of the appropriate type, that’s the strong bit, and that this can be checked before the program is evaluated, that’s the static bit. 2.3.3 Type inference However, programs don’t have to have any type declarations, the compiler will use type inference to work out the correct types of all names and expressions. If there were no declaration of the type of nonzero it will be inferred from the comparison of a with 0 so the source is Int and the result of the not-equals comparison is Bool so this is the type of the target. If the function is applied in an expression then this information can be used to check for type correctness. 2.4 Composite types: lists Composite types are types of values with other types as components, ie not primitive, they are sometimes called data structures. User defined composite types will be dealt with later. The most useful is a list. A list is a sequence of values, all the same type. It can be written as: [1,2,3,5,7,11,13,17] [True, False, True, True] [’h’, ’e’, ’l’, ’l’, ’o’ ] [] the first is a list of numbers, its type is [Int], the element type enclosed in square brackets. The empty list: “[]” is also a list. Lists of numbers are not the same type as lists of characters, so given that ++ joins (or concatenates) lists: [1,2,3] ++ [’a’,’b’,’b’,’a’] is a static type error, but: [1,2,3] ++ [4,5] is alright and will produce the Int list: [1,2,3,4,5] notice that lists of any length are compatible types so long as the elements are the same. The constructor operator for lists is “:”, it takes a value of any type and a list with elements of the same type and constructs a new list like its second argument but with the new value at the front. This operation is known as “prefix” or “cons” (after the original Lisp function that constructed lists). The expression: False : [False, True, False] will give the result: [False, False, True, False] The prefix operator, “:”, can only create a new list with an additional element at the front. To understand why it only does this it might help to understand how lists are usually implemented. This is very naughty, looking below a high level language shouldn’t be done but sometimes it helps to relate it to other constructs like, for example, chained records and pointers in Modula-2. So now a diagram: The list is represented as a sequence of cells pointing to an element and the rest of the list. Prefixing just creates one cell making the first field point to the new head value and the second field point to the second argument, the list. The pointer to the new cell is returned as the result which is the new list. From this it can be seen that the square bracket notation is a shorthand for repeated use of “:” ; any list can be built by repeatedly using “:” to join elements onto an empty list: []. The list: [10,20,30] is equivalent to: 10 : (20 : (30 : [])) All list values can be built from the type’s two constructors: the constant [] and the binary constructor “:”. 4 [2, 3, 4] [] 2 3 4 1 : [2, 3, 4] [] 1 2 3 4 Accessing elements of a list Lists are NOT arrays they are not stored as a linear sequence where elements can be easily accessed with an index, instead the only easy access is to the front of a list. The function head gets the first element from a list, the function tail gives the whole list without the first element. For example: head [1,2,3] gives: 1 and tail [1,2,3] gives: [2,3] Notice that head and tail “take apart” a list made by the operator “:”, head yields the first argument to : and tail gives the second argument: head ( 10 : [20, 30] ) gives: 10 and tail ( 10 : [20, 30] ) gives: [20, 30] Strings or Char lists When a list of characters is printed by the Haskell compiler they are joined up and come out as a string, so: [’h’, ’e’, ’l’, ’l’, ’o’ ] will be printed as: hello Also the notation "..." can be used in expressions. There is no “string” type in Haskell, the quoted string of characters is an abbreviation for a list of characters. The type is [Char]. So a list of strings is in fact a list of lists of characters: ["apple", "orange", "pear" ] has type [[Char]]. 5 2.5 Evaluating Haskell programs The interpreter to use is called hugs. All definitions and declarations must be in a file (sometimes called a script), there must be no expressionsWhen the interpreter is run it must be provided with the file name as a command line argument, the script is compiled and any error messages are displayed, then the interpreter prompts for input. The only Haskell that can be entered interactively is an expressions to be evaluated in the context of the definitions provided in the file. no definitions can be entered interactively. So script– definitions, interactively–expressions. Aswell as expressions one can interactively ask for the type of expressions whether inferred or declared, these are expressions precede by the word “type”. Apart from expressions and typings system commands or directives can be entered, these are a “:” followed by a name: :? for the list of directives :info describes what some named value is: ? :info sqrt sqrt :: Float -> Float -- primitive :quit leave hugs, typing ˆd also works So given the following very simple script file called first.hs containing: foo = 77 + bar sumsq a b = sqr a + sqr b sqr n = n * n snarf = 4 bar = 23 name = "Jo Soap" (Note: in a Haskell script everything after -- is ignored, it’s a comment). The following is a sample interactive session with the Haskell interpreter and compiler: freckles(780)$ hugs first.hs __ __ __ __ ____ ___ || || || || || || ||__ ||___|| ||__|| ||__|| __|| ||---|| ___|| || || || || Version: 20050308 _________________________________________ Hugs 98: Based on the Haskell 98 standard Copyright (c) 1994-2005 World Wide Web: http://haskell.org/hugs Report bugs to: [email protected] _________________________________________ Haskell 98 mode: Restart with command line option -98 to enable extensions Type :? for help Main> 4 + 5 9 Main> 1 + foo 101 Main> sumsq 3 snarf 25 Main> sumsq 3 "hello" ERROR - Cannot infer instance : Num [Char] *** Instance *** Expression : sumsq 3 "hello" 3 More about functions 3.1 Recursion interlude This is not about functional programming languages it is about functional programming. It is included because not knowing how to write functions will make it harder to understand the language constructs. Skip this if you know how to write recursive functions. The only way to define a function is by clauses of patterns and corresponding answers. So how can there be a non-trivial function? How can any problem requiring any repeated operation be solved? Even a problem as simple as computing the factorial of a number “n” would seem, in a procedural language, to require some repetition: ie. 1 ∗ 2 ∗ 3 . . .∗ n . The answer is to think of the solution to the problem in terms of clauses. All computably soluble problems have solutions that can be expressed in cases or patterns of actual arguments. Use the following approach: 6 Find simple immediate answers for the simple cases, and for all other cases express the answer as some operation on the result of an application of the same function on a simpler argument (ie. one tending towards the simple cases). Using the function to define itself is recursion, if you are not confident about it still do it – it will work – do it as an act of faith, just believe! To take that trivial problem of the factorial, observe that the factorial of 0 is defined to be 1, that’s the simple case, and for all other values 3 (other case) the factorial of any number “n” is “n” multiplied by the factorial of the previous number. In Haskell: factorial n = if n == 0 then 1 else n * factorial (n-1) notice that the second case where n is not equal to 0 is an operation on the result of an application of the function being defined but with an argument tending towards the simple case, using factorial to define factorial is the act of faith. The previous interlude about thinking of functional language evaluation in terms of rewriting names by their definitions can be used to assist understanding recursion: n==3, if evaluates the else part and rewrites to: just replacing factorial 2 rewrite to: rewriting factorial 1 and replacing by else branch: factorial 0 n==0 so rewrite to 1: which by arithmetic gives: done! factorial 3 3 * factorial (3-1) 3 * 2 * factorial (2-1) 3 * 2 * 1 * factorial (1-1) 3 * 2 * 1 * 1 6 Recursion works over lists, following the magic recipe: first identify the simple answers then the cases for which the result is an operation on a recursive application with a simpler argument. But this time the simple case is likely to be [] not 0 and the simpler argument will be the tail of the list not a smaller number. For example to find the length of a list, observe that the empty list is 0 elements long and all other lists are one longer than lists one element shorter, so: len x = if x==[] then 0 else 1 + len (tail x) Now rewrite len [4,5,6] which is equivalent to len (4:(5:(6:[]))) len [4,5,6] 1 + len [5,6] x /= [] so use else expression giving: again x /= [] so rewrite len [5,6] to 1 + 1 + 1 + len [6] again x /= [] so rewrite len [6] to 1 + 1 + 1 + 1 + len [] 1 + 1 + 1 + 0 3 this time x == [] so rewrite to 0, giving: which by arithmetic gives: len [6] len [] 3.2 More example functions 3.2.1 The function nth The function nth returns the nth element from a list, the list is the first argument and the index of the element to return is the second argument. It counts from 1. nth lst n = if n==1 then head lst else nth (tail lst) (n-1) The function works by identifying the simple, immediate answer case: if the position is 1 then the answer is the head (first item) of the list. In all other cases (assuming a positive index) the answer will the (n − 1)th element from the list that is one element shorter. So 3 Please assume non-negative arguments only for now 7 nth [2,4,6,8] 1 == (front [2,4,6,8]) == 2 and nth [2,4,6,8] 3 == nth [4,6,8] 2 == nth [6,8] 1 == 6 3.2.2 The function dropfront This function returns a list, given as the second argument, with the first n items removed, the count n is the first argument: dropfront n lst = if n==0 then lst else dropfront (n-1) (tail lst) The pattern of recursion is similar in many ways to the previous problem: recurse to the right position and then return the result, however in this case the answer is the remains of the list rather than the head of of remains. So, the simple case with an immediate answer: dropfront 0 [2,4,6,8] == [2,4,6,8] don’t drop anything, and dropfront 2 [2,4,6,8] == dropfront 1 [4,6,8] == dropfront 0 [6,8] == [6,8] 3.2.3 The function occurs This function counts the occurrences of an item in a list. The first argument is the list and the second element the one to search for and count. What is the simple case? If the list is empty then there cannot be any occurrences no matter what value you are counting. The recursive cases? If the list is not empty then there are two alternatives: either the list starts with (its head is) the element we are counting in which case the answer is one more than the number of occurrences of the item in the rest of the list, or if the head is not what we are looking for then the answer is just the number of occurrences of that item in the rest (tail) of the list. occurs list e = if list==[] then 0 else if (head list)==e then 1 + occurs (tail list) e else occurs (tail list) e For example: occurs [] 77 == 0 with a non-empty list that starts with the target item e: occurs [6,4,6,8] 6 == 1 + occurs [4,6,8] 6 == 2 and when the first item is not what is being counted: occurs [1,2,3,2,3,2] 2 == occurs [2,3,2,3,2] 2 == 3 3.2.4 The function isin This is a very famous function, it is a common example in applicative (functional) programming tutorials, it is more often called member but has been called isin here to avoid any clash with any pre-defined function in Haskell. The function takes two arguments, a list and a value, and gives True if the value aoccurs at least once in the list, and gives False otherwise. Here is a “case analysis”. The simple case: if the list is empty the value e cannot be there, it gives False. If the list is not empty then there are two cases, firstly the list starts with the value, in which case give True, or, it doesn’t start with the value so check to see if the value is in the remains of the list (its tail). isin s e = if s==[] then False else if head s == e then True else isin (tail s) e or, given the prior definition of the function occurs: isin s e = occurs s e > 0 ie. True if the count of element e in s is greater then zero. 8 3.2.5 The function takefirst One further example of a function on lists, this time to illustrate that values are not changed and if necessary new values (lists) must be constructed. Remember in a functional programming there are no stored variables to alter, nothing changes, all results must be built out of other values. The function takefirst n s will take the first n elements of the list s and make a list of them, so: takefirst 4 [10,9,8,7,6,5,4] will give: [10,9,8,7]. The first direct answer case is: if the count of elements is 0 then the answer is the empty list, otherwise if the count, n, is greater than 0, then the result is given by sticking the first item back onto the result of finding the first n-1 items from the tail of the list. Now the function: takefirst n s = if n==0 then [] else (head s) : takefirst (n-1) (tail s) So: == == == == == takefirst 3 [10,20,30,40,50] 10 : (takefirst 2 [20,30,40,50]) 10 : (20 : (takefirst 1 [30,40,50])) 10 : (20 : (30 : (takefirst 0 [40,50]))) 10 : (20 : (30 : [] )) [10,20,30] 3.3 Patterns Function definitions can be written as a sequence of clauses (or equations) each of which provides the answer for one pattern of arguments: fname patterns1 fname patterns2 ··· fname patternsn = rhs1 = rhs2 = rhsn Patterns consist of constants, variables and data constructors (eg. : or (,..)). A constant only matches itself, a variable matches anything – and gets bound to the value it matched – and a constructor only matches values built from that constructor if the component patterns match. This is an alternative way of writing functions, any function can still be written using the if..then..else.. expression to test for different values of parameters. When a function is applied to an argument the patterns of each clause of the definition are matched, in turn, with the argument. When a match occurs the expression on the right hand side (rhs) is evaluated giving the result of the function. For example: iszero 0 = True iszero n = False if iszero is applied to an argument as in: iszero 37 it fails to match the first clause because 0 does not equal 37 but the variable “n” matches anything so the second clause is used and False is the result. The simple function to exchange the components of a pair tuple: exchange (a,b) = (b,a) The pattern (a,b) only matches pairs, and a and b will match any values, they are bound to the two components of the actual value and then the right hand side reconstructs a pair with the elements reversed. The next example is a function called front that will return the first element of a list 4 . Remember that the empty list: [], is a list so it must be matched but it has no first element: front [] = error "empty list to front" front (h:t) = h the pattern [] will only match an empty list and the corresponding right hand side has no result, instead it aborts the program, error is not just an error message. If the list is non-empty it must have been made with “:” and the pattern (h:t) will match it binding h to the first element and t to the rest of the list: 4 The function front isn’t really necessary as there is a pre-defined Haskell function called head which does the same. 9 front [1,2,3,4] is equivalent to: front (1 : (2 : (3 : (4 : [])))) and the pattern matches setting: h = 1 t = (2 : (3 : (4 : []))) 3.4 Guarded expressions For most functions simple clauses and patterns are not enough to select all the different cases for which different answers are required. For example a function to return the larger of two numbers cannot be done with patterns alone, selection of the result depends on a numeric comparison of the arguments and patterns are only constants, constructors or variables. Haskell provides alternative answers for each pattern’s right hand side depending on the value of a boolean expression called a guard. fname pats1 ··· fname patsn |guard11 |guard12 = exp11 = exp12 ... |guard1m = exp1m |guardn1 |guardn2 = expn1 = expn2 ... |guardnm = expnm The last guard for one pattern can be the word “otherwise”. For example to return the larger of two numbers: larger a b | a>=b = a | a<b = b note that a and b match any arguments, the two cases are selected by evaluating the guards in order until one gives true and then evaluating the following expression as the result. Relying on the order and using “otherwise”: larger a b | a >= b = a | otherwise = b though perhaps it is safer to use mutually exclusive guard conditions. Multiple pattern clauses and guards can be combined. The following function isin tests to see whether a list contains a particular value: isin [] e = False -- nothing in empty list isin (h:t) | h == e = True -- is head the goal? | h /= e = isin t e Each clause is a separate equation and the bindings of the variables in the patterns only hold throughout the corresponding right hand side. The first rule has no bindings for h and t they are in a different equation but the second equation has two parts and h and t are bound through both expressions and both guards. 3.5 Where definitions and scope The right hand side of a clause (just one clause and all its guarded expressions) can have local definitions. These definitions are within the scope of the pattern match for the clause so they can use the argument names and the values they define can be used throughout the guarded expressions. They can be used to avoid writing repeated expressions and to make the code clearer if it is long. The following function uses an auxiliary definition it its third equation: largest [] = error "no largest in empty list" largest (h:[]) = h largest (h:t) | h>l = h | h<=l = l where l = largest t If the where is not used the recursive call, largest t, to find the largest item in the tail of the list will be needed by the guard to compare it with the head item and again, maybe, as the result. The function would be as follows: 10 largest [] = error "no largest in empty list" largest (h:[]) = h largest (h:t) | if h>largest = h | otherwise = largest t Note that the otherwise is to avoid using largest t three times! 4 Polymorphic functions In section 2.3.3 the idea of type inference for expressions and functions was introduced. The typing system of functional languages is more powerful and general than has so far been described. Consider the type of the function sumlist that takes a list of numbers and adds them up. Using the magic recipe for recursive functions this must produce an answer for the empty list, the value is zero, and then for list made by “:” which will be sum the tail of the list recursively and then add the value of the head element: sumlist [] = 0 sumlist (h:t) = h + sumlist t the type of this functions is: [Int]->Int, because the argument is a list the elements of which are added and the sum is the result. This function can only be applied to lists of numbers but what sort of lists can len be applied to? It does not perform any operation on the elements of the lists, it just takes the list apart and counts its way along. In fact len can be applied to any list, it is said to be polymorphic 5 which means the same operation applied to different types of value. The way Haskell types a polymorphic function is to use a type variable where the “any” type would be, so len is of type: len:: [a] -> Int Type variables are “a”, “b”, “c” etc. and they can only be used in type descriptions. A letter type variable in a type means the function has a range of types, a polytype, those which can be obtained by substituting a type for the variable uniformly throughout the signature. So len has types: [Int] -> Int [Char] -> Int [Bool] -> Int [[Char]] -> Int [[Int]] -> Int ... and infinitely many more. This form of typing is not a weakening of the strong typing system, full type checking is still possible; all that has happened is to allow the safe general use of a general function which is something denied the programmer in languages like Modula-2 where the function would be rewritten for each type of argument. To show that checking is strong consider the function last which returns the last element of any list: last [] = error "last: got empty list" last (x:[]) = x last (h:t) = last t Notice there is no operation applied to the elements of the list just the matching and taking apart of the list. The type of last is: [a] -> a when applied to a list of numbers its type is: [Int] -> Int The substitution of the type for the type variable must be uniform. This allows complete checking of the use of last in expressions, as in: 3 * last (last l) What is the type of “l”? Type inference will determine that the result of the outer last must be Int because of the multiplication therefore the argument must be a list of Ints: [Int], but this is the result of another application of last so by substitution its argument, “l” must be a list of lists of Ints: [[Int]]. Strong checking is preserved without the need for explicit type declarations and more simplicity and generality have been given to the programmer. 5 This type of polymorphism is parametric universal polymorphism, there are other forms. 11 5 Higher order functions In applicative languages functions are treated as first class values, in other words they are data values like tuples, lists, or numbers; they can be treated in the same way as any other value. They can be: • the value of an expression, • the elements of a list (or other structure), • passed as parameters, • and they can be named. This is also nearly true in other languages but some still tend to treat a function as an assembly language subroutine not as a manipulatable value. Programming by combining functions to produce other functions is known as using higher order functions and the functions that build new functions from existing ones are called combinators, they are no different from other functions, it’s just the way they are used that is different. 5.1 Functions as values A function by itself, not applied to an argument, is a value. The value is the function. If you type the name of a function as an expression to the interpreter you get the function: ? fact fact (1 reduction, 7 cells) Functions can therefore be returned as the results of other functions. The following function select is rather pointless, if given the argument “1” it returns the function front, if given “2” it gives end: select 1 = front select 2 = end select n = error "only 1 or 2" This can be applied to “1” or “2”, it will give a function. What can be done with a function? Apply it. How is it applied? By following it by an argument: ? select 1 head (3 reductions, 8 cells) ? (select 1) [10,20,30] 10 (3 reductions, 13 cells) ? (select 2) [10,20,30] 30 (6 reductions, 13 cells) So the expression (select 1) returns front and following this by [10,20,30] causes front to be applied to it yielding 10. Notice that this idea is even more simply shown by using the idea of expression evaluation by rewriting: (select 1) [10,20,30] front [10,20,30] rewrite select 1 first, this matches the first clause and rewrites to: this is now a simple function application and can be rewritten using the equations for front giving: 10 Finally notice that the function application is left associative 6 so the expression can be rewritten as: ? select 2 [10,20,30] 30 ? 6 Associativity concerns what happens if there is a sequence of similar operations without parentheses, which operation is done first? A left associative operator is one which groups to the left, eg. 4-2-1 means (4-2)-1 12 Since a function is a value it can be named in script definitions: first = front end = select 1 the first definition just gives another name to front, it can be thought of as (a) evaluating front, getting a function value, and then naming that as first, or (b) it is just a rewriting rule that replaces first by front, both views amount to the same thing. The second definition can be thought of the same way, by either view end will evaluate to last. Functions as elements of structures Since a function is a value it can be an element of a data structure such as a list. Given the definitions of some functions: sqr:: Int->Int double n = n * 2 inc x = x + 1 sqr n = n * n (notice that the types of double, inc and sqr are the same, ie. Int->Int) then the definition: funlist = [sqr, double, inc] names a list of type [Int->Int]. It can be manipulated like any other, ie. elements can be accessed and used, and since they are function values they can be applied: (first funlist) 10 should yield the value 100. 5.2 Functions as arguments Obviously if a function is a value it can be passed as an argument to another function. Passing function arguments will be discussed in order to show how this gives a different way of programming. The function map The first example is perhaps the best known of all higher order functions, it is called map (or sometimes mapcar). It takes a single argument function and applies it to all the elements of a list building a list of the results as its result, so given that inc n=n+1 then: map inc [10,20,30] gives: [11,21,31] The function is simple: map f [] = [] map f (h:t) = (f h) : map f t the function argument, ie. inc, matches the pattern “f”, in the second rule this is applied to the head element of the list and the result is prefixed to the result of the natural recursion. One motivation for this is that given the existence of functions like map one can write other functions more easily. For example to write a function to get the square roots of all the numbers in a list just write: roots l = map sqrt l or even more simply, remembering rewriting: roots = map sqrt instead of: roots [] = [] roots (h:t) = (sqrt h) : roots t This is less useful than some other higher order functions but it was dealt with first because it is one of the easier ones. 13 The function filter Another, called filter, is more useful, filter is a function that takes 2 arguments: “p”, a function of one argument that returns a boolean and “l” a list (the elements must be the same type as the argument to “p”). It returns a list containing only those elements of the input list that when given as argument to “p” gave the value True. That was a tortuous description an example will make it clearer: filter even [1,2,3,4,5,6] will give: [2, 4, 6], if even is a function giving True for even numbers only. The function: filter p [] = [] filter p (h:t) | p h = h : filter p t | otherwise = filter p t The function foldr This function is a bit more complicated. It takes 3 arguments: a two-argument function “f”, an initial value “b”, and a list “l”. It applies the function “f” to the first element of the list “l” and the result of applying the function to the second element of the list and the result of applying the function to the third element of the list and. . . to the result of applying the function to the last element of the list and the base value “b”. To make this clearer consider an example of its use, where plus is a function that adds its two arguments: foldr plus 0 [1,2,3,4] which adds up all elements of the list and gives 10. What has actually happened is that it has worked as follows: (plus 1 (plus 2 (plus 3 (plus 4 0) ) ) ) It has “folded” its argument function into the list. The definition of the function is a lot simpler than the description of how it works! foldr f b [] = b foldr f b (h:t) = f h (foldr f b t) (Note: foldr is pre-defined. You cannot re-define it, so if you want to experiment and define your own version use a different name). To produce a function to sum all the elements of a list just name the above example as in: sumlist l = foldr (+) 0 l don’t write a new recursive function. In the above definition a binary operator has been enclosed in parentheses and given as an argument, this is called a section in Miranda. sections A section is the use of an operator as a prefix function, enclosing the operator in parentheses causes it to be considered as a prefix function, it doesn’t change the meaning just the way its written: (*) 3 5 is the same as 3 * 5 and gives 15. Sections also permit the operator to be given one of its arguments in the section: inc n = (+1) n, here (+1) is the addition operator taking one of its arguments in the section so that when it is used as a prefix function it only needs one argument: inc 10 which will rewrite to: (+1) 10 which is the same as 1 + 10 because the section just allows prefixing instead of infixing. Similarly prefix functions can be infixed: infix functions In Haskell the application of any two argument function can be written in infix form by enclosing the function name in backquotes “‘fub‘”. So given the two argument function map it could be applied in either of the following ways: map inc [10, 100, 1000] inc ‘map‘ [10, 100, 1000] When functions are used as infix the precedence is more binding than any other infix operator but lower then prefix application. Back to foldr. The function to find the product of all the elements in a list is: prodlist a = foldr (*) 1 a When applied it rewrites as follows: 14 prodlist [1, 2, 3, 4] ((*) 1 ((*) 2 ((*) 3 ((*) 4 1) ) ) ) A function to determine if any value in a boolean list is True would be: orlist l = foldr (||) False l it “folds” the disjunction operator, “|| between all the elements of the list. With this the member or isin function could be defined, this is the function that gives True is in a list and False otherwise: isin l e = orlist (map (==e) l) so that the application isin ["apple", "plum", "pear"] "plum" rewrites to: orlist (map (=="plum") ["apple", "plum", "pear"]) which rewrites to: orlist ([False,True,False]) which rewrites to: foldr (||) False ([False,True,False]) which rewrites to: ((||) False ((||) True ((||) False False))) which eventually gives True, thank goodness. 5.3 Partial application and nameless functions Using just the rewriting view of expression evaluation the following definition is very simple to understand: isvowel = isin [’a’,’e’,’i’,’o’,’u’] it is just a name given to the isin (member) function and one of its arguments. When given the following application: isvowel ’x’ the name isvowel is replaced by its definition so the expression will simply rewrite to: isin [’a’,’e’,’i’,’o’,’u’] ’x’ However at a deeper level how can the Miranda compiler determine the type of isvowel or even isin [...] for type checking? The system can be though of as if it does simple rewriting but in fact it does more. So what is the meaning of a function only applied to some its arguments? In Haskell there is a construct called a lambda expression or nameless function. It is the value that function names name. A (simplified) lambda is: \ pat -> exp this is a function, it can be applied, so: (\ x -> x * 2) 15 is the application of a lambda to 15. The way the application works is to match the actual argument to the formal argument pattern in the lambda and rewrite the whole application to be just the body (ie. the expression after the ->) of the lambda with all occurrences of the formal name replaced by the actual value. So the above application rewrites to: x * 2 where x=15, which is 15 * 2. Function definitions of the form: fname pat = rhs-exp can be thought of as abbreviations for the simple naming of a lambda. So that the definition of the function double: double x = x * 2 is just an abbreviation for: double = (\ x -> x * 2) So all named functions can be thought of as if they are the names of lambda expressions. 15 5.3.1 Curried functions All this is a bit pointless and might be of theoretical interest only except that it explains how Miranda and all other functional languages give a meaning to functions only applied to some of their arguments. Functions of more than one argument are actually regarded as functions of one argument that return as their result a nameless function (lambda) of one argument that when applied to the second argument gives the intended result. So: sumsq a b = a*a + b*b is treated as if it is the name of a lambda: sumsq = (\ a -> (\ b -> a*a + b*b)) the outer lambda takes an argument and then returns its body as result: (\ a -> (\ b -> a*a + b*b)) 4 5 rewrites to: (\ b -> 4*4 + b*b) 5 which is a function of one argument that when applied to another number will square it and add in the square of 4. The meaning of the expression f x y is really: apply “f” to “x” and then apply the result, which is a function, to “y”. So f x y means (f x) y. This explains why the type of mult-argument functions is given as it is. The function sumsq has type: sumsq:: Int -> Int -> Int or equivalently sumsq:: Int -> (Int -> Int) and the following expressions have the types: sumsq sumsq x sumsq x y :: Int -> Int -> Int :: Int -> Int :: Int which means sumsq is a function of one number that returns as result a function of one number that gives a number result. This idea of treating multi-argument functions as function returning single argument functions is called currying not because it is culinary but because it is based on the work of the mathematician Haskell Curry (whose other name was also used for the name of this language). It is why partial application has a defined meaning and that that meaning is a lambda. 6 Lazy evaluation No matter whether a functional language is implemented using a rewriting interpreter or a native code compiler there is a choice about the evaluation strategy. Given: f (g x) should the application of “f” to the sub-expression in the parentheses be rewritten (or evaluated) first or should “g x” be rewritten first? The first choice, reducing the left-most outer application is called normal order or lazy evaluation. The second choice reducing the left-most inner expression is called eager evaluation. Eager evaluation is what is done in most programming languages, evaluate the actual argument expressions before evaluating the function. In nearly all cases the results will be the same for functional programs with either strategy. But if the value of an expression is non-terminating or can produce an error then the lazy strategy might produce a result where eager will not. Consider the following function: second a b = b it takes two arguments and discards the first. If the following expression is lazily evaluated: 16 second (100/0) 42 then the result will be 42. But if it is eagerly evaluated the outcome is a runtime error of division by zero. The lazy evaluation rewrites using the function definition before reducing the actual argument expression, in this case the rewrite matches (100/2) to “a” and 42 to “b” and rewrites to 42. Notice lazy evaluation will not evaluate an argument expression until it is used in the function. No expressions are evaluated until their values are needed, this applies to whole programs aswell, one can think of the result of lazily evaluated program just sitting there like a fat parcel waiting to be kicked to start producing results. The “kick” in a Haskell interpreter is the need to display the result, this forces evaluation. In a lazy evaluating language data constructors are lazy aswell, so given the expression: 3 : (f x), nothing happens unless a surrounding expression actually wants to access a component of this “potential” list. Given the definition: squaresfrom n = (n * n) : squaresfrom (n + 1) then the application squaresfrom 1, if evaluated at all, will yield: 1 : squaresfrom 2 another expression needing to be “kicked” again. If the above expression was given to the Haskell interpreter the built-in “result-printing” would attempt to display the result “kicking” the expression until the end of the list is found: [1, 4, 9, 16, 25, 39, 49, 64, 81, 100 ... on and on for ever because there is no end, it is infinite. But if the expression squaresfrom 1 is used in another expression that only requires one element, say the third, then the function is only kicked enough times to produce as much of the list as needed. The lazy evaluating list constructor “:” can be used to build all sorts of structures, for example here is the infinite list of all the Fibonacci numbers: fibs = fibx 1 1 where fibx p n = n : fibx n (p + n) And here is the ouput from taking the first 12 followed by the ouput from carelessly asking for all of fibs: Main> take 12 fibs [1,2,3,5,8,13,21,34,55,89,144,233] Main> fibs [1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,1597,2584,4181,6765, 10946,17711,28657,46368,75025,121393,196418,317811,514229,832040, 1346269,2178309,3524578,5702887,9227465,14930352,24157817,39088169, 63245986,102334155,165580141,267914296,433494437,701408733,1134903170, 1836311903,-1323752223,512559680,-811192543,-298632863,-1109825406, ... 6.1 Left-most outer (lazy) rewriting Lazy expression rewriting can be shown line by line in the same way that simple expression evaluation was. Given the previous squaresfrom 2 infinite list then the expression: head (tail (tail (squaresfrom 2))) has a simple defined result. In the following rewrite table squaresfrom is called sqs: sqs n = (n*n) : sqs (n+1) The expression reduced: head head head head head head 16 (tail (tail (sqs 2))) (tail (tail (4 : (sqs 3)))) (tail (sqs 3)) (tail (9 : (sqs 4))) (sqs 4) (16 : (sqs 5)) leftmost redex is sqs 2 leftmost redex is (tail (4:(sqs 3))) leftmost redex is sqs 3 leftmost redex is (tail (9:(sqs 4))) leftmost redex is sqs 4 leftmost redex is (head (16:(..))) irreducible Done! And the final inner expression (sqs 5) in the last reduction is never used and therefore not evaluated further. 17 7 Dot-dot lists and list comprehensions It is a desireable goal that functional programs should be written at as higher level as possible. Instead of always having to write recursive functions for every small job programmers should use libraries of higher order functions that capture common patterns of recursion to define functions. This simplifies programming and makes (if the programmer understands them) programs clearer. In addition he has provided higher level list constructors to remove the need for recursive functions to build lists. 7.1 Dot-dot lists The “dot-dot” lists generate lists of numbers. The following are the four different expression forms: [ start .. end ] [ start .. ] [ start , next .. end ] [ start , next .. ] the first creates a list of all the numbers from start to end inclusive, the second creates the infinite list of all the numbers from start (remember with lazy evaluation this is no problem), the third form creates the list of all numbers from start to end in steps of next − start. So [0..] is the list of all the non-negative integers, [1,3..99] is the list of all odd integers from 1 to 99 inclusive. 7.2 List comprehensions List comprehensions are also known as ZF expressions because the idea and notation is based on ZermeloFrankel set notation, but remember these things are lists not sets! The general form is: [ expression | qualifiers ] The qualifiers are either generators or filters. The generators are of the form: pattern <- list-expression The qualifier matches each element of the list, in turn, with the pattern (which must be a pattern for elements of the the list NOT the list itself) and then the expression before the “|” is used to generate a new list element, any names bound in the pattern are in scope in the list-forming expression. For example the list comprehension: [ 1/x | x <- [1..8]] matches “x” with each element of [1..8] in turn and for each “x” evaluates “1/x” to produce the next list item giving the result: [1.0,0.5,0.33333,0.25,0.2,0.166667,0.142857,0.125] There can be any number of qualifiers, separated by semicolons (“;”). Each qualifier is evaluated from left to right the later ones being evaluated for each set of pattern values bound by the one to the left of them. The expression: [(a,b) | a<-[1..3]; b <-[1..a]] generates a list of pairs, examine the sequence of values to see the order and scope of the generators: [(1,1),(2,1),(2,2),(3,1),(3,2),(3,3)] A qualifier can also be a filter which is an expression yielding a boolean. When a generator has produced a value the filter is evaluated and if it yields True then any other filters and generators are evaluated and the value is used to construct a new list element, if the value is False the value is ignored. The following function produces all the factors of a number: factors n = [ f | f<-[1..n div 2]; n mod f = 0] With list comprehensions and higher order functions there are lots of ways to define functions. Consider a new version of the factorial function: factorial n = foldr (*) 1 [1..n] 18 This is cheating a bit because it can’t do factorial 0, the list only starts from 1! Or without cheating, the list of Fibonacci numbers: fib = 1 : 1 : [ a+b | (a,b) <- zip fib (tail fib)] where zip takes two lists and builds one list of pairs of the corresponding items from its argument lists: zip [1, 1, 2, 3, 5, 8] [1, 2, 3, 5, 8, 13] gives: [(1,1), (1,2), (2,3), (3,5), (5,8), (8,13)] 8 Datatype definitions The is a very simple form of definition, the type synonym declaration. It only gives a simpler name to a type expression, it does not introduce a new type, both the new name and the type are the same type. Example type string = [Char] 8.1 Algebraic types There is one construct for defining algebraic data types it must be used for all programmer defined data types, it is however, very flexible and gives many different sorts of data structure. The construct is sometimes called a sum of products type and it is available, in different forms, in the functional languages: Miranda, Haskell, SML and Hope. In Haskell the general form of the definition is: data tname tyvars= | Constr1 arg11 . . . arg1n1 Constr2 arg21 . . . arg2n2 ... | Constrm argm1 . . . argmnm In Haskell each constructor name, Constri , must start with a capital letter. Each constructor Constri when applied to the correct arguments (it’s like a function to use) produces a value of the type tname, the values produced by any of the constructors are of type tname and no other values are of type tname. An example of a datatype definition would be a tree structure: data Ntree = Null | Node Ntree Int Ntree this can be read as defining an Ntree value as: either Null or it is a node containing a number and two trees, a left and a right. The definition has introduced two constructors: Null and Node, the constructor Null has no arguments it is a constant. A simple value can be constructed and named: t = Node (Node Null 10 Null) 20 (Node Null 30 Null) will produce a value of type Ntree for the structure: The constructors introduced by a datatype definition 20 10 30 can be used in patterns (just like the constructors for lists: [] and “:”), so a function can be written to count the non-empty nodes in an Ntree: countn Null = 0 countn (Node l v r) = (countn l) + 1 + (countn r) and, if the entries in the tree are to be kept sorted so that all numbers less than or equal to a node’s value are in the left tree and all the greater numbers are in the right, a function to add a new value to a tree might be: addordered Null e = Node Null e Null addordered (Node l v r) e | e <= v = Node (addordered l e) v r | e > v = Node l v (addordered r e) 19 20 10 30 15 so that if 15 is added to the Ntree constructed above the following sort of structure will be produced: Another example of a type definition might be type to represent simple the well-formed formulae of Propositional Logic. data wff = P | Q | R | Conj wff wff | Neg wff | Disj wff wff | Impl wff wff It might be better to have arbitrary variable names by using a constructor like ..| Var [Char].. using strings for names, but this is simpler. Given that the type the wff: P ∧ Q → R would be represented as: Impl (Conj P Q) R Now it is possible to define functions over the new type, if it was necessary to simplify formulae and put them in some normal from then a function to replace implication might be useful. Remember P → Q is equivalent to: ∼P ∨ Q. remimp:: wff -> wff remimp P = P remimp Q = Q remimp R = R remimp (Neg w) = Neg (remimp w) remimp (Conj a b) = Conj (remimp a) (remimp b) remimp (Disj a b) = Disj (remimp a) (remimp b) remimp (Impl a b) = Disj (Neg aa) bb where aa = remimp a bb = remimp b 8.1.1 Parameterised types There is no reason why trees should not contain values of other types, as with lists most of the structure and operations on trees are independent of the actual type of the element values So the Tree type and operations on it can be polymorphic: Tree a = Null | Node (Tree a) a (Tree a) this is a parameterised type, the functions addordered and countn don’t need to be changed, but the type inferred for addordered has changed to: Tree a->a->Tree a 9 Conclusion These notes have had various objectives: • primarily to present characteristics of a functional programming language and how they are different from those of a procedural language, • to give a very sketchy idea of functional programming to aid understanding of functional languages, • to show some of the features of Haskell. 20 Perhaps trying for all three has meant that none of them have been achieved successfully. I hope not. Most of the concepts and ideas presented are similar in SML, Hope, Haskell and Miranda: recursive functions, expression, nested scopes, application, algebraic datatypes, type polymorphism, lists and pattern matching. They do, of course differ in detail. The ideas of: lazy evaluation, list comprehensions, only one number type and the separate compilation system are not common to all other functional languages. The major omissions are the type class operator overloading features, the I/O system and the modules separate compliation facilities of Haskell. What has not been attempted has been to examine alternatives or evaluated different approaches to functional language design. This requires knowledge of other functional languages and their features. References [Bac78] J. Backus. Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. Communications of the ACM, 21(8):613–641, August 1978. [PH90] Philip Wadler et al. Paul Hudak. Report on the programming language Haskell. Technical report, Yale, Glasgow etc., April 1990. [RHM86] David MacQueen Robert Harper and Robin Milner. Standard ML. LFCS Report ECS-LFCS86-2, Edinburgh University, March 1986. [RMB80] D. T. Sannella R. M. Burstall, D. B. MacQueen. HOPE: An experimental applicative language. In The 1980 Lisp Conference, pages 136–143, Stanford University, August 1980. ACM. [Tur86] D. A. Turner. An overview of Miranda. SIGPLAN Notices, December 1986. [Wik87] Åke Wikström. Functional Programming Using Standard ML. Prentice-Hall, 1987. 21