Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Scala (programming language) wikipedia , lookup
Falcon (programming language) wikipedia , lookup
Curry–Howard correspondence wikipedia , lookup
Lambda calculus definition wikipedia , lookup
Lambda calculus wikipedia , lookup
Intuitionistic type theory wikipedia , lookup
Combinatory logic wikipedia , lookup
Anonymous function wikipedia , lookup
Closure (computer programming) wikipedia , lookup
Lambda lifting wikipedia , lookup
Haskell Dan Johnson CSC 415 11.8.2011 A History of Haskell Haskell is a lazy, committee designed, pure functional programming language that started development in the late 1980's. Before the Function Programming and Computer Architecture Conference (FPCA) in 1987, Simon Peyton Jones and Paul Hudak met to discuss the fragmentation of functional programming languages and from this meeting, the two decided that there needed to be an initiative to create a new, common functional language. The meeting that followed at the FPCA signaled the beginning of the Haskell language development. The first meeting of the Haskell Committee occurred at Yale in 1988. Haskell derives its name from Haskell Curry, a mathematician and logician. At this meeting, the goals of the Haskell Language were formally defined and are listed as follows: 1. It should be suitable for teaching, research, and applications, including building large systems. 2. It should be completely described via the publication of a formal syntax and semantics. 3. It should be freely available. 4. It should be usable as a basis for further language research. 5. It should be based on ideas that enjoy a wide consensus. 6. It should reduce unnecessary diversity in functional programming languages. (Hudak, Hughes, Peyton Jones & Walder, 2007) Development of the language began shortly after the 1988 meeting, but it was not until April 1, 1990 that the first working version of Haskell, Haskell 1.0, was released. From 1990 to 1999 many 2 incremental versions of Haskell were released and in 1999 the Haskell Committee decided that a formalized and stable version of the language needed to be released. They decided to call this standard version of the language Haskell 98. The reasoning behind creating a Haskell standard was so the language could be used in teaching environments and books could be written about it. Haskell 98 did not fully stabilize until 2002. Haskell recently released a new standardized version called Haskell 2010. This update includes all the changes in the language since 2002 repackaged with up-to-date documentation in the newest version of the Haskell Report. Since the release of Haskell 2010, version numbers are now based on date of release. (Marlow, 2010) Basics Haskell is an interpreted language. Two common Haskell interpreters are the HUGS interpreter and the GHC interpreter. For the purposes of the paper, the GHC interpreter is assumed to be used. The interpreter loads with a module called Prelude already loaded. Prelude contains a large selection of common functions for list comprehension, common numeric operations and basic I/O. The interpreter has a set of build-in commands for loading external Haskell files and various other operations. The interpreter behaves in a similar fashion to the interpreter in Python. Functions can be declared and evaluated completely from inside the interpreter. This makes it the best place to test small portions of code. An external Haskell file is called a Haskell script and is denoted by the extension .hs. To import other modules or scripts into a Haskell script, Haskell uses the import statement followed by 3 the name of the module to be imported. To load a file into the interpreter, use the :load command followed by the path to the script that needs to be opened. Data Types and Typeclasses In Haskell, types are collections of related values. Expression's types are automatically calculated at compile time. This is known as type inference. Because the type of an expression is determined at compile time, errors in type are found at compile time. Because the type of every expression is known at compile time, the code is considered to be safer. The application will not compile if type errors exist. From this we can conclude that Haskell is statically typed (Lipovača, 2011). All types in Haskell begin with a capital letter. Haskell has several basic types. The Bool type consists of the logical values True and False. The Char type is the single character type. The String type is strings of characters which basically just a shortcut or alias for lists of Char types. The Int type is a fixed precision integer type. The Integer type is integers of arbitrary precision. The Float type which is real floating point numbers with single precision, and the Double type is real floating point numbers with double precision. Lists in Haskell are homogeneous data structures, or sequences of values which all have the same type. The length of a list is arbitrary, meaning it is not known or defined explicitly. List elements are contained inside of brackets with each element separated by a comma. Each element of a list must be of the same type, lists with elements that are not of the same type will result in a type error. List indexes start at zero. There are a plethora of built in functions for list concatenation and manipulation built into the Haskell language. Because of the power of lists, most applications in 4 Haskell use Lists extensively. The following is an example of a list declaration: let a = [1, 2, 3, 4]. Tuples differ from lists in that they may contain elements of multiple types, but the length of a tuple is determined by its type declaration. Tuple elements are contained in parenthesis with each element separated by a space. The important differences between lists and tuples is that lists can shrink and grow, but tuples remain the same length. The empty tuple ( ) can only have the value ( ). Lists and tuples have a unique interaction as you can have lists of tuples and tuples with lists inside of them. By combining the two powerful data types, Haskell can support numerous complex data types. The following is an example of a tuple: let a = (1, “John”). Functions are defined as mappings of values of a given type to values of another type. These types may be the same. While functions are stand alone entities, they are also types. This is important because in Haskell, functions can be passed as parameters to other functions and also be returned as results. In Haskell, it is perfectly acceptable to have a function that accepts a function as a parameter and returns a function as its return type. A functions type is determined by its return type, not by the parameters that are passed to the function. For example the following function declaration is of type Int, not of type Char: ascii :: Char -> Int. Typeclasses are groupings of types. Because a type belongs to a typeclass, it must support and must implement all the characteristics of that typeclass, similar to how a class in Java must implement all the methods of an interface it uses. Typeclass constraints are denoted with the '=>' symbol. Everything before the '=>' is considered to be a class constraint. Consider the declaration of a plus function “plus :: (Num a) => a -> a -> a”, the parameters for this function must be of numeric type. Because Floats, Doubles, Ints, and Integers are all governed by the same typeclass 5 Num, a function that uses the Num typeclass is valid for any numeric data type. This feature allows for polymorphic functions to be created easily in Haskell. (Lipovača, 2011) In the previous paragraph, Num was associated with the type variable “a”. A type variable is a placeholder variable that begins with a lowercase character. Type variables are not true types, but are instead used to represent parameters with no type attached to them or in conjunction with a typeclass. In this way, they are used for generic parameters for functions, because they are not bound to a specific type or type class without explicitly being assigned to one. Haskell uses the capitalization of the variable to know that Int represents the type Int, and int is just a type variable. Because type variables are used in parameter declaration, they are most commonly simple, one character variables. In Haskell, the programmer has the flexibility to create and define new data types and typeclasses. To create a new type, the data keyword is used, followed by the name of the new type, followed by the definition for each type separated by the “|” character. The name for the new type must be capitalized. The following is a definition for a calendar event: data Event = Int Int Int. In this type declaration, the Event data type contains three separate integers, one for the month, day, and year. Functions Functions in Haskell are executed by calling the function name, then each of the functions parameters separated by spaces in between. It is not necessary to declare the parameter types of a function or the return type of a function because of Haskell's type inference. However, it is 6 generally considered good practice to explicitly declare the parameter types and return types of the function. The parameters for a function in Haskell are listed one at a time with '->' in between each parameter. The last parameter denotes the return type of the function and there is no other way to represent the return type. Parameters are evaluated left to right in Haskell. Functions are considered to be prefix functions (functions whose parameters are listed after the function call) unless they are declared to be infix functions. Any function in Haskell that uses two parameters can be used as an infix function by surrounding it with single quotes when the function is called. Functions are written in much the same way as they are defined. Type declaration for functions starts with the function name, the two colons, then each parameter type and the return. The function name is written, followed by a listing of its parameters as type variables or as explicit types, followed by the expression or conditional expressions for the function. What the expression to the right of the equals sign evaluates to is the return value of the function, so it must match the return type in the function's type declaration. From this, we can see that the structure of the expression directly mirrors the structure of its declared type. Consider the following function that doubles a given integer: double :: Integer -> Integer double n = 2 * n The first line declares the function to be named “double” and it is of the type Integer and returns the type Integer. The next line restates the name of the function, then is the listing of all the input parameters to the function. In this case, double takes a single parameter so a single parameter is listed, if more parameters were required they would be listed separated by a space. Then is the 7 assignment operator followed by the doubling operation. In this particular example, the function's type declaration is not necessary because Haskell will infer that 2 times in the input variable n must be a numeric type. Infix operations are functions can also be defined and overloaded by placing the operator in parenthesis. By allowing for infix operator overloading, Haskell gives the programmer the ability to create infix functions for user defined abstract data types which leads to more readable and writable code. In this manner, we treat the infix operation as its own function. A powerful tool in Haskell for writing functions is pattern matching. In pattern matching, a sequence of expressions is used to select between a sequence of results of the same type. Pattern matching is not limited to functions of a single argument. In the case of multiple arguments, the patterns are matched in order from left to right for each argument. The underscore character is used to denote a “wildcard”, or always True, pattern. (Hutton, 2007). Polymorphism and Overloading Polymorphic functions are functions whose type contains one or more different type variables. So any function that implements a typeclass is a polymorphic function and any function that implements a type variable is a polymorphic function. Most built in functions in Haskell are polymorphic functions. Another form of polymorphism is called parametric polymorphism. Polymorphic functions in Haskell do not care about the type of its parameters, it simply treats that type as a completely abstract type (O, Stewart & Goerzen, 2009). In addition to parametric polymorphism, Haskell also supports the concept of overloading. In 8 parametric polymorphism, the type of the parameters does not matter, but there are some instances where the type should matter, but still be generic enough to accept multiple parameters types while still being very specific about the exact operation to be performed with each specific type. For example, the equality operator (==) should be able to compare two lists to verify that each element of the list is equal to its corresponding element in another list. However, a list of Integers and a list of Char are very different and the same comparison methods cannot be used for each. Haskell will allow us to overload the equality operator for each instance needed. Haskell will also insure that we cannot compare lists of two different types unless we provide an overloaded function to do so. (Marlow, 2010) Higher Order Functions In Haskell, higher order functions are functions that either take a function as a parameter, or return a function as a result, or both. Higher order function comprise the majority of functions in Haskell because function passing is at the Haskell's core. To define a function as a parameter to another function, the declaration of the parameters function in the type declaration must be included inside of parenthesis in the position that the function will be passed in. Because Curried functions are defined as functions that return functions as a result, the term “higher order“ is typically just used for using functions as arguments to another function. Technically, every function in Haskell takes in a single parameter. Curried functions are defined as functions with multiple parameters. Most functions in Haskell are curried functions. This is possible because functions in Haskell can return functions themselves. So the declaration 9 addAlot :: Int → Int → Int → Int really means Int → ( Int → ( Int → Int) ). This equivocates to addAlot takes in an integer and returns a function, that function takes in an integer and returns a function, and that function takes in an integer and returns the result. Curried function present unique advantages over programming with tuples because functions can be created by partially applying curried functions. Curried functions do not have to be declared with the parenthesis listed above because the -> function in Haskell associates to the right (Hutton, 2007). In Haskell, functions can be created without assigning them a name or explicitly declaring its type by using lambda expressions. The use of lambda expressions and the symbol for the nameless function ( ) comes from lambda calculus. Lambda expressions are useful for defining and evaluating expressions inside of expressions and used to avoid the necessity of giving single use functions a name. The “\” symbol replaces the lambda symbol in the declaration of a lambda function. Lambda expressions are also useful for functions that return functions as their result. For example, we can use a lambda expression in the function that returns a list of the first x odd numbers: odds :: Int -> [Int] odds n = map (\x -> x*2 + 1) [0..n-1] The lambda expression literally takes in a variable x and returns 2 times x + 1. While a lambda expression is not necessary to perform this task, it uses shows that lambda expressions can be defined and used in the middle of other functions in instances where defining a separate function is not needed. Conditional Expressions and Control Structures 10 Haskell provides interfaces for conditional expressions in a several different ways. The first and most basic conditional statement is the “if-then” statement. The “if-then” statement is very similar to the “if-then” statement found in most languages with a few distinct differences. Each “ifthen” statement must include an “else” clause or branch. This helps to eliminate ambiguity when nesting conditional expressions. There is also no support for an “else if” type statement, the “ifthen” clause is placed in the underlying statement. Consider this function that calculates the nth Fibonacci number: fib :: Int -> Int fib n = if n == 0 then 1 else if n == 1 then 1 else fib(n-2) + fib(n-1) This example shows how nested if-then conditional expressions need to be formatted in Haskell. Also, no “return” statement is necessary. Because a primary goal of Haskell is to create concise of readable code, Haskell also implements another system for conditional expressions called guarded equations. Guarded equations can be used in place of conditional expressions and make nested conditionals easier to read and write. To define each condition, the “|” character is used followed by the condition and then the value of the condition. It is important to note that Haskell requires each condition in a guarded equation to be aligned. The “otherwise” condition is the catch-all for any condition that is not previously defined in the guarded equation. The guarded equation system closely mimics conditionals you might see in mathematics. It is also important to note that these conditions are 11 evaluated top to bottom and left to right. (O, Stewart & Goerzen, 2009) The previous Fibonacci function can be rewritten in a more compact form by using guards: fib :: Int -> Int fib n | n == 0 = 1 | n == 1 = 1 | otherwise = fib(n-2) + fib(n-1) Each line following the list of function parameters has its own conditional expression and return statement. The same function could be written with multiple nested if-then statements, but by using guards, the function is less verbose and more readable. Note that each return statement of the function must satisfy the original type declaration of the function. IO Haskell uses a special abstract datatype called IO to handle input and output. Even functions of this type return a value and have the additional side effect of printing some result to the screen or reading some input from the keyboard. To accomplish these features, Haskell uses a monad to separate values with actions normal to I/O and imperative programming. I/O actions must be well-defined to produce meaningful output. Haskell's I/O monad allows the user to interface the sequential order without having to specify it directly (Marlow, 2010). Exception Handling Instead of returning a result, I/O operations may raise an exception. I/O exceptions are of the IOError type. Haskell includes a built in function called catch that handles exceptions. The 12 catch function is not selective about exception catching. Catch can be redefined multiple times throughout an application. Haskell exceptions are propagated from one catch to another if necessary. Because Haskell's type errors are caught at compilation and Haskell is a pure language with no side effects other than in I/O, only I/O exception is typically used. Lazy Evaluation Haskell is lazy language as mentioned previously. When we say Haskell is lazy, we mean that Haskell implements an evaluation methodology called lazy evaluation. In lazy evaluation, parameters to a function are only evaluated if and when they are needed in that function. When parameters are evaluated, they are only evaluated just enough to satisfy the needs of the function in the context they are used. If a parameter is passed to a function and never called or used in that function, it is simply never evaluated. If a parameter is used multiple times in the same function, it is shared among each reference to its value without being reevaluated each time. Graham Hutton makes an observation about the benefits to sharing and lazy evaluation by stating that “lazy evaluation has the property that it ensures that evaluation terminates as often as possible. Moreover, using sharing ensures that lazy evaluation never requires more steps than call-by-value evaluation.” (O, Stewart & Goerzen, 2009) An infinite list of Integers in Haskell can be defined as [1..]. Because Haskell uses lazy evaluations, we can use this infinite list as a parameter to a function without needing to worry that it will try to evaluate the entire list. For example, the “take” function will return the first x items of a list, so if we execute the following statement: take 10 [1..]. The output will be 13 [1,2,3,4,5,6,7,8,9,10]. While an infinite list of integers is used a parameter, only the first ten items in that list are evaluated. This is a classic example of lazy evaluation. (Lipovača, 2011) Scoping Haskell scoping is different that what you would define as strictly static or dynamic scoping, since Haskell does not use variable in the same sense as in imperative languages. Functions in Haskell can only see the values of their parameters and any other values instantiated within the function. They cannot see anything outside of themselves. This forces the programmer to make sure that all the values for a given function, are passed to that function. Object Oriented Programming Haskell does not support object oriented programming. Evaluation The views in the remaining sections are purely of the author's own opinion. Readability From an outsider perspective, Haskell is a difficult to read language. However, after understanding the nature of the Haskell language and the methodologies and reasoning for its structure and layout, Haskell is actually an easy to read language. Haskell promotes concise expressions and declarations. Alignment and spacing matter so the code tends to be very structured 14 in its layout. Also,, Haskell allows most expressions to be simplified into more compact code, which means there is less code in general. Writability From a novice's perspective, Haskell is not necessarily difficult to write so much as it is different to write. The layout of functions, specifically those with complex parameter assignments, can be challenging to setup the way you intend them to be. Expressions and conditional statements are very easy to write and are logically straight forward. One you understand how to create functions and data types, Haskell can be written fairly quickly. Because each function can be run independently, it is easy to test components of that application. This really speeds up the troubleshooting process. Also, because Haskell is not as verbose of a language as C++ or Java, the code that must be written to accomplish similar task is comparatively much shorter. Reliability Functional languages are typically very reliable languages and Haskell is no exception. The ability to break functions apart from the parent program and test each element independently makes verifying function correctness incredibly simple. Also because Haskell is state independent, you know that the same input to a given function will return the same result, so there are no surprises in function output. Haskell's type inference also sniffs out type errors at compilation so the issue of incorrect parameter types is virtually nonexistent. These key features, along with the ability to transparently handle infinite lists with lazy evaluation and division by zero not breaking functions 15 prove Haskell's reliability. Cost Haskell is an open source language with no licensing fees or other costs. So the physical cost, cost of software and licenses, is nothing, Haskell is not an extremely popular language. This lack of popularity makes it more difficult to find trained programmers and the programmers who are well versed in Haskell can charge more. However, the demand for Haskell programmers is relatively low, in comparison to more popular languages like Java and C++. Because most programmers are trained in imperative languages, Haskell is inherently more difficult to learn because the paradigm is so dramatically different. It is reasonable to suspect that a programmer could learn a language like Python or Ruby considerably faster than learning Haskell. Issues Monads and I/O are relatively new in the grand scheme of Haskell since they directly create side effects that Haskell seeks to avoid. Because they were not native to the original implementations of the language, they really feel like they are tacked on to the language. When dealing with monads and I/O, it simply behaves and is controlled in ways that feel distinctly different than programming and performing other operations in the language. Overall 16 Overall Haskell is an incredibly powerful functional language. Haskell's real strengths are its purity, support for abstract data types and support for currency and parallelism. The Haskell community continues to grow at a steady pace. As the need for easy parallelization increases, so will the popularity of the language. Haskell has far outlived many of its sister functional languages developed at around the same time and should continue to live as more programmers and companies discover the power of the functional paradigm. 17 Bibliography Biancuzzi, E., & Warden, S. (2009).Masterminds of programming. (1 ed., pp. 177-196). Sebastopol, CA: O'Reilly Media. Hudak, P., Hughes, J., Peyton Jones, S., & Walder, P. (2007). A history of haskell: being lazy with class. Paper presented at Acm sigplan history of programming languages conference iii, San Diego. Retrieved from http://research.microsoft.com/en-us/um/people/simonpj/papers/history-ofhaskell/history.pdf Hutton, R. (2007). Programming in haskell. New York, NY: Cambridge Univ Pr. Lipovača, M. (2011). Learn you a haskell for great good!: a beginner's guide. (1 ed.). San Francisco: No Starch Press, Inc.Retrieved from http://learnyouahaskell.com/chapters Marlow, S. (Ed.). (2010). Haskell 2010 Language Report. Retrieved from http://haskell.org/definition/haskell2010.pdf O, R., Stewart, D., & Goerzen, J. (2009). Real world haskell. O'Reilly Media. 18