Download Haskell

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Scala (programming language) wikipedia , lookup

Falcon (programming language) wikipedia , lookup

Curry–Howard correspondence wikipedia , lookup

Lambda calculus definition wikipedia , lookup

Lambda calculus wikipedia , lookup

Intuitionistic type theory wikipedia , lookup

Combinatory logic wikipedia , lookup

Anonymous function wikipedia , lookup

Closure (computer programming) wikipedia , lookup

Lambda lifting wikipedia , lookup

Currying wikipedia , lookup

C Sharp (programming language) wikipedia , lookup

Standard ML wikipedia , lookup

Transcript
Haskell
Dan Johnson
CSC 415
11.8.2011
A History of Haskell
Haskell is a lazy, committee designed, pure functional programming language that started
development in the late 1980's. Before the Function Programming and Computer Architecture
Conference (FPCA) in 1987, Simon Peyton Jones and Paul Hudak met to discuss the fragmentation
of functional programming languages and from this meeting, the two decided that there needed to be
an initiative to create a new, common functional language. The meeting that followed at the FPCA
signaled the beginning of the Haskell language development.
The first meeting of the Haskell Committee occurred at Yale in 1988. Haskell derives its
name from Haskell Curry, a mathematician and logician. At this meeting, the goals of the Haskell
Language were formally defined and are listed as follows:
1. It should be suitable for teaching, research, and applications, including building large
systems.
2. It should be completely described via the publication of a formal syntax and
semantics.
3. It should be freely available.
4. It should be usable as a basis for further language research.
5. It should be based on ideas that enjoy a wide consensus.
6. It should reduce unnecessary diversity in functional programming languages.
(Hudak, Hughes, Peyton Jones & Walder, 2007)
Development of the language began shortly after the 1988 meeting, but it was not until April 1,
1990 that the first working version of Haskell, Haskell 1.0, was released. From 1990 to 1999 many
2
incremental versions of Haskell were released and in 1999 the Haskell Committee decided that a
formalized and stable version of the language needed to be released. They decided to call this
standard version of the language Haskell 98. The reasoning behind creating a Haskell standard was
so the language could be used in teaching environments and books could be written about it.
Haskell 98 did not fully stabilize until 2002.
Haskell recently released a new standardized version called Haskell 2010. This update
includes all the changes in the language since 2002 repackaged with up-to-date documentation in
the newest version of the Haskell Report. Since the release of Haskell 2010, version numbers are
now based on date of release. (Marlow, 2010)
Basics
Haskell is an interpreted language. Two common Haskell interpreters are the HUGS
interpreter and the GHC interpreter. For the purposes of the paper, the GHC interpreter is
assumed to be used. The interpreter loads with a module called Prelude already loaded. Prelude
contains a large selection of common functions for list comprehension, common numeric operations
and basic I/O. The interpreter has a set of build-in commands for loading external Haskell files and
various other operations. The interpreter behaves in a similar fashion to the interpreter in Python.
Functions can be declared and evaluated completely from inside the interpreter. This makes it the
best place to test small portions of code.
An external Haskell file is called a Haskell script and is denoted by the extension .hs. To
import other modules or scripts into a Haskell script, Haskell uses the import statement followed by
3
the name of the module to be imported. To load a file into the interpreter, use the :load command
followed by the path to the script that needs to be opened.
Data Types and Typeclasses
In Haskell, types are collections of related values. Expression's types are automatically
calculated at compile time. This is known as type inference. Because the type of an expression is
determined at compile time, errors in type are found at compile time. Because the type of every
expression is known at compile time, the code is considered to be safer. The application will not
compile if type errors exist. From this we can conclude that Haskell is statically typed (Lipovača,
2011). All types in Haskell begin with a capital letter.
Haskell has several basic types. The Bool type consists of the logical values True and False.
The Char type is the single character type. The String type is strings of characters which basically
just a shortcut or alias for lists of Char types. The Int type is a fixed precision integer type. The
Integer type is integers of arbitrary precision. The Float type which is real floating point numbers
with single precision, and the Double type is real floating point numbers with double precision.
Lists in Haskell are homogeneous data structures, or sequences of values which all have the
same type. The length of a list is arbitrary, meaning it is not known or defined explicitly. List
elements are contained inside of brackets with each element separated by a comma. Each element of
a list must be of the same type, lists with elements that are not of the same type will result in a type
error. List indexes start at zero. There are a plethora of built in functions for list concatenation
and manipulation built into the Haskell language. Because of the power of lists, most applications in
4
Haskell use Lists extensively. The following is an example of a list declaration: let a = [1, 2, 3, 4].
Tuples differ from lists in that they may contain elements of multiple types, but the length of
a tuple is determined by its type declaration. Tuple elements are contained in parenthesis with each
element separated by a space. The important differences between lists and tuples is that lists can
shrink and grow, but tuples remain the same length. The empty tuple ( ) can only have the value (
). Lists and tuples have a unique interaction as you can have lists of tuples and tuples with lists
inside of them. By combining the two powerful data types, Haskell can support numerous complex
data types. The following is an example of a tuple: let a = (1, “John”).
Functions are defined as mappings of values of a given type to values of another type. These
types may be the same. While functions are stand alone entities, they are also types. This is
important because in Haskell, functions can be passed as parameters to other functions and also be
returned as results. In Haskell, it is perfectly acceptable to have a function that accepts a function
as a parameter and returns a function as its return type. A functions type is determined by its
return type, not by the parameters that are passed to the function. For example the following
function declaration is of type Int, not of type Char: ascii :: Char -> Int.
Typeclasses are groupings of types. Because a type belongs to a typeclass, it must support
and must implement all the characteristics of that typeclass, similar to how a class in Java must
implement all the methods of an interface it uses. Typeclass constraints are denoted with the '=>'
symbol. Everything before the '=>' is considered to be a class constraint. Consider the declaration
of a plus function “plus :: (Num a) => a -> a -> a”, the parameters for this function must be of
numeric type. Because Floats, Doubles, Ints, and Integers are all governed by the same typeclass
5
Num, a function that uses the Num typeclass is valid for any numeric data type. This feature allows
for polymorphic functions to be created easily in Haskell. (Lipovača, 2011)
In the previous paragraph, Num was associated with the type variable “a”. A type variable is
a placeholder variable that begins with a lowercase character. Type variables are not true types,
but are instead used to represent parameters with no type attached to them or in conjunction with a
typeclass. In this way, they are used for generic parameters for functions, because they are not
bound to a specific type or type class without explicitly being assigned to one. Haskell uses the
capitalization of the variable to know that Int represents the type Int, and int is just a type variable.
Because type variables are used in parameter declaration, they are most commonly simple, one
character variables.
In Haskell, the programmer has the flexibility to create and define new data types and
typeclasses. To create a new type, the data keyword is used, followed by the name of the new type,
followed by the definition for each type separated by the “|” character. The name for the new type
must be capitalized. The following is a definition for a calendar event: data Event = Int Int Int. In
this type declaration, the Event data type contains three separate integers, one for the month, day,
and year.
Functions
Functions in Haskell are executed by calling the function name, then each of the functions
parameters separated by spaces in between. It is not necessary to declare the parameter types of a
function or the return type of a function because of Haskell's type inference. However, it is
6
generally considered good practice to explicitly declare the parameter types and return types of the
function. The parameters for a function in Haskell are listed one at a time with '->' in between
each parameter. The last parameter denotes the return type of the function and there is no other
way to represent the return type. Parameters are evaluated left to right in Haskell. Functions are
considered to be prefix functions (functions whose parameters are listed after the function call)
unless they are declared to be infix functions. Any function in Haskell that uses two parameters can
be used as an infix function by surrounding it with single quotes when the function is called.
Functions are written in much the same way as they are defined. Type declaration for
functions starts with the function name, the two colons, then each parameter type and the return.
The function name is written, followed by a listing of its parameters as type variables or as explicit
types, followed by the expression or conditional expressions for the function. What the expression
to the right of the equals sign evaluates to is the return value of the function, so it must match the
return type in the function's type declaration. From this, we can see that the structure of the
expression directly mirrors the structure of its declared type. Consider the following function that
doubles a given integer:
double :: Integer -> Integer
double n = 2 * n
The first line declares the function to be named “double” and it is of the type Integer and returns
the type Integer. The next line restates the name of the function, then is the listing of all the input
parameters to the function. In this case, double takes a single parameter so a single parameter is
listed, if more parameters were required they would be listed separated by a space. Then is the
7
assignment operator followed by the doubling operation. In this particular example, the function's
type declaration is not necessary because Haskell will infer that 2 times in the input variable n must
be a numeric type.
Infix operations are functions can also be defined and overloaded by placing the operator in
parenthesis. By allowing for infix operator overloading, Haskell gives the programmer the ability to
create infix functions for user defined abstract data types which leads to more readable and writable
code. In this manner, we treat the infix operation as its own function.
A powerful tool in Haskell for writing functions is pattern matching. In pattern matching, a
sequence of expressions is used to select between a sequence of results of the same type. Pattern
matching is not limited to functions of a single argument. In the case of multiple arguments, the
patterns are matched in order from left to right for each argument. The underscore character is
used to denote a “wildcard”, or always True, pattern. (Hutton, 2007).
Polymorphism and Overloading
Polymorphic functions are functions whose type contains one or more different type
variables. So any function that implements a typeclass is a polymorphic function and any function
that implements a type variable is a polymorphic function. Most built in functions in Haskell are
polymorphic functions. Another form of polymorphism is called parametric polymorphism.
Polymorphic functions in Haskell do not care about the type of its parameters, it simply treats that
type as a completely abstract type (O, Stewart & Goerzen, 2009).
In addition to parametric polymorphism, Haskell also supports the concept of overloading. In
8
parametric polymorphism, the type of the parameters does not matter, but there are some instances
where the type should matter, but still be generic enough to accept multiple parameters types while
still being very specific about the exact operation to be performed with each specific type. For
example, the equality operator (==) should be able to compare two lists to verify that each element
of the list is equal to its corresponding element in another list. However, a list of Integers and a list
of Char are very different and the same comparison methods cannot be used for each. Haskell will
allow us to overload the equality operator for each instance needed. Haskell will also insure that we
cannot compare lists of two different types unless we provide an overloaded function to do so.
(Marlow, 2010)
Higher Order Functions
In Haskell, higher order functions are functions that either take a function as a parameter, or
return a function as a result, or both. Higher order function comprise the majority of functions in
Haskell because function passing is at the Haskell's core. To define a function as a parameter to
another function, the declaration of the parameters function in the type declaration must be
included inside of parenthesis in the position that the function will be passed in. Because Curried
functions are defined as functions that return functions as a result, the term “higher order“ is
typically just used for using functions as arguments to another function.
Technically, every function in Haskell takes in a single parameter. Curried functions are
defined as functions with multiple parameters. Most functions in Haskell are curried functions. This
is possible because functions in Haskell can return functions themselves. So the declaration
9
addAlot :: Int → Int → Int → Int really means Int → ( Int → ( Int → Int) ). This equivocates to
addAlot takes in an integer and returns a function, that function takes in an integer and returns a
function, and that function takes in an integer and returns the result. Curried function present
unique advantages over programming with tuples because functions can be created by partially
applying curried functions. Curried functions do not have to be declared with the parenthesis listed
above because the -> function in Haskell associates to the right (Hutton, 2007).
In Haskell, functions can be created without assigning them a name or explicitly declaring its
type by using lambda expressions. The use of lambda expressions and the symbol for the nameless
function ( ) comes from lambda calculus. Lambda expressions are useful for defining and evaluating
expressions inside of expressions and used to avoid the necessity of giving single use functions a
name. The “\” symbol replaces the lambda symbol in the declaration of a lambda function. Lambda
expressions are also useful for functions that return functions as their result. For example, we can
use a lambda expression in the function that returns a list of the first x odd numbers:
odds :: Int -> [Int]
odds n = map (\x -> x*2 + 1) [0..n-1]
The lambda expression literally takes in a variable x and returns 2 times x + 1. While a lambda
expression is not necessary to perform this task, it uses shows that lambda expressions can be
defined and used in the middle of other functions in instances where defining a separate function is
not needed.
Conditional Expressions and Control Structures
10
Haskell provides interfaces for conditional expressions in a several different ways. The first
and most basic conditional statement is the “if-then” statement. The “if-then” statement is very
similar to the “if-then” statement found in most languages with a few distinct differences. Each “ifthen” statement must include an “else” clause or branch. This helps to eliminate ambiguity when
nesting conditional expressions. There is also no support for an “else if” type statement, the “ifthen” clause is placed in the underlying statement. Consider this function that calculates the nth
Fibonacci number:
fib :: Int -> Int
fib n = if n == 0 then
1
else
if n == 1 then
1
else
fib(n-2) + fib(n-1)
This example shows how nested if-then conditional expressions need to be formatted in Haskell.
Also, no “return” statement is necessary.
Because a primary goal of Haskell is to create concise of readable code, Haskell also
implements another system for conditional expressions called guarded equations. Guarded
equations can be used in place of conditional expressions and make nested conditionals easier to
read and write. To define each condition, the “|” character is used followed by the condition and
then the value of the condition. It is important to note that Haskell requires each condition in a
guarded equation to be aligned. The “otherwise” condition is the catch-all for any condition that is
not previously defined in the guarded equation. The guarded equation system closely mimics
conditionals you might see in mathematics. It is also important to note that these conditions are
11
evaluated top to bottom and left to right. (O, Stewart & Goerzen, 2009)
The previous Fibonacci function can be rewritten in a more compact form by using guards:
fib :: Int -> Int
fib n
| n == 0 = 1
| n == 1 = 1
| otherwise = fib(n-2) + fib(n-1)
Each line following the list of function parameters has its own conditional expression and return
statement. The same function could be written with multiple nested if-then statements, but by
using guards, the function is less verbose and more readable. Note that each return statement of
the function must satisfy the original type declaration of the function.
IO
Haskell uses a special abstract datatype called IO to handle input and output. Even
functions of this type return a value and have the additional side effect of printing some result to the
screen or reading some input from the keyboard. To accomplish these features, Haskell uses a
monad to separate values with actions normal to I/O and imperative programming. I/O actions
must be well-defined to produce meaningful output. Haskell's I/O monad allows the user to
interface the sequential order without having to specify it directly (Marlow, 2010).
Exception Handling
Instead of returning a result, I/O operations may raise an exception. I/O exceptions are of
the IOError type. Haskell includes a built in function called catch that handles exceptions. The
12
catch function is not selective about exception catching. Catch can be redefined multiple times
throughout an application. Haskell exceptions are propagated from one catch to another if
necessary. Because Haskell's type errors are caught at compilation and Haskell is a pure language
with no side effects other than in I/O, only I/O exception is typically used.
Lazy Evaluation
Haskell is lazy language as mentioned previously. When we say Haskell is lazy, we mean
that Haskell implements an evaluation methodology called lazy evaluation. In lazy evaluation,
parameters to a function are only evaluated if and when they are needed in that function. When
parameters are evaluated, they are only evaluated just enough to satisfy the needs of the function in
the context they are used. If a parameter is passed to a function and never called or used in that
function, it is simply never evaluated. If a parameter is used multiple times in the same function, it
is shared among each reference to its value without being reevaluated each time. Graham Hutton
makes an observation about the benefits to sharing and lazy evaluation by stating that “lazy
evaluation has the property that it ensures that evaluation terminates as often as possible.
Moreover, using sharing ensures that lazy evaluation never requires more steps than call-by-value
evaluation.” (O, Stewart & Goerzen, 2009)
An infinite list of Integers in Haskell can be defined as [1..]. Because Haskell uses lazy
evaluations, we can use this infinite list as a parameter to a function without needing to worry that it
will try to evaluate the entire list. For example, the “take” function will return the first x items of a
list, so if we execute the following statement: take 10 [1..]. The output will be
13
[1,2,3,4,5,6,7,8,9,10]. While an infinite list of integers is used a parameter, only the first ten items
in that list are evaluated. This is a classic example of lazy evaluation. (Lipovača, 2011)
Scoping
Haskell scoping is different that what you would define as strictly static or dynamic scoping,
since Haskell does not use variable in the same sense as in imperative languages. Functions in
Haskell can only see the values of their parameters and any other values instantiated within the
function. They cannot see anything outside of themselves. This forces the programmer to make
sure that all the values for a given function, are passed to that function.
Object Oriented Programming
Haskell does not support object oriented programming.
Evaluation
The views in the remaining sections are purely of the author's own opinion.
Readability
From an outsider perspective, Haskell is a difficult to read language. However, after
understanding the nature of the Haskell language and the methodologies and reasoning for its
structure and layout, Haskell is actually an easy to read language. Haskell promotes concise
expressions and declarations. Alignment and spacing matter so the code tends to be very structured
14
in its layout. Also,, Haskell allows most expressions to be simplified into more compact code, which
means there is less code in general.
Writability
From a novice's perspective, Haskell is not necessarily difficult to write so much as it is
different to write. The layout of functions, specifically those with complex parameter assignments,
can be challenging to setup the way you intend them to be. Expressions and conditional statements
are very easy to write and are logically straight forward. One you understand how to create
functions and data types, Haskell can be written fairly quickly. Because each function can be run
independently, it is easy to test components of that application. This really speeds up the
troubleshooting process. Also, because Haskell is not as verbose of a language as C++ or Java, the
code that must be written to accomplish similar task is comparatively much shorter.
Reliability
Functional languages are typically very reliable languages and Haskell is no exception. The
ability to break functions apart from the parent program and test each element independently makes
verifying function correctness incredibly simple. Also because Haskell is state independent, you
know that the same input to a given function will return the same result, so there are no surprises in
function output. Haskell's type inference also sniffs out type errors at compilation so the issue of
incorrect parameter types is virtually nonexistent. These key features, along with the ability to
transparently handle infinite lists with lazy evaluation and division by zero not breaking functions
15
prove Haskell's reliability.
Cost
Haskell is an open source language with no licensing fees or other costs. So the physical
cost, cost of software and licenses, is nothing, Haskell is not an extremely popular language. This
lack of popularity makes it more difficult to find trained programmers and the programmers who are
well versed in Haskell can charge more. However, the demand for Haskell programmers is relatively
low, in comparison to more popular languages like Java and C++. Because most programmers are
trained in imperative languages, Haskell is inherently more difficult to learn because the paradigm is
so dramatically different. It is reasonable to suspect that a programmer could learn a language like
Python or Ruby considerably faster than learning Haskell.
Issues
Monads and I/O are relatively new in the grand scheme of Haskell since they directly create
side effects that Haskell seeks to avoid. Because they were not native to the original
implementations of the language, they really feel like they are tacked on to the language. When
dealing with monads and I/O, it simply behaves and is controlled in ways that feel distinctly different
than programming and performing other operations in the language.
Overall
16
Overall Haskell is an incredibly powerful functional language. Haskell's real strengths are its
purity, support for abstract data types and support for currency and parallelism. The Haskell
community continues to grow at a steady pace. As the need for easy parallelization increases, so
will the popularity of the language. Haskell has far outlived many of its sister functional languages
developed at around the same time and should continue to live as more programmers and companies
discover the power of the functional paradigm.
17
Bibliography
Biancuzzi, E., & Warden, S. (2009).Masterminds of programming. (1 ed., pp. 177-196). Sebastopol,
CA: O'Reilly Media.
Hudak, P., Hughes, J., Peyton Jones, S., & Walder, P. (2007). A history of haskell: being lazy with
class. Paper presented at Acm sigplan history of programming languages conference iii, San Diego.
Retrieved from http://research.microsoft.com/en-us/um/people/simonpj/papers/history-ofhaskell/history.pdf
Hutton, R. (2007). Programming in haskell. New York, NY: Cambridge Univ Pr.
Lipovača, M. (2011). Learn you a haskell for great good!: a beginner's guide. (1 ed.). San Francisco:
No Starch Press, Inc.Retrieved from http://learnyouahaskell.com/chapters
Marlow, S. (Ed.). (2010). Haskell 2010 Language Report. Retrieved from
http://haskell.org/definition/haskell2010.pdf
O, R., Stewart, D., & Goerzen, J. (2009). Real world haskell. O'Reilly Media.
18