Download Introduction to Python James Curran

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
ICL LAB
ZERO
Introduction to Python
James Curran
Contact Info
Ben Hachey
job: Contract Research Staff, Institute for Communicating and Collaborative Systems
office: B02 BP6 (Basement, Room 02, 6 Buccleuch Place)
office phone number: 0131 650 4656
email: [email protected]
James Curran
job: 3rd Year PhD student, Institute for Communicating and Collaborative Systems
office: 3R14 BP2 (3rd Floor Right, Room 14, 2 Buccleuch Place)
office phone number: 0131 650 4431
web: http://www.cogsci.ed.ac.uk/∼jamesc/
email: [email protected]
It is best to contact the tutors by email first if you have questions.
What is Python?
Python is a script language developed by Guido van Rossum at CNRI (Corporation for National Research Initiatives) in the early 1990’s. The language is named after the BBC show “Monty Python’s Flying Circus”. You
will find there are frequent (and gratuitous) references to Monty Python skits in the Python documentation (but
thankfully not in these notes).
The Python Language Reference Manual sums up the language features nicely:
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it
very attractive for rapid application development, as well as for use as a scripting or glue language to
connect existing components together. Python’s simple, easy to learn syntax emphasises readability
and therefore reduces the cost of program maintenance. Python supports modules and packages,
which encourages program modularity and code reuse. The Python interpreter and the extensive
standard library are available in source or binary form without charge for all major platforms, and can
be freely distributed.
I will not explain what all this means here, but hopefully by the end of the course you will understand many of the
important properties of Python that are alluded to in this description.
1
Why use Python?
Python makes the development of short to medium length programs easy. Python syntax is very simple (in fact
many people call it executable pseudo-code), which helps make the programs easy to read, debug and maintain.
Variables don’t require declaration in Python which makes the code shorter and often clearer.
Since the scripts are interpreted, they don’t take time to be compiled (which for a large system can be very time
consuming) and the code itself can be manipulated as the program runs. Further, in Python experimentation is
made easy by the fact that code can be typed directly into the interpreter and run as it is entered. All types of
values in Python can be printed which makes debugging and tracing easier. Thus Python is great for the newbie,
the curious and the experimenter.
Python scripts can be run without change on any system which has a Python interpreter installed, which makes the
scripts fully portable. The Python interpreter and the extensive standard library are freely available in source or
binary form for all major platforms from the Python web site, http://www.python.org, and can be freely distributed.
Python’s free documentation has long been considered to be excellent (particularly for a free programming language), and there is a large, active and very helpful community of Python programmers on the web. There are also
many Python tutorials available on the web.
Python has built in support for many standard data structures which compiled languages typically lack built in
support for. This means more convenient syntax can be used for common operations. Python also has a comprehensive standard library supporting text and HTML/XML processing, network access and operating system
services.
Members of the Python user community often distribute their own work in the form of Python modules that collect
common functionality together in one place. The Python website contains pointers to many free third party Python
modules, programs and tools, and additional documentation. Examples of these modules include Graphical User
Interface (GUI) components, matrix/vector support and the Natural Language Toolkit (NLTK), which we will be
using for this course.
Starting Python
One of the nicest things about Python (which is reminiscent of the glory days of BASIC on the Commodore
64, VIC 20 or BBC Micro) is the ability to type a program straight into the Python interpreter and have it run
as you go. This is one of the reasons that Python is so great to learn. If you want to try something just run the
Python interpreter, by typing python, and then pressing Enter at the command (or shell) prompt in the terminal
window.
Example 1 These notes will use my DICE shell prompt, which consists of the machine I am running on tarski
and my user id s0090160. Your shell prompt will have a different machine name in square brackets followed by
s, and then your matriculation number.
[tarski]s0090160: python2.2
Python 2.2 (#1, Aug 23 2002, 15:36:47)
[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-85)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print ’Hello World’
Hello World
>>>
In examples where you have to type things into the shell or the interpreter directly, what you are required to type
is in bold teletype font, and the shell’s or interpreter’s responses are in normal teletype.
The Python interpreter prints its version and copyright information when it is first started for interactive
use. The >>> is the prompt which means python is waiting for a statement to be entered. Typing in
print ’Hello World’ and pressing Enter causes the interpreter to read and execute the statement. The
result is the Hello World printed on the next line. After executing the statement, python returns to waiting for
another statement, and hence there is another >>> prompt.
And before you know it, you have written your first Python program, the ubiquitous ’Hello World’.
2
In order to exit from the Python interpreter, enter the key sequence Ctrl-D (i.e., while holding the control key
down, press the key d).
Most of the time I will provide each example in a separate file, because it saves typing them out every time you
want to run them and I need to type them out to check them anyway. These files are named after the example
number in the notes. However, for this week we will do everything interactively so you get the idea of typing things
directly into Python.
NB: On many Unix systems, Python version 2.2 (which we will require to run NLTK properly) is not the
default version of Python, so typing python rather than python2.2 (or on some systems python2) may
give you the wrong version. For instance, on the DICE machines and almost all current versions of Linux,
python will give you version 1.5.2, which will not run NLTK.
Python as a Calculator
To further stress the value of typing code directly into the Python interpreter for experimentation purposes, you
should have a go at using Python as a calculator. I fire up the Python interpreter all the time, since it is more
convenient that having to run the standard graphical calculator program:
>>> 123*34 + 3
4185
>>> 1834.34/34.5 - 4
49.169275362318835
>>> 1/2
0
>>> 1/2.0
0.5
Notice that Python prints the result of each command or statement on the following line and then waits for more
statements. The second thing to notice is that multiply uses the ∗ rather than ×. The final, and perhaps most
mysterious thing to notice is the difference between the results of 1/2 and 1/2.0. We will get back to why this
is the case in the next lab session.
Hello, Who are you?
The previous ‘Hello World’ program is a bit too simple (and impersonal). Programs that do not accept information
from the user (or the outside world) can only do so many interesting things. As a first step, we will ask the user
their name and greet them personally. Also, this program is quite a bit bigger, so it is worth typing it into a file
and making it run like an independent program.
Example 2 This example asks the user for their name and then greets them personally. It shows two different
ways of achieving this in Python. The first step is to open your favourite text editor (probably Emacs) and type in
the program below, and then save it as example1.py
3
#!/usr/bin/python2.2
# tute0/example1.py
name = raw_input(’Enter your name? ’)
print ’Hello ’ + name
print ’Hello %s’ % name
import sys
print ’Enter your name?’
name = sys.stdin.readline()
print ’Hello %s’ % name
print ’Hello "%s"’ % name
name = name[:-1]
print "Hello ’%s’" % name
# prompt and read user’s name
# string concatenation
# format string substitution
# prompt for user’s name
# read user’s name
# double quotes within single quotes
# remove the newline character
# single quotes within double quotes
Running this example now gives the following output (remember to answer the questions (the bits in bold) otherwise it will just sit there):
[tarski]s0090160: python2.2 example1.py
Enter your name? James
Hello James
Hello James
Enter your name?
James
Hello James
Hello "James
"
Hello ’James’
This script can also be run without having to invoke the Python interpreter:
[tarski]s0090160: ./example1.py
This example shows one way of making a Python script appear stand-alone (that is, you don’t have to type python
before the name of the program). To make this happen, there are two steps to the process:
make the first line of the script tell Unix how to run the script by specifying which program to run to interpret
the rest of the file. #!, pronounced hash-bang, tells the operating system (only in Linux and Unix that is)
that the file is a script that must be interpreted by the following program. The /usr/bin/python2.2 is
the full path (location in the directory hierarchy) of the interpreter, in our case the Python 2.2 interpreter.
mark the file as executable using the chmod (1) (change mode) command1 . The command chmod +rx example1.py marks the file example1.py as readable and executable for all users on the system. Doing this is called
changing the file permissions to executable and needs only be done to the file once.
This program shows two different ways of reading information from the user. The most simple and direct approach
is to call the raw_input function, which needs to be given a string which it prompts the user with. The next
chapter fully describes creating and using functions, but for now, a function is a piece of code with a name that
does a particular task. The raw_input function waits for the user to type in a string and press Enter. Once we
have retrieved the name from the user (which is returned to us in a string), we need to store it somewhere.
This ‘somewhere’ is called a variable A variable is a like a mailbox with a name, you can store things in the
mailbox and look at the contents of the mailbox. Internally the variable is just a piece of memory with a number,
but Python makes it easier for us to remember the piece of memory by giving it a name rather than a number.
Another way of thinking about variables is that they are like pronumerals in algebra – they are designed to hold
and give a name to changeable bits of information.
1 the
bracketed number following a Unix command is customary. It refers to the set of manual pages that describe the command. To see
this try typing man chmod at the shell prompt, the manual entry for chmod comes from section 1 of the manual pages (the section is shown
in the top left and right corners of the manual page).
4
Variables are created in Python by assignment, which is the process of setting the value of a variable (or putting
something in the mailbox). Assignment is signified by the = operator. Be careful not to confuse with assignment
with equality in mathematics. To avoid confusion, always think about assignment as taking the value calculated
on the right of the equals sign and placing into the variable on the left.
The next two lines print out the message with the contents of the variable using concatenation (as we have seen
above) and a format string. A format string is a like a template for making new strings by substituting values
into the template. The %s in the format string ’Hello %s’ is replaced by a string value (the s in %s stands for
string) which must be placed after the % that follows the format string. More will be said about this below when
we describe strings in detail.
The second chunk of Python shows a more general way of performing the same task. It involves reading the name
from the user using file operations.
5
Python in Action
The following example gives you a taste of what you can do very simply with Python. This program will extract
all of the text from the main Informatics web page:
#!/usr/bin/python2.2
import urllib
import re
URL = ’http://www.informatics.ed.ac.uk’
TAGS = re.compile(’<[ˆ>]+>’)
WS = re.compile(’\w+’)
url = urllib.urlopen(URL)
html = url.read()
text = TAGS.sub(’’, html)
words = WS.findall(text)
for word in words:
print word
Type this program in, save it as example2.py and then make it executable. Here are some things you can try:
make the URL selectable by the user
convert the words to lowercase
count the number of times each word appears
extract the URLs from the page rather than the words
Python Resources
http://www.python.org/ — main Python portal
Python binaries i.e. the interpreter, tutorials, reference manuals, and interesting Python modules. I recommend reading the tutorial, but it isn’t really designed for total beginners.
http://greenteapress.com/thinkpython.html — How to think like a Computer Scientist
free online book that teaches programming (Python version). Hard copy of this is available from the ITO.
http://www.onlamp.com/python/ - O’Reilly Python portal
contains parts of the Learning Python and Programming Python reference books online, and source code,
discussions etc.
http://www.cogsci.ed.ac.uk/∼jamesc/icl/ccss2002.pdf - James Curran’s CCSS notes
http://www.cogsci.ed.ac.uk/∼jamesc/icl/ccss2002.tgz - James Curran’s CCSS examples
6