Download COSC 121 Introduction to Programming

Document related concepts
no text concepts found
Transcript
COSC 121
Introduction to Programming
Richard Lobb, Erskine Building room 211
Email: [email protected]
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 1
1. Administrative bumph
See Initial Course Handout
Fire regulations and exit locations
Student reps picked second week
People
Locations: lectures and labs
Items of assessment
Next few slides
Important dates
Textbook
Learn site + forums
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 2
People
Andy Cockburn
Phil Holland
–
Course Supervisor
–
King of the Labs.
–
A real staff member
–
Room 112.
Yalini Sundralingam
Me
–
Tutor coordinator.
–
Richard Lobb.
–
Room 332
–
See next slide.
Marina Filipovic
–
Tutor in charge of
121 labs. Room 321
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 3
Locations
Lectures (all here, we hope)
–
Monday, Thursday 10 – 10:50am
–
Wednesday 11 – 11:50am
o
NB: No lecture this first Monday.
Labs – all in Lab 2 aka Room 133, Erskine
–
6 streams – see Uni course page for COSC121
–
You should have been allocated to a lab stream
o
–
Check out your UCStudentWeb / MyTimetable page.
You can use other lab times if there are spare machines
Labs start next week
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 4
Who am I?
Richard Lobb
–
–
–
–
–
Room 211, Erskine Building (in 333 until end of March)
Adjunct Senior Fellow
“Retired” from full-time academia
o
Was in CS dept at Auckland from 1978 to 2003
o
Computer graphics was my area
Passionate about programming
This year teaching:
o
COSC 121 Introduction to Programming (Python),
o
ENCN305 Computer Programming & Stochastic Modelling (Matlab)
o
ENCE 260 Computer Systems (C),
o
COSC 365 Web Computing (PHP, C#, JavaScript)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 5
Assessment
The first quiz is THIS WEEK
What?
Worth
When?
Lab quizzes
10%
Every week (10 @ 1%)
Mid-course quiz/test
15%
Week of April 23rd (TBA)
Programming Assignment
20%
Due: 5pm, 29 May
Examination
55%
To Be Announced
NOTE: To achieve a full pass (C or better) that will allow you to advance
in Computer Science you must achieve:
(a) at least 45% over the two invigilated items combined (test + exam)
(b) at least 55% overall.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 6
Textbook
Available from:
–
Bookshop (178 copies @ 7/2)
o
–
$61 (less 10% = ~$55)
E-book from
http://pragprog.com/titles/gwpy
o
$US22
Highly recommended
Course is built around it
We will assume you have a
copy
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 7
Other resources
"How to Think Like a Computer Scientist: Interactive Edition"
–
–
–
http://thinkcspy.appspot.com/build/index.html
A great interactive text, but discovered too late for this year
BUT: it uses Python 3, not Python 2.
Online Python Tutor:
–
Visualise python execution, stepping forwards and backwards
–
http://people.csail.mit.edu/pgbovine/python/
–
BUT: visualisations of data structures different from my notes
Python exercises: codingbat.com/python
If you're already a programmer:
–
The Python tutorial: docs.python.org/tutorial/
–
Dive into Python: www.diveintopython.net/
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 8
Mainly a programming course
You have to acquire new cognitive skills
–
It’s NOT about using Microsoft Office or any other package
–
It’s NOT about learning lecture notes by rote
–
It’s NOT about hacking at code downloaded from the web
Labs and assignments are where you learn to program
Lectures provide the context, e.g.:
–
–
–
–
–
–
Overview
Motivation
Expectations
Focus on specific difficulties
Demonstrations
Program “style”
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 9
How not to run a Marathon
Richard rants and raves.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 10
How to succeed in 121
Do all the labs, on time.
Do the assignment thoroughly, getting started early.
Don’t give up when the going gets tough
Try to solve problems by yourself
–
Read the book
–
Experiment with code
–
Google
Don’t just “hack” at code until it works
–
Work out what’s wrong before continuing
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 11
Why do all the labs?
COSC 121S1, 2010
Average final exam mark versus # labs attempted
Average final exam mark
70
60
50
40
30
20
10
0
0
1
2
3
4
5
6
7
8
9
10
Number of labs attempted
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 12
Demo
The Learn website
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 13
Lecture notes
All lecture notes are on Learn
–
After this week you must print your own copies
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 14
Timetable (tentative)
Note: The due date for each lab is at the end of the week in which it’s timetabled.
Late submission (of the associated quiz) is permitted for at most one further week.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 15
Programming Challenges Workshop
Just for fun!
Goals:
–
to provide extra challenges for top students
o
–
But 121 students will need prior programming experience, sorry to prepare selected students for programming contests
o
e.g. ANZAC, NZ Programming Contest, ACM ICPC
Wednesday evenings, 7pm, starting 29 February (?)
–
Staff and student tutors will act as mentors
–
First round of ANZAC contest is 31 March
o
See http://www.cse.unsw.edu.au/~elgindyh/anzac12/home.htm
Contact [email protected] for info/details
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 16
1. Getting Started
Read textbook, chapters 1 and 2
What’s 121 about anyway?
The What and Why of Python
Getting started, here and at home
Expressions
Assignment statements
Functions
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 17
What’s 121 about anyway?
“Introduction to programming”
Programming underpins Computer Science
–
But Computer Science is not (just) programming
o
Theory (e.g. computability, algorithmic complexity)
o
Algorithms and data structures
o
Languages and Operating Systems
o
Databases
o
Software Engineering
o
Artificial Intelligence
o
Data Communications & Networks
o
Graphics and Human-Computer Interaction
o
Web computing
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Etc, etc
Slide # 18
The What and Why of Python
We teach programming in the Python language
–
Prior to 2010 we used Java, but hard for beginners to get into
–
Basic programming skills are language independent
–
In later courses you’ll learn C, Java, C#, JavaScript, …
Python is:
–
Free
–
“Elegant”
–
Powerful
–
Relevant
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 19
XKCD’s view (xkcd.com)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 20
Getting started, here and at home
Do the week 1 quiz: Welcome to COSC 121”:
–
Log in to website learn.canterbury.ac.nz
–
Select COSC 121
–
Follow the link on the front page.
At home, to get ready for the rest of the labs:
–
Download and install
o
Python 2.6 or 2.7 from python.org/download [NOT 3.n!]
o
Wing 101 from www.wingware.com/downloads/wingide-101
o
Python Imaging Library from www.pythonware.com/products/pil/
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 21
Starting a lab exercise
(Looking ahead to lab 1 in week 2)
Log in to learn.canterbury.ac.nz and
–
Select 121
–
Select Lab Material
–
Download the zip archive for the required lab
–
Unzip it
–
Click the associated quiz link to start taking the quiz
Launch Wing101 and start doing the lab ☺
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 22
Wing 101
DEMO
Program editing area
NB!
Python Shell
pane
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 23
The Python Shell
DEMO
Bottom right pane in Wing
A “terminal” interface to the Python engine
–
The Python Engine is the program that executes (“interprets”)
Python instructions (“programs”)
o
A “virtual machine” or “scripting engine”
Python engine
Shell
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 24
Expressions
DEMO
Python shell prints the value of any expression entered
An expression: something that can be evaluated to yield a
value
–
Typically a sequence of operands and operators
–
E.g. (25 * 3 – 5) / 7
o
Operands here are: 25, 3, 5, 7
o
Operators: * , –, /
o
Evaluates to 10
Arithmetic operators (in lab 1):
–
+, -, *, /, **, %
–
Last two are exponentiation and modulus operators
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 25
Expressions (cont’d)
Exponentiation: 2**3 is 8 (i.e. 23)
Modulus: 26 % 3 is 2
–
the remainder after dividing 26 by 3
Operator precedence determines order of evaluation
–
** highest then *,/,% then +, – [but more operators later]
o
–
Left-to-right (usually) if operators have same precedence
Parentheses used to change default order
o
2 + 3 * 5 is 17, (2 + 3) * 5 is 25
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 26
Expressions (cont'd)
Operand types (in lab 1):
–
–
int (normal and long variants): whole numbers, e.g. 28196
o
Exact
o
Any arbitrary size/accuracy
float: numbers with fractional bit, e.g. 3.1415926
o
Approximate: ~16 digits accuracy
o
Stored in binary representation so even numbers like 1.1 are
approximate
–
But 1.5, 1.25, 1.125 etc have exact representations!
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 27
Warning: integer division
5 / 10 evaluates to 0
–
“How many times does 10 go into 5?”
–
Answer: “No times”!
5.0 / 10.0 evaluates to 0.5
–
So does 5.0 / 10 and 5 / 10.0
o
Because ints gets converted to floats when doing mixed-type division
THIS WILL GET YOU TIME AND TIME AGAIN!
Changed in Python 3, but we’re not using that.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 28
Assignment statements
DEMO
Python shell executes any statements you enter
Our first type of statement is the assignment statement
–
E.g. my_age = 200 / 2 – 1
Of form
variable_name = expression
A variable name must be a letter (or underscore) followed
by any number of alphanumeric characters (letters,
underscores or digits)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 29
1. Works out the value of the Right Hand Side (RHS)
2. Creates a new object (in the “object store” or heap) to
hold that value
–
In this case an int (i.e.,“Integer”) object with the value 99
3. Adds the variable name to the current “dictionary” of
variables (unless it’s already there)
4. Sets the dictionary entry to point to the new object
–
We call this a reference to the object
...
Dictionary ...
(also an object) my_age
...
Warning! This is a simplification. See “aliasing” later.
What Python does
new int object
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
99
“Fred”
37.5
other objects
Object store
(a.k.a. heap)
Slide # 30
What a reference actually is
The computer has lots of random access memory (RAM)
–
e.g. 4 gigabytes (GB) where a byte is 8-bits, e.g. 01101101
Bytes are numbered 0, 1, 2, 3, 4, .... 4 GB
–
The number of each byte is called its address
A reference to an object is the address in memory at which
the object is stored (i.e., where it starts)
–
In Python it's called the object’s identity
Thus a dictionary entry consists of a variable name (called a
“string”) together with the identity of the object it references
–
Shown as an arrow in the figures
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 31
Using variables
When a variable name appears in an expression, the
associated object’s value is used in its place, e.g.
his_age = my_age - 1
my_age = my_age + 1 # '=' is not "equals"!!!
'=' means “is assigned the value”
...
Dictionary ...
my_age
...
his_age
old object (defunct)
99
100
“Fred”
37.5
other objects
Object store
(a.k.a. heap)
98
new objects
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 32
Combined Operators
We often use operations like
count = count + 1
# Increment the count
size = 2 * size
# Double the size
So Python provides short-cut “combined operators”, e.g.
count += 1
size *= 2
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 33
Functions
The key to programming is abstraction
–
“Abstraction: the process of formulating generalised ideas or
concepts by extracting the common qualities from specific
examples” – Collins English Dictionary
–
Naming a concept is a key part of abstraction
Example: “Hey, I often need to multiply a number by
itself. I know, let’s call that squaring a number”
In Python, functions are used for abstracting common
procedures (i.e., sequences of operations)
–
We’ll see other abstraction methods – modules and classes –
later.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 34
An example function
def square(x):
# x is called a "parameter"
return x * x # the "body" is indented
Used by “calling” or “invoking” it, e.g.
square(3)
# 3 is called the "argument"
square(37.5)
# Here 37.5 is the argument
square(2 + 3 * 5) # The argument is an expression
The parameter is set to the value of the argument and then the
body of the function is executed
In this case (but not always) it returns a value
–
The value of the function
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 35
What type is the parameter?
In many languages we have to specify the parameter type
– e.g. specify whether we are squaring ints or floats
– That restricts the allowable argument types
Python has “Duck Typing”
– “If it walks like a duck and quacks like a duck, it’s a duck”
– In this case: if the argument allows x * x it’s OK
o
If not, it crashes when we run it
So you can square ints and floats
– And complex objects, but we don't do them in 121
Well, maybe a quick demo?
–
And any other objects we might define that allow ‘*’
o
We probably won't do that in 121 either
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 36
Running programs in Wing
Typing functions directly into the Python shell is clumsy
Instead we enter them into a program file
Then we can:
–
Run the file
–
Edit it easily
–
Come back to it days later
–
Re-use the functions in other programs
We are now programming ☺
DEMO
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 37
Another example
def fahrenheit(degrees_c):
degrees_f = (9.0 / 5.0) * degrees_c + 32.0
return degrees_f
print fahrenheit(0)
# What answer do we get?
Note multiline
body.
All lines
indented by
same amount.
Also note
local
variable.
print fahrenheit(100) # What answer here?
print fahrenheit(451.0) # And here?
print fahrenheit("Fred") # What does this do?
•
•
•
The above is a program in a separate file
Now we can’t just write expressions and have them printed
We have to use a print statement. Covered in detail later.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 38
Local variables
degrees_f is a “local variable” of the fahrenheit function
Goes in a new dictionary belonging to that function
–
That dictionary exists only while the function is running
o
So variable disappears when function returns
We say the scope of a local variable is the body of the
function in which it is used
–
Scope is where a variable can be “seen” from
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 39
When do I use functions?
Don't expect to understand all this properly yet!
Always!
Programming is the art of breaking a problem into small
“obviously correct” functions
–
“Divide and conquer”
Each can be separately debugged
–
To “debug” is to remove the “bugs”, i.e., errors, from a program
Most functions should be less than 10 lines
No functions may be longer than 40 lines in COSC121
–
Break big functions into smaller functions
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 40
Built-in functions
Python has lots
–
Though most of its library functions are methods not simple
functions – see later.
Some early samplers:
–
round(x) returns the nearest int to the float value x
–
int(x) converts x into an int
–
o
If x is a float, it truncates.
o
Later we'll see x can also be a string.
abs(x) returns the absolute value of x
You’ll meet lots more in due course
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 41
Week 2: Strings and Modules
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 42
2. Strings
So far we have met just int and float objects.
To process text, we use string objects.
A string is a sequence of characters
In COSC121 we use only “normal” Python strings
–
Can represent only standard western keyboard characters
(“Latin1”)
–
We ignore unicode strings, which can represent vastly more
characters, including e.g. Chinese
o
But note that Python 3 uses only unicode strings
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 43
String literals
A literal is just a constant
–
e.g. 10 is an int literal, 23.5 is a float literal
String literals in Python are text enclosed by matching
delimiters, which can be:
1.
single quote (') characters, e.g.
o
2.
or double quote (") characters, e.g.
o
3.
s = 'Hi there class 121!'
s = "Hi there class 121!"
or triple single or double quote characters, e.g.
o
s = '''Hi there class 121!'''
o
s = """Hi there class 121!"""
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 44
Pictorially
Hi there class 121!
...
Dictionary s
...
other objects
Object store
(a.k.a. heap)
37.5
98
s is a variable of type str, a simple Latin1 (usually) string
Note for C and Java programmers: Python does not have a data type for representing single characters. You won’t miss it though – just use 1-character strings.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 45
Interlude: character encoding
Computer memory is just a sequence of bytes
–
And an object occupies a chunk of memory
A byte is 8 bits e.g. 00110011
–
A bit is a binary digit: either 0 or 1
Each byte can have 28 possible states
–
So can represent the numbers 0 through 255 inclusive
Mapping from byte values to characters is via a character
encoding table
–
e.g. 48 is the character ‘0’, 49 is ‘1’, 65 is ‘A’, 66 is ‘B’ etc.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 46
Latin1 encoding (ISO 8859-1)
The encoding typically used within Python’s str objects
A sample bit is shown on the right
See htmlhelp.com/reference/charset
for full table
There are some non-printing
characters, e.g.:
–
9 is “horizontal tab”
o
–
... by an unspecified amount!
12 is “line feed”
o
Used to start a new line
Interlude ends
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 47
But before we continue with strings ...
The print statement
Syntax:
–
print expression [, expression] ... [,]
i.e. the word “print” followed by one or more expressions,
separated by commas, whose values are to be printed
o
Square brackets in syntax specifications denote optional elements
o
“...” denotes zero or more repetitions of the preceding syntax element
Prints the values of the expressions on the screen
–
–
Spaces separate the output expressions
The optional comma at the end suppresses the final newline
Used to generate output in programs (or in the shell)
–
An expression on a line by itself in a program doesn’t
generate output as it does when typed into the shell.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 48
Example
The following program:
DEMO
n = 10
s = "n is"
print n
print s, n
print s,
print n
Outputs:
Evaluating printExample.py
10
n is 10
n is 10
>>>
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 49
Which delimiters should I use?
Use double quote character (") as string terminator unless
string contains a double quote character. Then use single
quote delimiters, e.g.
>>>print '"Hi", he said'
Or vice-versa!
"Hi", he said
Use the triple-delimiters (''' or """) when the string
includes newline characters, e.g.
s = '''One fish
Two fish
Red fish
Blue fish'''
print s
In particular, see docstrings later
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 50
Note on statement termination
Python statements end at newline except:
–
Inside triple-quoted strings (as above)
–
Inside bracketed expressions (i.e. ( ... ) or [ ... ]), e.g.
cost = (23.5 *
36)
–
# This is valid (but ugly)
When newline preceded by backslash (\), e.g.
cost = 23.5 * \
36
# Also valid (and also ugly)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 51
Special characters
Can embed special characters like tabs, newline and quote
characters in a string with special “escape sequences” e.g.
s = "\"One fish\nTwo fish\nRed fish\nBlue fish\""
print s
Outputs
"One fish
Two fish
Red fish
Blue fish"
\n
\t
\'
\"
\xhh
\\
is newline
is tab
is single quote
is double quote
is the character with hexadecimal value hh
is a backslash
See Language Reference Manual section 2.4.1 for
complete syntax
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 52
String operations
Three of the arithmetic operators are overloaded to work
on strings:
1.
'+' performs string concatenation
o
2.
'*' performs string repetition
o
3.
Both operands must be strings
One operand a string, the other an int
'%' performs string formatting
o
Left operand must be a string, right operand a value or a "tuple"
o
This is deprecated
o
Textbook uses it but we won’t! [You don’t need to understand it.]
Other string operations need indexing and method calls
–
See later
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 53
String concatenation examples
The program
DEMO
s = "Hi"
print s + "there"
print s + " " + "there"
print "Hi " "there" # Two literals, no operator
n = 10
print "n is " + str(n) # What happens without 'str'?
Outputs
Hithere
Hi there
Hi there
n is 10
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 54
String repetition examples
DEMO
The program
s = "Hi"
print 3 *
print s *
n = 5
print n *
print s *
s
3
s
n
Outputs
HiHiHi
HiHiHi
HiHiHiHiHi
HiHiHiHiHi
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 55
Formatting
Different from textbook: we don’t use ‘%’ operator
Function str converts values to strings, e.g.
–
str(23) yields the string '23'
–
str(2.0 / 3.0) yields the string '0.666666666667'
–
But no control over formatting
Function format(value, format_spec) formats the
value as a string according to the given format specifier
–
e.g. '6.2f' means “format a floating point number in a field
of 5 characters with 2 digits after the decimal point”
–
format(value) ≡ format(value, "") ≡ str(value)
Best to explain with examples ...
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 56
Examples
DEMO
Conversion types s and d
name = "Richard"
n = 20
print "Hi " + format(name, "s") + ", pleased to meet you"
print "n = " + format(n, "d")
# Above 2 lines could have used just 'str' equivalently
Minimum field width
from math import pi
Precision
print pi
print "pi is " + format(pi, "5.3f")
print "pi is " + format(pi, "8.3f")
print "pi is " + format(pi, ".3f")
print "pi is " + format(pi, ".3g")
print "Tiny num is " + format(pi/100000, ".3g")
print format(name, "20")
print format(name, ">20")
print format(name, "^20")
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Note: Conversion type
can usually be omitted for
strings and ints
Slide # 57
Example output
Hi Richard, pleased to meet you
n = 20
3.14159265359
pi is 3.142
pi is
3.142
pi is 3.142
pi is 3.14
Tiny num is 3.14e-05
Richard
Richard
Richard
• Many other capabilities e.g. binary, octal, hexadecimal,
percentages, commas for thousands, arbitrary fill characters
• See http://docs.python.org/library/string.html#formatspec
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 58
raw_input
One more useful function: raw_input([prompt])
Displays prompt (if given) and reads a line from keyboard
Returns the line as a string, e.g.
–
name = raw_input("What is your name? ")
If you want to read numbers, convert the string using int or
float as appropriate, e.g.
–
–
age_as_string = raw_input("How old are you? ")
age = int(age_as_string)
DEMO
Or
–
weight = float(raw_input("What's your weight in kg? "))
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 59
In class exercise
Write a program that prompts for a person’s weight in kg
and height in metres and prints their Body Mass Index
–
BMI = weight / height2 (in kg/m2)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 60
3. Modules
To control complexity, large programs are always broken
into smaller modules
–
Collections of functions (plus perhaps data)
One module imports the code and data from another
The Python library is a large collection of modules
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 61
Importing the math module
DEMO
Can import the entire module and its namespace
import math
print math.pi
# An imported data value
print math.sqrt(23.456)# An imported function
Or import selected data/functions into current namespace
from math import pi, sqrt
print pi
print sqrt(23.456)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 62
Finding what’s in a module
DEMO
1. Read the Python Standard Library documentation
–
Accessed via Wing’s Help menu
–
Or via http://www.python.org/doc/
2. In the shell window, import the module and type
help(moduleName), e.g. help(math)
or, for its directory, dir(math)
3. Google, e.g., python math module
4. For details on a particular function, can use the on-line
help’s index or type help(moduleName.functionName)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 63
Interlude
A bit about namespaces
A namespace is just a dictionary of names
Consider:
my_age = 100
import math
We then have:
“Global”
Dictionary
...
math
my_age
...
these are two different namespaces
...
sqrt
...
pi
...
100
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
math
Dictionary
Code for sqrt
function
Object store
(a.k.a. heap)
3.14159...
etc
Slide # 64
Local namespaces
DEMO
Consider the following program
x = 10
def blah():
x = 20
print "Within blah, x =", x
Function has its own
local namespace.
print "Initially, x =", x
blah()
print "Post-blah, x =", x
Output is:
Initially, x = 10
Within blah, x = 20
Post-blah, x = 10
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 65
Using globals
A function can “see” variables in the global namespace,
e.g.
x = 10
def blah():
print "Within blah, x =", x
# Prints 10!
But new variables created by assignment inside a function
are added to the local name space.
When evaluating expressions, Python looks first in local
namespace, then in global namespace if name not found.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 66
Using globals (cont'd)
It’s illegal to reference a global variable from within a
function and then to create a local one of the same name.
e.g., the following gives a runtime error:
x = 10
def blah():
print x
x = 20
Python interprets this as a local variable being used
before it is defined.
–
We say the scope of a variable is the entire function
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 67
Assigning to a global
BUT can assign to a global variable using a global statement:
x = 10
def blah():
global x
print "In blah, x =", x
x = 20
print "Initially, x =", x
blah()
print "Post-blah, x =", x
Output is:
Initially, x = 10
In blah, x = 10
Post-blah, x= 20
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 68
Use of global variables in 121
A very simple rule:
DON’T
i.e., don’t read from or write to
global variables from within a
function body
Interlude Ends
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 69
Writing your own modules
A module is nothing more than a file of Python code, e.g.
a file circle.py:
import math
def area(radius):
return math.pi * radius**2
def circumference(radius):
return 2 * math.pi * radius
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 70
Using the circle module
Just import it and use it!
import circle
r = 5.0
area = circle.area(r)
circum = circle.circumference(r)
...
import x causes Python to load and execute the file x.py
–
Must be in the current directory or on the Python search path
o
Don’t worry about the latter for now
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 71
Documenting your modules
Triple-quoted strings at the start of modules and function
bodies are docstrings, e.g. for circle.py
'''A module of functions related to circles.'''
import math
def area(radius):
'''Returns the area of a circle given its radius.'''
return math.pi * radius**2
def circumference(radius):
'''Returns the circumference of a circle given its radius.'''
return 2 * math.pi * radius
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 72
Output of help(circle) is now:
Help on module circle:
NAME
circle - A module of functions related to circles.
FILE
h:\work\2011\121s1\lectures\circle.py
FUNCTIONS
area(radius)
Return the area of a circle given its radius.
circumference(radius)
Return the circumference of a circle given its radius.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 73
Style in COSC121
All modules should have:
–
A docstring for the module
o
–
A docstring for each function
o
–
–
At the very start of the file, before any imports
Immediately after the def line
A blank line between the docstring and the function body
Three blank lines between functions
Also, lines shouldn't be more than 80 characters long
–
i.e., don’t cross that red line in Wing!
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 74
More on import
When a module is imported, it is executed
–
That’s when the function objects get defined.
–
Also, any module globals (like math.pi) get defined then.
If a module has already been imported, nothing happens
When a module is imported, its __name__ variable is set
to the name of the module
When a module is run, its __name__ is set to “__main__”
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 75
Using “__main__”
It’s standard practice to include module test code at the
end of a module
Execute it only if name is __main__
But we haven’t done if statements yet!
–
Textbook however introduces if statements at this point
without explanation
–
We'll have a quick advance peek
DEMO
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 76
Week3: Objects, methods and lists
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 77
Objects
We’ve seen that everything in Python is an object
–
int objects, float objects, str objects, function objects ...
An object contains data
–
The value, for int and float objects
–
The sequence of characters, for str objects
–
The Python code, for function objects
But wait, there’s more ....
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 78
Methods
Each type or class of objects has a set of functions that
operate on objects of that type
These are called methods.
We call a method with the syntax
objectName.methodName([argument]...)
This is roughly equivalent to
functionName(objectName, [argument]...)
– i.e., calling a method of a particular object is like calling a
(roughly equivalent) function that takes the object as its first
parameter
– This will make sense much later in the course ☺
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 79
Some string methods
DEMO
capitalize()
find(substring [, begin [, end]])
In this slide, square
brackets denote optional
parameters – you don’t
actually type them.
lower()
upper()
strip([chars_to_strip])
startswith(prefix [, start [, end]])
endswith(suffix [, start [, end]])
Return a boolean.
See later.
split([delimiter]) # Returns a list – see later
format(value [,value]…) # Format using a template
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 80
String method example program
• A program to read a full name like “natalie ng”, break it into
its two components, correctly capitalise each one, and print a
"Hi" message. Handles mixed case, e.g. “nATAlie nG”.
full_name = raw_input("Enter your full name: ")
pos_of_space = full_name.find(' ')
first_name = full_name[0:pos_of_space]
last_name = full_name[pos_of_space+1:]
Extract appropriate "slices" of the
string. See later.
[There are better ways, but this is
probably the easiest at this stage.]
corrected_first_name = first_name.capitalize()
corrected_last_name = last_name.capitalize()
print "Hi", corrected_first_name, corrected_last_name
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 81
Formatting from a template
We've seen lots of string-formatting expressions like
s = "(" + format(x, ".3f") + ", " + format(y, ".3f") + ")"
Cumbersome!
The format method of a string achieves this much more easily:
template = "({0:.3f}, {1:.3f})"
s = template.format(x, y)
DEMO
or just
s = "({0:.3f}, {1:.3f})".format(x, y)
–
The result is the template string with the replacement fields (in braces)
replaced by the formatted argument values
–
Replacement field is an argument index followed by an (optional) colon
and a format specifier (as in the format function).
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 82
More formatting examples
Refce: http://docs.python.org/library/string.html#formatstrings
DEMO
first_name = "Lucy"
last name = "Languid"
third = 1.0 / 3.0
print "{0} {1}".format(first_name, last_name)
print "{} {}".format(first_name, last_name)
print "One third to 4 dec places is {}".format(third)
print " n
sqrt(n)"
for i = range(100): # Laying out a table in columns
print "{0:3}{1:10.5f}".format(i, math.sqrt(i))
We have all the same options as before for formatting
each argument
–
Binary, octal, hex, left/centre/right justification, general
numeric format, etc.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 83
Methods of other object types
All objects have methods
–
The object’s type (or “class”) determines the set of methods
Even int and float objects
–
But mostly their methods are for use internally by the Python
engine, e.g. if i and j are ints:
o
i.__add__(j) is exactly equivalent to i + j
__add__ is the “add this int to another” method
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 84
Finding the available methods
Read the documentation (e.g. via help in Wing), or
Use dir and/or help in the shell, e.g.
s = "blah"
# Create an object of the type we're interested in
dir(s)
# A "directory listing" of the methods of s
help(s.index) # Help on a particular method
help(type(s)) # Help on the entire string type (aka "class")
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 85
Another example: images
DEMO
The Python Imaging Library (PIL) does image processing
It contains a set of submodules
–
Must explicitly import the one(s) we want
Image submodule does reading of images, resizing,
cropping, rotating, pixel editing, various transformations,
etc. For example:
from PIL import Image # Get the Image submodule
my_image = Image.open("photo.jpg")
new_image = my_image.rotate(90)
# Rotate 90 degrees
new_image.save("rotated_photo.jpg")# Save new image
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 86
Lists
Many /most computer programs handle collections of data
–
a list of students, a sequence of temperature samples, an array of
image pixels, set of university courses, a table of measurements ...
Most such collections can be represented in Python by its
list data type.
A list is a sequence of objects that can be processed
sequentially.
The Python list also allows immediate access to any element
by subscripting, e.g. marks[i]for the ith mark
–
In maths notation, we’d write this as marksi
–
So a Python list is both a list and an array
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 87
Some examples of lists
days_in_month = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
guys= ["Freddie", "Brian", "Roger", "John"]
colours = ["Red", "Green", "Blue"]
squares = [0, 1, 4, 9, 16, 25, 36, 49]
square_roots = [1.0, 1.4142135623, 1.732050807, 2.0, 2.236067977]
great_thoughts_of_george_bush = [] # The empty list
personal_details = ["Erika Mandelbrot", 27, "5 Nowhere St, Christchurch"]
A list of objects of different types.
Legal in Python but bad style.
We'll see better ways of representing such "records" later.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 88
Indexing into lists
To use lists we need to be able to get at the individual
elements
Do by indexing, e.g.
print days_in_month[0] # Prints 31
print colours[2]
# Prints "Blue"
NB: subscripts start at 0!!
print squares[len(squares) – 1] # prints 49
o
len function returns the number of items in a list
print squares[-1] # Also prints 49
o
If subscript is negative, Python adds len(list) to it
print squares[-2]
# Prints 36
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 89
How lists are represented
names = ["Fred", "Mary", "ChinMay"] results in:
list object
...
Dictionary names
...
"ChinMay"
Object store
"Fred"
"Mary"
The list object itself is just a list of references to the
objects in the list.
–
This is important – see aliasing slide shortly
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 90
Changing list elements
names = ["Fred", "Mary", "ChinMay"]
names[1] = "Alex"
results in:
list object
...
Dictionary names
...
"ChinMay"
Object store
"Fred"
"Mary"
"Alex"
defunct
The list element is changed – we don't get a new list
We say list objects are mutable (= "changeable")
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 91
List slicing
Often want sublists rather than individual items
Done by extended indexing of the form "start:end+1"
–
Missing first subscript defaults to 0
–
Missing second subscript defaults to len(list)
Called "slicing"
Examples:
–
print squares[2:4] # Prints "[4, 9]"
o
Note that slice is up to but not including the second subscript
–
print squares[:4] # Prints "[0, 1, 4, 9]"
–
print squares[3:] # Prints "[9, 16, 25, 36, 49]"
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 92
Assigning to slices
my_list[start:end] = another_list
replaces the elements my_list [start] up to but not including
my_list [end] with the elements from another_list
Example:
my_list = [1, 3, 5, 7, 9,11]
my_list[2:4] = [-3, -9, -11, -13]
print my_list
prints [1, 3, -3, -9, -11, -13, 9, 11]
Can do insertion too (but insert method easier to read?):
my_list = [1, 3, 5]
my_list[1:1] = [-3, -9] # my_list is now [1, -3, -9, 3, 5]
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 93
List operators
is a list of all the elements from list1 followed by all
the elements from list2
list1 + list2
–
Called concatenation
–
e.g. [1, 2, 3] + [7, 8] is [1, 2, 3, 7, 8]
or n * my_list, where n is an int, is a new list
containing n repetitions of the sequence of items in my_list
my_list * n
–
3 * ['Max', 'Amy'] is ['Max ', 'Amy ', 'Max ', 'Amy ', 'Max ', 'Amy']
object in list evaluates to True if the object is in the list
–
e.g. 3 in [1,3,5] is True, 2 in [1,3,5] is False
o
You're not meant to understand this yet. It's here for completeness!
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 94
List functions
len(my_list)
–
is the length of my_list
e.g., print len([1,2,3]) prints 3
sum(my_list)
sums the elements of my_list
–
e.g., print sum([1,2,3]) prints 6
–
List items must be numeric
o
Can’t do string concatenation this way
and max(my_list) return min and max elements
in a numeric list
min(my_list)
–
e.g. max([-3, 13, 5]) is 13
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 95
List methods
If L is a list:
L.append(object) # Adds object to end of L. Returns None
o
None is a special object used to signify “no answer”
L.count(value) # Returns count of items in L equal to value
L.extend(L2) # Appends all the items from L2 onto L. Returns None
L.index(value) # Returns the index of the first occurrence of value in L
o
Gives an error if value not found
L.insert (index, object) # Insert object into L before index. Returns None
L.pop([index]) # Remove and return object at index (defaults to last)
L.remove(value) # Remove first occurrence of value. Returns None
L.reverse() # Reverse list L. Returns None
L.sort()
# Sorts L in ascending order. Returns None
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 96
A trap!
What will the following output?
names = ["Fred", "Mary", "ChinMay"]
other_names = names
names.append("Angus")
print "Names: ", names
print "Other names:", other_names
Answer:
Names: ['Fred', 'Mary', 'ChinMay', 'Angus']
Other names: ['Fred', 'Mary', 'ChinMay', 'Angus']
Both lists were altered!
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 97
Why it happened: aliasing
Assignment of one object to another just copies the
reference.
So after other_names = names we have:
list object
...
Dictionary names
other_names
...
"ChinMay"
Object store
"Fred"
"Mary"
So names and other_names are just aliases for the same
object. Whenever one changes, the other changes too.
–
Also see
http://people.csail.mit.edu/pgbovine/python/tutor.html#mode=visualize
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 98
Avoiding aliasing problems
Be wary of assignments of the form a = b when b is a
mutable object, i.e., one whose value can be changed, as
any changes will apply to all aliases.
–
Not a problem with ints, floats, strings, tuples as they’re all
immutable.
If you want to make a copy of a list, use slicing, e.g.,
other_names = names[:]
–
This constructs a new list containing copies of all the
references. Called a shallow copy.
o
There can still be aliasing problems if the referenced objects are
mutable but we won't worry about that for now!
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 99
List processing: the for statement
for variable in list:
block
Sequentially performs the given statement block for each
element in list, with variable bound to that element.
–
Also called a for loop or a for each loop
Example:
Emphasises that the
loop executes “for
each” value in the list
for i in [0, 1, 2, 3, 4, 5, 6]:
i_sqr = i * i
print "{} squared = {:2}" .format(i, i_sqr)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 100
The for loop: example 2
Often we want to accumulate new data as we traverse the
list, e.g. totalling items or building a new list.
numbers = [0, 1, 2, 3, 4, 5, 6]
squares = []
sum_squares = 0
for i in numbers:
i_sqr = i * i
squares.append(i_sqr)
sum_squares += i_sqr
print "List of squares: ", squares
print "Sum of squares = ", sum_squares
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 101
Nested lists
A list item can be any object, including another list
Useful when representing tabular data, e.g. 6 hourly
Day number
rainfall totals (mm) for a given week
6 am
noon
6pm
Midnight
0
0
9
3
7
1
11
9
0
0
2
0
10
12
20
3
0
0
0
0
4
1
3
4
1
5
2
8
10
0
6
0
0
0
0
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 102
Python representation
Use a list of rows, each row being a list of rainfall values:
rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [0, 10, 12, 20],
[0, 0, 0, 0], [1, 3, 4, 1], [2, 8, 10, 0], [0, 0, 0, 0] ]
...
Dictionary rainfalls
...
etc
etc
0
0
9
7
3
rainfalls[day_num][column_num]
11
9
0
selects a particular int value
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 103
Nested loops
Nested lists are often processed with nested loops, e.g. to
print the previous table:
print "Rainfall data"
for day in [0, 1, 2, 3, 4, 5, 6]:
print "Day", day, ": ",
for column in [0, 1, 2, 3]:
print rainfall[day][column],
print # Prints just a newline
Tip: the range function would help here
–
range(low, high) is
the list [low, low+1, low+2, ... high-1]
–
range(high) is the list [0, 1, 2, ... high-1]
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 104
Sequences
Lists are an example of a more basic Python type: the sequence
Sequences can be:
–
Indexed
–
Iterated over with for loops
strings are also sequences. Thus:
name = "Fred"
print name[1]
# Prints 'r'
print name[1:3] # Prints "re"
for char in name:
print char
# Prints characters in name, one per line
But unlike lists, strings are immutable
–
So can’t assign to elements or slices
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 105
Another sequence type: the tuple
Tuples are like lists, but use parentheses instead of brackets
point = (10, 20)
person = ("Fred", "Bloggs", 27, "15 Memorial Drive", 8053)
empty = () # The empty tuple
singleton = ("Fred",) # Note weird syntax – that comma is essential
Differences:
–
Tuples are immutable
–
Have only two methods: count and index
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 106
Should I use a tuple or a list?
Use lists when:
–
Elements are all of the same type, and
–
It makes sense to think of the elements as a list
Use tuples in any of the following situations:
1.
You’re dealing with inhomogeneous data
o
2.
e.g. person has a name, address, age, postcode
You think of the elements as being parts of a single object
o
But for non-trivial objects, classes are usually better
–
3.
Covered towards end of course
You need or want immutability
o
e.g., as keys for a dictionary – see later.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 107
The list constructor
Can make a list of the elements in any sequence using the
built-in list function, e.g.
–
list("Angus")
–
list((1,2,3))
returns ['A', 'n', 'g', 'u', 's']
returns [1, 2, 3]
list is actually a special function called a constructor.
–
See Object Oriented Programming much later
It’s best to avoid using the name list for one of your own
variables as you then can’t use the list function any more.
–
Names like str, len, list etc are not reserved or protected.
o
i.e, there’s nothing to stop you from clobbering them!
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 108
Parallel assignment
Although we can index tuples, it’s not normally intuitive
–
They’re used when we don't think of the elements as being in a list
Better to extract their components using parallel assignment
person = ("Fred", "Bloggs", 27, "15 Memorial Drive", 8053)
...
(first_name, last_name, age, address, post_code) = person
Useful e.g. in a function that takes a tuple as a parameter
–
Say a point or a person
–
RHS can be any sequence of the same length as LHS
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 109
Processing image pixels
An image is a 2D array of pixels
–
Each pixel is a (red, green, blue) 3-tuple -- an RGB value
PIL.Image.getdata() delivers all the pixels as a list of tuples
–
Actually it’s a sequence, but you can mostly treat it like a list
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 110
Example: redden.py
DEMO
'''Replicate the functionality of sunset.py from the textbook with PIL'''
from PIL import Image
def redder(pixel):
'''Return a redder version of the given pixel'''
(red, green, blue) = pixel
return (red, int(0.7 * green), int(0.7 * blue))
imageFile = open('pic207.jpg', 'rb')
pic = Image.open(imageFile)
pixels = pic.getdata()
print type(pixels)
new_pixels = []
for pixel in pixels:
new_pixels.append(redder(pixel))
pic.putdata(new_pixels) # Replace pixel sequence with our new one
pic.save('redder.jpg')
# Save to a new file
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 111
Files as sequences
Built-in function open(path [,mode]) opens a file
–
–
path is a filepath string, e.g. "H:/121/junk.txt"
o
Python allows forward-slashes instead of Window’s backslashes
o
If you want backslashes, you must escape them, e.g. "H:\\121\\junk.txt"
without slashes it’s the name of a file in the current directory
o
–
The one the running program was saved in
mode is "r" (default), "w" or "a" to read, write or append resp.
A file object, opened for reading, is a sequence of lines.
data = open("junk.txt") # Default is open for reading
for line in data:
# Processes file line by line
print line[0:-1]
# Print the line without its final \n char
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 112
Files as lists of lines
Previous program could also be written
data = open("junk.txt")
lines = data.readlines() # Get a list of all the lines in the file
for line in lines:
# Processes file line by line
print line[0:-1]
# Print the line without its final \n char
Advantages:
–
More explicit about what it’s doing (?)
–
Can access lines in arbitrary order, back up in file, etc
Disadvantages:
–
Need to have enough memory to hold the whole file
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 113
Splitting and stripping lines
removes characters in
chars_to_strip from the front and end of line
line.strip([chars_to_strip])
–
Default is to strip white space (newlines, tabs, spaces)
–
e.g. " hi there mate! \n".strip() is "hi there mate!"
line.split([separator]) returns a list of strings obtained by
splitting the line into substrings around the given
separator string or around whitespace by default
–
e.g. " hi there mate! \n".split() is ["hi", "there", "mate!"]
–
Useful when breaking data into numbers, words, etc or
processing tabular data like ".csv" (comma-separated values)
files.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 114
List comprehensions
A quick preview. Not in the text and not examined but good to know ☺
Often want a new list in which each element is computed
from the elements of an existing list.
–
e.g. the squares of a sequence of integers
–
the lengths of all the words in a file
Syntax: [ expression for variable in sequence]
Yields a list of the values of the given expression (which
usually involves variable) for each value of the variable
in the sequence.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 115
List comprehensions (cont'd)
Examples:
–
[x * x for x in range(10)]
o
–
[len(word) for word in "this is a sentence".split()]
o
–
[0, 1, 4, 9, 16, ... 81]
[4, 2, 1, 8]
max([len(line) for line in open("poem.txt")])
o
The length of the longest line in the file poem.txt (including the newline char)
Can also have an if clause to “filter” in/out wanted/unwanted
elements
–
[x * x for x in range(n) if x % 2 == 1]
o
The squares of all the odd integers less than n
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Don't worry about this until
we've done if statements and
even then don’t worry about it!
It won’t be examined.
Slide # 116
Week4: Conditionals
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 117
The boolean type (‘bool’)
A boolean expression evaluates to either True or
False.
A boolean variable is either True or False
Get booleans by:
1.
Testing relationships using relational operators
<, <=, >, >=, ==, != (or <>) , is [not], [not] in
2.
Calling functions or methods that return booleans,
e.g. startswith, endswith
3.
Combining booleans with logical operators, e.g.
and, or, not
NB: Equality
testing is
done with
"==", not "="
(which is
assignment).
Tests equality of identity. Rarely useful.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 118
Examples
(in practice, at least one operand would be a variable)
Expression
Result
10 > 5
True
10 <= 5
False
<= is “less than or equal to”
2 + 3 == 5
True
Note ‘==’ rather than ‘=’
"a" < "apples"
True
Strings are compared char-by-char, ...
"ABC" != "abc"
True
... using their numeric encoding. Upper case
"Zack" < "alan"
True
... chars come before lower case chars
10 < "Fred"
True
Meaningless! Don’t do this.
"red" in "Fred"
True
"x" not in "Fred"
True
not (2 > 5 or 3 < 5)
False
"Gonk".startswith("Go")
True
"Gonk".endswith("NK")
False
Comment
Same as not ("x" in "Fred")
Lower-case != upper case
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 119
Operator precedence
**
*, /, %
High
Precedence
+, Shifts and bitwise operations (not in 121)
All relational operators (all same precedence)
not
and
or
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Low
Precedence
Slide # 120
Chaining relational operators
Suppose we want to check if an int i is in the range low to high
inclusive
In most languages we’d write (i >= low) and (i <= high)
Python allows the shorthand: low <= i <= high
But use this operator chaining only in the usual mathematical
ways or it might surprise you, e.g.:
–
5 < 10 == False evaluates to False
–
5 < 10 == True
–
Reason:
also evaluates to False!
o
they’re shorthands for (5 < 10) and (10 == False) and (5 < 10) and (10 == True)
o
Objects of different types (int and bool) usually test unequal (but don’t do it!)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 121
Boolean operators on non-bools
Python allows non-bool operands with Boolean operators
–
I wish it didn’t!
All non-bool objects are treated as True except:
–
–
–
–
Numeric zeroes: 0, 0.0, (0 + 0j)
None
Empty strings
Empty containers (lists, tuples, dictionaries etc – see later)
These
are all
False
PLEASE DON'T USE THIS FACT!
–
Use boolean operators on bools ONLY
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 122
Lazy (“short circuit”) evaluation
a or b returns True if a is True. Evaluates b only if a is False.
a and b returns False if a is False. Evaluates b only if a is True.
Called lazy evaluation.
Can be important if b has side effects or might generate an
error, e.g.
names = ["Fred", "Alice"]
...
len(names) > 2 and names[2] == "Alan"
names[2] == "Alan" and len(names) > 2
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
# Is False
# Throws an error
Slide # 123
if statements
Syntax (the most general form):
A block is an
indented sequence
of statements
(as in function defs
and for loops)
if bool_expression:
block
elif bool_expression:
block
elif bool_expression:
optional: zero or more
block
...
else:
block
optional
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 124
A basic if statement
substance = raw_input("What substance? ")
ph = float(raw_input("Enter the measured pH: "))
if ph < 7.0:
print substance + " is acidic"
Flow chart
Input ph
ph < 7 ?
No
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Yes
Print
message
Slide # 125
Multi-line blocks
As in function definitions, blocks can have multiple lines:
substance = raw_input("What substance? ")
ph = float(raw_input("Enter the measured pH: "))
if ph < 7.0:
print substance + " is acidic"
Input ph
print "Be careful with that!"
ph < 7 ?
No
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Yes
Code for
acidic case
Slide # 126
The else part
substance = raw_input("What substance? ")
ph = float(raw_input("Enter the measured pH: "))
if ph < 7.0:
print substance + " is acidic"
Input ph
print "Be careful with that!"
else:
print substance + " is not acidic"
print "But that doesn't mean it's safe!"
ph < 7 ?
No
Code for nonacidic case
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Yes
Code for
acidic case
Slide # 127
Using elifs
substance = raw_input("What substance? ")
ph = float(raw_input("Enter the measured pH: "))
if ph < 7.0:
print substance + " is acidic"
print "Be careful with that!"
elif ph == 7.0:
print substance + " is neutral"
else:
print substance + " is basic"
print "It might be caustic!"
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 128
Flow chart
Input ph
ph < 7 ?
No
ph == 7 ?
No
It is clear that only one of
Yes
Yes
Code for
acidic case
the three blocks can be
executed.
–
i.e., cases are all mutually
exclusive.
Code for
neutral case
Code for
basic case
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 129
A variant using only basic ifs
Could write the preceding program as:
substance = raw_input("What substance? ")
ph = float(raw_input("Enter the measured pH: "))
if ph < 7.0:
print substance + " is acidic"
print "Be careful with that!"
if ph == 7.0:
print substance + " is neutral"
if ph > 7.0:
print substance + " is basic"
print "It might be caustic!"
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 130
Flow chart
ph < 7 ?
Yes
Code for
acidic case
No
ph == 7 ?
–
More flexible
–
Explicit about when each
block executes
Yes
Code for
neutral case
No
ph > 7 ?
Advantages of this version
Yes
Disadvantages
–
Doesn’t make the mutual
exclusion obvious
–
Slightly less efficient
o
No
Rarely relevant
Code for
basic case
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 131
The perils of floating point equality
What will the following code print?
a = 3.0 * 0.1 / 3.0
if a == 0.1:
print "I should think so too"
else:
print "Huh?"
Or, generalising that:
print [i for i in range(1,100) if i * 0.1 / i != 0.1]
Rule of thumb:
–
Never compare floats for (in)equality.
Floats are approximate so equality cannot be guaranteed.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 132
Nested ifs
The conditional code blocks can contain any type of
statement including if statements
–
Called nested ifs
Example:
–
2
Write a program to solve a quadratic equation ax + bx + c = 0
Solution is:
−b ± b 2 − 4ac
x=
2a
–
Program should prompt for a, b and c
–
Print “Not a quadratic” if a is 0, otherwise:
o
o
Print “Roots are ... , ...” (4 digit accuracy) if roots are real
Print “Roots are imaginary” otherwise.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 133
Pseudocode
When writing programs we often draft the general algorithm
we’ll use in pseudocode
–
Shows the main code blocks, if statements and loops
–
Omits much of the coding details
Input a, b and c
if a is 0:
print "Not a quadratic"
else:
2
Compute the discriminant b − 4 ac
if discriminant is positive:
Compute and print roots
else:
A nested if.
Note increased
indentation level.
print "Roots are imaginary"
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 134
Actual code
In-class exercise.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 135
Simplifying complex logic
Programs must be as simple and readable as possible.
Avoid complex logic by:
1.
Simplifying logical expressions
o
e.g. de Morgan’s theorem
2.
Flattening nested code
3.
Introducing temporary boolean variables
4.
Writing boolean-valued functions
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 136
de Morgan’s Theorem
A basic theorem of boolean algebra, useful in computing
not (a and b) = (not a) or (not b)
not (a or b) = (not a) and (not b)
Examples
–
“I’m not going if it’s raining or I’m feeling tired” = “I’m
going if it’s not raining and I’m not feeling tired”
if not(is_raining or is_tired):
go()
≡
if is_not_raining and is_not_tired:
go()
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 137
de Morgan’s Theorem (cont'd)
“Unless a is zero or the discriminant is negative, print the
roots” = “if a is non-zero and the discriminant is nonnegative, print the roots”.
if not(a == 0 or discriminant < 0):
print_roots(a, b, c)
if a != 0 and discriminant >= 0:
≡
print_roots(a, b, c)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Much more readable
Slide # 138
Interlude: truth tables
Since boolean values are either True or False, it’s easy to
make a table of the value of some boolean expression for
all possible parameter values.
–
Called a truth table
For example, the Truth table for not (a or b) is
a
b
False
False
False
True
True
False
True
True
a or b
not (a or b)
Do in lectures
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 139
Truth tables (cont'd)
Similarly for (not a) and (not b)
a
b
False
False
False
True
True
False
True
True
not a
not b
not a and not b
Do in lectures
Last column same as previous table. This proves one of
the two de Morgan’s laws.
You do the other one.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 140
Exercise (text 6.5 Q)
You want an automatic wildlife camera to switch on if the
light level is less than 0.01 or if the temperature is above
freezing, but not if both conditions are true. Your first
attempt to write this is:
if (light < 0.01) or (temperature > 0.0):
if (light < 0.01) and (temperature > 0.0):
pass
else:
camera.on()
A friend says that this is an exclusive or and that you could
write it more simply as:
if (light < 0.01) ! = (temperature > 0.0):
camera.on()
Prove whether your friend is right or wrong.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Truth
Table
Slide # 141
Flattening nested code
The previous exercise shows an example where nested ifs
can be replaced with a single if.
Avoiding nesting usually makes code more readable.
Example, the quadratic pseudocode earlier could be
rewritten as:
Input a, b and c
Compute the discriminant b2 – 4ac
if a is 0:
print "Not a quadratic"
elif discriminant is negative:
print "Roots are imaginary"
else:
compute and print roots
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 142
Temporary boolean variables
Often code is more readable if we give names to boolean
conditions, e.g.
is_low_light = light < 0.01
is_high_temp = temperature > 0.0
if (is_low_light or is_high_temp) and not (is_low_light and is_high_temp):
camera.on()
We’ve avoided nesting without the “tricky” (?) exclusive-
or code.
Style convention: use names beginning with is_ for such
booleans “flag” values.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 143
Boolean valued functions
Another way of giving names to boolean conditions is to wrap
them in functions, e.g.:
def either_but_not_both(condition1, condition2):
'''Return true if either of condition1 or condition2 is true but not both.'''
return (condition1 or condition2) and not (condition1 and condition2)
if either_but_not_both( light < 0.0, temperature > 0.0):
camera.on()
either_but_not_both is a verbose way to provide exclusive or
(cf condition1 != condition2) but more readable to most people
(?)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 144
Week 5: Repetition
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 145
"for each" loops (reviewed)
The basic for statement (aka
"for each" loop) iterates over
elements in a sequence
–
List, string, tuple, file ...
Done all
elements?
E.g.:
for num in [10, 20, 30, 40]:
print num
Yes
No
num = next list element
The loop control variable (num
in this example) is bound to
each list element in turn
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
print num
Slide # 146
List → List
Often need to modify list information, e.g., scale up all
the marks in a list of marks by 10%.
Two approaches:
–
Generate a new list with a for each loop
–
Modify the existing list "in place"
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 147
A new list via "for each"
marks = [55.3, 37.4, 85.2, ...] # (or more likely read from file)
scale_factor = 1.1
scaled_marks = []
for mark in marks:
scaled_marks.append(mark * scale_factor)
print scaled_marks
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 148
Modifying the existing list
Try the "obvious" approach:
marks = [55.3, 37.4, 85.2, ...]
scale_factor = 1.1
for mark in marks:
mark = mark * scale_factor
print marks # UNCHANGED!
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 149
Why it fails
The assignment mark = mark * scale_factor binds mark
to a new object
Doesn't alter the existing object
e.g., on first time through the loop:
list object
...
...
Dictionary marks
mark
...
Before assignment
After assignment
85.2
55.3
Object store
37.4
60.6
New object
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 150
What we actually want:
list object
...
...
Dictionary marks
...
85.2
55.3
Object store
37.4
Before assignment
After assignment
60.6
New object
i.e., we want to assign new values to marks[0], marks[1],
...
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 151
Modifying the existing list: take 2
This time use a loop control variable i = 0, 1, 2, ...
marks = [55.3, 37.4, 85.2, ...]
scale_factor = 1.1
for i in range(0, len(marks)) : # i takes values 0, 1, 2, ... len(marks)-1
marks[i]= marks[i] * scale_factor # Or marks[i] *= scale_factor
print marks
This version works ☺
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 152
The enumerate function
The enumerate function takes a sequence as a parameter
and returns a new sequence (an enumerate sequence) of
(index, value) pairs.
e.g. list(enumerate(["Joe", "Alice", "Anne"])) is
[(0, "Joe"), (1, "Alice"), (2, "Anne")]
Can rewrite the previous example as:
marks = [55.3, 37.4, 85.2, ...]
scale_factor = 1.1
for (i, value) in enumerate(marks): # Using parallel assignment
marks[i] = value * scale_factor
print marks
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 153
Skipped slide
from Week 3
Nested lists
A list item can be any object, including another list
Useful when representing tabular data, e.g. 6 hourly
Day number
rainfall totals (mm) for a given week
6 am
noon
6pm
Midnight
0
0
9
3
7
1
11
9
0
0
2
0
10
12
20
3
0
0
0
0
4
1
3
4
1
5
2
8
10
0
6
0
0
0
0
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 154
Python representation
Skipped slide
from Week 3
Use a list of rows, each row being a list of rainfall values:
rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [0, 10, 12, 20],
[0, 0, 0, 0], [1, 3, 4, 1], [2, 8, 10, 0], [0, 0, 0, 0] ]
...
Dictionary rainfalls
...
etc
etc
0
0
9
7
3
rainfalls[day_num][column_num]
11
9
0
selects a particular int value
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 155
Nested loops
Skipped slide
from Week 3
Nested lists are often processed with nested loops, e.g.
rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [0, 10, 12, 20],
[0, 0, 0, 0], [1, 3, 4, 1], [2, 8, 10, 0], [0, 0, 0, 0] ]
print "Rainfall data"
for day in [0, 1, 2, 3, 4, 5, 6]:
print "Day {}:".format(day),
for column in [0, 1, 2, 3]:
print format(rainfalls[day][column], "5"), # Note final comma!
print # Prints just a newline
Tip: the range function would help here:
range(low, high) is the list [low, low+1, low+2, ... high-1]
range(high) is the list [0, 1, 2, ... high-1]
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 156
An improved version
We can refine that to:
rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [0, 10, 12, 20],
[0, 0, 0, 0], [1, 3, 4, 1], [2, 8, 10, 0], [0, 0, 0, 0] ]
print "Rainfall data"
for (day, days_rain) in enumerate(rainfalls):
print "Day {}:".format(day),
for rain in days_rain:
print format(rain, "5"),
print
Now it works for any number
of days and any number of
rainfall samples per day.
# Prints just a newline
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 157
Loops and ifs
Often need to use an if in the body of a loop
Example: suppose rainfall data uses -1 to denote missing
data and you need to print daily averages, excluding
missing data.
Pseudocode:
Get rainfall data
Print heading
for each day:
clear sample counter and total
for each rainfall sample in the day's rain:
if sample non-negative:
Add sample to daily total and increment sample counter
Print average for day
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 158
The code
rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [-1, 10, 12, 20],
[0, -1, 0, 0], [-1, 3, 4, 1], [2, 8, 10, 0], [-1, 0, 0, 0] ]
print "Rainfall data"
print "Day Readings Total Average"
for (day, days_rain) in enumerate(rainfalls):
daily_total = 0
num_readings = 0
for rain in days_rain:
if rain >= 0:
daily_total += rain
num_readings += 1
average = daily_total / num_readings
print "{:2}{:5}{:8}{:9.2f}".format(day, num_readings, daily_total, average)
BUT: there are two bugs for you to find and fix: (1) Averages are wrong! (2)
Fails if data missing for the whole day; it should then print "*" for the average.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 159
while loops
The for loop iterates over a known sequence.
Sometimes we don't have a known sequence, e.g.
–
"Stir until boiling"
–
"Keep reading input until the user types quit"
–
"Keep refining the answer until it's good enough"
Situations like this need a different sort of loop:
–
The while loop
Syntax:
while condition:
statement_block
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 160
Example 1: bacteria
If a population of bacteria increases
in size by 21% every minute, how
long does it take for the population
size to double?
minutes = 0
population = 1000
growth_rate = 0.21
while population < 2000:
population *= (1 + growth_rate)
minutes += 1
print minutes, " minutes required."
print "Population =", int(population)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Initialise all variables
No
Population
< 2000 ?
Yes
update population
and increment
minutes
Slide # 161
Infinite loops
Need to be quite sure that loop condition will become true
at some point
Otherwise program loops forever – an infinite loop
A very common bug.
Example: if previous program were
while population != 2000: # Continue until population doubles
...
# ... but equality never occurs!
Have to use Options > Restart in shell window to kill it
Remember: don't compare floats for (in)equality!
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 162
Example 2: input loops
Often want to loop, consuming some input data sequence,
until some condition occurs, e.g.
–
Loop until user enters a quit command
–
Loop until user gives a valid response
–
Search through a sequence for the first occurrence of something
There are three common idioms for this:
–
a "one-and-a-half" loop
–
use of a boolean variable like "is_done"
–
use of a break statement
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 163
The three idioms
"One-and-a-half loop"
Use of a boolean "flag"
item = getItem()
while item needs action:
Process item
item = getItem()
is_done = False
while not is_done:
item = getItem()
if item needs action:
Process item
else:
is_done = True
Use of break
while True:
item = getItem()
if item needs action
Process item
else:
break
But I don't like this
one! See later.
Either is fine. Choice depends on
situation + personal preference.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 164
Example 2(a): Looping until "quit"
idiom 1
idiom 2
prompt = "Enter command or 'q' to quit"
command = raw_input(prompt).lower()
while command != 'q' :
if command == 'jump':
...
elif command == 'move':
...
etc
command = raw_input(prompt).lower()
is_quitting = False
prompt = "Enter command or 'q' to quit"
while not is_quitting :
command = raw_input(prompt).lower()
if command == 'q':
is_quitting = True
elif command == 'jump':
...
etc
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 165
Example 2(b): Looping for valid input
idiom 1
prompt = "What suit (spades, hearts, diamonds or clubs)? "
response = raw_input(prompt).lower()
while not response in ["spades", "hearts", "diamonds", "clubs"]:
print "Invalid suit. Try again."
response = raw_input(prompt).lower()
idiom 2
prompt = "What suit (spades, hearts, diamonds or clubs)? "
is_valid = False
while not is_valid:
response = raw_input(prompt).lower()
if response in ["spades", "hearts", "diamonds", "clubs"]:
is_valid = True
else:
print "Invalid suit. Try again."
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 166
Break statements
Style guideline:
don't use breaks!
break forces an immediate exit from the loop
–
Goes straight to the first statement after the loop
Easy to use but:
–
Loop termination condition is no longer explicit
o
Loop terminates either because loop condition is false or a break
statement was executed
–
–
It encourages a lazy style of programming
o
–
Makes it harder to reason about the program, prove it correct, etc
"I'm not quite sure what the exact loop termination condition is, so I'll
just use break when I think I've hit it"
Creates maintenance problems
o
Loop body is not fully executed, so extra statements (e.g. for debugging)
added at the end of the loop body don't get executed
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 167
Interlude: the Wing101 debugger
It can be hard to debug a program that loops forever
–
Particularly in Wing101, which buffers print output until
execution is finished or input is requested, e.g.
while True:
print "Looping"
doesn't output anything!
o
This doesn't happen in a normal Python shell, only in Wing
Instead of running programs, may need to debug them
–
Click Debug instead of Run
–
Print output now appears continuously in Debug I/O window,
not Python Shell window.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 168
The Wing101 debugger (cont'd)
DEMO
Also, in this debug mode, you can:
–
Set breakpoints to stop the program at a given line
–
Single-step the program one line at a time or run it until the
next breakpoint
–
Inspect all the current variables in the Stack Data pane
o
See next few slides for explanation
Can be very useful, but it's no substitute for thinking!
–
And usually, thoughtfully chosen print statements provide
better information.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 169
"Stacks"
push
pop
A Stack is a data type in Computer Science
–
–
Provides just three operations:
o
push(item) to add the given to the "top" of the stack
o
item = pop() to take an item from the top of the stack
o
is_empty() to test if the stack has any items in it
Last item In is First item Out ("LIFO")
Python's list can be used as a stack:
–
append
≡ push
–
pop
≡
–
len(...) == 0 ≡ is_empty()
pop
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 170
"Stack frames"
During execution, the local data belonging to a Python
function is kept in a stack frame
A stack frame is a block of data that's pushed onto the
call stack when the function is called
Stack frame contains function's dictionary of variables
and the "return address", i.e., where it came from in the
program
When function returns, its stack frame is popped from the
stack
–
Execution resumes at the saved return address.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 171
The call stack: example
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
DEMO
Slide # 172
Week 6: File processing
Textbook, Chapter 8.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 173
Types of file-processing tasks
HUGE range, e.g.
Our Focus
–
Numerical /scientific data processing (e.g. rainfall data)
–
Commercial data processing (e.g. files of account transactions)
–
Document processing (e.g. MS Word documents)
–
Programming language compilation (e.g. a Fortran program)
–
Image processing (e.g. green screening)
–
Internet data harvesting (e.g. web-crawling for email addresses)
–
...
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 174
Steps in processing numerical data
1. Open the file
Extract the data from the file
2.
–
May be as simple as splitting each line in a .csv file or as
complex as parsing an XML file
3. Process the data
4. Output/display the results
May have to interleave these steps for large files
(can’t fit all data in memory)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 175
Opening files
We’ve already seen the usual opening of a local file:
–
data_file = open("data/resources/blah.txt", "r")
o
File “name” is most generally a file path, i.e. a path to the file within the
directory tree
But we can also open Internet resources as files, e.g.:
import urllib # The URL (Uniform Resource Locator) library
url = "http://www.cosc.canterbury.ac.nz/open/teaching/"
web_page = urllib.urlopen(url)
for line in web_page:
print line,
web_page.close() # Should always do this – earlier code was lazy COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 176
Extracting data
Data files given to you so far have been “sanitised”
Real data files usually have lots of extraneous info
–
Headers, footers, irrelevant data, etc
–
For example, see next slide
o
The result of querying for sunshine data at Christchurch from
http://cliflo.niwa.co.nz
Need an algorithm to extract just the required data
–
e.g. month, day, sunshine from following slide
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 177
cliflo.niwa.co.nz query result (csv)
Station information:
Name,Agent Number,Network Number,Latitude (dec.deg),Longitude (dec.deg),Height (m),...
Christchurch Aero,4843,H32451,-43.493,172.537,37,G,N/A
Note: Position precision types are: "W" = based on whole minutes, "T" = estimated to tenth minute,
"G" = derived from gridref , "E" = error cases derived from gridref,
"H" = based on GPS readings (NZGD49), "D" = by definition i.e. grid points.
Sunshine: Daily
Station,Date(NZST),Time(NZST),Amount(Hrs),Period(Hrs),Freq
Christchurch Aero,20100101,2259,9.9,24,D
Christchurch Aero,20100102,2259,7.1,24,D
Christchurch Aero,20100103,2259,1.8,24,D
Christchurch Aero,20100104,2259,9.7,24,D
...
Christchurch Aero,20100320,2259,1.8,24,D
Christchurch Aero,20100321,2259,0.3,24,D
Christchurch Aero,20100322,2259,5.3,24,D
Christchurch Aero,20100323,2259,9.6,24,D
Christchurch Aero,20100324,2259,1.0,24,D
Wanted data
UserName is = angusmcgurkinshaw
Total number of rows output = 83
Number of rows remaining in subscription = 1999917
Copyright NIWA 2010 Subject to NIWA's Terms and Conditions
See: http://cliflo.niwa.co.nz/pls/niwp/doc/terms.html
Comments to: [email protected]
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 178
Algorithm #1 for extracting data
Many possibilities. One is:
skip lines until we get an empty line
More robust against changes in file
format than “skip 9 lines”
skip two more lines
read a line
while line not empty: # A blank line terminates actual data rows
split line into pieces separated by comma
date = piece[1]
get month and day from date
sunshine = float(piece[3])
process month, day, sunshine data point (e.g. write to another file)
read a line
“Idiom 1” from last chapter
Question: what happens if the data file doesn’t contain the expected two blank lines?
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 179
Code
Direct translation from pseudocode
infile = open("sunshine.txt")
line = infile.readline()
while line != "\n":
line = infile.readline()
infile.readline()
infile.readline()
line = infile.readline()
while line != "\n":
pieces = line.split(",")
date = pieces[1]
month = int(date[4:6])
day = int(date[6:8])
sunshine = float(pieces[3])
print month, day, sunshine
line = infile.readline()
infile.close()
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Variant using a function for data line processing
def process_data_line(line):
pieces = line.split(",")
date = pieces[1]
month = int(date[4:6])
day = int(date[6:8])
sunshine = float(pieces[3])
print month, day, sunshine
infile = open("sunshine.txt")
line = infile.readline()
while line != "\n":
line = infile.readline()
infile.readline()
infile.readline()
line = infile.readline()
while line != "\n":
process_data_line(line)
line = infile.readline()
infile.close()
Slide # 180
Algorithm #2 for extracting data
Another is:
get a list of all lines in file
make a list of all those lines (after line 3) beginning "Christchurch aero"
for each of those lines:
split line into pieces separated by comma
date = piece[1]
get month and day from date
sunshine = float(piece[3])
process month, day, sunshine data point (e.g. write to another file)
Simpler (?) but only works for this one base station.
Also, can’t handle huge files.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 181
Algorithm #2b
An improvement (in terms of code reusability) is to get
the station name from line 3
get a list of all lines in file
station_name = start of line 3, up until ","
make a list of all lines (after line 3) beginning with station_name
for each of those lines:
split line into pieces separated by comma
date = piece[1]
get month and day from date
sunshine = float(piece[3])
process month, day, sunshine data point (e.g. write to another file)
OK for any base station.
Still can’t handle huge files.
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 182
Code
Not 3! [Remember: zero origin]
infile = open("sunshine.txt")
lines = infile.readlines()
infile.close()
station_name = lines[2].split(",")[0]
data = [line for line in lines[3:] if line.startswith(station_name)]
for line in data:
pieces = line.split(",")
date = pieces[1]
month = int(date[4:6])
day = int(date[6:8])
sunshine = float(pieces[3])
print month, day, sunshine
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 183
Which algorithm?
Those are just 2 algorithms.
–
How many more can you find?
Which is better?
–
Actually they’re both pretty bad!
–
They both fail if the file format is significantly changed
–
The problem is that we’ve inferred the data format from the data
o
We really need a specification of the data format from the supplier
o
Acts as a contract ensuring (hopefully) our program continues to work in
the future
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 184
Outputing the results
Many possibilities, e.g.
–
Display textual output with print
–
Write a new file, e.g.
–
o
Pure text (e.g. csv)
o
Markup language output (e.g. HTML)
Graphical output
o
Maybe in a GUI
o
Maybe with matplotlib
–
o
Installed on Linux in labs but not on Windows.
...
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 185
Writing output files
Open file for writing, prepare data, e.g.
out_file = open("myoutput.txt", "w")
data = "{},{},{:.3f}".format(month, day, sunshine)
Write data to file
–
Can use print chevron technique, e.g.
Obsolescent: not
in Python 3
print >>out_file, data
–
Or directly output byte stream (usually a string), e.g.
out_file.write(data)
o
NB: must explicitly include newline character when using write method
Close file
out_file.close()
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 186
Getting graphical output
Outside official curriculum, except for GUI section later
BUT ... scientists and engineers should at least be aware
of matplotlib
–
See matplotlib.sourceforge.net
–
A plotting package modelled on the one in matlab
–
Multiplatform
–
Publication-quality output
–
Extremely flexible
o
–
But using this flexibility isn’t trivial!
Installed only under Linux on lab machines (?)
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 187
Example program
import matplotlib.pyplot as plt
from datetime import date
def get_date(line):
'''Extract the date info from the given
line and return a Date object'''
pieces = line.split(",")
date_string = pieces[1]
year = int(date_string[0:4])
month = int(date_string[4:6])
day = int(date_string[6:8])
return date(year, month, day)
def get_sunshine(line):
'''Return the float sunshine value from
the given line'''
return float(line.split(',')[3])
infile = open("sunshine.txt")
lines = infile.readlines()
infile.close()
station_name = lines[2].split(",")[0]
data = [line for line in lines[3:]
if line.startswith(station_name)]
dates = [get_date(line) for line in data]
sunshine_data = [get_sunshine(line)
for line in data]
plt.plot(sunshine_data,
linestyle="dotted", marker='o')
n = len(sunshine_data)
tick_positions = range(0, n, 10)
tick_labels = [dates[i].strftime("%d %b")
for i in tick_positions]
plt.xticks(tick_positions, tick_labels,
rotation=90)
plt.ylabel("Sunshine (hrs)")
plt.title("Sunshine hours, Christchurch
aero, 2010")
plt.show()
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 188
Program’s output
COSC 121S1: Intro to Programming. ©Richard Lobb, 2012
Slide # 189