Download Day 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
What Did We Learn Last Time?
Introduction to
Programming
●
Strings can be joined to each other using + and can be
repeated multiple times using * and a number
●
The string data type provides a large set of methods for
manipulating strings in various ways
Day 3: Data Containers & File Handling
●
Comparison operators like == and < can be used in
Boolean expressions, yielding a True or False result
●
These expressions can be used as tests in if statements to
decide which path to follow in a program
Nick Efford
Twitter: @python33r
Email: [email protected]
G+: http://plus.google.com/+NickEfford
Example
What Did We Learn Last Time?
# Program to display an exam grade
# (NDE, 2013-06-10)
●
while loops use Boolean expressions to decide whether a
block of statements should run again or not
mark = int(input("Enter exam mark: "))
●
for loops execute a block of statements once for every
item in a sequence of some kind
●
Sequences of numbers can be generated using range
if mark >= 70:
print("Distinction")
elif mark >= 40:
print("Pass")
else:
print("Fail")
Note the use of indentation!
variable to hold each
integer from sequence
generates integers from
0 to 9, one at a time
for num in range(10):
print(num)
loop body often uses
loop variable, but
doesn’t have to
Why Do We Need Containers
For Multiple Values?
Today’s Objectives
●
To understand why we need ways of containing multiple
values and referencing them as a single entity
●
To explore three kinds of container built into Python:
lists, tuples and dictionaries
●
To introduce a useful, non-standard container type:
the NumPy array
●
To see how we read from and write to files
Having variables to represent individual values is useful, but
what if we need to manipulate lots of data?
num1
num2
num3
num4
num5
=
=
=
=
=
float(input("Enter
float(input("Enter
float(input("Enter
float(input("Enter
float(input("Enter
first number: "))
second number: "))
third number: "))
fourth number: "))
fifth number: "))
total = num1 + num2 + num3 + num4 + num5
...
Does this approach scale?
What if there were 50 numbers instead of 5?...
Lists
Useful Functions For Lists of Numbers
●
Created using [] or list()
●
min and max give minimum and maximum values
●
Can be empty or contain comma-separated items
●
sum gives the total of all values in the list
●
len function gives you list length (i.e., number of items)
●
Try these functions in the Python shell, for this list:
●
Items can be added using the append method
x = []
x
type(x)
len(x)
x.append(7)
x.append(4)
x
len(x)
try these lines in the
Python shell
data = [1, 4, 11, -2, 7]
Exercise 1
Modify mean.py from yesterday’s exercises so that it
● Reads user input into a list
● Computes the mean of the values in the list
Indexing Examples
Try the following in the Python shell:
fruit = ["apple", "orange", "banana", "kiwi"]
fruit[0]
fruit[3]
fruit[4]
fruit[-1]
fruit[-2]
Indexing of Lists
●
We can access an individual item using an index
representing its position in the list
●
Indices are integers and start from zero!
●
Index goes inside square brackets, following the name of
the list variable; thus data[0] is first item in list data,
data[1] is second item, and so on
●
If a list contains N items, index of last item is N-1
●
Many languages use zero-based indexing - notable
exceptions include FORTRAN, Lua, MATLAB
Negative Indices
●
Provide us with a nice way of indexing items backwards,
starting from end of the list
○
-1 gives last item
○
-2 gives last-but-one item...
●
Very useful for accessing last item of a list without ever
having to know how big the list is
●
Perl, Ruby & Mathematica also support this feature, but the
vast majority of programming languages don’t
Slicing
Why Isn’t Element at Second
Index Included in a Slice?
●
Two indices separated by a colon is a slice, giving us a
range of items from the list
●
Means that difference between the two indices = number of
items in the slice
●
Item at second index is not included in that range!
●
●
Slice is assumed to include start of list if the first index is
omitted, end of list if the last index is omitted
Allows for a ‘nice’ interpretation of slices in cases where
one of the indices is missing:
●
data[:2] = first two items
data[2:] = everything after first two items
data[-2:] = last two items
data[:-2] = everything before last two items
Try these examples, using previously-defined lists:
data[0:3]
data[2:4]
fruit[0:2]
fruit[1:3]
Generality of Indexing & Slicing
●
Indexing & slicing operations work on many different kinds
of sequence, not just on lists
●
This means we can index and slice strings!
●
Try the following in the Python shell:
s = "Hello World!"
s[0]
s[6]
s[-1]
s[1:4]
s[:5]
s[5:]
s[-6:]
try these out now in
the Python shell
Replacing Indexed Items
●
You can assign to an indexed list item to replace it
●
Same thing won’t work for characters in a string, because
strings are an immutable type
●
Try the following in the Python shell:
fruit[0] = "pear"
s[0] = "J"
List Methods
Exercise 2: Lottery Simulation
count
Returns number of occurrences of the given item
index
Returns position of first occurrence of the given item
append
Adds the given item to the end of the list
extend
Extends list by appending items from another sequence
insert
Inserts the given item before the given position
pop
Removes and returns the item at the specified position
remove
Removes the first occurrence of the given item
reverse
Reverses the ordering of list items, in place
sort
Sorts the list of items, in place
Command-Line Arguments
six numbers from
the range 1 - 49,
selected randomly,
sorted into order
and displayed
a seventh randomly
selected number acts
as a ‘bonus ball’
Exercise 3
●
The sys module provides sys.argv, a list containing the
program’s name and command line arguments
●
Modify an existing program to acquire data via command
line arguments instead of the input function
●
CLAs are the items typed after the program name when you
run it from an OS command prompt
●
Program must terminate with a suitable usage hint if
required arguments are not supplied
●
CLAs are useful for supplying things like filenames
program name
program argument
python statistics.py input.txt output.txt
sys.argv[0]
sys.argv[1]
sys.argv[2]
Tuples
String, List & Tuple Comparisons
●
Similar to lists, but defined with () instead of []
●
Can be indexed and sliced just like lists
●
More limited: a tuple’s size is fixed after it is defined and
you cannot replace existing items with new items
●
Useful for representing things that don’t change in size e.g., (x, y) coordinates always contain two values
(0,75)
shape = [(0,0), (0,75), (100,0)]
(0,0)
(100,0)
Iteration Over Strings, Lists & Tuples
Standard for loop fetches items from the sequence:
for character in message:
print(character)
for value in data:
print(value)
●
Use == and != to test for equality or inequality, just as you
would for numeric types
●
Operators <, <=, >, >= work for strings, lists and tuples and
operate ‘lexicographically’
○
First pair of items are compared and determine result
unless they are equal - in which case, next pair of items
are compared, etc...
"abc" == "abc"
"abc" < "abcd"
"abd" > "abc"
[1,2,3] == [1,2,3]
[1,2] < [1,2,3]
(1,2,4) > (1,2,3)
try these out now in
the Python shell
Testing For Membership of
Strings, Lists & Tuples
in operator can be used to test whether a string, list or tuple
contains a given item:
if "x" in word:
print("Letter 'x' found in word")
if 0 in data:
print("Dataset contains a zero")
try the second example
in the Python shell
Representing a Phone Book
An Associative Container: The Dictionary
How could we implement a phone book in Python?
Using a dictionary is a better solution:
phonebook = { "nick": "01234 567890",
"tony": "09876 543210",
"jane": "01357 246802" }
Idea 1: List of names and a parallel list of numbers
["nick", "tony", "jane"]
["01234 567890", "09876 543210", "01357 246802"]
key
value
Idea 2: List of (name, number) tuples
[ ("nick", "01234 567890"),
("tony", "09876 543210"),
("jane", "01357 246802") ]
Using Dictionaries
Try the following in the Python shell:
fruit = {"apples": 5, "pears": 2}
type(fruit)
len(fruit)
fruit
fruit["apples"]
fruit["oranges"]
fruit["oranges"] = 3
fruit
●
Dictionary is defined using {}
●
We can look up values using [] and the key:
print(phonebook["nick"])
Dictionary Methods
clear
Empties the dictionary of all items
get
Returns value associated with the given key, or a default
items
Returns a view of (key, value) pairs in the dictionary
keys
Returns a view of the dictionary’s keys
pop
Removes a key-value pair, given the key
update
Updates the dictionary with data from another dictionary
values
Returns a view of the dictionary’s values
Using Dictionary Methods
Try these in the Python shell:
fruit.keys()
fruit.values()
fruit.items()
fruit.get("oranges", 0)
fruit.get("grapes", 0)
fruit.pop("pears")
fruit
fruit.clear()
fruit
Iteration & Membership Tests
For Dictionaries
●
for loop operating on a dictionary will iterate over keys
●
Dictionary’s values and items methods can be used to
supply values or key-value pairs to a for loop
for f, n in fruit.items():
print(f, n)
●
fruit = {"apples": 5, "pears": 2}
"apples" in fruit
"oranges" in fruit
A Useful Non-Standard Type:
The NumPy Array
Exercise 4
Provided by NumPy, http://www.numpy.org
●
●
Typically used with homogeneous numerical data
●
●
Can be indexed and sliced like lists & tuples
●
Much faster that using a list or tuple; many operations can
be carried out without the need for Python loops
●
Well-suited to representing multidimensional data
●
in operator can be used to test whether a dictionary
contains a particular key
We will explore arrays in
more detail on the final day
of the course...
●
Create a sequence of x values as a NumPy array
Generate a sequence of y values from x
Plot y against x
Opening Files For Input
mode string:
rt for reading text files,
rb for reading binary files
name of file
File Types
infile variable
represents the
opened file
with open("input.txt", "rt") as infile:
Code that reads from infile
Rest of program
file will be closed automatically
on leaving the with block and
resuming rest of program
Reading From a Text File
Text files
● Consist of normal, printable characters
● Encoding dictates how characters are stored
● Input operations in Python yield strings
● Examples: .txt files, .html files, .py files
Binary files
● Consist of arbitrary bytes
● Can’t be manipulated in a text editor
● Input operations in Python yield bytes objects
● Examples: images, MP3 files, MS Office docs
Example 1: Reading Lines
of Text Into a List
empty list needed
to hold the text
●
Call read method to get entire file as one string
●
Call readline method to get a single line
●
Call readlines to get all lines, as a list of strings
lines = []
●
Use for to iterate over lines one at a time
with open("input.txt", "rt") as infile:
for line in infile:
lines.append(line[:-1])
Note: lines read via these
methods include the newline
character (\n) at the end!
for loop used here; could
use readlines() instead
if \n at end of line doesn’t
need to be removed
slice gives us all of string
except for final character
(thereby removing \n)
Example 2: Reading a Series
of Numbers Into a List
Example 3: Extracting a Value
From Multicolumn Data
Need to split each line on whatever separates the columns
(this example assumes separation by spaces or tabs)
data = []
data = []
with open("input.txt", "rt") as infile:
for line in infile:
data.append(float(line))
with open("input.txt", "rt") as infile:
for line in infile:
record = line.split()
data.append(float(record[2]))
assumes each line holds
only a single number
Exercise 5: Data File Handling
●
●
selects values from third
column in dataset
Opening Files For Output
Open a data file and read its contents
Extract an item of interest
wt for writing text files,
at for reading text files,
wb for reading binary files
name of file
outfile variable
represents the
opened file
with open("output.txt", "wt") as outfile:
Code that writes to outfile
Rest of program
file will be closed automatically
on leaving the with block and
resuming rest of program
(this is important)
Writing To a Text File
High-level approaches:
● Call print function and specify destination file object
using the file keyword argument
Low-level approaches:
● Call write method with a string as an argument
(add \n if you want to write a whole line)
●
Call writelines method with sequence of strings as an
argument (each string ending in \n)
Summary
We have
● Seen how lists and tuples can be used to store and
manipulate sequences of values
●
Seen how items and ranges can be extracted from lists,
tuples and strings by indexing and slicing
●
Seen how values can be stored and retrieved by key rather
than position, using a dictionary
●
Considered the array - a non-standard container type
provided by the NumPy package
●
Explored how file-based I/O is done in Python
Simple Example
...
with open("output.txt", "wt") as outfile:
for item in data:
print(item, file=outfile)
without this, file will be
empty and output will
appear on the screen!