Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
What Did We Learn Last Time? Introduction to Programming ● Strings can be joined to each other using + and can be repeated multiple times using * and a number ● The string data type provides a large set of methods for manipulating strings in various ways Day 3: Data Containers & File Handling ● Comparison operators like == and < can be used in Boolean expressions, yielding a True or False result ● These expressions can be used as tests in if statements to decide which path to follow in a program Nick Efford Twitter: @python33r Email: [email protected] G+: http://plus.google.com/+NickEfford Example What Did We Learn Last Time? # Program to display an exam grade # (NDE, 2013-06-10) ● while loops use Boolean expressions to decide whether a block of statements should run again or not mark = int(input("Enter exam mark: ")) ● for loops execute a block of statements once for every item in a sequence of some kind ● Sequences of numbers can be generated using range if mark >= 70: print("Distinction") elif mark >= 40: print("Pass") else: print("Fail") Note the use of indentation! variable to hold each integer from sequence generates integers from 0 to 9, one at a time for num in range(10): print(num) loop body often uses loop variable, but doesn’t have to Why Do We Need Containers For Multiple Values? Today’s Objectives ● To understand why we need ways of containing multiple values and referencing them as a single entity ● To explore three kinds of container built into Python: lists, tuples and dictionaries ● To introduce a useful, non-standard container type: the NumPy array ● To see how we read from and write to files Having variables to represent individual values is useful, but what if we need to manipulate lots of data? num1 num2 num3 num4 num5 = = = = = float(input("Enter float(input("Enter float(input("Enter float(input("Enter float(input("Enter first number: ")) second number: ")) third number: ")) fourth number: ")) fifth number: ")) total = num1 + num2 + num3 + num4 + num5 ... Does this approach scale? What if there were 50 numbers instead of 5?... Lists Useful Functions For Lists of Numbers ● Created using [] or list() ● min and max give minimum and maximum values ● Can be empty or contain comma-separated items ● sum gives the total of all values in the list ● len function gives you list length (i.e., number of items) ● Try these functions in the Python shell, for this list: ● Items can be added using the append method x = [] x type(x) len(x) x.append(7) x.append(4) x len(x) try these lines in the Python shell data = [1, 4, 11, -2, 7] Exercise 1 Modify mean.py from yesterday’s exercises so that it ● Reads user input into a list ● Computes the mean of the values in the list Indexing Examples Try the following in the Python shell: fruit = ["apple", "orange", "banana", "kiwi"] fruit[0] fruit[3] fruit[4] fruit[-1] fruit[-2] Indexing of Lists ● We can access an individual item using an index representing its position in the list ● Indices are integers and start from zero! ● Index goes inside square brackets, following the name of the list variable; thus data[0] is first item in list data, data[1] is second item, and so on ● If a list contains N items, index of last item is N-1 ● Many languages use zero-based indexing - notable exceptions include FORTRAN, Lua, MATLAB Negative Indices ● Provide us with a nice way of indexing items backwards, starting from end of the list ○ -1 gives last item ○ -2 gives last-but-one item... ● Very useful for accessing last item of a list without ever having to know how big the list is ● Perl, Ruby & Mathematica also support this feature, but the vast majority of programming languages don’t Slicing Why Isn’t Element at Second Index Included in a Slice? ● Two indices separated by a colon is a slice, giving us a range of items from the list ● Means that difference between the two indices = number of items in the slice ● Item at second index is not included in that range! ● ● Slice is assumed to include start of list if the first index is omitted, end of list if the last index is omitted Allows for a ‘nice’ interpretation of slices in cases where one of the indices is missing: ● data[:2] = first two items data[2:] = everything after first two items data[-2:] = last two items data[:-2] = everything before last two items Try these examples, using previously-defined lists: data[0:3] data[2:4] fruit[0:2] fruit[1:3] Generality of Indexing & Slicing ● Indexing & slicing operations work on many different kinds of sequence, not just on lists ● This means we can index and slice strings! ● Try the following in the Python shell: s = "Hello World!" s[0] s[6] s[-1] s[1:4] s[:5] s[5:] s[-6:] try these out now in the Python shell Replacing Indexed Items ● You can assign to an indexed list item to replace it ● Same thing won’t work for characters in a string, because strings are an immutable type ● Try the following in the Python shell: fruit[0] = "pear" s[0] = "J" List Methods Exercise 2: Lottery Simulation count Returns number of occurrences of the given item index Returns position of first occurrence of the given item append Adds the given item to the end of the list extend Extends list by appending items from another sequence insert Inserts the given item before the given position pop Removes and returns the item at the specified position remove Removes the first occurrence of the given item reverse Reverses the ordering of list items, in place sort Sorts the list of items, in place Command-Line Arguments six numbers from the range 1 - 49, selected randomly, sorted into order and displayed a seventh randomly selected number acts as a ‘bonus ball’ Exercise 3 ● The sys module provides sys.argv, a list containing the program’s name and command line arguments ● Modify an existing program to acquire data via command line arguments instead of the input function ● CLAs are the items typed after the program name when you run it from an OS command prompt ● Program must terminate with a suitable usage hint if required arguments are not supplied ● CLAs are useful for supplying things like filenames program name program argument python statistics.py input.txt output.txt sys.argv[0] sys.argv[1] sys.argv[2] Tuples String, List & Tuple Comparisons ● Similar to lists, but defined with () instead of [] ● Can be indexed and sliced just like lists ● More limited: a tuple’s size is fixed after it is defined and you cannot replace existing items with new items ● Useful for representing things that don’t change in size e.g., (x, y) coordinates always contain two values (0,75) shape = [(0,0), (0,75), (100,0)] (0,0) (100,0) Iteration Over Strings, Lists & Tuples Standard for loop fetches items from the sequence: for character in message: print(character) for value in data: print(value) ● Use == and != to test for equality or inequality, just as you would for numeric types ● Operators <, <=, >, >= work for strings, lists and tuples and operate ‘lexicographically’ ○ First pair of items are compared and determine result unless they are equal - in which case, next pair of items are compared, etc... "abc" == "abc" "abc" < "abcd" "abd" > "abc" [1,2,3] == [1,2,3] [1,2] < [1,2,3] (1,2,4) > (1,2,3) try these out now in the Python shell Testing For Membership of Strings, Lists & Tuples in operator can be used to test whether a string, list or tuple contains a given item: if "x" in word: print("Letter 'x' found in word") if 0 in data: print("Dataset contains a zero") try the second example in the Python shell Representing a Phone Book An Associative Container: The Dictionary How could we implement a phone book in Python? Using a dictionary is a better solution: phonebook = { "nick": "01234 567890", "tony": "09876 543210", "jane": "01357 246802" } Idea 1: List of names and a parallel list of numbers ["nick", "tony", "jane"] ["01234 567890", "09876 543210", "01357 246802"] key value Idea 2: List of (name, number) tuples [ ("nick", "01234 567890"), ("tony", "09876 543210"), ("jane", "01357 246802") ] Using Dictionaries Try the following in the Python shell: fruit = {"apples": 5, "pears": 2} type(fruit) len(fruit) fruit fruit["apples"] fruit["oranges"] fruit["oranges"] = 3 fruit ● Dictionary is defined using {} ● We can look up values using [] and the key: print(phonebook["nick"]) Dictionary Methods clear Empties the dictionary of all items get Returns value associated with the given key, or a default items Returns a view of (key, value) pairs in the dictionary keys Returns a view of the dictionary’s keys pop Removes a key-value pair, given the key update Updates the dictionary with data from another dictionary values Returns a view of the dictionary’s values Using Dictionary Methods Try these in the Python shell: fruit.keys() fruit.values() fruit.items() fruit.get("oranges", 0) fruit.get("grapes", 0) fruit.pop("pears") fruit fruit.clear() fruit Iteration & Membership Tests For Dictionaries ● for loop operating on a dictionary will iterate over keys ● Dictionary’s values and items methods can be used to supply values or key-value pairs to a for loop for f, n in fruit.items(): print(f, n) ● fruit = {"apples": 5, "pears": 2} "apples" in fruit "oranges" in fruit A Useful Non-Standard Type: The NumPy Array Exercise 4 Provided by NumPy, http://www.numpy.org ● ● Typically used with homogeneous numerical data ● ● Can be indexed and sliced like lists & tuples ● Much faster that using a list or tuple; many operations can be carried out without the need for Python loops ● Well-suited to representing multidimensional data ● in operator can be used to test whether a dictionary contains a particular key We will explore arrays in more detail on the final day of the course... ● Create a sequence of x values as a NumPy array Generate a sequence of y values from x Plot y against x Opening Files For Input mode string: rt for reading text files, rb for reading binary files name of file File Types infile variable represents the opened file with open("input.txt", "rt") as infile: Code that reads from infile Rest of program file will be closed automatically on leaving the with block and resuming rest of program Reading From a Text File Text files ● Consist of normal, printable characters ● Encoding dictates how characters are stored ● Input operations in Python yield strings ● Examples: .txt files, .html files, .py files Binary files ● Consist of arbitrary bytes ● Can’t be manipulated in a text editor ● Input operations in Python yield bytes objects ● Examples: images, MP3 files, MS Office docs Example 1: Reading Lines of Text Into a List empty list needed to hold the text ● Call read method to get entire file as one string ● Call readline method to get a single line ● Call readlines to get all lines, as a list of strings lines = [] ● Use for to iterate over lines one at a time with open("input.txt", "rt") as infile: for line in infile: lines.append(line[:-1]) Note: lines read via these methods include the newline character (\n) at the end! for loop used here; could use readlines() instead if \n at end of line doesn’t need to be removed slice gives us all of string except for final character (thereby removing \n) Example 2: Reading a Series of Numbers Into a List Example 3: Extracting a Value From Multicolumn Data Need to split each line on whatever separates the columns (this example assumes separation by spaces or tabs) data = [] data = [] with open("input.txt", "rt") as infile: for line in infile: data.append(float(line)) with open("input.txt", "rt") as infile: for line in infile: record = line.split() data.append(float(record[2])) assumes each line holds only a single number Exercise 5: Data File Handling ● ● selects values from third column in dataset Opening Files For Output Open a data file and read its contents Extract an item of interest wt for writing text files, at for reading text files, wb for reading binary files name of file outfile variable represents the opened file with open("output.txt", "wt") as outfile: Code that writes to outfile Rest of program file will be closed automatically on leaving the with block and resuming rest of program (this is important) Writing To a Text File High-level approaches: ● Call print function and specify destination file object using the file keyword argument Low-level approaches: ● Call write method with a string as an argument (add \n if you want to write a whole line) ● Call writelines method with sequence of strings as an argument (each string ending in \n) Summary We have ● Seen how lists and tuples can be used to store and manipulate sequences of values ● Seen how items and ranges can be extracted from lists, tuples and strings by indexing and slicing ● Seen how values can be stored and retrieved by key rather than position, using a dictionary ● Considered the array - a non-standard container type provided by the NumPy package ● Explored how file-based I/O is done in Python Simple Example ... with open("output.txt", "wt") as outfile: for item in data: print(item, file=outfile) without this, file will be empty and output will appear on the screen!