Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COSC 121 Introduction to Programming Richard Lobb, Erskine Building room 211 Email: [email protected] COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 1 1. Administrative bumph See Initial Course Handout Fire regulations and exit locations Student reps picked second week People Locations: lectures and labs Items of assessment Next few slides Important dates Textbook Learn site + forums COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 2 People Andy Cockburn Phil Holland – Course Supervisor – King of the Labs. – A real staff member – Room 112. Yalini Sundralingam Me – Tutor coordinator. – Richard Lobb. – Room 332 – See next slide. Marina Filipovic – Tutor in charge of 121 labs. Room 321 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 3 Locations Lectures (all here, we hope) – Monday, Thursday 10 – 10:50am – Wednesday 11 – 11:50am o NB: No lecture this first Monday. Labs – all in Lab 2 aka Room 133, Erskine – 6 streams – see Uni course page for COSC121 – You should have been allocated to a lab stream o – Check out your UCStudentWeb / MyTimetable page. You can use other lab times if there are spare machines Labs start next week COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 4 Who am I? Richard Lobb – – – – – Room 211, Erskine Building (in 333 until end of March) Adjunct Senior Fellow “Retired” from full-time academia o Was in CS dept at Auckland from 1978 to 2003 o Computer graphics was my area Passionate about programming This year teaching: o COSC 121 Introduction to Programming (Python), o ENCN305 Computer Programming & Stochastic Modelling (Matlab) o ENCE 260 Computer Systems (C), o COSC 365 Web Computing (PHP, C#, JavaScript) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 5 Assessment The first quiz is THIS WEEK What? Worth When? Lab quizzes 10% Every week (10 @ 1%) Mid-course quiz/test 15% Week of April 23rd (TBA) Programming Assignment 20% Due: 5pm, 29 May Examination 55% To Be Announced NOTE: To achieve a full pass (C or better) that will allow you to advance in Computer Science you must achieve: (a) at least 45% over the two invigilated items combined (test + exam) (b) at least 55% overall. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 6 Textbook Available from: – Bookshop (178 copies @ 7/2) o – $61 (less 10% = ~$55) E-book from http://pragprog.com/titles/gwpy o $US22 Highly recommended Course is built around it We will assume you have a copy COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 7 Other resources "How to Think Like a Computer Scientist: Interactive Edition" – – – http://thinkcspy.appspot.com/build/index.html A great interactive text, but discovered too late for this year BUT: it uses Python 3, not Python 2. Online Python Tutor: – Visualise python execution, stepping forwards and backwards – http://people.csail.mit.edu/pgbovine/python/ – BUT: visualisations of data structures different from my notes Python exercises: codingbat.com/python If you're already a programmer: – The Python tutorial: docs.python.org/tutorial/ – Dive into Python: www.diveintopython.net/ COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 8 Mainly a programming course You have to acquire new cognitive skills – It’s NOT about using Microsoft Office or any other package – It’s NOT about learning lecture notes by rote – It’s NOT about hacking at code downloaded from the web Labs and assignments are where you learn to program Lectures provide the context, e.g.: – – – – – – Overview Motivation Expectations Focus on specific difficulties Demonstrations Program “style” COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 9 How not to run a Marathon Richard rants and raves. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 10 How to succeed in 121 Do all the labs, on time. Do the assignment thoroughly, getting started early. Don’t give up when the going gets tough Try to solve problems by yourself – Read the book – Experiment with code – Google Don’t just “hack” at code until it works – Work out what’s wrong before continuing COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 11 Why do all the labs? COSC 121S1, 2010 Average final exam mark versus # labs attempted Average final exam mark 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6 7 8 9 10 Number of labs attempted COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 12 Demo The Learn website COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 13 Lecture notes All lecture notes are on Learn – After this week you must print your own copies COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 14 Timetable (tentative) Note: The due date for each lab is at the end of the week in which it’s timetabled. Late submission (of the associated quiz) is permitted for at most one further week. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 15 Programming Challenges Workshop Just for fun! Goals: – to provide extra challenges for top students o – But 121 students will need prior programming experience, sorry to prepare selected students for programming contests o e.g. ANZAC, NZ Programming Contest, ACM ICPC Wednesday evenings, 7pm, starting 29 February (?) – Staff and student tutors will act as mentors – First round of ANZAC contest is 31 March o See http://www.cse.unsw.edu.au/~elgindyh/anzac12/home.htm Contact [email protected] for info/details COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 16 1. Getting Started Read textbook, chapters 1 and 2 What’s 121 about anyway? The What and Why of Python Getting started, here and at home Expressions Assignment statements Functions COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 17 What’s 121 about anyway? “Introduction to programming” Programming underpins Computer Science – But Computer Science is not (just) programming o Theory (e.g. computability, algorithmic complexity) o Algorithms and data structures o Languages and Operating Systems o Databases o Software Engineering o Artificial Intelligence o Data Communications & Networks o Graphics and Human-Computer Interaction o Web computing COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Etc, etc Slide # 18 The What and Why of Python We teach programming in the Python language – Prior to 2010 we used Java, but hard for beginners to get into – Basic programming skills are language independent – In later courses you’ll learn C, Java, C#, JavaScript, … Python is: – Free – “Elegant” – Powerful – Relevant COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 19 XKCD’s view (xkcd.com) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 20 Getting started, here and at home Do the week 1 quiz: Welcome to COSC 121”: – Log in to website learn.canterbury.ac.nz – Select COSC 121 – Follow the link on the front page. At home, to get ready for the rest of the labs: – Download and install o Python 2.6 or 2.7 from python.org/download [NOT 3.n!] o Wing 101 from www.wingware.com/downloads/wingide-101 o Python Imaging Library from www.pythonware.com/products/pil/ COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 21 Starting a lab exercise (Looking ahead to lab 1 in week 2) Log in to learn.canterbury.ac.nz and – Select 121 – Select Lab Material – Download the zip archive for the required lab – Unzip it – Click the associated quiz link to start taking the quiz Launch Wing101 and start doing the lab ☺ COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 22 Wing 101 DEMO Program editing area NB! Python Shell pane COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 23 The Python Shell DEMO Bottom right pane in Wing A “terminal” interface to the Python engine – The Python Engine is the program that executes (“interprets”) Python instructions (“programs”) o A “virtual machine” or “scripting engine” Python engine Shell COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 24 Expressions DEMO Python shell prints the value of any expression entered An expression: something that can be evaluated to yield a value – Typically a sequence of operands and operators – E.g. (25 * 3 – 5) / 7 o Operands here are: 25, 3, 5, 7 o Operators: * , –, / o Evaluates to 10 Arithmetic operators (in lab 1): – +, -, *, /, **, % – Last two are exponentiation and modulus operators COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 25 Expressions (cont’d) Exponentiation: 2**3 is 8 (i.e. 23) Modulus: 26 % 3 is 2 – the remainder after dividing 26 by 3 Operator precedence determines order of evaluation – ** highest then *,/,% then +, – [but more operators later] o – Left-to-right (usually) if operators have same precedence Parentheses used to change default order o 2 + 3 * 5 is 17, (2 + 3) * 5 is 25 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 26 Expressions (cont'd) Operand types (in lab 1): – – int (normal and long variants): whole numbers, e.g. 28196 o Exact o Any arbitrary size/accuracy float: numbers with fractional bit, e.g. 3.1415926 o Approximate: ~16 digits accuracy o Stored in binary representation so even numbers like 1.1 are approximate – But 1.5, 1.25, 1.125 etc have exact representations! COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 27 Warning: integer division 5 / 10 evaluates to 0 – “How many times does 10 go into 5?” – Answer: “No times”! 5.0 / 10.0 evaluates to 0.5 – So does 5.0 / 10 and 5 / 10.0 o Because ints gets converted to floats when doing mixed-type division THIS WILL GET YOU TIME AND TIME AGAIN! Changed in Python 3, but we’re not using that. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 28 Assignment statements DEMO Python shell executes any statements you enter Our first type of statement is the assignment statement – E.g. my_age = 200 / 2 – 1 Of form variable_name = expression A variable name must be a letter (or underscore) followed by any number of alphanumeric characters (letters, underscores or digits) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 29 1. Works out the value of the Right Hand Side (RHS) 2. Creates a new object (in the “object store” or heap) to hold that value – In this case an int (i.e.,“Integer”) object with the value 99 3. Adds the variable name to the current “dictionary” of variables (unless it’s already there) 4. Sets the dictionary entry to point to the new object – We call this a reference to the object ... Dictionary ... (also an object) my_age ... Warning! This is a simplification. See “aliasing” later. What Python does new int object COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 99 “Fred” 37.5 other objects Object store (a.k.a. heap) Slide # 30 What a reference actually is The computer has lots of random access memory (RAM) – e.g. 4 gigabytes (GB) where a byte is 8-bits, e.g. 01101101 Bytes are numbered 0, 1, 2, 3, 4, .... 4 GB – The number of each byte is called its address A reference to an object is the address in memory at which the object is stored (i.e., where it starts) – In Python it's called the object’s identity Thus a dictionary entry consists of a variable name (called a “string”) together with the identity of the object it references – Shown as an arrow in the figures COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 31 Using variables When a variable name appears in an expression, the associated object’s value is used in its place, e.g. his_age = my_age - 1 my_age = my_age + 1 # '=' is not "equals"!!! '=' means “is assigned the value” ... Dictionary ... my_age ... his_age old object (defunct) 99 100 “Fred” 37.5 other objects Object store (a.k.a. heap) 98 new objects COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 32 Combined Operators We often use operations like count = count + 1 # Increment the count size = 2 * size # Double the size So Python provides short-cut “combined operators”, e.g. count += 1 size *= 2 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 33 Functions The key to programming is abstraction – “Abstraction: the process of formulating generalised ideas or concepts by extracting the common qualities from specific examples” – Collins English Dictionary – Naming a concept is a key part of abstraction Example: “Hey, I often need to multiply a number by itself. I know, let’s call that squaring a number” In Python, functions are used for abstracting common procedures (i.e., sequences of operations) – We’ll see other abstraction methods – modules and classes – later. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 34 An example function def square(x): # x is called a "parameter" return x * x # the "body" is indented Used by “calling” or “invoking” it, e.g. square(3) # 3 is called the "argument" square(37.5) # Here 37.5 is the argument square(2 + 3 * 5) # The argument is an expression The parameter is set to the value of the argument and then the body of the function is executed In this case (but not always) it returns a value – The value of the function COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 35 What type is the parameter? In many languages we have to specify the parameter type – e.g. specify whether we are squaring ints or floats – That restricts the allowable argument types Python has “Duck Typing” – “If it walks like a duck and quacks like a duck, it’s a duck” – In this case: if the argument allows x * x it’s OK o If not, it crashes when we run it So you can square ints and floats – And complex objects, but we don't do them in 121 Well, maybe a quick demo? – And any other objects we might define that allow ‘*’ o We probably won't do that in 121 either COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 36 Running programs in Wing Typing functions directly into the Python shell is clumsy Instead we enter them into a program file Then we can: – Run the file – Edit it easily – Come back to it days later – Re-use the functions in other programs We are now programming ☺ DEMO COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 37 Another example def fahrenheit(degrees_c): degrees_f = (9.0 / 5.0) * degrees_c + 32.0 return degrees_f print fahrenheit(0) # What answer do we get? Note multiline body. All lines indented by same amount. Also note local variable. print fahrenheit(100) # What answer here? print fahrenheit(451.0) # And here? print fahrenheit("Fred") # What does this do? • • • The above is a program in a separate file Now we can’t just write expressions and have them printed We have to use a print statement. Covered in detail later. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 38 Local variables degrees_f is a “local variable” of the fahrenheit function Goes in a new dictionary belonging to that function – That dictionary exists only while the function is running o So variable disappears when function returns We say the scope of a local variable is the body of the function in which it is used – Scope is where a variable can be “seen” from COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 39 When do I use functions? Don't expect to understand all this properly yet! Always! Programming is the art of breaking a problem into small “obviously correct” functions – “Divide and conquer” Each can be separately debugged – To “debug” is to remove the “bugs”, i.e., errors, from a program Most functions should be less than 10 lines No functions may be longer than 40 lines in COSC121 – Break big functions into smaller functions COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 40 Built-in functions Python has lots – Though most of its library functions are methods not simple functions – see later. Some early samplers: – round(x) returns the nearest int to the float value x – int(x) converts x into an int – o If x is a float, it truncates. o Later we'll see x can also be a string. abs(x) returns the absolute value of x You’ll meet lots more in due course COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 41 Week 2: Strings and Modules COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 42 2. Strings So far we have met just int and float objects. To process text, we use string objects. A string is a sequence of characters In COSC121 we use only “normal” Python strings – Can represent only standard western keyboard characters (“Latin1”) – We ignore unicode strings, which can represent vastly more characters, including e.g. Chinese o But note that Python 3 uses only unicode strings COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 43 String literals A literal is just a constant – e.g. 10 is an int literal, 23.5 is a float literal String literals in Python are text enclosed by matching delimiters, which can be: 1. single quote (') characters, e.g. o 2. or double quote (") characters, e.g. o 3. s = 'Hi there class 121!' s = "Hi there class 121!" or triple single or double quote characters, e.g. o s = '''Hi there class 121!''' o s = """Hi there class 121!""" COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 44 Pictorially Hi there class 121! ... Dictionary s ... other objects Object store (a.k.a. heap) 37.5 98 s is a variable of type str, a simple Latin1 (usually) string Note for C and Java programmers: Python does not have a data type for representing single characters. You won’t miss it though – just use 1-character strings. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 45 Interlude: character encoding Computer memory is just a sequence of bytes – And an object occupies a chunk of memory A byte is 8 bits e.g. 00110011 – A bit is a binary digit: either 0 or 1 Each byte can have 28 possible states – So can represent the numbers 0 through 255 inclusive Mapping from byte values to characters is via a character encoding table – e.g. 48 is the character ‘0’, 49 is ‘1’, 65 is ‘A’, 66 is ‘B’ etc. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 46 Latin1 encoding (ISO 8859-1) The encoding typically used within Python’s str objects A sample bit is shown on the right See htmlhelp.com/reference/charset for full table There are some non-printing characters, e.g.: – 9 is “horizontal tab” o – ... by an unspecified amount! 12 is “line feed” o Used to start a new line Interlude ends COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 47 But before we continue with strings ... The print statement Syntax: – print expression [, expression] ... [,] i.e. the word “print” followed by one or more expressions, separated by commas, whose values are to be printed o Square brackets in syntax specifications denote optional elements o “...” denotes zero or more repetitions of the preceding syntax element Prints the values of the expressions on the screen – – Spaces separate the output expressions The optional comma at the end suppresses the final newline Used to generate output in programs (or in the shell) – An expression on a line by itself in a program doesn’t generate output as it does when typed into the shell. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 48 Example The following program: DEMO n = 10 s = "n is" print n print s, n print s, print n Outputs: Evaluating printExample.py 10 n is 10 n is 10 >>> COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 49 Which delimiters should I use? Use double quote character (") as string terminator unless string contains a double quote character. Then use single quote delimiters, e.g. >>>print '"Hi", he said' Or vice-versa! "Hi", he said Use the triple-delimiters (''' or """) when the string includes newline characters, e.g. s = '''One fish Two fish Red fish Blue fish''' print s In particular, see docstrings later COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 50 Note on statement termination Python statements end at newline except: – Inside triple-quoted strings (as above) – Inside bracketed expressions (i.e. ( ... ) or [ ... ]), e.g. cost = (23.5 * 36) – # This is valid (but ugly) When newline preceded by backslash (\), e.g. cost = 23.5 * \ 36 # Also valid (and also ugly) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 51 Special characters Can embed special characters like tabs, newline and quote characters in a string with special “escape sequences” e.g. s = "\"One fish\nTwo fish\nRed fish\nBlue fish\"" print s Outputs "One fish Two fish Red fish Blue fish" \n \t \' \" \xhh \\ is newline is tab is single quote is double quote is the character with hexadecimal value hh is a backslash See Language Reference Manual section 2.4.1 for complete syntax COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 52 String operations Three of the arithmetic operators are overloaded to work on strings: 1. '+' performs string concatenation o 2. '*' performs string repetition o 3. Both operands must be strings One operand a string, the other an int '%' performs string formatting o Left operand must be a string, right operand a value or a "tuple" o This is deprecated o Textbook uses it but we won’t! [You don’t need to understand it.] Other string operations need indexing and method calls – See later COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 53 String concatenation examples The program DEMO s = "Hi" print s + "there" print s + " " + "there" print "Hi " "there" # Two literals, no operator n = 10 print "n is " + str(n) # What happens without 'str'? Outputs Hithere Hi there Hi there n is 10 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 54 String repetition examples DEMO The program s = "Hi" print 3 * print s * n = 5 print n * print s * s 3 s n Outputs HiHiHi HiHiHi HiHiHiHiHi HiHiHiHiHi COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 55 Formatting Different from textbook: we don’t use ‘%’ operator Function str converts values to strings, e.g. – str(23) yields the string '23' – str(2.0 / 3.0) yields the string '0.666666666667' – But no control over formatting Function format(value, format_spec) formats the value as a string according to the given format specifier – e.g. '6.2f' means “format a floating point number in a field of 5 characters with 2 digits after the decimal point” – format(value) ≡ format(value, "") ≡ str(value) Best to explain with examples ... COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 56 Examples DEMO Conversion types s and d name = "Richard" n = 20 print "Hi " + format(name, "s") + ", pleased to meet you" print "n = " + format(n, "d") # Above 2 lines could have used just 'str' equivalently Minimum field width from math import pi Precision print pi print "pi is " + format(pi, "5.3f") print "pi is " + format(pi, "8.3f") print "pi is " + format(pi, ".3f") print "pi is " + format(pi, ".3g") print "Tiny num is " + format(pi/100000, ".3g") print format(name, "20") print format(name, ">20") print format(name, "^20") COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Note: Conversion type can usually be omitted for strings and ints Slide # 57 Example output Hi Richard, pleased to meet you n = 20 3.14159265359 pi is 3.142 pi is 3.142 pi is 3.142 pi is 3.14 Tiny num is 3.14e-05 Richard Richard Richard • Many other capabilities e.g. binary, octal, hexadecimal, percentages, commas for thousands, arbitrary fill characters • See http://docs.python.org/library/string.html#formatspec COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 58 raw_input One more useful function: raw_input([prompt]) Displays prompt (if given) and reads a line from keyboard Returns the line as a string, e.g. – name = raw_input("What is your name? ") If you want to read numbers, convert the string using int or float as appropriate, e.g. – – age_as_string = raw_input("How old are you? ") age = int(age_as_string) DEMO Or – weight = float(raw_input("What's your weight in kg? ")) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 59 In class exercise Write a program that prompts for a person’s weight in kg and height in metres and prints their Body Mass Index – BMI = weight / height2 (in kg/m2) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 60 3. Modules To control complexity, large programs are always broken into smaller modules – Collections of functions (plus perhaps data) One module imports the code and data from another The Python library is a large collection of modules COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 61 Importing the math module DEMO Can import the entire module and its namespace import math print math.pi # An imported data value print math.sqrt(23.456)# An imported function Or import selected data/functions into current namespace from math import pi, sqrt print pi print sqrt(23.456) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 62 Finding what’s in a module DEMO 1. Read the Python Standard Library documentation – Accessed via Wing’s Help menu – Or via http://www.python.org/doc/ 2. In the shell window, import the module and type help(moduleName), e.g. help(math) or, for its directory, dir(math) 3. Google, e.g., python math module 4. For details on a particular function, can use the on-line help’s index or type help(moduleName.functionName) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 63 Interlude A bit about namespaces A namespace is just a dictionary of names Consider: my_age = 100 import math We then have: “Global” Dictionary ... math my_age ... these are two different namespaces ... sqrt ... pi ... 100 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 math Dictionary Code for sqrt function Object store (a.k.a. heap) 3.14159... etc Slide # 64 Local namespaces DEMO Consider the following program x = 10 def blah(): x = 20 print "Within blah, x =", x Function has its own local namespace. print "Initially, x =", x blah() print "Post-blah, x =", x Output is: Initially, x = 10 Within blah, x = 20 Post-blah, x = 10 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 65 Using globals A function can “see” variables in the global namespace, e.g. x = 10 def blah(): print "Within blah, x =", x # Prints 10! But new variables created by assignment inside a function are added to the local name space. When evaluating expressions, Python looks first in local namespace, then in global namespace if name not found. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 66 Using globals (cont'd) It’s illegal to reference a global variable from within a function and then to create a local one of the same name. e.g., the following gives a runtime error: x = 10 def blah(): print x x = 20 Python interprets this as a local variable being used before it is defined. – We say the scope of a variable is the entire function COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 67 Assigning to a global BUT can assign to a global variable using a global statement: x = 10 def blah(): global x print "In blah, x =", x x = 20 print "Initially, x =", x blah() print "Post-blah, x =", x Output is: Initially, x = 10 In blah, x = 10 Post-blah, x= 20 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 68 Use of global variables in 121 A very simple rule: DON’T i.e., don’t read from or write to global variables from within a function body Interlude Ends COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 69 Writing your own modules A module is nothing more than a file of Python code, e.g. a file circle.py: import math def area(radius): return math.pi * radius**2 def circumference(radius): return 2 * math.pi * radius COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 70 Using the circle module Just import it and use it! import circle r = 5.0 area = circle.area(r) circum = circle.circumference(r) ... import x causes Python to load and execute the file x.py – Must be in the current directory or on the Python search path o Don’t worry about the latter for now COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 71 Documenting your modules Triple-quoted strings at the start of modules and function bodies are docstrings, e.g. for circle.py '''A module of functions related to circles.''' import math def area(radius): '''Returns the area of a circle given its radius.''' return math.pi * radius**2 def circumference(radius): '''Returns the circumference of a circle given its radius.''' return 2 * math.pi * radius COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 72 Output of help(circle) is now: Help on module circle: NAME circle - A module of functions related to circles. FILE h:\work\2011\121s1\lectures\circle.py FUNCTIONS area(radius) Return the area of a circle given its radius. circumference(radius) Return the circumference of a circle given its radius. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 73 Style in COSC121 All modules should have: – A docstring for the module o – A docstring for each function o – – At the very start of the file, before any imports Immediately after the def line A blank line between the docstring and the function body Three blank lines between functions Also, lines shouldn't be more than 80 characters long – i.e., don’t cross that red line in Wing! COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 74 More on import When a module is imported, it is executed – That’s when the function objects get defined. – Also, any module globals (like math.pi) get defined then. If a module has already been imported, nothing happens When a module is imported, its __name__ variable is set to the name of the module When a module is run, its __name__ is set to “__main__” COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 75 Using “__main__” It’s standard practice to include module test code at the end of a module Execute it only if name is __main__ But we haven’t done if statements yet! – Textbook however introduces if statements at this point without explanation – We'll have a quick advance peek DEMO COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 76 Week3: Objects, methods and lists COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 77 Objects We’ve seen that everything in Python is an object – int objects, float objects, str objects, function objects ... An object contains data – The value, for int and float objects – The sequence of characters, for str objects – The Python code, for function objects But wait, there’s more .... COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 78 Methods Each type or class of objects has a set of functions that operate on objects of that type These are called methods. We call a method with the syntax objectName.methodName([argument]...) This is roughly equivalent to functionName(objectName, [argument]...) – i.e., calling a method of a particular object is like calling a (roughly equivalent) function that takes the object as its first parameter – This will make sense much later in the course ☺ COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 79 Some string methods DEMO capitalize() find(substring [, begin [, end]]) In this slide, square brackets denote optional parameters – you don’t actually type them. lower() upper() strip([chars_to_strip]) startswith(prefix [, start [, end]]) endswith(suffix [, start [, end]]) Return a boolean. See later. split([delimiter]) # Returns a list – see later format(value [,value]…) # Format using a template COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 80 String method example program • A program to read a full name like “natalie ng”, break it into its two components, correctly capitalise each one, and print a "Hi" message. Handles mixed case, e.g. “nATAlie nG”. full_name = raw_input("Enter your full name: ") pos_of_space = full_name.find(' ') first_name = full_name[0:pos_of_space] last_name = full_name[pos_of_space+1:] Extract appropriate "slices" of the string. See later. [There are better ways, but this is probably the easiest at this stage.] corrected_first_name = first_name.capitalize() corrected_last_name = last_name.capitalize() print "Hi", corrected_first_name, corrected_last_name COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 81 Formatting from a template We've seen lots of string-formatting expressions like s = "(" + format(x, ".3f") + ", " + format(y, ".3f") + ")" Cumbersome! The format method of a string achieves this much more easily: template = "({0:.3f}, {1:.3f})" s = template.format(x, y) DEMO or just s = "({0:.3f}, {1:.3f})".format(x, y) – The result is the template string with the replacement fields (in braces) replaced by the formatted argument values – Replacement field is an argument index followed by an (optional) colon and a format specifier (as in the format function). COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 82 More formatting examples Refce: http://docs.python.org/library/string.html#formatstrings DEMO first_name = "Lucy" last name = "Languid" third = 1.0 / 3.0 print "{0} {1}".format(first_name, last_name) print "{} {}".format(first_name, last_name) print "One third to 4 dec places is {}".format(third) print " n sqrt(n)" for i = range(100): # Laying out a table in columns print "{0:3}{1:10.5f}".format(i, math.sqrt(i)) We have all the same options as before for formatting each argument – Binary, octal, hex, left/centre/right justification, general numeric format, etc. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 83 Methods of other object types All objects have methods – The object’s type (or “class”) determines the set of methods Even int and float objects – But mostly their methods are for use internally by the Python engine, e.g. if i and j are ints: o i.__add__(j) is exactly equivalent to i + j __add__ is the “add this int to another” method COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 84 Finding the available methods Read the documentation (e.g. via help in Wing), or Use dir and/or help in the shell, e.g. s = "blah" # Create an object of the type we're interested in dir(s) # A "directory listing" of the methods of s help(s.index) # Help on a particular method help(type(s)) # Help on the entire string type (aka "class") COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 85 Another example: images DEMO The Python Imaging Library (PIL) does image processing It contains a set of submodules – Must explicitly import the one(s) we want Image submodule does reading of images, resizing, cropping, rotating, pixel editing, various transformations, etc. For example: from PIL import Image # Get the Image submodule my_image = Image.open("photo.jpg") new_image = my_image.rotate(90) # Rotate 90 degrees new_image.save("rotated_photo.jpg")# Save new image COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 86 Lists Many /most computer programs handle collections of data – a list of students, a sequence of temperature samples, an array of image pixels, set of university courses, a table of measurements ... Most such collections can be represented in Python by its list data type. A list is a sequence of objects that can be processed sequentially. The Python list also allows immediate access to any element by subscripting, e.g. marks[i]for the ith mark – In maths notation, we’d write this as marksi – So a Python list is both a list and an array COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 87 Some examples of lists days_in_month = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31] guys= ["Freddie", "Brian", "Roger", "John"] colours = ["Red", "Green", "Blue"] squares = [0, 1, 4, 9, 16, 25, 36, 49] square_roots = [1.0, 1.4142135623, 1.732050807, 2.0, 2.236067977] great_thoughts_of_george_bush = [] # The empty list personal_details = ["Erika Mandelbrot", 27, "5 Nowhere St, Christchurch"] A list of objects of different types. Legal in Python but bad style. We'll see better ways of representing such "records" later. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 88 Indexing into lists To use lists we need to be able to get at the individual elements Do by indexing, e.g. print days_in_month[0] # Prints 31 print colours[2] # Prints "Blue" NB: subscripts start at 0!! print squares[len(squares) – 1] # prints 49 o len function returns the number of items in a list print squares[-1] # Also prints 49 o If subscript is negative, Python adds len(list) to it print squares[-2] # Prints 36 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 89 How lists are represented names = ["Fred", "Mary", "ChinMay"] results in: list object ... Dictionary names ... "ChinMay" Object store "Fred" "Mary" The list object itself is just a list of references to the objects in the list. – This is important – see aliasing slide shortly COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 90 Changing list elements names = ["Fred", "Mary", "ChinMay"] names[1] = "Alex" results in: list object ... Dictionary names ... "ChinMay" Object store "Fred" "Mary" "Alex" defunct The list element is changed – we don't get a new list We say list objects are mutable (= "changeable") COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 91 List slicing Often want sublists rather than individual items Done by extended indexing of the form "start:end+1" – Missing first subscript defaults to 0 – Missing second subscript defaults to len(list) Called "slicing" Examples: – print squares[2:4] # Prints "[4, 9]" o Note that slice is up to but not including the second subscript – print squares[:4] # Prints "[0, 1, 4, 9]" – print squares[3:] # Prints "[9, 16, 25, 36, 49]" COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 92 Assigning to slices my_list[start:end] = another_list replaces the elements my_list [start] up to but not including my_list [end] with the elements from another_list Example: my_list = [1, 3, 5, 7, 9,11] my_list[2:4] = [-3, -9, -11, -13] print my_list prints [1, 3, -3, -9, -11, -13, 9, 11] Can do insertion too (but insert method easier to read?): my_list = [1, 3, 5] my_list[1:1] = [-3, -9] # my_list is now [1, -3, -9, 3, 5] COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 93 List operators is a list of all the elements from list1 followed by all the elements from list2 list1 + list2 – Called concatenation – e.g. [1, 2, 3] + [7, 8] is [1, 2, 3, 7, 8] or n * my_list, where n is an int, is a new list containing n repetitions of the sequence of items in my_list my_list * n – 3 * ['Max', 'Amy'] is ['Max ', 'Amy ', 'Max ', 'Amy ', 'Max ', 'Amy'] object in list evaluates to True if the object is in the list – e.g. 3 in [1,3,5] is True, 2 in [1,3,5] is False o You're not meant to understand this yet. It's here for completeness! COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 94 List functions len(my_list) – is the length of my_list e.g., print len([1,2,3]) prints 3 sum(my_list) sums the elements of my_list – e.g., print sum([1,2,3]) prints 6 – List items must be numeric o Can’t do string concatenation this way and max(my_list) return min and max elements in a numeric list min(my_list) – e.g. max([-3, 13, 5]) is 13 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 95 List methods If L is a list: L.append(object) # Adds object to end of L. Returns None o None is a special object used to signify “no answer” L.count(value) # Returns count of items in L equal to value L.extend(L2) # Appends all the items from L2 onto L. Returns None L.index(value) # Returns the index of the first occurrence of value in L o Gives an error if value not found L.insert (index, object) # Insert object into L before index. Returns None L.pop([index]) # Remove and return object at index (defaults to last) L.remove(value) # Remove first occurrence of value. Returns None L.reverse() # Reverse list L. Returns None L.sort() # Sorts L in ascending order. Returns None COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 96 A trap! What will the following output? names = ["Fred", "Mary", "ChinMay"] other_names = names names.append("Angus") print "Names: ", names print "Other names:", other_names Answer: Names: ['Fred', 'Mary', 'ChinMay', 'Angus'] Other names: ['Fred', 'Mary', 'ChinMay', 'Angus'] Both lists were altered! COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 97 Why it happened: aliasing Assignment of one object to another just copies the reference. So after other_names = names we have: list object ... Dictionary names other_names ... "ChinMay" Object store "Fred" "Mary" So names and other_names are just aliases for the same object. Whenever one changes, the other changes too. – Also see http://people.csail.mit.edu/pgbovine/python/tutor.html#mode=visualize COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 98 Avoiding aliasing problems Be wary of assignments of the form a = b when b is a mutable object, i.e., one whose value can be changed, as any changes will apply to all aliases. – Not a problem with ints, floats, strings, tuples as they’re all immutable. If you want to make a copy of a list, use slicing, e.g., other_names = names[:] – This constructs a new list containing copies of all the references. Called a shallow copy. o There can still be aliasing problems if the referenced objects are mutable but we won't worry about that for now! COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 99 List processing: the for statement for variable in list: block Sequentially performs the given statement block for each element in list, with variable bound to that element. – Also called a for loop or a for each loop Example: Emphasises that the loop executes “for each” value in the list for i in [0, 1, 2, 3, 4, 5, 6]: i_sqr = i * i print "{} squared = {:2}" .format(i, i_sqr) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 100 The for loop: example 2 Often we want to accumulate new data as we traverse the list, e.g. totalling items or building a new list. numbers = [0, 1, 2, 3, 4, 5, 6] squares = [] sum_squares = 0 for i in numbers: i_sqr = i * i squares.append(i_sqr) sum_squares += i_sqr print "List of squares: ", squares print "Sum of squares = ", sum_squares COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 101 Nested lists A list item can be any object, including another list Useful when representing tabular data, e.g. 6 hourly Day number rainfall totals (mm) for a given week 6 am noon 6pm Midnight 0 0 9 3 7 1 11 9 0 0 2 0 10 12 20 3 0 0 0 0 4 1 3 4 1 5 2 8 10 0 6 0 0 0 0 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 102 Python representation Use a list of rows, each row being a list of rainfall values: rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [0, 10, 12, 20], [0, 0, 0, 0], [1, 3, 4, 1], [2, 8, 10, 0], [0, 0, 0, 0] ] ... Dictionary rainfalls ... etc etc 0 0 9 7 3 rainfalls[day_num][column_num] 11 9 0 selects a particular int value COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 103 Nested loops Nested lists are often processed with nested loops, e.g. to print the previous table: print "Rainfall data" for day in [0, 1, 2, 3, 4, 5, 6]: print "Day", day, ": ", for column in [0, 1, 2, 3]: print rainfall[day][column], print # Prints just a newline Tip: the range function would help here – range(low, high) is the list [low, low+1, low+2, ... high-1] – range(high) is the list [0, 1, 2, ... high-1] COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 104 Sequences Lists are an example of a more basic Python type: the sequence Sequences can be: – Indexed – Iterated over with for loops strings are also sequences. Thus: name = "Fred" print name[1] # Prints 'r' print name[1:3] # Prints "re" for char in name: print char # Prints characters in name, one per line But unlike lists, strings are immutable – So can’t assign to elements or slices COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 105 Another sequence type: the tuple Tuples are like lists, but use parentheses instead of brackets point = (10, 20) person = ("Fred", "Bloggs", 27, "15 Memorial Drive", 8053) empty = () # The empty tuple singleton = ("Fred",) # Note weird syntax – that comma is essential Differences: – Tuples are immutable – Have only two methods: count and index COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 106 Should I use a tuple or a list? Use lists when: – Elements are all of the same type, and – It makes sense to think of the elements as a list Use tuples in any of the following situations: 1. You’re dealing with inhomogeneous data o 2. e.g. person has a name, address, age, postcode You think of the elements as being parts of a single object o But for non-trivial objects, classes are usually better – 3. Covered towards end of course You need or want immutability o e.g., as keys for a dictionary – see later. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 107 The list constructor Can make a list of the elements in any sequence using the built-in list function, e.g. – list("Angus") – list((1,2,3)) returns ['A', 'n', 'g', 'u', 's'] returns [1, 2, 3] list is actually a special function called a constructor. – See Object Oriented Programming much later It’s best to avoid using the name list for one of your own variables as you then can’t use the list function any more. – Names like str, len, list etc are not reserved or protected. o i.e, there’s nothing to stop you from clobbering them! COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 108 Parallel assignment Although we can index tuples, it’s not normally intuitive – They’re used when we don't think of the elements as being in a list Better to extract their components using parallel assignment person = ("Fred", "Bloggs", 27, "15 Memorial Drive", 8053) ... (first_name, last_name, age, address, post_code) = person Useful e.g. in a function that takes a tuple as a parameter – Say a point or a person – RHS can be any sequence of the same length as LHS COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 109 Processing image pixels An image is a 2D array of pixels – Each pixel is a (red, green, blue) 3-tuple -- an RGB value PIL.Image.getdata() delivers all the pixels as a list of tuples – Actually it’s a sequence, but you can mostly treat it like a list COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 110 Example: redden.py DEMO '''Replicate the functionality of sunset.py from the textbook with PIL''' from PIL import Image def redder(pixel): '''Return a redder version of the given pixel''' (red, green, blue) = pixel return (red, int(0.7 * green), int(0.7 * blue)) imageFile = open('pic207.jpg', 'rb') pic = Image.open(imageFile) pixels = pic.getdata() print type(pixels) new_pixels = [] for pixel in pixels: new_pixels.append(redder(pixel)) pic.putdata(new_pixels) # Replace pixel sequence with our new one pic.save('redder.jpg') # Save to a new file COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 111 Files as sequences Built-in function open(path [,mode]) opens a file – – path is a filepath string, e.g. "H:/121/junk.txt" o Python allows forward-slashes instead of Window’s backslashes o If you want backslashes, you must escape them, e.g. "H:\\121\\junk.txt" without slashes it’s the name of a file in the current directory o – The one the running program was saved in mode is "r" (default), "w" or "a" to read, write or append resp. A file object, opened for reading, is a sequence of lines. data = open("junk.txt") # Default is open for reading for line in data: # Processes file line by line print line[0:-1] # Print the line without its final \n char COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 112 Files as lists of lines Previous program could also be written data = open("junk.txt") lines = data.readlines() # Get a list of all the lines in the file for line in lines: # Processes file line by line print line[0:-1] # Print the line without its final \n char Advantages: – More explicit about what it’s doing (?) – Can access lines in arbitrary order, back up in file, etc Disadvantages: – Need to have enough memory to hold the whole file COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 113 Splitting and stripping lines removes characters in chars_to_strip from the front and end of line line.strip([chars_to_strip]) – Default is to strip white space (newlines, tabs, spaces) – e.g. " hi there mate! \n".strip() is "hi there mate!" line.split([separator]) returns a list of strings obtained by splitting the line into substrings around the given separator string or around whitespace by default – e.g. " hi there mate! \n".split() is ["hi", "there", "mate!"] – Useful when breaking data into numbers, words, etc or processing tabular data like ".csv" (comma-separated values) files. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 114 List comprehensions A quick preview. Not in the text and not examined but good to know ☺ Often want a new list in which each element is computed from the elements of an existing list. – e.g. the squares of a sequence of integers – the lengths of all the words in a file Syntax: [ expression for variable in sequence] Yields a list of the values of the given expression (which usually involves variable) for each value of the variable in the sequence. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 115 List comprehensions (cont'd) Examples: – [x * x for x in range(10)] o – [len(word) for word in "this is a sentence".split()] o – [0, 1, 4, 9, 16, ... 81] [4, 2, 1, 8] max([len(line) for line in open("poem.txt")]) o The length of the longest line in the file poem.txt (including the newline char) Can also have an if clause to “filter” in/out wanted/unwanted elements – [x * x for x in range(n) if x % 2 == 1] o The squares of all the odd integers less than n COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Don't worry about this until we've done if statements and even then don’t worry about it! It won’t be examined. Slide # 116 Week4: Conditionals COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 117 The boolean type (‘bool’) A boolean expression evaluates to either True or False. A boolean variable is either True or False Get booleans by: 1. Testing relationships using relational operators <, <=, >, >=, ==, != (or <>) , is [not], [not] in 2. Calling functions or methods that return booleans, e.g. startswith, endswith 3. Combining booleans with logical operators, e.g. and, or, not NB: Equality testing is done with "==", not "=" (which is assignment). Tests equality of identity. Rarely useful. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 118 Examples (in practice, at least one operand would be a variable) Expression Result 10 > 5 True 10 <= 5 False <= is “less than or equal to” 2 + 3 == 5 True Note ‘==’ rather than ‘=’ "a" < "apples" True Strings are compared char-by-char, ... "ABC" != "abc" True ... using their numeric encoding. Upper case "Zack" < "alan" True ... chars come before lower case chars 10 < "Fred" True Meaningless! Don’t do this. "red" in "Fred" True "x" not in "Fred" True not (2 > 5 or 3 < 5) False "Gonk".startswith("Go") True "Gonk".endswith("NK") False Comment Same as not ("x" in "Fred") Lower-case != upper case COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 119 Operator precedence ** *, /, % High Precedence +, Shifts and bitwise operations (not in 121) All relational operators (all same precedence) not and or COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Low Precedence Slide # 120 Chaining relational operators Suppose we want to check if an int i is in the range low to high inclusive In most languages we’d write (i >= low) and (i <= high) Python allows the shorthand: low <= i <= high But use this operator chaining only in the usual mathematical ways or it might surprise you, e.g.: – 5 < 10 == False evaluates to False – 5 < 10 == True – Reason: also evaluates to False! o they’re shorthands for (5 < 10) and (10 == False) and (5 < 10) and (10 == True) o Objects of different types (int and bool) usually test unequal (but don’t do it!) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 121 Boolean operators on non-bools Python allows non-bool operands with Boolean operators – I wish it didn’t! All non-bool objects are treated as True except: – – – – Numeric zeroes: 0, 0.0, (0 + 0j) None Empty strings Empty containers (lists, tuples, dictionaries etc – see later) These are all False PLEASE DON'T USE THIS FACT! – Use boolean operators on bools ONLY COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 122 Lazy (“short circuit”) evaluation a or b returns True if a is True. Evaluates b only if a is False. a and b returns False if a is False. Evaluates b only if a is True. Called lazy evaluation. Can be important if b has side effects or might generate an error, e.g. names = ["Fred", "Alice"] ... len(names) > 2 and names[2] == "Alan" names[2] == "Alan" and len(names) > 2 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 # Is False # Throws an error Slide # 123 if statements Syntax (the most general form): A block is an indented sequence of statements (as in function defs and for loops) if bool_expression: block elif bool_expression: block elif bool_expression: optional: zero or more block ... else: block optional COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 124 A basic if statement substance = raw_input("What substance? ") ph = float(raw_input("Enter the measured pH: ")) if ph < 7.0: print substance + " is acidic" Flow chart Input ph ph < 7 ? No COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Yes Print message Slide # 125 Multi-line blocks As in function definitions, blocks can have multiple lines: substance = raw_input("What substance? ") ph = float(raw_input("Enter the measured pH: ")) if ph < 7.0: print substance + " is acidic" Input ph print "Be careful with that!" ph < 7 ? No COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Yes Code for acidic case Slide # 126 The else part substance = raw_input("What substance? ") ph = float(raw_input("Enter the measured pH: ")) if ph < 7.0: print substance + " is acidic" Input ph print "Be careful with that!" else: print substance + " is not acidic" print "But that doesn't mean it's safe!" ph < 7 ? No Code for nonacidic case COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Yes Code for acidic case Slide # 127 Using elifs substance = raw_input("What substance? ") ph = float(raw_input("Enter the measured pH: ")) if ph < 7.0: print substance + " is acidic" print "Be careful with that!" elif ph == 7.0: print substance + " is neutral" else: print substance + " is basic" print "It might be caustic!" COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 128 Flow chart Input ph ph < 7 ? No ph == 7 ? No It is clear that only one of Yes Yes Code for acidic case the three blocks can be executed. – i.e., cases are all mutually exclusive. Code for neutral case Code for basic case COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 129 A variant using only basic ifs Could write the preceding program as: substance = raw_input("What substance? ") ph = float(raw_input("Enter the measured pH: ")) if ph < 7.0: print substance + " is acidic" print "Be careful with that!" if ph == 7.0: print substance + " is neutral" if ph > 7.0: print substance + " is basic" print "It might be caustic!" COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 130 Flow chart ph < 7 ? Yes Code for acidic case No ph == 7 ? – More flexible – Explicit about when each block executes Yes Code for neutral case No ph > 7 ? Advantages of this version Yes Disadvantages – Doesn’t make the mutual exclusion obvious – Slightly less efficient o No Rarely relevant Code for basic case COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 131 The perils of floating point equality What will the following code print? a = 3.0 * 0.1 / 3.0 if a == 0.1: print "I should think so too" else: print "Huh?" Or, generalising that: print [i for i in range(1,100) if i * 0.1 / i != 0.1] Rule of thumb: – Never compare floats for (in)equality. Floats are approximate so equality cannot be guaranteed. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 132 Nested ifs The conditional code blocks can contain any type of statement including if statements – Called nested ifs Example: – 2 Write a program to solve a quadratic equation ax + bx + c = 0 Solution is: −b ± b 2 − 4ac x= 2a – Program should prompt for a, b and c – Print “Not a quadratic” if a is 0, otherwise: o o Print “Roots are ... , ...” (4 digit accuracy) if roots are real Print “Roots are imaginary” otherwise. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 133 Pseudocode When writing programs we often draft the general algorithm we’ll use in pseudocode – Shows the main code blocks, if statements and loops – Omits much of the coding details Input a, b and c if a is 0: print "Not a quadratic" else: 2 Compute the discriminant b − 4 ac if discriminant is positive: Compute and print roots else: A nested if. Note increased indentation level. print "Roots are imaginary" COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 134 Actual code In-class exercise. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 135 Simplifying complex logic Programs must be as simple and readable as possible. Avoid complex logic by: 1. Simplifying logical expressions o e.g. de Morgan’s theorem 2. Flattening nested code 3. Introducing temporary boolean variables 4. Writing boolean-valued functions COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 136 de Morgan’s Theorem A basic theorem of boolean algebra, useful in computing not (a and b) = (not a) or (not b) not (a or b) = (not a) and (not b) Examples – “I’m not going if it’s raining or I’m feeling tired” = “I’m going if it’s not raining and I’m not feeling tired” if not(is_raining or is_tired): go() ≡ if is_not_raining and is_not_tired: go() COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 137 de Morgan’s Theorem (cont'd) “Unless a is zero or the discriminant is negative, print the roots” = “if a is non-zero and the discriminant is nonnegative, print the roots”. if not(a == 0 or discriminant < 0): print_roots(a, b, c) if a != 0 and discriminant >= 0: ≡ print_roots(a, b, c) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Much more readable Slide # 138 Interlude: truth tables Since boolean values are either True or False, it’s easy to make a table of the value of some boolean expression for all possible parameter values. – Called a truth table For example, the Truth table for not (a or b) is a b False False False True True False True True a or b not (a or b) Do in lectures COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 139 Truth tables (cont'd) Similarly for (not a) and (not b) a b False False False True True False True True not a not b not a and not b Do in lectures Last column same as previous table. This proves one of the two de Morgan’s laws. You do the other one. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 140 Exercise (text 6.5 Q) You want an automatic wildlife camera to switch on if the light level is less than 0.01 or if the temperature is above freezing, but not if both conditions are true. Your first attempt to write this is: if (light < 0.01) or (temperature > 0.0): if (light < 0.01) and (temperature > 0.0): pass else: camera.on() A friend says that this is an exclusive or and that you could write it more simply as: if (light < 0.01) ! = (temperature > 0.0): camera.on() Prove whether your friend is right or wrong. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Truth Table Slide # 141 Flattening nested code The previous exercise shows an example where nested ifs can be replaced with a single if. Avoiding nesting usually makes code more readable. Example, the quadratic pseudocode earlier could be rewritten as: Input a, b and c Compute the discriminant b2 – 4ac if a is 0: print "Not a quadratic" elif discriminant is negative: print "Roots are imaginary" else: compute and print roots COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 142 Temporary boolean variables Often code is more readable if we give names to boolean conditions, e.g. is_low_light = light < 0.01 is_high_temp = temperature > 0.0 if (is_low_light or is_high_temp) and not (is_low_light and is_high_temp): camera.on() We’ve avoided nesting without the “tricky” (?) exclusive- or code. Style convention: use names beginning with is_ for such booleans “flag” values. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 143 Boolean valued functions Another way of giving names to boolean conditions is to wrap them in functions, e.g.: def either_but_not_both(condition1, condition2): '''Return true if either of condition1 or condition2 is true but not both.''' return (condition1 or condition2) and not (condition1 and condition2) if either_but_not_both( light < 0.0, temperature > 0.0): camera.on() either_but_not_both is a verbose way to provide exclusive or (cf condition1 != condition2) but more readable to most people (?) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 144 Week 5: Repetition COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 145 "for each" loops (reviewed) The basic for statement (aka "for each" loop) iterates over elements in a sequence – List, string, tuple, file ... Done all elements? E.g.: for num in [10, 20, 30, 40]: print num Yes No num = next list element The loop control variable (num in this example) is bound to each list element in turn COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 print num Slide # 146 List → List Often need to modify list information, e.g., scale up all the marks in a list of marks by 10%. Two approaches: – Generate a new list with a for each loop – Modify the existing list "in place" COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 147 A new list via "for each" marks = [55.3, 37.4, 85.2, ...] # (or more likely read from file) scale_factor = 1.1 scaled_marks = [] for mark in marks: scaled_marks.append(mark * scale_factor) print scaled_marks COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 148 Modifying the existing list Try the "obvious" approach: marks = [55.3, 37.4, 85.2, ...] scale_factor = 1.1 for mark in marks: mark = mark * scale_factor print marks # UNCHANGED! COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 149 Why it fails The assignment mark = mark * scale_factor binds mark to a new object Doesn't alter the existing object e.g., on first time through the loop: list object ... ... Dictionary marks mark ... Before assignment After assignment 85.2 55.3 Object store 37.4 60.6 New object COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 150 What we actually want: list object ... ... Dictionary marks ... 85.2 55.3 Object store 37.4 Before assignment After assignment 60.6 New object i.e., we want to assign new values to marks[0], marks[1], ... COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 151 Modifying the existing list: take 2 This time use a loop control variable i = 0, 1, 2, ... marks = [55.3, 37.4, 85.2, ...] scale_factor = 1.1 for i in range(0, len(marks)) : # i takes values 0, 1, 2, ... len(marks)-1 marks[i]= marks[i] * scale_factor # Or marks[i] *= scale_factor print marks This version works ☺ COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 152 The enumerate function The enumerate function takes a sequence as a parameter and returns a new sequence (an enumerate sequence) of (index, value) pairs. e.g. list(enumerate(["Joe", "Alice", "Anne"])) is [(0, "Joe"), (1, "Alice"), (2, "Anne")] Can rewrite the previous example as: marks = [55.3, 37.4, 85.2, ...] scale_factor = 1.1 for (i, value) in enumerate(marks): # Using parallel assignment marks[i] = value * scale_factor print marks COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 153 Skipped slide from Week 3 Nested lists A list item can be any object, including another list Useful when representing tabular data, e.g. 6 hourly Day number rainfall totals (mm) for a given week 6 am noon 6pm Midnight 0 0 9 3 7 1 11 9 0 0 2 0 10 12 20 3 0 0 0 0 4 1 3 4 1 5 2 8 10 0 6 0 0 0 0 COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 154 Python representation Skipped slide from Week 3 Use a list of rows, each row being a list of rainfall values: rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [0, 10, 12, 20], [0, 0, 0, 0], [1, 3, 4, 1], [2, 8, 10, 0], [0, 0, 0, 0] ] ... Dictionary rainfalls ... etc etc 0 0 9 7 3 rainfalls[day_num][column_num] 11 9 0 selects a particular int value COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 155 Nested loops Skipped slide from Week 3 Nested lists are often processed with nested loops, e.g. rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [0, 10, 12, 20], [0, 0, 0, 0], [1, 3, 4, 1], [2, 8, 10, 0], [0, 0, 0, 0] ] print "Rainfall data" for day in [0, 1, 2, 3, 4, 5, 6]: print "Day {}:".format(day), for column in [0, 1, 2, 3]: print format(rainfalls[day][column], "5"), # Note final comma! print # Prints just a newline Tip: the range function would help here: range(low, high) is the list [low, low+1, low+2, ... high-1] range(high) is the list [0, 1, 2, ... high-1] COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 156 An improved version We can refine that to: rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [0, 10, 12, 20], [0, 0, 0, 0], [1, 3, 4, 1], [2, 8, 10, 0], [0, 0, 0, 0] ] print "Rainfall data" for (day, days_rain) in enumerate(rainfalls): print "Day {}:".format(day), for rain in days_rain: print format(rain, "5"), print Now it works for any number of days and any number of rainfall samples per day. # Prints just a newline COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 157 Loops and ifs Often need to use an if in the body of a loop Example: suppose rainfall data uses -1 to denote missing data and you need to print daily averages, excluding missing data. Pseudocode: Get rainfall data Print heading for each day: clear sample counter and total for each rainfall sample in the day's rain: if sample non-negative: Add sample to daily total and increment sample counter Print average for day COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 158 The code rainfalls = [ [0, 9, 3, 7], [11, 9, 0, 0], [-1, 10, 12, 20], [0, -1, 0, 0], [-1, 3, 4, 1], [2, 8, 10, 0], [-1, 0, 0, 0] ] print "Rainfall data" print "Day Readings Total Average" for (day, days_rain) in enumerate(rainfalls): daily_total = 0 num_readings = 0 for rain in days_rain: if rain >= 0: daily_total += rain num_readings += 1 average = daily_total / num_readings print "{:2}{:5}{:8}{:9.2f}".format(day, num_readings, daily_total, average) BUT: there are two bugs for you to find and fix: (1) Averages are wrong! (2) Fails if data missing for the whole day; it should then print "*" for the average. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 159 while loops The for loop iterates over a known sequence. Sometimes we don't have a known sequence, e.g. – "Stir until boiling" – "Keep reading input until the user types quit" – "Keep refining the answer until it's good enough" Situations like this need a different sort of loop: – The while loop Syntax: while condition: statement_block COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 160 Example 1: bacteria If a population of bacteria increases in size by 21% every minute, how long does it take for the population size to double? minutes = 0 population = 1000 growth_rate = 0.21 while population < 2000: population *= (1 + growth_rate) minutes += 1 print minutes, " minutes required." print "Population =", int(population) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Initialise all variables No Population < 2000 ? Yes update population and increment minutes Slide # 161 Infinite loops Need to be quite sure that loop condition will become true at some point Otherwise program loops forever – an infinite loop A very common bug. Example: if previous program were while population != 2000: # Continue until population doubles ... # ... but equality never occurs! Have to use Options > Restart in shell window to kill it Remember: don't compare floats for (in)equality! COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 162 Example 2: input loops Often want to loop, consuming some input data sequence, until some condition occurs, e.g. – Loop until user enters a quit command – Loop until user gives a valid response – Search through a sequence for the first occurrence of something There are three common idioms for this: – a "one-and-a-half" loop – use of a boolean variable like "is_done" – use of a break statement COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 163 The three idioms "One-and-a-half loop" Use of a boolean "flag" item = getItem() while item needs action: Process item item = getItem() is_done = False while not is_done: item = getItem() if item needs action: Process item else: is_done = True Use of break while True: item = getItem() if item needs action Process item else: break But I don't like this one! See later. Either is fine. Choice depends on situation + personal preference. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 164 Example 2(a): Looping until "quit" idiom 1 idiom 2 prompt = "Enter command or 'q' to quit" command = raw_input(prompt).lower() while command != 'q' : if command == 'jump': ... elif command == 'move': ... etc command = raw_input(prompt).lower() is_quitting = False prompt = "Enter command or 'q' to quit" while not is_quitting : command = raw_input(prompt).lower() if command == 'q': is_quitting = True elif command == 'jump': ... etc COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 165 Example 2(b): Looping for valid input idiom 1 prompt = "What suit (spades, hearts, diamonds or clubs)? " response = raw_input(prompt).lower() while not response in ["spades", "hearts", "diamonds", "clubs"]: print "Invalid suit. Try again." response = raw_input(prompt).lower() idiom 2 prompt = "What suit (spades, hearts, diamonds or clubs)? " is_valid = False while not is_valid: response = raw_input(prompt).lower() if response in ["spades", "hearts", "diamonds", "clubs"]: is_valid = True else: print "Invalid suit. Try again." COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 166 Break statements Style guideline: don't use breaks! break forces an immediate exit from the loop – Goes straight to the first statement after the loop Easy to use but: – Loop termination condition is no longer explicit o Loop terminates either because loop condition is false or a break statement was executed – – It encourages a lazy style of programming o – Makes it harder to reason about the program, prove it correct, etc "I'm not quite sure what the exact loop termination condition is, so I'll just use break when I think I've hit it" Creates maintenance problems o Loop body is not fully executed, so extra statements (e.g. for debugging) added at the end of the loop body don't get executed COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 167 Interlude: the Wing101 debugger It can be hard to debug a program that loops forever – Particularly in Wing101, which buffers print output until execution is finished or input is requested, e.g. while True: print "Looping" doesn't output anything! o This doesn't happen in a normal Python shell, only in Wing Instead of running programs, may need to debug them – Click Debug instead of Run – Print output now appears continuously in Debug I/O window, not Python Shell window. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 168 The Wing101 debugger (cont'd) DEMO Also, in this debug mode, you can: – Set breakpoints to stop the program at a given line – Single-step the program one line at a time or run it until the next breakpoint – Inspect all the current variables in the Stack Data pane o See next few slides for explanation Can be very useful, but it's no substitute for thinking! – And usually, thoughtfully chosen print statements provide better information. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 169 "Stacks" push pop A Stack is a data type in Computer Science – – Provides just three operations: o push(item) to add the given to the "top" of the stack o item = pop() to take an item from the top of the stack o is_empty() to test if the stack has any items in it Last item In is First item Out ("LIFO") Python's list can be used as a stack: – append ≡ push – pop ≡ – len(...) == 0 ≡ is_empty() pop COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 170 "Stack frames" During execution, the local data belonging to a Python function is kept in a stack frame A stack frame is a block of data that's pushed onto the call stack when the function is called Stack frame contains function's dictionary of variables and the "return address", i.e., where it came from in the program When function returns, its stack frame is popped from the stack – Execution resumes at the saved return address. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 171 The call stack: example COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 DEMO Slide # 172 Week 6: File processing Textbook, Chapter 8. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 173 Types of file-processing tasks HUGE range, e.g. Our Focus – Numerical /scientific data processing (e.g. rainfall data) – Commercial data processing (e.g. files of account transactions) – Document processing (e.g. MS Word documents) – Programming language compilation (e.g. a Fortran program) – Image processing (e.g. green screening) – Internet data harvesting (e.g. web-crawling for email addresses) – ... COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 174 Steps in processing numerical data 1. Open the file Extract the data from the file 2. – May be as simple as splitting each line in a .csv file or as complex as parsing an XML file 3. Process the data 4. Output/display the results May have to interleave these steps for large files (can’t fit all data in memory) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 175 Opening files We’ve already seen the usual opening of a local file: – data_file = open("data/resources/blah.txt", "r") o File “name” is most generally a file path, i.e. a path to the file within the directory tree But we can also open Internet resources as files, e.g.: import urllib # The URL (Uniform Resource Locator) library url = "http://www.cosc.canterbury.ac.nz/open/teaching/" web_page = urllib.urlopen(url) for line in web_page: print line, web_page.close() # Should always do this – earlier code was lazy COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 176 Extracting data Data files given to you so far have been “sanitised” Real data files usually have lots of extraneous info – Headers, footers, irrelevant data, etc – For example, see next slide o The result of querying for sunshine data at Christchurch from http://cliflo.niwa.co.nz Need an algorithm to extract just the required data – e.g. month, day, sunshine from following slide COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 177 cliflo.niwa.co.nz query result (csv) Station information: Name,Agent Number,Network Number,Latitude (dec.deg),Longitude (dec.deg),Height (m),... Christchurch Aero,4843,H32451,-43.493,172.537,37,G,N/A Note: Position precision types are: "W" = based on whole minutes, "T" = estimated to tenth minute, "G" = derived from gridref , "E" = error cases derived from gridref, "H" = based on GPS readings (NZGD49), "D" = by definition i.e. grid points. Sunshine: Daily Station,Date(NZST),Time(NZST),Amount(Hrs),Period(Hrs),Freq Christchurch Aero,20100101,2259,9.9,24,D Christchurch Aero,20100102,2259,7.1,24,D Christchurch Aero,20100103,2259,1.8,24,D Christchurch Aero,20100104,2259,9.7,24,D ... Christchurch Aero,20100320,2259,1.8,24,D Christchurch Aero,20100321,2259,0.3,24,D Christchurch Aero,20100322,2259,5.3,24,D Christchurch Aero,20100323,2259,9.6,24,D Christchurch Aero,20100324,2259,1.0,24,D Wanted data UserName is = angusmcgurkinshaw Total number of rows output = 83 Number of rows remaining in subscription = 1999917 Copyright NIWA 2010 Subject to NIWA's Terms and Conditions See: http://cliflo.niwa.co.nz/pls/niwp/doc/terms.html Comments to: [email protected] COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 178 Algorithm #1 for extracting data Many possibilities. One is: skip lines until we get an empty line More robust against changes in file format than “skip 9 lines” skip two more lines read a line while line not empty: # A blank line terminates actual data rows split line into pieces separated by comma date = piece[1] get month and day from date sunshine = float(piece[3]) process month, day, sunshine data point (e.g. write to another file) read a line “Idiom 1” from last chapter Question: what happens if the data file doesn’t contain the expected two blank lines? COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 179 Code Direct translation from pseudocode infile = open("sunshine.txt") line = infile.readline() while line != "\n": line = infile.readline() infile.readline() infile.readline() line = infile.readline() while line != "\n": pieces = line.split(",") date = pieces[1] month = int(date[4:6]) day = int(date[6:8]) sunshine = float(pieces[3]) print month, day, sunshine line = infile.readline() infile.close() COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Variant using a function for data line processing def process_data_line(line): pieces = line.split(",") date = pieces[1] month = int(date[4:6]) day = int(date[6:8]) sunshine = float(pieces[3]) print month, day, sunshine infile = open("sunshine.txt") line = infile.readline() while line != "\n": line = infile.readline() infile.readline() infile.readline() line = infile.readline() while line != "\n": process_data_line(line) line = infile.readline() infile.close() Slide # 180 Algorithm #2 for extracting data Another is: get a list of all lines in file make a list of all those lines (after line 3) beginning "Christchurch aero" for each of those lines: split line into pieces separated by comma date = piece[1] get month and day from date sunshine = float(piece[3]) process month, day, sunshine data point (e.g. write to another file) Simpler (?) but only works for this one base station. Also, can’t handle huge files. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 181 Algorithm #2b An improvement (in terms of code reusability) is to get the station name from line 3 get a list of all lines in file station_name = start of line 3, up until "," make a list of all lines (after line 3) beginning with station_name for each of those lines: split line into pieces separated by comma date = piece[1] get month and day from date sunshine = float(piece[3]) process month, day, sunshine data point (e.g. write to another file) OK for any base station. Still can’t handle huge files. COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 182 Code Not 3! [Remember: zero origin] infile = open("sunshine.txt") lines = infile.readlines() infile.close() station_name = lines[2].split(",")[0] data = [line for line in lines[3:] if line.startswith(station_name)] for line in data: pieces = line.split(",") date = pieces[1] month = int(date[4:6]) day = int(date[6:8]) sunshine = float(pieces[3]) print month, day, sunshine COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 183 Which algorithm? Those are just 2 algorithms. – How many more can you find? Which is better? – Actually they’re both pretty bad! – They both fail if the file format is significantly changed – The problem is that we’ve inferred the data format from the data o We really need a specification of the data format from the supplier o Acts as a contract ensuring (hopefully) our program continues to work in the future COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 184 Outputing the results Many possibilities, e.g. – Display textual output with print – Write a new file, e.g. – o Pure text (e.g. csv) o Markup language output (e.g. HTML) Graphical output o Maybe in a GUI o Maybe with matplotlib – o Installed on Linux in labs but not on Windows. ... COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 185 Writing output files Open file for writing, prepare data, e.g. out_file = open("myoutput.txt", "w") data = "{},{},{:.3f}".format(month, day, sunshine) Write data to file – Can use print chevron technique, e.g. Obsolescent: not in Python 3 print >>out_file, data – Or directly output byte stream (usually a string), e.g. out_file.write(data) o NB: must explicitly include newline character when using write method Close file out_file.close() COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 186 Getting graphical output Outside official curriculum, except for GUI section later BUT ... scientists and engineers should at least be aware of matplotlib – See matplotlib.sourceforge.net – A plotting package modelled on the one in matlab – Multiplatform – Publication-quality output – Extremely flexible o – But using this flexibility isn’t trivial! Installed only under Linux on lab machines (?) COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 187 Example program import matplotlib.pyplot as plt from datetime import date def get_date(line): '''Extract the date info from the given line and return a Date object''' pieces = line.split(",") date_string = pieces[1] year = int(date_string[0:4]) month = int(date_string[4:6]) day = int(date_string[6:8]) return date(year, month, day) def get_sunshine(line): '''Return the float sunshine value from the given line''' return float(line.split(',')[3]) infile = open("sunshine.txt") lines = infile.readlines() infile.close() station_name = lines[2].split(",")[0] data = [line for line in lines[3:] if line.startswith(station_name)] dates = [get_date(line) for line in data] sunshine_data = [get_sunshine(line) for line in data] plt.plot(sunshine_data, linestyle="dotted", marker='o') n = len(sunshine_data) tick_positions = range(0, n, 10) tick_labels = [dates[i].strftime("%d %b") for i in tick_positions] plt.xticks(tick_positions, tick_labels, rotation=90) plt.ylabel("Sunshine (hrs)") plt.title("Sunshine hours, Christchurch aero, 2010") plt.show() COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 188 Program’s output COSC 121S1: Intro to Programming. ©Richard Lobb, 2012 Slide # 189