Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Python 1 Chang Y. Chung Office of Population Research 01/14/2014 Why Python I Popular I Easy to learn and use I Open-source I General-Purpose I Multi-paradigm (procedureal, object-oriented, functional) I Hettinger, “What makes Python Awesome?”[2] http://tinyurl.com/mndd4er I Did I mention popular? 1 / 46 Organization I First two hours for learning basics of Python language. I Third hour for a guided tour of Python “ecosystem”. 2 / 46 Running Python Read-Eval-Print Loop I For those who are using UNIX-like systems, including the Ubuntu image running on a Virtual Box. I Type "python" at the terminal shell prompt. 3 / 46 Running a Python Script File (.py) I Type "python" followed by the script file name. 4 / 46 Running Python REPL on Windows I For those who are using Windows sytem with Python (and IDLE) installed. I Run the IDLE application. 5 / 46 Running a Python Script File (.py) on Windows I File > New (or OPEN) brings up a script window. I Run > Run Module (Output will show in IDLE shell). 6 / 46 Conceptual Hierarchy by Mark Lutz[3] I Programs are composed of modules. I Modules contain statements. 7 / 46 Conceptual Hierarchy by Mark Lutz[3] I Programs are composed of modules. I Modules contain statements. I Statements contain expressions. I Expressions create and process objects. 7 / 46 Script File (.py) I A script file is a module. I A script is a sequence of statements, delimited by a newline (or the end of line character). I Python executes one statement at a time, from the top of a script file to the bottom. 8 / 46 Script File (.py) I A script file is a module. I A script is a sequence of statements, delimited by a newline (or the end of line character). I Python executes one statement at a time, from the top of a script file to the bottom. I Execution happens in namespaces (modules, classes, functions all have their own). I Everything is runtime (even def, class, and import) 8 / 46 Executable Python Script File I On a Unix-like system, we can change the mode of the script file and make it executable: 1 2 $chmod +x hello . py $ . / hello . py I Just add the first line with a hashbang (#!): 1 #!/usr/bin/python 2 3 4 5 # say hello to the world def main ( ) : p r i n t " Hello , World ! " 6 7 8 i f __name__ == " __main__ " : main ( ) 9 / 46 Comments I Comments start with a hash (#) and end with a newline. 1 # this whole line is a comment 2 3 4 5 6 # add 10 integers. total = 0 f o r i in range ( 1 0 ) : # i goes 0, 1, 2, ..., 9 t o t a l += i # shortcut for total = total + i 7 8 9 p r i n t " t o t a l=" , t o t a l # total= 45 10 / 46 Variables I Variables are created when first assigned a value. 1 2 my_var = 3 answer_to_everything = 42 3 4 5 6 # also works are: x = y = 0 a, b, c = 1, 2, 3 I Variable names start with a letter or an underscore(_) and can have letters, underscores, or digits. I Variable names are case-sensitive. 11 / 46 Assignment Semantics According to David Godger[1] I Variables in many other languages are a container that stores a value. 1 i n t a = 1; 12 / 46 Assignment Semantics According to David Godger[1] I Variables in many other languages are a container that stores a value. 1 i n t a = 1; I In Python, an assignment creates an object, and labels it with the variable. 1 a = 1 12 / 46 Assignment Semantics According to David Godger[1] I Variables in many other languages are a container that stores a value. 1 i n t a = 1; I In Python, an assignment creates an object, and labels it with the variable. 1 a = 1 I If you assign another value to a, then the variable labels the new value (2). 1 a = 2 12 / 46 Assignment Semantics According to David Godger[1] I Variables in many other languages are a container that stores a value. 1 i n t a = 1; I In Python, an assignment creates an object, and labels it with the variable. 1 a = 1 I If you assign another value to a, then the variable labels the new value (2). 1 a = 2 I This is what happens if you assign a to a new variable b: 1 b = a 12 / 46 Numbers I Integers 1 2 3 x = 0 age = 20 size_of_household = 5 4 5 6 p r i n t type (age) # <type ’int’> 7 8 9 # can handle arbitrarily large numbers huge = 10 ** 100 + 1 I Floating-point Numbers 1 2 g = 0.1 f = 6.67384 3 4 5 6 7 8 velocity = 1. # it is the dot (.) that makes it a float p r i n t velocity # 1.0 type ( velocity ) # <type ’float’> 13 / 46 Numeric Expressions I Most of the arithmetic operators behave as expected. 1 2 3 4 a = 10 b = 20 p r i n t a - (b ** 2) + 23 # -367 5 6 7 8 x = 2.0 p r i n t x / 0.1 # 20.0 I Watch out for integer divisions. In Python 2, it truncates down to an integer. 1 2 3 4 p r i n t 10 / 3 # 3 (in Python 2) p r i n t -10 / 3 # -4 (in Python 2) 3.3333333333333335 (in Python 3) -3.3333333333333335 (in Python 3) 5 6 7 8 # a solution: use floating point numbers p r i n t 10.0 / 3.0 # 3.3333333333333335 (in both Python 2 and 3) 14 / 46 String Literals I A string is a sequence of characters. 1 2 3 # either double (") or single(’) quotes for creating string literals. name = " Changarilla Dingdong" file_name = ' workshop . tex ' 4 5 6 7 # triple-quoted string starwars = """ A long time ago is a galaxy far, far away... 8 9 10 11 12 It is a period of civil war. Rebel spaceships, striking from a hidden base, have won their first victory ... 13 14 15 What is the last character of this string? """ 16 17 18 19 last_char = starwars [ -1 ] p r i n t ord ( last_char ) , ord ( " \ n" ) # 10 10 15 / 46 Working with strings I Strings are immutable. 1 2 3 s = "abcde" s [0] = "x" # trying to change the first char to "x" # TypeError: ’str’ object does not support item assignment 4 5 6 7 t = "x" + s [ 1 : ] # creating a new string print t # xbcde I Many functions and methods are available. 1 2 3 s = "abcde" p r i n t s + s # concatenation # abcdeabcde 4 5 6 p r i n t len ( s ) #5 7 8 9 p r i n t s . find ( "c" ) # index is 0-based. returns -1 if not found #2 16 / 46 String Manipulation I A few more string methods. 1 2 3 s = "abcde" p r i n t s . upper ( ) , "XYZ" . lower ( ) # ABCDE, xyz 4 5 6 print " # xxx yy xxx yy " . strip () 7 8 9 p r i n t "a , bb , ccc " . s p l i t ( " , " ) # [’a’, ’bb’, ’ccc’] 17 / 46 String Manipulation I A few more string methods. 1 2 3 s = "abcde" p r i n t s . upper ( ) , "XYZ" . lower ( ) # ABCDE, xyz 4 5 6 print " # xxx yy xxx yy " . strip () 7 8 9 p r i n t "a , bb , ccc " . s p l i t ( " , " ) # [’a’, ’bb’, ’ccc’] I What are "methods"? 17 / 46 String Manipulation I A few more string methods. 1 2 3 s = "abcde" p r i n t s . upper ( ) , "XYZ" . lower ( ) # ABCDE, xyz 4 5 6 print " # xxx yy xxx yy " . strip () 7 8 9 p r i n t "a , bb , ccc " . s p l i t ( " , " ) # [’a’, ’bb’, ’ccc’] I What are "methods"? B Functions which are a member of a type (or class). 17 / 46 String Manipulation I A few more string methods. 1 2 3 s = "abcde" p r i n t s . upper ( ) , "XYZ" . lower ( ) # ABCDE, xyz 4 5 6 print " # xxx yy xxx yy " . strip () 7 8 9 p r i n t "a , bb , ccc " . s p l i t ( " , " ) # [’a’, ’bb’, ’ccc’] I What are "methods"? B Functions which are a member of a type (or class). B int, str, Word Document are types (or classes). B 2, "abcde", diary.docx are an instance (or object) of the respective type. B Types have members: properties (data) and methods (functions). 17 / 46 Two Ways to Format I format() 1 p r i n t "The answer i s {0}" . format (21) # The answer is 21 p r i n t "The real answer i s {0:6.4 f }" . format(21.2345678) # The answer is 21.2346 2 3 4 method. I Formatting operator. 1 2 3 4 p r i n t "The answer i s %d" % 21 # The answer is 21 p r i n t "The real answer i s %6.4f " % 21.2345678 # The answer is 21.2346 18 / 46 Raw Strings I Within a string literal, escape sequences start with a backslash (\) 1 2 3 4 5 a_string = ' I t \ ' s a great day \ nto learn \ \ Python \ \ . \ n 1\ t 2\ t 3 ' p r i n t a_string # It’s a great day # to learn \Python\. #1 2 3 I A raw string literal starts with the prefix r. In a raw string, the backslash (\) is not special. 1 import re 2 3 4 5 6 7 8 # raw strings are great for writing regular expression patterns. p = re . compile ( r" \ d \ d \ d−\d \ d \ d \ d" ) m = p . match( '123−4567 ' ) i f m i s not None: p r i n t m. group ( ) # 123-4567 19 / 46 Unicode Strings I You can create Unicode strings using the u prefix and "\u" escape sequences. 1 2 3 a_string = u"Euro \u20AC" # \u followed by 16-bit hex value xxxx p r i n t a_string , len ( a_string ) # Euro e 6 20 / 46 Unicode Strings I You can create Unicode strings using the u prefix and "\u" escape sequences. 1 2 3 a_string = u"Euro \u20AC" # \u followed by 16-bit hex value xxxx p r i n t a_string , len ( a_string ) # Euro e 6 I Unicode strings are sequences of code points. I Code points are numbers, each representing a “character”. e.g., U+0061 is ‘Latin small letter a’. I Unicode text strings are encoded into bytes. UTF-8 is one of many Unicode encodings, using one to four bytes to store a Unicode “character”. I Once you have Unicode strings in Python, all the string functions and properties work as expected. 20 / 46 Best Practices According to Thomas Wouters[5] I Never mix unicode and bytecode [i.e. ordinary] strings. I Decode bytecode strings on input. I Encode unicode strings on output. I Try automatic conversion (codecs.open()) I Pay attention to exceptions, UnicodeDecodeError I An example 1 ustr = u"Euro \u20AC" # Euro e 2 3 4 5 6 7 # Python’s default encoding codec is ’ascii’ ustr . encode ( ) # UnicodeEncodeError: ’ascii’ codec can’t encode characater ’u\u20ac’ utf_8 = ustr . encode( "UTF−8" ) # encoding to UTF-8 works fine # this takes 8 bytes. five one-byte’s and one three-byte 8 9 10 11 # now we want to decode to ascii, ignoring non-ascii chars p r i n t utf_8 . decode( " a s c i i " , " ignore " ) # Euro (no euro symbol character) 21 / 46 None I Is a place-holder, like NULL in other languages. 1 x = None I Is a universal object, i.e., there is only one None. 1 2 p r i n t None i s None # True I Is evaluated as False. 1 2 3 x = None i f x: p r i n t " t h i s w i l l never p r i n t " I Is, however, distinct from others which are False. 1 2 p r i n t None i s 0 , None i s False , None i s [ ] # False, False, False 22 / 46 Core Data Types I Basic core data types: int, float, and str. I "Python is dynamically, but strongly typed." 1 2 3 n = 1.0 p r i n t n + "99" # TypeError: unsupported operand types(s) for +: ’int’ and ’str’ I Use int() or float() to go from str to numeric 2 p r i n t n + f l o a t ( "99" ) # 100.0 I str() 1 n = 1.0 p r i n t s t r (n) + "99" # 1.099 1 2 3 returns the string representation of the given object. 23 / 46 Operators I Python supports following types of operators: B B B B B B B Arithmetic (+ − * / % ** // ) Comparison (== != > < >= <=) Assignment (= += −= *= /= %= **= //=) Logical (and or not) Bitwise (& | ^ ~ << >>) Membership (in not in) Identity ( is is not) 24 / 46 A Few Surprises I Power operator (**) binds more tightly than unary operators on the left. 1 2 3 4 5 6 a = -2**2 print a # -4 # solution: parenthesize the base p r i n t ( -2) ** 2 #4 I Comparisons can be chained. 1 x < y <= z # y is evaluated only once here is equivalent to 1 x < y and y <= z I Logical operators (and, or) short-circuit evaluation and return an operand. 1 2 p r i n t 3 or 2 #3 25 / 46 Quiz I Demographic and Health Surveys (DHS) Century Month Code (CMC)[4, p.5] provides an easy way working with year and month. The CMC is an integer representing a month, taking the value of 1 in January 1900, 2 in February 1900, . . . , 13 in January 1901, etc. The CMC in February 2011 is 1333. What is the CMC for this month, i.e., January, 2014? 26 / 46 Quiz I Demographic and Health Surveys (DHS) Century Month Code (CMC)[4, p.5] provides an easy way working with year and month. The CMC is an integer representing a month, taking the value of 1 in January 1900, 2 in February 1900, . . . , 13 in January 1901, etc. The CMC in February 2011 is 1333. What is the CMC for this month, i.e., January, 2014? I An answer: 1 2 3 4 year = 2014 month = 1 p r i n t "CMC" , ( year - 1900) * 12 + month # CMC 1369 26 / 46 Quiz I Demographic and Health Surveys (DHS) Century Month Code (CMC)[4, p.5] provides an easy way working with year and month. The CMC is an integer representing a month, taking the value of 1 in January 1900, 2 in February 1900, . . . , 13 in January 1901, etc. The CMC in February 2011 is 1333. What is the CMC for this month, i.e., January, 2014? I An answer: 1 2 3 4 year = 2014 month = 1 p r i n t "CMC" , ( year - 1900) * 12 + month # CMC 1369 I What is the month (and year) of CMC 1000? 26 / 46 Quiz I According to U.S. National Debt Clock, the current outstanding public debt, as of a day last month, was a big number: 1 debt = 17234623718339.96 Count how many times the digit 3 appears in the number. (Hint: create a string variable and use the count() method of the string type.) 27 / 46 Quiz I According to U.S. National Debt Clock, the current outstanding public debt, as of a day last month, was a big number: 1 debt = 17234623718339.96 Count how many times the digit 3 appears in the number. (Hint: create a string variable and use the count() method of the string type.) I An answer: 1 2 3 sdebt = "17234623718339.96" p r i n t sdebt . count ( "3" ) #4 27 / 46 Quiz I According to U.S. National Debt Clock, the current outstanding public debt, as of a day last month, was a big number: 1 debt = 17234623718339.96 Count how many times the digit 3 appears in the number. (Hint: create a string variable and use the count() method of the string type.) I An answer: 1 2 3 sdebt = "17234623718339.96" p r i n t sdebt . count ( "3" ) #4 I (tricky) It feels rather silly to rewrite the value as a string. Can you think of a way to convert the number into a string? 27 / 46 Flow Control I Conditional Execution ( if ) I Iteration B B loop loop while for 28 / 46 IF I IF statement is used for conditional execution. 1 2 3 4 i f x > 0: p r i n t "x i s positive " else : p r i n t "x i s zero or negative " I Only one suite (block of statements) under a True conditional expression is executed. 1 2 me = " rock " wins = 0 3 4 5 6 7 8 9 10 i f you == "paper" : p r i n t "You win ! " e l i f you == " scissors " : p r i n t " I win ! " wins += 1 else : p r i n t "draw" 29 / 46 Compound statements I if , while, and for are compound statements, which have one or more clauses. A clause, in turn, consists of a header that ends with a colon ( : ) and a suite. I A suite, a block of statements, is identified by indentation. 1 2 3 4 a = 1 i f a > 0: desc = "a i s positive " p r i n t a , desc 5 6 p r i n t "done" 7 # these two lines form a suite # # (blank line ignored) # This line being started "dedented" # signals the end of the block 8 9 10 i f a > 0: desc = "a i s positive " # indentation error # indentation error 11 12 13 14 15 16 i f a > 0: # OK desc = "a i s positive " # OK p r i n t a , desc # OK else : # OK print a # OK 30 / 46 Compound statements I The amount of indentation does not matter (as long as the same within a level). Four spaces (per level) and no tabs are the convention. I Use editor’s python mode, which prevents/converts <TAB> to (four) spaces. I Why indentation? A good answer at: http://tinyurl.com/kxv9vts. I For an empty suite, use pass statement, which does nothing. 31 / 46 WHILE I Repeats a block of statements as long as the condition remains True. 1 2 3 4 5 6 7 8 9 10 11 total = 0 n = 0 while n < 10: print n, total n += 1 t o t a l += n #0 0 #1 1 #2 3 # ... # 9 45 I break I continue terminates the loop immediately. skips the remainder of the block and goes back to the test condition at the top. 32 / 46 FOR I for 1 days = [ "Sunday" , "Monday" , "Tuesday" , "Wednesday" , "Thursday" , " Friday " , " Saturday " ] 2 is used to iterate over a sequence. 3 4 5 6 7 8 f o r day in days : i f day == " Friday " : p r i n t " I am outta here . " break p r i n t "Happy" + " " + day + " ! " 9 10 11 12 13 14 # Happy Sunday! # Happy Monday! # ... # Happy Thusday! # I am outta here. 33 / 46 FOR I Another example. 1 2 3 numbers = range (5) p r i n t numbers # [0, 1, 2, 3, 4] 4 5 6 7 8 9 10 f o r n in numbers : print n, i f n % 2 == 0: p r i n t " i s even" else : p r i n t " i s odd" 11 12 13 14 15 16 #0 #1 #2 #3 #4 is is is is is even odd even odd even 34 / 46 File I/O I The built-in function, open(), returns a file type object, unless there is an error opening the file. 1 2 i n _ f i l e = open( " y o u r f i l e . t x t " , " r " ) # for reading o u t _ f i l e = open( " myfile . t x t " , "w" ) # for writing I Once we get the 1 2 file type object, then use its methods. # read all the contents from the input file content = i n _ f i l e . read ( ) 3 4 5 # write out a line to the output file o u t _ f i l e . write ( " hello ? \n" ) 6 7 8 9 # close it when done i n _ f i l e . close ( ) o u t _ f i l e . close ( ) 35 / 46 Reading a file one line at a time I with 1 with open( "lorem . t x t " , " r " ) as f : f o r l i n e in f : print line 2 3 ensures that the file is closed when done. I Another example. 1 2 3 4 5 # creating a file and writing three lines with open( " small . t x t " , "w" ) as f : f . write ( " f i r s t \ n" ) f . write ( "second \ n" ) f . write ( " t h i r d " ) 6 7 8 9 10 11 12 13 14 with open( " small . t x t " , " r " ) as f : f o r l i n e in f : print line # line includes the newline char # first # # second # # third 36 / 46 Defining and Calling a Function I Defined with def and called by name followed by () . 1 2 def the_answer ( ) : return 42 3 4 5 p r i n t the_answer ( ) # 42 I Argument(s) can be passed. 1 2 def shout (what ) : p r i n t what . upper ( ) + " ! " 3 4 5 shout ( " hello " ) # HELLO! I If the function returns nothing, then it returns None. 1 2 3 4 r = shout ( " hi " ) # HI! print r # None 37 / 46 Another example I CMC again 1 2 3 4 5 6 7 8 9 10 def cmc( year , month ) : ' ' ' returns DHS Century Month Code ' ' ' # doc string i f year < 1900 or year > 2099: p r i n t " year out of range" return i f month < 1 or month > 12: p r i n t "month out of range" return value = ( year - 1900) * 12 + month return value 11 12 13 14 15 16 p r i n t cmc(2014, 1) # 1369 p r i n t cmc(2014, 15) # month out of range # None 38 / 46 Local and Global Variables I Within your function: B A new variable is local, and independent of the global var with the same name, if any. 1 x = 1 # global (or module) 2 3 4 5 6 def my_func ( ) : x = 2 my_func ( ) print x # local #1 B Both local and global variables can be read. B Global variables can be writen to once declared so. 1 x = 1 2 3 4 5 6 7 def my_func ( ) : global x x = 2 my_func ( ) print x # global #2 39 / 46 Quiz I Write a function that returns Body Mass Index (BMI) of an adult given weight in kilograms and height in meters. (Hint: BMI = weight(kg) / (height(m) squared). For instance, if a person is 70kg and 1.80m, then BMI is about 21.6.) 40 / 46 Quiz I Write a function that returns Body Mass Index (BMI) of an adult given weight in kilograms and height in meters. (Hint: BMI = weight(kg) / (height(m) squared). For instance, if a person is 70kg and 1.80m, then BMI is about 21.6.) I An answer. 1 2 def bmi( kg , m) : return f l o a t (kg) / (m ** 2) 40 / 46 Quiz I Write a function that returns Body Mass Index (BMI) of an adult given weight in kilograms and height in meters. (Hint: BMI = weight(kg) / (height(m) squared). For instance, if a person is 70kg and 1.80m, then BMI is about 21.6.) I An answer. 1 2 def bmi( kg , m) : return f l o a t (kg) / (m ** 2) I Re-write the bmi function so that it accepts height in feet and inches, and the weight in pounds. (Hint. Make pound, foot, and inch arguments. Convert them into local variables, kg and m, before calculating bmi to return.) 40 / 46 Importing a module I reads in a module, runs it (top-to-bottom) to create the module object. import I Via the module object, you get access to its variables, functions, classes, . . . I We’ve already seen an example of importing a standard regular expression module: 1 import re 2 3 # compile() is a function defined within the imported re module. 4 5 6 7 8 9 p = re . compile ( r" \ d \ d \ d−\d \ d \ d \ d" ) m = p . match( '123−4567 ' ) i f m i s not None: p r i n t m. group ( ) # 123-4567 41 / 46 Another example I There are many standard modules that come already installed, and be ready to be imported. 1 import math 2 3 4 5 s = math. sqrt ( 4 . 0 ) p r i n t " 4.0 squared i s {0:.2 f }" . format ( s ) # 4.0 squared is 2.00 I You can selectively import as well. 1 from math import sqrt # import sqrt() alone 2 3 4 p r i n t sqrt ( 9 . 0 ) # 3.0 # no "math." I You can import your own Python script file (.py) the same way. The default import path includes the current working directory and its sub-directories. 1 import hello # suppose that hello.py defines a main() function 2 3 4 hello . main ( ) # "hello, World!" 42 / 46 Quiz I Write a function such that, given a BMI value, returns the BMI category as a string. Recall the categories are: Underweight Normal weight Overweight Obesity less than 18.5 18.5 upto but not including 25 25 upto but not including 30 30 or greater For instance, the function should return a string "Normal weight", when it is called with an argument of, say 20. (Hint: use conditional statements i.e., if ... elif ... ) I Print out a BMI table showing several lines of a pair: a BMI value and its category. BMI value may start at 15 and go up by 3 up to 36. (Hint: use a loop) 43 / 46 Quiz (cont.) I Create a comma-separated values (.csv) file of the BMI table. (Hint: You may start with below, and modify it as neccessary.) 1 import csv 2 3 4 5 6 7 8 with open( " test . csv " , "wb" ) as f : my_writer = csv . writer ( f ) i = 15 while i < 30: my_writer . writerow ( [ i , "yahoo! " ] ) i += 3 44 / 46 Summary I Using Python interactively or by running a script. I Comments. I Variables and assignment semantics. I Core data types (int, float, str, None). I Operators. I Conditionals and Looping. I Defining and using functions. I Basic File I/O. I Importing a module. 45 / 46 References Godger, D. Code like a pythonista: Idiomatic python. http://tinyurl.com/2cv9kg. Hettinger, R. What makes Python Awesome? http://tinyurl.com/mndd4er. Lutz, M. Learning Python, fifth ed. O’Reilly Media, Sebastopol, CA, 2013. MEASURE DHS Plus. Description of the Dempgraphic and Health Surveys Individual Recode Data File, version 1.0 ed., March 2008. Wouters, T. Advanced python: (or understanding python) google tech talks. feb 21, 2007. http://www.youtube.com/watch?v=uOzdG3lwcB4. 46 / 46