Download STA312 Python Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
STA312
Python Introduction
Craig Burkett, Dan Zingaro
January 6, 2015
Python History
I
Late 1970s: programming language called ABC
I
I
I
I
I
High-level, intended for teaching
Only five data types
Programs are supposedly one-quarter the size of the equivalent
BASIC or Pascal program
Not a successful project
More ABC information:
http://homepages.cwi.nl/~steven/abc/
Python History...
I
1983: Guido van Rossum joined the ABC team
I
Late 1980s: Guido started working on a new project, in which
a scripting language would be helpful
I
Based Python on ABC, removed warts (e.g. ABC wasn’t
extensible)
I
Python, after Monty Python
I
Guido: Benevolent Dictator for Life (BDFL)... but he’s
retiring!
I
http://www.artima.com/intv/ (search for Guido)
Why Python for Big Data?
I
Readable, uniform code structure
I
No compilation step; Python is interpreted
I
Supports object-oriented programming (OOP) features
I
Batteries included: Python’s standard library comes with tools
for a variety of problem domains
I
Additional modules are available for download: data mining,
language processing . . .
Dynamic Typing
I
Biggest conceptual change compared to C, Java etc.
I
Variables do not have types. Objects have types
>>> a = 5
>>> type (a)
<type ’int’>
>>> a = ’hello’
>>> type (a)
<type ’str’>
>>> a = [4, 1, 6]
>>> type (a)
<type ’list’>
Built-in Types
I
We’ll look at the core five object types that are built-in to
Python
I
I
I
I
I
I
Numbers
Strings
Lists
Dictionaries
Files
They’re extremely powerful and save us from writing tons of
low-level code
Built-in Types: Numbers
I
Create numbers by using numeric literals
I
If you include no fractional component, it’s an integer;
otherwise it’s a float
I
We have all of the standard mathematical operators, and even
** for exponent
I
Make integers as small or large as you like — they can’t go
out of bounds
Built-in Types: Strings
I
A string is a sequence of characters
I
To indicate that something is a string, we place single- or
double-quotes around it
I
We can use + to concatenate strings
I
This is an example of overloading: + is used to add numbers
too; it knows what to do based on context
What happens if we try to use + with a string and a number?
I
I
I
I
Error: + doesn’t know what to do!
e.g. is ’3’ + 4 supposed to be the string ’34’ or the number
7?
Design philosophy: Python tries never to guess at what you
mean
Strings...
I
The * operator is overloaded, too
I
I
I
Applied to a string and an integer i, it duplicates the string i
times
If i ≤ 0, the result is the empty string
Can also use relational operators such as < or > to
alphabetically compare strings
Looping Through Strings
for char in s:
<do something with char>
I
We’ll see this pattern again and again for each Python type
I
It’s like Php’s foreach or Java’s for-with-the-colon
I
Let’s write a function that counts the number of vowels in a
string
I
A function is a named piece of code that carries out some task
Possible Solution: How Many Vowels? (num vowels.py)
def num_vowels(s):
’’’Return the number of vowels in string s.
The letter "y" is not treated as a vowel.’’’
count = 0
for char in s:
if char in "aAeEiIoOuU":
count += 1
return count
String Methods
I
Strings are objects and have tons of methods
I
Use dot-notation to access methods
I
Use dir (str) to get a list of methods, and
help (str.methodname) for help on any method
I
Useful ones: find, lower, count, replace...
I
Strings are immutable (cannot be modified): all we can do is
create new strings
Indexing and Slicing Strings
I
Assume s is a string
I
Then, s[i] for i ≥ 0 extracts character i from the left (0 is
the leftmost character)
I
We can also use a negative index i to extract a character
beginning from the right (-1 is the rightmost character)
Slice notation: s[i:j] extracts characters beginning at
s[i] and ending at the character one to the left of s[j]
I
I
I
If we leave out the first index, Python defaults to using index 0
to begin the slice
Similarly, if we leave out the second index, Python defaults to
using index len(s) to end the slice
Built-in Types: Lists
Lists are like arrays in other languages,
Strings
Sequences of?
Characters
Yes
Immutable?
Can be heterogeneous? No
Yes
Can index and slice?
Can use for-loop?
Yes
Created like?
’hi’
but much more powerful.
Lists
Any object types
No
Yes
Yes
Yes
[4, 1, 6]
List Methods
I
As with strings, there are lots of methods; use dir (list) or
help (list.method) for help
I
append is used to add an object to the end of a list
I
extend is used to append the objects of another list
I
insert (index, object) inserts object before index
I
sort() sorts a list
I
remove (value) removes the first occurrence of value from
the list
Exercise: Length of Strings
3
1
0
I
Write a function that takes a list of strings, and prints out the
length of each string in the list
I
e.g. if the list is [’abc’, ’q’, ’’], the output would be as
follows
Built-in Types: Dictionaries
Dictionaries are like associative arrays or maps in other languages.
Stores?
Immutable?
Can be heterogeneous?
Can index and slice?
Can use for-loop?
Created like?
Lists
Sequences of objects
No
Yes
Yes
Yes
[4, 1]
Dictionaries
Key-value pairs
No
Yes
No
Yes
{’a’: 1, ’b’: 2}
Dictionaries vs. Lists
I
Compared to using “parallel lists”, dictionaries make an
explicit connection between a key and a value
I
I
But unlike lists, dictionaries do not guarantee any ordering of
the elements
If you use for k in d, for a dictionary d, you get the keys
back in arbitrary order
bird_dict = {
’peregrine falcon’: 1, ’harrier falcon’: 5,
’red-tailed hawk’: 2, ’osprey’: 11}
Adding to Dictionaries
I
Dictionary keys must be of immutable types (no lists!), but
values can be anything
I
We can use d[k] = v to add key k with value v to dictionary
d
I
We can use the update method to dump another dictionary’s
key-value pairs into our dictionary
We can use d[k] to obtain the value associated with key k of
dictionary d
I
I
I
If k does not exist, we get an error
The get method is similar, except it returns None instead of
giving an error when the key does not exist
Built-in Types: Files
I
We’ll use files whenever we read external data (websites,
spreadsheets, etc.)
I
To open a file in Python, we use the open function
I
Syntax: open (filename, mode)
I
mode is the string ’r’ to open the file for reading, ’w’ to
open the file for writing, or ’a’ to open the file for
appending. No mode = ’r’
I
open gives us a file object that we can use to read or write
the file
Reading Files with Methods
To read the next line from a file:
readline: reads and returns next line; returns empty string at
end-of-file
There are other methods, but try not to use these because they
read the entire file into memory:
I
read: reads the entire file into one string
I
readlines: reads the entire file into a list of strings
All of these leave a trailing ’\n’ character at the end of each line.
Reading Files with Loops
A file is a sequence of lines:
f = open(’songs.txt’)
for line in f:
print(line.strip())
. . . or using a while-loop:
f = open(’songs.txt’)
line = f.readline()
while line:
print(line.strip())
line = f.readline()
Skipping Headers
Suppose we have a file of this format:
header
# comment text
# comment text
# ...
... actual data
...
Let’s write a function that skips the header of such a file and
returns the first line of actual data.
Multi-Field Records
I
So far, we have been reading entire lines from our file
I
But, our lines are actually records containing three fields:
game name, song name, and rating
I
Let’s write a function to read this data into three lists
The critical string method here is split
I
I
I
With no parameters, it splits around any space
With a string parameter, it splits around that string
Further Python Resources
I
http://www.rmi.net/~lutz
I
I
I
http://docs.python.org/tutorial
I
I
Mark Lutz’ Python books
Constantly-updated to keep up with Python releases
Free online Python tutorial
https://mcs.utm.utoronto.ca/~108
I
Dan’s intro CS course in Python