Download String Processing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
String Operations & Formatting
string methods
modify
display
search
construct
deconstruct
lower
upper
title
strip
replace
center
ljust
rjust
startswith
endswith
find
index
count
join
splitlines
split
For more info,
see the official
documentation
and this handout.
string formatting
%[<width>][.<precision>]<character>
character
value
d
integer
s
string
f
float
x
hexidecimal
%
literal %
CS105 Python
>>> 'a b c'.split()
['a', 'b', 'c']
>>> ' '.join(_)
'a b c'
>>> '%d + %d = %d' % (1, 2, 1+2)
'1 + 2 = 3'
>>> '%10s' % ('python',)
'
python'
2
Spring 2009
Files
Reading From Files
(iterator technique)
acquire resource
1
sonnet = open('lxxi.txt')
2
3
4
for line in sonnet:
print line,
iterate over lines
(each line ends with '\n')
5
6
sonnet.close()
release resource
CS105 Python
4
Spring 2009
file Is A Type
>>> sonnet = open('lxxi.txt')
>>> type(sonnet)
<type 'file'>
>>> sonnet.read()
"Shakespeare's Sonnet LXXI\n\nNo longer mourn for me when ...
File (file)
Name (type)
Modes
Methods
CS105 Python
mode
description
'r' read mode, default
'w' write mode, erases file
'a' append mode, write to end of file
read, readlines, write, writelines, close
5
Spring 2009
Writing To Files
1
output = open('output.txt', 'w')
2
3
output.write(data)
4
5
CS105 Python
Careful!
Opening in 'w' mode
erases the file’s contents.
output.close()
6
Spring 2009
Regular Expressions
CS105 Python
8
Spring 2009
aa
aa
CS105 Python
9
Spring 2009
ab*a
aa
aba
abba
abbba
abbbba
...
CS105 Python
10
Spring 2009
ab+a
aba
abba
abbba
abbbba
...
CS105 Python
11
Spring 2009
ab?a
aa
aba
CS105 Python
12
Spring 2009
a
!
q|r|z
r
z
q
zz
CS105 Python
quijibo
...
13
Spring 2009
a
!
[qrz]
r
z
q
zz
CS105 Python
quijibo
...
14
Spring 2009
a
r
[^qrz]
z
q
zz
CS105 Python
!
[qrz]
quijibo
...
15
Spring 2009
[0-9]
0
[^0-9]
1
4
3
6
7
2
5
8
9
CS105 Python
16
Spring 2009
\d
0
\D
1
4
3
6
7
2
5
8
9
CS105 Python
17
Spring 2009
Character Classes
\d
[0-9]
\D
[^0-9]
\s
[ \t\n\r\f\v]
\S
[^ \t\n\r\f\v]
\w
[a-zA-Z0-9_]
\W
[^a-zA-Z0-9_]
Backslash
\\
[\\]
Any Character
.
(everything)
Decimals
Whitespace
Alpha-Numeric
CS105 Python
18
Spring 2009
Using Regular Expressions
>>> import re
>>> regex = re.compile('\d\d/\d\d/\d\d\d\d')
>>> match = regex.match('hello')
>>> print match
None
CS105 Python
19
Spring 2009
Using Regular Expressions
>>> import re
>>> regex = re.compile('\d\d/\d\d/\d\d\d\d')
>>> match = regex.match('01/31/2008')
>>> print match
<_sre.SRE_Match object at 0x6dde8>
CS105 Python
20
Spring 2009
Match Objects
>>> import re
>>> regex = re.compile('\d\d/\d\d/\d\d\d\d')
>>> match = regex.match('01/31/2008')
>>> match.group()
'01/31/2008'
CS105 Python
21
Spring 2009
Grouping
>>> import re
>>> regex = re.compile('(\d\d)/(\d\d)/(\d\d\d\d)')
CS105 Python
22
Spring 2009
Grouping
>>> import re
>>> regex = re.compile('(\d\d)/(\d\d)/(\d\d\d\d)')
>>> match = regex.match('01/31/2008')
>>> match.groups()
('01', '31', '2008')
>>> month, day, year = match.groups()
>>> year
'2008'
CS105 Python
23
Spring 2009
Watch Out: Greediness
>>> s = 'Ken Griffey, Ken Griffy, Jr.'
>>> regex = re.compile(r'.*,')
>>> regex.match(s).group()
'Ken Griffey, Ken Griffy,'
CS105 Python
24
Spring 2009
Watch Out: Greediness
>>> s = 'Ken Griffey, Ken Griffy, Jr.'
>>> regex = re.compile(r'.*,')
>>> regex.match(s).group()
'Ken Griffey, Ken Griffy,'
CS105 Python
25
Spring 2009
Watch Out: Greediness
>>> s = 'Ken Griffey, Ken Griffy, Jr.'
>>> regex = re.compile(r'.*?,')
>>> regex.match(s).group()
'Ken Griffey,'
CS105 Python
26
Spring 2009
Related documents