Download Introduction to Python for Biologists

Document related concepts
no text concepts found
Transcript
Introduction to Python for Biologists
Katerina Taškova1
1 Faculty
Jean-Fred Fontaine1,2
of Biology, Johannes Gutenberg-Universität Mainz, Mainz, Germany
2 Genomics
and Computational Biology, Kernel Press, Mainz, Germany
https://cbdm.uni-mainz.de/mb17
March 21, 2017
Introduction to Python for Biologists –
Table of Contents
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
2
Introduction to Python for Biologists – Introduction
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
3
Introduction to Python for Biologists – Introduction
What is Python?
Python is a general-purpose programming language
Python design philosophy
created by Guido van Rossum (1991)
high-level (abstraction from the details of the computer)
interpreted (needs an interpreter software)
code readability
syntax brevity
Python is widely used for Biology
March 21, 2017
rich built-in features
powerful scientific extensions
plotting capabilities
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
4
Introduction to Python for Biologists – Introduction
Structured programming I
Instructions are executed sequentially, one per line
Conditional statements allow selective execution of code
blocks
Loops allow repeated execution of code blocks
Functions allow on-demand execution of code blocks
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
5
Introduction to Python for Biologists – Introduction
Structured programming II
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
instruction 1
# 1 s t i n s t r u c t i o n ( hashtag # s t a r t s comments )
# blank l i n e
r e p e a t 20 t i m e s
# 2nd i n s t r u c t i o n ( l o o p s t a r t s a b l o c k )
i n s t r u c t i o n a # b l o c k d e f i n e d by i n d e n t a t i o n ( spaces o r t a b s )
i n s t r u c t i o n b # 2nd i n s t r u c t i o n i n b l o c k
# blank l i n e
i f n>10
# 3rd i n s t r u c t i o n ( C o n d i t i o n a l statement )
i n s t r u c t i o n a # 1 st i n s t r u c t i o n in block
i n s t r u c t i o n b # 2nd i n s t r u c t i o n i n b l o c k
# blank l i n e
# blank l i n e
# backslashs j o i n l i n e s
instruction 3 \
# 3rd i n s t r u c t i o n , p a r t 1
instruction 3
# 3rd i n s t r u c t i o n , p a r t 2
# blank l i n e
# Expressions i n ( ) , {} , o r [ ] can span m u l t i p l e l i n e s
i n s t r u c t i o n 4 (1 , 2 , 3 # 4th instruction , part 1
4 , 5 , 6) # 4 th i n s t r u c t i o n , p art 2
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
6
Introduction to Python for Biologists – Introduction
Namespace
Variables are names associated with data
Functions are names associated to specific code blocks
e.g. a=2 assigns value 2 to variable a
built-in functions are available (see list on slide 100)
e.g. print(a) will display ’2’ on the screen
The user namespace is the set of names available to the
user
March 21, 2017
users can define new names of variables and functions in their
namespace
imported modules can add names of variables and functions
in the user namespace
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
7
Introduction to Python for Biologists – Introduction
Object-oriented programming
Data is organized in classes and objects
a class is a template defining what objects can store and do
an object is an instance of a class
objects have attributes to store data and methods to do
actions
object namespaces are different from user namespace
Example class ”Human” is defined as:
1
2
3
4
March 21, 2017
has a name (an attribute ”name”)
has an age (an attribute ”age”)
can introduce itself (a method ”who”)
example with 1 existing Human object P1:
P1 . name = ” Mary ”
P1 . age = 26
P1 . who ( )
who ( )
#
#
#
#
a s s i g n s v a l u e t o a t t r i b u t e name
a s s i g n s v a l u e t o a t t r i b u t e age
d i s p l a y s ”My name i s Mary I am 2 6 ! ”
e r r o r ! n o t i n t h e user namespace
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
8
Introduction to Python for Biologists – Introduction
Modules
Modules can add functionalities to Python
Example of available modules:
1
2
3
4
e.g. classes and functions
NumPy for scientific computing
Matplotlib for plotting
BioPython for Biology
Modules have to be imported into the code
# i m p o r t d a t e t i m e module i n i t s own namespace
import datetime
d a t e t i m e . date . today ( ) # 2017−03−16
today ( )
# error !
5
6
7
8
9
# i m p o r t f u n c t i o n s l o g 2 and log10 from module math
# i n c u r r e n t namespace
from math i m p o r t log2 , log10
log10 ( 1 ) # equal 0
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
9
Introduction to Python for Biologists – Running code
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
10
Introduction to Python for Biologists – Running code
Running code I
1
2
3
4
From a terminal by using the interactive Python shell
$ python3
a=2
b=3
exit ()
2
opens Python s h e l l
assigns 2 to a
assigns 3 to b
c l o s e s Python s h e l l
From a terminal by running a script file
1
#
#
#
#
e.g. let say myscript.py is a script file (simple text file)
and it contains: print(”hello world!”)
$ python3 m y s c r i p t . py
h e l l o world !
March 21, 2017
# runs python3 and t h e s c r i p t
# r e s u l t o f t h e s c r i p t on t h e t e r m i n a l
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
11
Introduction to Python for Biologists – Running code
Running code II
From Jupyter Notebook
March 21, 2017
web-based graphical interface
manage cells of code or text
see execution results on the same notebook
save/open notebooks
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
12
Introduction to Python for Biologists – Running code
Documentation and messages I
Documentation and help:
https://docs.python.org/3
use the built-in help() function
e.g. help(print) to display help for function print()
see help menu or Google it
Examples of error messages
1
2
3
4
5
6
# F o r g e t t i n g quotes
p r i n t ( Hello world )
# F i l e ”< s t d i n >” , l i n e 2
#
p r i n t ( Hello world )
#
ˆ
# SyntaxError : i n v a l i d syntax
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
13
Introduction to Python for Biologists – Running code
Documentation and messages II
1
2
3
4
5
1
2
3
4
5
6
7
# S p e l l i n g mistakes
p r i n ( ” Hello world ” )
# Traceback ( most r e c e n t c a l l l a s t ) :
#
F i l e ”< s t d i n >” , l i n e 2 , i n <module>
# NameError : name ’ p r i n ’ i s n o t d e f i n e d
# Wrong l i n e break w i t h i n a s t r i n g
p r i n t ( ” Hello
World ” )
# F i l e ”< s t d i n >” , l i n e 2
#
p r i n t ( ” Hello
#
ˆ
# S y n t a x E r r o r : EOL w h i l e scanning s t r i n g l i t e r a l
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
14
Introduction to Python for Biologists – Literals and variables
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
15
Introduction to Python for Biologists – Literals and variables
Numeric and strings literals I
1
2
3
4
# Numeric l i t e r a l s
12
−123
1 . 6E3 # means 1600
5
6
7
8
9
10
11
12
13
# Strings l i t e r a l s
’A s t r i n g ’
’A ” s t r i n g ” ’
”A ’ s t r i n g ’ ”
’ ’ ’ Three s i n g l e quotes ’ ’ ’
” ” ” Three double quotes ” ” ”
’A \ ’ s t r i n g \ ’ ’
r ’A \ ’ s t r i n g \ ’ ’
#
#
#
#
#
#
#
A string
A ” string ”
A ’ string ’
Three s i n g l e quotes
Three double quotes
A ’ s t r i n g ’ ( backslash escape sequence )
A \ ’ s t r i n g \ ’ ( raw s t r i n g )
Python stores literals in objects of corresponding classes (class int
for integers, float for floatting point, and str for strings)
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
16
Introduction to Python for Biologists – Literals and variables
Numeric and strings literals II
Printing numeric and strings literals
1
2
p r i n t ( 1 2 ) # 12
p r i n t (1+2) # 3
3
4
p r i n t ( ’ H e l l o World ’ ) # H e l l o World
5
6
7
8
p r i n t ( ’ H e l l o World ’ , 1+2)
p r i n t ( ’ H e l l o World ’ , 1+2 , sep= ’− ’ )
p r i n t ( ’ H e l l o World ’ , 1+2 , sep= ’ \ t ’ )
9
#
#
#
#
H e l l o World 3
H e l l o World−3
H e l l o World 3
( \ t : tab , \n : n e w l i n e )
10
11
12
p r i n t ( ’AB ’ , end= ’ ’ ) # AB ( a v o i d n e w l i n e a t t h e end )
p r i n t ( ’CD ’ )
# ABCD
13
14
p r i n t ( ’ Max i s ’ , 12 , ’ and Min i s ’ , 3 ) # Max i s 12 and Min i s 3
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
17
Introduction to Python for Biologists – Literals and variables
Variables I
Variables are names used to access objects
first letter is a character (not a digit)
no space characters allowed
case-sensitive (variable name var is not Var)
prefer alphanumeric characters (e.g. abc123)
avoid accents, non-alphanumeric, non English
underscores may be used (e.g. abc 123)
The following keywords can not be used as variable names
and, assert, break, class, continue
def, del, elif, else, except, exec, finally, for, from
global, if, import, in, is, lambda, not, or, pass
print, raise, return, try, while, yield
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
18
Introduction to Python for Biologists – Literals and variables
Variables II
1
2
3
4
5
6
7
8
# Numeric t y p e s
a=2
# a i s assigned an i n t o b j e c t o f v a l u e 2
p r i n t ( a ) # p r i n t s t h e o b j e c t assigned t o a ( 2 )
b=a
# b i s assigned t h e same o b j e c t as a ( 2 )
print (b) # 2
a=5
# a i s assigned a new o b j e c t o f v a l u e 5
print (a) # 5
p r i n t ( b ) # 2 ( b i s s t i l l assigned t o o b j e c t o f v a l u e 2 )
9
10
11
12
13
# Strings
c1= ’ a ’
p r i n t ( c1 ) # ’ a ’
myName125 = ’ abc ’
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
19
Introduction to Python for Biologists – Numeric types
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
20
Introduction to Python for Biologists – Numeric types
Numeric types I
1
2
3
type ( 7 )
# <c l a s s ’ i n t ’>
( i n t e g e r number )
type ( 8 . 2 5 )
# <c l a s s ’ f l o a t ’> ( f l o a t i n g p o i n t )
t y p e ( 4 . 5 2 e−3) # <c l a s s ’ f l o a t ’> ( f l o a t i n g p o i n t )
4
5
6
7
8
9
10
11
12
# Operators ( s p e c i a l b u i l t −i n f u n c t i o n s )
1 + 3 # 4
( addition )
4 − 1 # 3
( substraction )
3 ∗ 2 # 6
( multiplication )
9 / 2 # 4.5 ( d i v i s i o n )
9 // 2 # 4
( integer division )
9 % 2 # 1
( i n t e g e r d i v i s i o n remainder )
2∗∗3
# 8
( exponent )
13
14
15
16
17
18
# Lowest t o h i g h e s t o p e r a t o r s precedence ( equal i f on same l i n e )
+,−
# Addition , Subtraction
∗ , / , / / , % # M u l t i p l i c a t i o n , D i v i s i o n s , Remainder
+x , −x
# P o s i t i v e , Negative
∗∗
# Exponentiation
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
21
Introduction to Python for Biologists – Numeric types
Numeric types II
1
2
3
# B u i l t −i n f u n c t i o n s
abs ( −2.58) # 2.58 ( a b s o l u t e v a l u e o f x )
round ( 2 . 5 ) # 2
( round t o c l o s e s t i n t e g e r )
4
5
6
7
8
9
10
#
a
b
c
d
d
With v a r i a b l e s
= 1
# 1
= 1 + 1
# 2
= a + b
# 3
= a+c∗b
# 7 ( precedence o f ∗ over +)
= ( a+c ) ∗b # 8 ( use parentheses t o break precedence )
11
12
13
14
# S h o r t n o t a t i o n s ( v a l i d f o r + , −, ∗ , / ,
a += 1 # a = a + 1
a ∗= 5 # a = a ∗ 5
...)
15
16
17
18
# Special f l o a t values
f l o a t ( ’NaN ’ ) # nan ( Not a Number )
f l o a t ( ’ I n f ’ ) # i n f : I n f i n i t e p o s i t i v e ; −i n f : I n f i n i t e negative
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
22
Introduction to Python for Biologists – Strings
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
23
Introduction to Python for Biologists – Strings
Sequence types
Text sequence type:
Strings: immutable sequences of characters
Basic sequence types:
Lists: mutable sequences
Tuples: immutable sequences
Ranges: immutable sequence of numbers
Sequence operations:
All sequence types support common sequence operations
(slide 98)
Mutable sequence types support specific operations (slide 99)
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
24
Introduction to Python for Biologists – Strings
Strings I
1
2
3
4
5
6
# Quotes
’A s t r i n g ’
’A ” s t r i n g ” ’
”A ’ s t r i n g ’ ”
’ ’ ’ Three s i n g l e quotes ’ ’ ’
” ” ” Three double quotes ” ” ”
#
#
#
#
#
A string
A ” string ”
A ’ string ’
Three s i n g l e quotes
Three double quotes
7
8
9
10
11
12
# Escape sequences ( see annexes )
” A s i n g l e quote
’ ” # A s i n g l e quote ’
’A s i n g l e quote \ ’ ’ # A s i n g l e quote ’
”A t a b u l a t i o n
\t ”
”A newline
\n ”
See other escape sequences in slide 97
Triple quoted strings may span multiple lines - all associated
whitespace will be included in the string literal
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
25
Introduction to Python for Biologists – Strings
Strings II
1
2
3
4
5
# Operators
’ pipe ’ + ’ t t e ’
’A ’ ∗7
’A ’ ∗3 + ’C ’ ∗2
’A ’ + s t r ( 2 . 0 )
#
#
#
#
= ’ p i p e t t e ’ ( concatenation )
= ’AAAAAAA ’ ( r e p l i c a t i o n )
= ’AAACC ’
= ’A2 . 0 ’
( c o n v e r t number then concatenate )
6
7
8
9
# B u i l t −i n f u n c t i o n s
l e n ( ’A s t r i n g o f c h a r a c t e r s ’ ) # 22 ( l e n g t h i n c h a r a c t e r s )
t y p e ( ’ a ’ ) # <c l a s s ’ s t r ’> ( s t r i n g )
10
11
12
13
14
15
16
# S l i c e s [ s t a r t : end : s t e p ] ( 0 i s i n d e x o f f i r s t c h a r a c t e r )
”ABCDEFG” [ 2 : 5 ]
# ’CDE ’
( F a t i n d e x 5 excluded )
”ABCDEFG” [ : 5 ]
# ’ABCDE ’ ( from b e g i n i n g )
”ABCDEFG” [ 5 : ]
# ’FG ’
( t o t h e end )
”ABCDEFG” [ − 2 : ]
# ’FG ’
(−2 from t h e end : t o t h e end )
”ABCDEFG” [ 0 : 5 : 2 ] # ’ACE ’
( every second l e t t e r w i t h s t e p =2)
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
26
Introduction to Python for Biologists – Strings
Strings methods I
Strings are immutable: new objects are created for changes
1
seq = ”ACGtCCAgTnAGaaGT”
2
3
4
5
6
7
8
# Case
seq . c a p i t a l i z e ( )
seq . c a s e f o l d ( )
seq . l o w e r ( )
seq . swapcase ( )
seq . upper ( )
#
#
#
#
#
’ Acgtccagtnagaagt ’
’ a cg t c c a g t n a g a a g t ’ ( e s z e t t => ” ss ” )
’ a cg t c c a g t n a g a a g t ’ ( e s z e t t => e s z e t t )
’ acgTccaGtNagAAgt ’
’ACGTCCAGTNAGAAGT ’
9
10
11
12
13
14
15
16
17
# Search and r e p l a c e
seq . count ( ’ a ’ )
seq . count ( ’G ’ , 0 , 4 )
seq . endswith ( ’GT ’ )
seq . endswith ( ’G ’ , 0 , 4 )
seq . f i n d ( ’ GtC ’ )
seq . r e p l a c e ( ” aa ” , ” t t ” )
seq . r e p l a c e ( ” A ” , ” x ” , 2 )
March 21, 2017
#
#
#
#
#
#
#
2 ( case s e n s i t i v e )
1 ( s l i c e s t a r t and end indexes )
True
False ( s l i c e s t a r t and end indexes )
2 ( 1 s t h i t index , −1 o t h e r w i s e )
’ ACGtCCAgTnAGttGT ’ ( case s e n s i t i v e )
’ xCGtCCxgTnAGaaGT ’ ( 2 f i r s t h i t s o n l y )
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
27
Introduction to Python for Biologists – Strings
Strings methods II
1
seq = ”ACGtCCAgTnAGaaGT”
2
3
4
5
6
7
8
9
# Is functions
seq . isalnum ( )
seq . i s a l p h a ( )
seq . i s l o w e r ( )
seq . i s n u m e r i c ( )
seq . i s s p a c e ( )
seq . i s u p p e r ( )
#
#
#
#
#
#
True
True
False
False
False
False
( Are
( Are
( Are
( Are
( Are
( Are
all
all
all
all
all
all
c h a r a c t e r s alphanumeric ? )
characters alphabetic ?)
c h a r a c t e r s lowercase ? )
numeric c h a r a c t e r s ? )
whitespace c h a r a c t e r s ? )
c h a r a c t e r s uppercase ? )
10
11
12
13
14
15
16
# J o i n and s p l i t
”−” . j o i n ( [ ” A ” , ” B ” ] )
”−” . j o i n ( seq )
seq . p a r t i t i o n ( ” aa ” )
seq . s p l i t ( ” aa ” )
’ 1\n2 ’ . s p l i t l i n e s ( )
March 21, 2017
#
#
#
#
#
’ A−B ’
’ A−C−G−t −C−C−A−g−T−n−A−G−a−a−G−T ’
( ’ ACGtCCAgTnAG ’ , ’ aa ’ , ’GT ’ ) : a t u p l e
[ ’ ACGtCCAgTnAG ’ , ’GT ’ ]
: a list
[ ’ 1 ’ , ’ 2 ’ ] ( s p l i t a t l i n e boundaries \ r , \n )
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
28
Introduction to Python for Biologists – Strings
Strings methods III
1
seq = ”ACGtCCAgTnAGaaGT”
2
3
4
5
6
# Deleting
seq . l s t r i p ( )
seq . r s t r i p ( )
seq . s t r i p ( )
# remove l e a d i n g whitespace c h a r a c t e r s
# remove t r a i l i n g whitespace c h a r a c t e r s
# remove whitespace c h a r a c t e r s from both ends
7
8
9
10
11
seq . l s t r i p ( ”AC” ) # ’GtCCAgTnAGaaGT ’
( remove C ’ s o r A ’ s )
seq . l s t r i p ( ”CA” ) # ’GtCCAgTnAGaaGT ’
( remove C ’ s o r A ’ s )
seq . l s t r i p ( ”C” ) # ’ACGtCCAgTnAGaaGT ’ ( no impact )
# same f o r r s t r i p b u t from t h e r i g h t and s t r i p from both ends
12
13
14
# Simple p a r s i n g o f t e x t l i n e s from CSV f i l e s
l i n e . s t r i p ( ) . s p l i t ( ’ , ’ ) # remove n e w l i n e and s p l i t CSV ( \ t i f TSV)
15
16
17
18
# t r a n s l a t e ( case s e n s i t i v e )
t a b l e = seq . maketrans ( ’ a t c g ’ , ’ t a g c ’ ) # map c h a r a c t e r s by i n d e x
seq . l o w e r ( ) . t r a n s l a t e ( t a b l e )
# ’ tgcaggtcantcttca ’
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
29
Introduction to Python for Biologists – – Exercise–
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
30
Introduction to Python for Biologists – – Exercise–
Exercise
Create the following directory structure
Dokumente
python
notebooks
data
Jupyter Notebook
File: Literals.ipynb
URL: https://cbdm.uni-mainz.de/mb17
Download the file into the notebooks folder
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
31
Introduction to Python for Biologists – Lists, tuples and ranges
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
32
Introduction to Python for Biologists – Lists, tuples and ranges
Sequence types
Text sequence type:
Strings: immutable sequences of characters
Basic sequence types:
Lists: mutable sequences
Tuples: immutable sequences
Ranges: immutable sequence of numbers
Sequence operations:
All sequence types support common sequence operations
(slide 98)
Mutable sequence types support specific operations (slide 99)
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
33
Introduction to Python for Biologists – Lists, tuples and ranges
Lists I
A List is an ordered collection of objects
1
L i s t 1 = [ ] # an empty l i s t
2
3
4
5
6
7
List1 = [ ’b ’ , ’a ’ ,
List1 [0]
# ’b ’
List1 [1]
# ’a ’
L i s t 1 [ −1] # ’ F ’
L i s t 1 [ −2] # ’ dog ’
1 , ’ cat
( access
( access
( access
( access
’ , ’K ’ , ’ dog ’ , ’ F ’ ]
item of index 0)
item of index 1)
the l a s t item )
t h e second l a s t i t e m )
8
9
10
11
12
13
14
# Slices [ s t a r t
List1 [2:5]
#
List1 [ : 5 ]
#
List1 [ 5 : ]
#
L i s t 1 [ −2:]
#
List1 [0:5:2] #
March 21, 2017
: end : s t e p ]
[ 1 , ’ c a t ’ , ’K ’ ]
( i n d e x 5 excluded )
[ ’ b ’ , ’ a ’ , 1 , ’ c a t ’ , ’K ’ ]
[ ’ dog ’ , ’ F ’ ]
[ ’ dog ’ , ’ F ’ ]
[ ’ b ’ , 1 , ’K ’ ]
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
34
Introduction to Python for Biologists – Lists, tuples and ranges
Lists II
1
2
3
4
5
6
# B u i l t −i n f u n c t i o n s
List2 = [1 , 2 , 3 , 4 , 5]
len ( List1 ) # 5
( length = 7 items )
max ( L i s t 2 ) # 5
min ( L i s t 2 ) # 1
sum ( L i s t 2 ) # 15
7
8
9
10
11
12
13
14
15
16
17
# L i s t methods
List2 = [ ]
L i s t 2 . append ( 1 )
L i s t 2 . append ( ’A ’ )
L i s t 2 . extend ( [ ’B ’ , 2 ] )
L i s t 2 . pop ( 2 )
L i s t 2 . i n s e r t ( 3 , ’A ’ )
L i s t 2 . i n d e x ( ’A ’ )
L i s t 2 . count ( ’A ’ )
L i s t 2 . reverse ( )
March 21, 2017
#
#
#
#
#
#
#
#
#
empty l i s t
[1]
[ 1 , ’A ’ ]
[ 1 , ’A ’ , ’B ’ , 2 ]
[ 1 , ’A ’ , 2 ]
[ 1 , ’A ’ , 2 , ’A ’ ] ( i n s e r t
1 ( i n d e x o f t h e 1 s t ’A ’ )
2 ( number o f ’A ’ )
[ ’ A ’ , 2 , ’A ’ , 1 ]
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
’A ’ a t i n d e x 3 )
35
Introduction to Python for Biologists – Lists, tuples and ranges
Lists III
1
2
3
4
5
6
# sorting
List3 = [5 , 3 , 4 , 1 , 2]
sorted ( List3 ) # [1 , 2 , 3 , 4 ,
List3
# [5 , 3 , 4 , 1 ,
L i s t 3 . s o r t ( ) # modifies the
List3
# [1 , 2 , 3 , 4 ,
5 ] ( b u i l d a new s o r t e d l i s t )
2 ] ( L i s t 3 n o t changed )
l i s t i n−p l a c e
5 ] ( . s o r t ( ) d i d modify L i s t 3 ! )
7
8
9
10
11
12
13
14
15
# nested l i s t /
myList = [ [ ’ b ’
[ 1
myList [ 0 ]
myList [ 0 ] [ 0 ]
myList [ 0 ] [ 1 ]
myList [ 1 ]
myList [ 1 ] [ 0 ] = 1 0
March 21, 2017
2D l i s t s / t a b l e s
, ’a ’ ] ,
, ’ cat ’ ] ] # a l i s t of 2 l i s t s
# r e t u r n s the f i r s t l i s t [ ’ b ’ , ’ a ’ ]
# ’ b ’ (1 s t item of the 1 s t l i s t )
# ’ a ’ ( 2 nd i t e m o f t h e 1 s t l i s t )
# r e t u r n s t h e 2nd l i s t [ 1 , ’ c a t ’ ]
# [ [ ’ b ’ , ’a ’ ] , [10 , ’ cat ’ ] ]
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
36
Introduction to Python for Biologists – Lists, tuples and ranges
Lists IV
1
2
myList = [ [ ’ b ’ , ’ a ’ ] ,
[ 1 , ’ cat ’ ] ]
3
4
5
6
7
8
9
10
f o r s u b l i s t i n myList :
f o r value i n s u b l i s t :
p r i n t ( value )
# b
# a
# 10
# cat
# l o o p over s u b l i s t s
# l o o p over v a l u e s
# p r i n t 1 v a l u e per l i n e
11
12
13
14
15
16
f o r s u b l i s t i n myList :
# l o o p over s u b l i s t s
n e w s u b l i s t = map( s t r , s u b l i s t ) # c o n v e r t each i t e m t o s t r i n g
p r i n t ( ’\ t ’ . join ( new sublist ) )
# p r i n t as TSV t a b l e
# b
a
# 10 c a t
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
37
Introduction to Python for Biologists – Lists, tuples and ranges
Tuples and ranges
A Tuple is an ordered collection of objects
1
2
Tuple1 = ( )
# empty t u p l e
Tuple1 = ( ’ b ’ , ’ a ’ , 1 , ’ c a t ’ , ’K ’ , ’ dog ’ , ’ F ’ ) # d e f i n e d t u p l e
3
4
5
Tuple1 [ 0 ]
Tuple1 [ 1 : 3 ]
# ’b ’
# ( ’ a ’ , 1 ) ( i n d e x 3 excluded )
Ranges
1
2
3
4
5
6
7
# Range ( s t a r t , s t o p [ , s t e p ] )
range ( 1 0 )
# range ( 0 , 10) => no n i c e p r i n t method
l i s t ( range ( 1 0 ) )
# [0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9]
l i s t ( range ( 0 , 30 , 5 ) )
# [ 0 , 5 , 10 , 15 , 20 , 2 5 ]
l i s t ( range ( 0 , −5, −1) ) # [ 0 , −1, −2, −3, −4]
l i s t ( range ( 0 ) )
# []
l i s t ( range ( 1 , 0 ) )
# []
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
38
Introduction to Python for Biologists – Sets and dictionaries
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
39
Introduction to Python for Biologists – Sets and dictionaries
Sets I
A Set is a mutable unordered collection of objects
1
2
3
4
5
S0 = s e t ( )
S0 = { ’ a ’ , 1}
S1 = { ’ a ’ , 1 , ’ b ’ , ’R ’ }
S2 = { ’ a ’ , 1 , ’ b ’ , ’S ’ }
l e n ( S0 )
#
#
#
#
#
an empty s e t
a new s e t o f 2 i t e m s
a new s e t o f 4 i t e m s
a new s e t o f 4 i t e m s
2
6
7
8
9
10
11
12
13
14
15
16
17
# Operators
’R ’ i n S1
’R ’ n o t i n S2
S1 − S2
S1 | S2
S1 & S2
S1 ˆ S2
S0 <= S1
S1 >= S2
S1 >= S0
S0 . i s d i s j o i n t ( S1 )
March 21, 2017
#
#
#
#
#
#
#
#
#
#
True
True
i n S1
i n S1
i n S1
i n S1
S0 i s
S1 i s
True
False
b u t n o t i n S2 => { ’R ’ }
o r i n S2 => {1 , ’ a ’ , ’S ’ , ’R ’ , ’ b ’ }
and i n S2 => {1 , ’ b ’ , ’ a ’ }
o r i n S2 b u t n o t i n both => { ’R ’ , ’S ’ }
subset o f S2 => True
s u p e r s e t o f S2 => False
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
40
Introduction to Python for Biologists – Sets and dictionaries
Sets II
1
2
3
4
5
6
7
# Methods
S0 . copy ( )
S0 . add ( i t e m )
S0 . remove ( i t e m )
S0 . d i s c a r d ( i t e m )
S0 . pop ( )
S0 . c l e a r ( )
March 21, 2017
#
#
#
#
#
#
r e t u r n a new s e t w i t h a s h a l l o w copy o f S0
add element i t e m t o t h e s e t
remove element i t e m from t h e s e t
remove element i t e m from t h e s e t i f p r e s e n t
remove and r e t u r n an a r b i t r a r y element
remove a l l elements from t h e s e t
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
41
Introduction to Python for Biologists – Sets and dictionaries
Dictionaries I
A Dictionary is a mutable indexed collection of objects (indexed
by unique keys)
1
2
3
4
5
6
7
d = {}
# empty d i c t i o n a r y
d = { ’A ’ : ” ALA ” , ’C ’ : ”CYS” } # d i c t i o n a r y w i t h 2 i t e m s
d [ ’A ’ ]
# ’ ALA ’
d [ ’C ’ ]
# ’CYS ’
d [ ’H ’ ] = ” HIS ” # add new i t e m
d
# { ’H ’ : ’ HIS ’ , ’C ’ : ’CYS ’ , ’A ’ : ’ ALA ’ }
d e l d [ ’A ’ ]
# { ’C ’ : ’CYS ’ , ’H ’ : ’ HIS ’ }
8
9
10
’C ’ i n d
# True ( key ’C ’ i s i n d )
’A ’ n o t i n d # True ( key ’A ’ i s n o t i n d anymore )
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
42
Introduction to Python for Biologists – Sets and dictionaries
Dictionaries II
d[key]
d[key] = val
del d[key]
d.clear()
len(d)
d.copy()
d.keys()
d.values()
d.items()
d.update(d2)
d.get(key [, val])
d.setdefaults(key [, val])
pop(key[, default])
d.popitem()
get value by key
set value by key
delete item by key
delete all items
number of items
make a shallow copy
return a view of all keys
return a view of all values
return a view of all items (key,value)
add all items from dictionary d2
get value by key if exists, otherwise val
like d.get(k,val), also set d[k]=val if k not in d
remove key and return its value, return default otherwise.
remove a random item and returns it as tuple
Table: Functions for dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
43
Introduction to Python for Biologists – Convert and copy
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
44
Introduction to Python for Biologists – Convert and copy
Converting types I
Many Python functions are sensitive to the type of data. For
example, you cannot concatenate a string with an integer:
1
2
3
s i g n = ’ You are ’ +
21 + ’−years−o l d ’ # e r r o r ! !
s i g n = ’ You are ’ + s t r ( 2 1 ) + ’−years−o l d ’ # OK
s i g n # ’ You are 21−years−o l d ’
4
5
6
7
# c o n v e r t t o i n t ( from s t r o r f l o a t )
i n t ( ’ 2014 ’ )
# from a s t r i n g
i n t ( 3 . 1 4 1 5 9 2 ) # from a f l o a t
8
9
10
11
# c o n v e r t t o f l o a t ( from s t r o r i n t )
f l o a t ( ’ 1.99 ’ ) # from a s t r i n g
float (5)
# from an i n t e g e r
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
45
Introduction to Python for Biologists – Convert and copy
Converting types II
1
2
3
# c o n v e r t t o s t r ( from i n t , f l o a t , l i s t , t u p l e , d i c t and s e t )
s t r (3.141592) # ’3.141592 ’
str ([1 ,2 ,3 ,4]) # ’[1 , 2, 3, 4] ’
4
5
6
7
8
9
10
# convert a
# ( str , l i s t
new set
=
new tuple =
new set
=
new list =
March 21, 2017
sequence t y p e t o a n o t h e r
, t u p l e , and s e t f u n c t i o n s )
set ( o l d l i s t )
# l i s t to set
tuple ( o l d l i s t ) # l i s t to tuple
set ( ” Hello ” )
# s t r i n g t o s e t { ’H ’ , ’ o ’ , ’ e ’ , ’ l ’ }
l i s t ( ” Hello ” )
# s t r i n g to l i s t [ ’H ’ , ’ e ’ , ’ l ’ , ’ l ’ , ’ o ’ ]
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
46
Introduction to Python for Biologists – Convert and copy
Copy I
1
2
3
4
5
6
Assignments (=) do not copy objects, they create bindings
between a target and an object.
#
a
b
b
a
b
Numeric
= 1
= a
= b + 1
t y p e s ( immutable )
# a binds the o b j e c t 1
# b binds the o b j e c t 1
# b b i n d s a new o b j e c t c r e a t e d by t h e sum
# 1
# 2
#
a
b
a
a
b
S t r i n g s ( immutable )
= ” Hello ”
# a binds the o b j e c t ” Hello ”
= a
# b binds the o b j e c t ” Hello ”
= a . r e p l a c e ( ’ o ’ , ’ o World ! ’ ) # a b i n d s a new o b j e c t
# ’ H e l l o World ! ’
# ’ Hello ’
7
8
9
10
11
12
13
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
47
Introduction to Python for Biologists – Convert and copy
Copy II
1
2
3
4
5
6
For collections that are mutable or contain mutable items, a
shallow copy is sometimes needed so one can change one
copy without changing the other.
# D i c t i o n a r y ( mutable )
d1 = { ’A ’ : ” ALA ” , ’C ’ : ”CYS” } # d1 b i n d s
d2 = d1
# d2 b i n d s
d2 [ ’H ’ ] = ” HIS ” # add i t e m t o t h e o b j e c t
d1
# { ’A ’ : ’ ALA ’ , ’H ’ : ’ HIS
d2
# { ’A ’ : ’ ALA ’ , ’H ’ : ’ HIS
the o b j e c t
the o b j e c t
’ , ’C ’ :
’ , ’C ’ :
’CYS ’ }
’CYS ’ }
7
8
9
10
11
d2 = d1 . copy ( ) # d2 b i n d s a s h a l l o w copy o f t h e o b j e c t
d2 [ ’P ’ ] = ”PRO” # add i t e m t o t h e copied o b j e c t
d1
# { ’A ’ : ’ ALA ’ , ’H ’ : ’ HIS ’ , ’C ’ : ’CYS ’ }
d2
# { ’A ’ : ’ ALA ’ , ’H ’ : ’ HIS ’ , ’P ’ : ’PRO ’ , ’C ’ :
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
’CYS ’ }
48
Introduction to Python for Biologists – Convert and copy
Copy III
1
2
3
4
5
6
#
l1
l2
l2
l1
l2
L i s t ( mutable )
= [ ’A ’ , ’H ’ , ’C ’ ]
= l1
. append ( ’P ’ )
# [ ’ A ’ , ’H ’ , ’C ’ , ’P ’ ]
# [ ’ A ’ , ’H ’ , ’C ’ , ’P ’ ]
l2
l2
l1
l2
= l1 [ : ]
. append (
# [ ’A ’ ,
# [ ’A ’ ,
7
8
9
10
11
March 21, 2017
# s h a l l o w copy by a s s i g n i n g a s l i c e o f t h e a l l
’V ’ )
’H ’ , ’C ’ , ’P ’ ]
’H ’ , ’C ’ , ’P ’ , ’V ’ ]
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
list
49
Introduction to Python for Biologists – Convert and copy
Copy IV
1
2
3
4
1
2
3
Convert types to get copies
new
new
new
new
list
dict
set
tuple
=
=
=
=
list ( oldlist )
dict ( olddict )
set ( o l d l i s t )
tuple ( o l d l i s t )
#
#
#
#
s h a l l o w copy
s h a l l o w copy
copy l i s t as a s e t
copy l i s t a t u p l e
The copy module
i m p o r t copy
x . copy ( )
# s h a l l o w copy o f x
x . deepcopy ( ) # deep copy o f x , i n c l u d i n g embedded o b j e c t s
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
50
Introduction to Python for Biologists – Loops
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
51
Introduction to Python for Biologists – Loops
For loop I
1
2
3
4
5
6
# For i t e m s i n a l i s t
f o r person i n [ ’ I s a b e l ’ , ’ Kate ’ , ’ Michael ’ ] :
p r i n t ( ” Hi ” , person )
# Hi I s a b e l
# Hi Kate
# Hi Michael
7
8
9
10
11
12
13
# For i t e m s i n a d i c t i o n a r y
seq = ’ ’
#
d = { ’A ’ : ” ALA ” , ’C ’ : ”CYS” } #
f o r k i n d . keys ( ) :
#
seq += d [ k ]
#
p r i n t ( seq )
#
March 21, 2017
an empty s t r i n g
a d i c t i o n a r y w i t h 2 keys
l o o p over t h e keys
append v a l u e t o seq
’CYSALA ’
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
52
Introduction to Python for Biologists – Loops
For loop II
1
2
3
4
5
6
# For i t e m s i n a s t r i n g
f o r c i n ’ abc ’ :
print (c)
# a
# b
# c
7
8
9
10
11
12
13
# For i t e m s i n a range
f o r n i n range ( 3 ) :
print (n)
# 0
# 1
# 2
14
15
16
17
# For i t e m s from any i t e r a t o r
for n in i t e r a t o r :
print (n)
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
53
Introduction to Python for Biologists – Loops
Enumerate
1
2
3
4
5
6
7
# l o o p g e t t i n g i n d e x and v a l u e
RNAs = [ ’miRNA ’ , ’ tRNA ’ , ’mRNA ’ ]
f o r i , rna i n enumerate (RNAs) :
p r i n t ( i , rna )
# 0 miRNA
# 1 tRNA
# 2 mRNA
8
9
10
11
12
13
14
15
16
# l o o p over 2 l i s t s
RNAtypes = [ ’ micro ’ , ’ t r a n s f e r ’ , ’ messenger ’ ]
f o r i , t i n enumerate ( RNAtypes ) :
r = RNAs [ i ]
print ( i , t , r )
# 0 micro miRNA
# 1 t r a n s f e r tRNA
# 2 messenger mRNA
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
54
Introduction to Python for Biologists – Loops
While loop
1
2
3
4
5
6
7
8
9
10
11
12
i =0
v a l u e =1
w h i l e value <200:
i +=1
v a l u e ∗= i
p r i n t ( i , value )
# 1 1
# 2 2
# 3 6
# 4 24
# 5 120
# 6 720
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
55
Introduction to Python for Biologists – – Exercise –
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
56
Introduction to Python for Biologists – – Exercise –
Exercise
URL
https://cbdm.uni-mainz.de/mb17
Jupyter Notebook
File: Sequences.ipynb
Download the file into the notebooks folder
Data file
File: shrub dimensions.csv
Download the file into the data folder
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
57
Introduction to Python for Biologists – Functions
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
58
Introduction to Python for Biologists – Functions
Functions I
1
from random i m p o r t c h o i c e
# import f u n c t i o n ’ choice ’
# Simple f u n c t i o n
d e f kmerFixed ( ) :
p r i n t ( ”ACGTAGACGC” )
# d e f i n e f u n c t i o n kmerFixed
# p r i n t predefined s t r i n g
kmerFixed ( )
# d i s p l a y ’ACGTAGACGC ’
# Returning a value
d e f kmer10 ( ) :
seq= ” ”
f o r count i n range ( 1 0 ) :
seq += c h o i c e ( ”CGTA” )
r e t u r n ( seq )
#
#
#
#
#
newKmer = kmer10 ( )
p r i n t ( newKmer )
# get r e s u l t of f u n c t i o n i n t o v a r i a b l e
# c a l l t h e f u n c t i o n e . g . ’ACGGATACGC ’
2
3
4
5
6
7
8
9
10
11
12
13
14
d e f i n e f u n c t i o n kmer10
d e f i n e an empty s t r i n g
r e p e a t 10 t i m e s
add 1 random n t t o s t r i n g
return string
15
16
17
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
59
Introduction to Python for Biologists – Functions
Functions II
1
2
3
4
5
6
# One parameter
d e f kmer ( k ) :
# d e f i n e kmer w i t h 1 param . k
seq= ” ”
f o r count i n range ( k ) : # k i s used t o d e f i n e t h e range
seq+= c h o i c e ( ”CGTA” )
r e t u r n ( seq )
7
8
9
10
11
print
print
print
print
( kmer ( k =4) )
( kmer ( 2 0 ) )
( kmer ( 0 ) )
( kmer ( ) )
12
March 21, 2017
# e . g . ’TACC ’
# e . g . ’CACAATGGGTACCCCGGACC ’
#
# TypeError : kmer ( ) m i s s i n g 1 r e q u i r e d
#
p o s i t i o n a l argument : ’ k ’
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
60
Introduction to Python for Biologists – Functions
Functions III
1
2
3
4
5
6
# Parameters w i t h more parameters and d e f a u l t v a l u e s
d e f g e n e r i c k m e r ( a l p h a b e t = ”ACGT” , k =10) :
seq= ” ”
f o r count i n range ( k ) :
seq+= c h o i c e ( a l p h a b e t )
r e t u r n ( seq )
7
8
9
10
11
12
generic
generic
generic
generic
generic
March 21, 2017
k m e r ( ” AB12 ” , 15) # e . g . ’112AA1A12AA1121 ’
k m e r ( ” AB12 ” )
# e . g . ’ 1AA1B1BA2A ’
k m e r ( k =20)
# e . g . ’GTGGGCTTGTGCCCTGCACT ’
kmer ( )
# e . g . ’CTTGCCGGGA ’
k m e r ( k =8 , a l p h a b e t = ” #$%&” ) # e . g . ’ $$#&%$%$ ’
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
61
Introduction to Python for Biologists – Functions
Name spaces I
1
Variable and function names defined globally can be seen in
functions: this is the global namespace
a = 10
# global variable
2
3
4
def my function ( ) :
print (a)
# w i l l use t h e g l o b a l v a r i a b l e
5
6
7
my function ( )
print (a)
March 21, 2017
# 10 ( t h e g l o b a l a )
# 10 ( t h e g l o b a l a )
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
62
Introduction to Python for Biologists – Functions
Name spaces II
1
Names defined within a function can not be seen outside: the
function has its own namespace.
a = 10
# global variable
2
3
4
5
6
def my function ( ) :
a = 1
# l o c a l v a r i a b l e d e f i n e d by assignment
b = 2
# l o c a l v a r i a b l e d e f i n e d by assignment
print (a)
7
8
9
10
my function ( )
print (a)
print (b)
March 21, 2017
# 1 ( the l o c a l a )
# 10 ( t h e g l o b a l a )
# NameError : name ’ b ’ i s n o t d e f i n e d
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
63
Introduction to Python for Biologists – Functions
Name spaces III
1
Use parameters and returned values to get and set variables
outside the name space
a = 10
# global variable
def my function ( val ) :
b = 2
val = val + b
return ( val )
print (a)
p r i n t ( my function ( a ) )
print (a)
# local variable val
# 10 ( t h e g l o b a l a )
# 12
# 10 ( t h e g l o b a l a unchanged )
c = my function ( a )
print (c)
print (a)
# s e t v a l t o 10 and a s s i g n 10+2 t o c
# 12 ( g l o b a l a was changed )
# 10 ( g l o b a l a was unchanged )
a = my function ( a )
# change g l o b a l a w i t h v a l u e 10+2
2
3
4
5
6
7
8
9
10
11
12
13
14
15
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
64
Introduction to Python for Biologists – Branching
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
65
Introduction to Python for Biologists – Branching
Truth Value Testing I
Any object can be tested for truth value. The following values are
considered false (other values are considered True):
None
False
zero value: e.g. 0 or 0.0
an empty sequence or mapping: e.g. ’ ’, (), [ ], { }.
Operations and built-in functions that have a Boolean result
always return 0 for False and 1 for True
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
66
Introduction to Python for Biologists – Branching
Boolean Operations I
A Boolean is equal to True or False
a and b (true if a and b are true, false otherwise)
a or b (true if a or b is true (1 alone or both), false otherwise)
a ˆ b (true if either a or b is true (not both), false otherwise)
not b (true if b is false, false otherwise)
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
67
Introduction to Python for Biologists – Branching
Boolean Operations II
All example code for tests below return ”True” unless otherwise
specified
1
2
3
4
#
a
b
c
l e t s e t v a l u e s o f 3 v a r i a b l e s ( s i n g l e ” = ” symbol )
= True
= False
= True
#
a
b
c
s i m p l e t e s t s u s i n g two ” = ” symbols ( = = )
== True
== False
== True
5
6
7
8
9
10
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
68
Introduction to Python for Biologists – Branching
Boolean Operations III
1
2
3
4
#
a
b
c
l e t s e t v a l u e s o f 3 v a r i a b l e s ( one ” = ” symbol )
= True
= False
= True
5
6
7
8
# order i s i r r e l e v a n t
( a o r b ) == ( b o r a )
( a and b ) == ( b and a )
9
10
11
12
# n e u t r a l ( whatever v a l u e o f a )
( a o r False ) == a
( a and True ) == a
13
14
15
16
# always t h e same ( whatever v a l u e o f a )
( a and False ) == False
( a o r True )
== True
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
69
Introduction to Python for Biologists – Branching
Boolean Operations IV
1
2
3
4
#
a
b
c
l e t s e t v a l u e s o f 3 v a r i a b l e s ( one ” = ” symbol )
= True
= False
= True
5
6
7
8
# precedence ” = = ” > ” n o t ” > ” and ” > ” o r ”
( a and b o r c ) == ( ( a and b ) o r c )
( n o t a == b ) == ( n o t ( a == b ) )
9
10
11
12
13
# equivalent expressions
( ( a o r b ) o r c ) == ( a o r ( b o r c ) ) == ( a o r b o r c )
( a o r a o r a ) == a
( b and b and b ) == b
14
15
b and b and b == b # False and False and True => False ! !
16
17
18
a and ( b o r c ) == ( a and b ) o r ( a and c )
a o r ( b and c ) == ( a o r b ) and ( a o r c )
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
70
Introduction to Python for Biologists – Branching
Comparisons
1
2
3
4
5
6
7
8
9
10
11
Operations
<
<=
>
>=
==
math . i s c l o s e ( a , b )
!=
is
i s not
x < y <= z
#
#
#
#
#
#
#
#
#
#
s t r i c t l y l e s s than
l e s s than o r equal
s t r i c t l y g r e a t e r than
g r e a t e r than o r equal
equal ( two symbols =)
equal f o r f l o a t i n g p o i n t s a and b
n o t equal
object i d e n t i t y
negated o b j e c t i d e n t i t y
i s e q u i v a l e n t t o ” x < y and y <= z ”
Comparisons between objects of same class are supported if
operator defined for the class.
Different numerical types can be compared: e.g. 2<4.56
Floating points can not be compared exactly due to the limited
precision to represent infinite numbers such as 1/3 =
0.33333...
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
71
Introduction to Python for Biologists – Branching
Conditionals
1
2
3
4
5
6
7
IF-ELIF-ELSE
seq = ’ATGAnnATG ’
i f ’ n ’ i n seq :
p r i n t ( ” sequence c o n t a i n s u n d e f i n e d bases ( n ) ” )
e l i f ’ x ’ i n seq :
p r i n t ( ” sequence c o n t a i n s unknown bases x b u t n o t n ” )
else :
p r i n t ( ” no u n d e f i n e d bases i n sequence ” )
8
10
#
# sequence c o n t a i n s u n d e f i n e d bases
ELIF and ELSE are optional
multiple ELIF are possible
9
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
72
Introduction to Python for Biologists – – Exercise –
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
73
Introduction to Python for Biologists – – Exercise –
Exercise
URL
https://cbdm.uni-mainz.de/mb17
Jupyter Notebook
File: Conditionals.ipynb
Download the file into the notebooks folder
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
74
Introduction to Python for Biologists – Regular Expressions
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
75
Introduction to Python for Biologists – Regular Expressions
RE: Regular Expressions I
Regular expressions (called REs, or regexes, or regex
patterns) are a powerful language for matching text patterns
(re module)
In Python a regular expression search is typically written as:
1
match = r e . search ( expression , s t r i n g )
The re.search() method takes a regular expression pattern
and a string and searches for that pattern within the string.
If the search is successful, re.search() returns a Match
object (actually class ’ sre.SRE Match’) or None otherwise.
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
76
Introduction to Python for Biologists – Regular Expressions
RE: Regular Expressions II
1
2
3
4
5
6
7
import re
#
s t r = ’ an example word : c a t ! ! ’
#
match = r e . search ( r ’ word : \w\w\w ’ , s t r ) #
i f match :
p r i n t ( ’ found ’ , match . group ( ) )
#
else :
p r i n t ( ’ did not f i n d ’ )
i m p o r t r e module
Example s t r i n g
Search a p a t t e r n
’ found word : c a t ’
In the pattern string, \w codes a character (letter, digit or
underscore)
The ’r’ at the start of the pattern string designates a python
”raw” string which passes through backslashes without
change.
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
77
Introduction to Python for Biologists – Regular Expressions
RE: Basic Patterns
Pattern
a, X, 9, <
.
\w
\W
\b
\s
\S
\t
\n
\r
\d
ˆ
$
\
Match
ordinary characters match themselves exactly
a period matches any single character except newline
matches a ”word” character: a letter or digit or underbar [a-zA-Z0-9 ]
matches any non-word character
boundary between word and non-word
a single whitespace character – space, newline, return, tab, form [\n \r \t \f]
matches any non-whitespace character
tab
newline
return
decimal digit [0-9]
circumflex (top hat) matches the start of a string
dollar matches the end of a string
inhibits the ”specialness” of a character. So, for example, use \. to match a period
Table: Regular expressions: basic patterns
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
78
Introduction to Python for Biologists – Regular Expressions
RE: Basic examples I
The basic rules of RE search for a pattern within a string are:
The search proceeds through the string from start to end,
stopping at the first match found
All of the pattern must be matched, but not all of the string
If match = re.search(pat, str) is successful, match is not
None and in particular match.group() is the matching text
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
79
Introduction to Python for Biologists – Regular Expressions
RE: Basic examples II
1
2
match = r e . search ( r ’ i i i ’ , ’ p i i i g ’ ) # found
match . group ( ) == ” i i i ”
# True
3
4
5
match = r e . search ( r ’ i g s ’ , ’ p i i i g ’ ) # n o t found
match == None
# True
6
7
8
match = r e . search ( r ’ . . g ’ , ’ p i i i g ’ ) # found
match . group ( ) == ” i i g ”
# True
9
10
11
match = r e . search ( r ’ \d\d\d ’ , ’ p123g ’ ) # found
match . group ( ) == ” 123 ”
# True
12
13
14
match = r e . search ( r ’ \w\w\w ’ , ’@@abcd ! ! ’ ) # found
match . group ( ) == ” abc ”
# True
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
80
Introduction to Python for Biologists – Regular Expressions
RE: Repetitions I
Repetitions are defined using +, *, ? and { }
+ means 1 or more occurrences of the pattern to its left
e.g. i+ = one or more i’s
* means 0 or more occurrences of the pattern to its left
? means match 0 or 1 occurrences of the pattern to its left
curly brackets are used to specify exact number of repetitions
e.g. A{5} for 5 A letters
A{6,10} for 6 to 10 A letters
Leftmost and Largest:
First the search finds the leftmost match for the pattern, and
second it tries to use up as much of the string as possible
i.e. + and * go as far as possible (they are said to be
”greedy”).
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
81
Introduction to Python for Biologists – Regular Expressions
RE: Repetitions II
1
2
3
4
5
6
7
8
# simple r e p e t i t i o n s
r e . search ( r ’ p i + ’ ,
r e . search ( r ’ p i ? ’ ,
r e . search ( r ’ p i ? ’ ,
r e . search ( r ’ p i ∗ ’ ,
r e . search ( r ’ p i ∗ ’ ,
r e . search ( r ’ p i {3} ’ ,
r e . search ( r ’ i + ’ ,
’ piiig ’
’ ap ’
’ apii ’
’ ap ’
’ apii ’
’ apiiiii ’
’ piigiiii
) . group ( )
) . group ( )
) . group ( )
) . group ( )
) . group ( )
) . group ( )
’ ) . group ( )
#
#
#
#
#
#
#
piii
p
pi
p
pii
piii
i i (1 s t h i t only )
9
10
11
12
13
# 3 d i g i t s p o s s i b l y separated by whitespaces ( \ s ∗ )
r e . search ( r ’ \d\s∗\d\s∗\d ’ , ’ xx1 2
3xx ’ ) . group ( ) # ” 1 2
3”
r e . search ( r ’ \d\s∗\d\s∗\d ’ , ’ xx12 3xx ’ ) . group ( ) # ”12 3 ”
r e . search ( r ’ \d\s∗\d\s∗\d ’ , ’ xx123xx ’
) . group ( ) # ” 1 2 3 ”
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
82
Introduction to Python for Biologists – Regular Expressions
RE: Sets of characters I
Square brackets indicate a set of characters
[ABC] matches ’A’ or ’B’ or ’C’.
The codes \w, \s etc. work inside square brackets too with
the one exception that dot (.) just means a literal dot
Dash indicate a range or itself if put at the end
[a-z] for lowercase alphabetic characters
[a-zA-Z] for alphabetic characters
[AB-] for A, B or dash
Circumflex (ˆ) at the start inverts the set
March 21, 2017
[ˆAB] for any character except A or B.
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
83
Introduction to Python for Biologists – Regular Expressions
RE: Sets of characters II
1
2
3
4
s t r = ’ p u r p l e a l i c e −b@google . com monkey dishwasher ’
match = r e . search ( r ’ \w+@\w+ ’ , s t r )
i f match :
p r i n t match . group ( ) ## ’ b@google ’
5
6
7
8
match = r e . search ( r ’ [ \w. −]+@[ \w. −]+ ’ , s t r )
i f match :
p r i n t match . group ( ) ## ’ a l i c e −b@google . com ’
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
84
Introduction to Python for Biologists – Regular Expressions
RE: Functions I
RE module functions:
re.match() returns a Match object if occurrence found at begining
of string, None otherwise
re.search() returns a Match object for 1st occurrence, None if not
found
re.findall() returns a list of matched sub strings, an empty list if not
found
re.finditer() returns an iterator on Match objects of the
occurrences, an empty iterator if not found
Match object methods:
match.start() returns start index
match.end() returns end index
match.span() returns start and end index in a tuple
match.group() returns matched string
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
85
Introduction to Python for Biologists – Regular Expressions
RE: Functions II
1
2
3
import re
seq = ”RPAPPDRAPDQX” # A sequence
expr = ’A. { 1 , 2 }D ’
# A and D separated by 1 o r 2 c h a r a c t e r s
4
5
6
7
8
9
10
11
12
13
14
15
match = r e . search ( expr , seq )
i f match :
print (
match . s t a r t ( ) ,
match . end ( ) ,
match . span ( ) ,
match . group ( ) ,
seq [ match . s t a r t ( ) : match . end ( ) ] ,
sep= ’ − ’
)
# 2 − 6 − ( 2 , 6 ) − APPD − APPD
March 21, 2017
Johannes Gutenberg-Universität Mainz
#
#
#
#
#
s t a r t index
end i n d e x
s t a r t and end i n d e x
t h e matched s t r i n g
t h e matched s t r i n g
Taškova & Fontaine
86
Introduction to Python for Biologists – Regular Expressions
RE: Functions III
1
2
3
import re
seq = ”RPAPPDRAPDQX” # A sequence
expr = ’A. { 1 , 2 }D ’
# A and D separated by 1 o r 2 c h a r a c t e r s
4
5
6
7
match = r e . match ( expr , seq )
p r i n t ( match )
# None
# Not found a t b e g i n i n g
8
9
10
11
matches = r e . f i n d a l l ( expr , seq ) # Found 2 occurrences
p r i n t ( matches )
# [ ’ APPD ’ , ’APD ’ ]
12
13
14
15
16
17
matches = r e . f i n d i t e r ( expr , seq ) # Found 2 occurrences
f o r m i n matches :
# I t e r a t e over Match o b j e c t s
p r i n t ( m. span ( ) , m. group ( ) ) # Use each Match o b j e c t
# ( 2 , 6 ) APPD
# ( 7 , 10) APD
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
87
Introduction to Python for Biologists – Regular Expressions
RE: Group Extraction
Groups are defined with parentheses
On a successful search
1
2
3
4
5
6
7
match.group(): the whole match text
match.group(1): match text of 1st left parenthesis
match.group(2): match text of 2nd left parenthesis
...
import re
s t r = ’ p u r p l e a l i c e −b@google . com monkey dishwasher ’
match = r e . search ( ’ ( [ \ w. − ] + )@( [ \ w. − ] + ) ’ , s t r )
i f match :
p r i n t ( match . group ( ) )
## ’ a l i c e −b@google . com ’
p r i n t ( match . group ( 1 ) ) ## ’ a l i c e −b ’
p r i n t ( match . group ( 2 ) ) ## ’ google . com ’
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
88
Introduction to Python for Biologists – Regular Expressions
RE: Group Extraction and Findall
If the pattern includes a single set of parenthesis, then
findall() returns a list of strings corresponding to that single
group
If the pattern includes 2 or more parenthesis groups, then
instead of returning a list of strings, findall() returns a list of
tuples. Each tuple represents one match of the pattern, and
inside the tuple is the group(1), group(2) ... data.
1
2
3
4
s t r = ’ alice@google . com , monkey bob@abc . com dishwasher ’
t u p l e s = r e . f i n d a l l ( r ’ ( [ \ w\. −]+)@( [ \ w\. −]+) ’ , s t r )
p r i n t ( tuples )
# [ ( ’ a l i c e ’ , ’ google . com ’ ) , ( ’ bob ’ , ’ abc . com ’ ) ]
5
6
7
8
9
for t in tuples :
p r i n t ( t [ 0 ] , t [ 1 ] , sep= ’ |
# a l i c e | google . com
# bob | abc . com
March 21, 2017
Johannes Gutenberg-Universität Mainz
’)
Taškova & Fontaine
89
Introduction to Python for Biologists – Regular Expressions
RE: Options
The re functions take options to modify the behavior of the pattern
match. The option flag is added as an extra argument to the
search() or findall() etc., e.g. re.search(pat, str,
re.IGNORECASE).
IGNORECASE ignores upper/lowercase differences for
matching
DOTALL allows dot (.) to match newline – normally it matches
anything but newline.
Note that \s (whitespace) includes newlines
MULTILINE allows ˆand $ to match the start and end of each
line within a string made of many lines. Normally they just
match the start and end of the whole string.
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
90
Introduction to Python for Biologists – Regular Expressions
Greedy vs. Non-Greedy
1
.* or .+ return the largest match (aka it is ”greedy”)
to get nested occurrences use .*? or .+?
s t r i n g = ’<b>foo </b> and <i >so on</ i > ’ # s t r i n g w i t h xml t a g s
2
3
4
matches = r e . f i n d a l l ( r ’ <.∗> ’ , s t r i n g )
# <.∗>
p r i n t ( matches ) # [ ’ <b>foo </b> and <i >so on</ i > ’] # g o t a l l s t r i n g
5
6
7
matches = r e . f i n d a l l ( r ’ <.∗?> ’ , s t r i n g )
# <.∗?>
p r i n t ( matches ) # [ ’ <b > ’ , ’ </b > ’ , ’< i > ’ , ’ </ i > ’] # g o t each t a g
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
91
Introduction to Python for Biologists – Regular Expressions
Substitution
1
2
3
4
1
2
3
4
5
6
7
re.sub(expression, replacement, string)
t e x t 1 = ’ alice@google . com and bob@abc . n e t ’
t e x t 2 = r e . sub ( r ’ \ . \w+ ’ , r ’ . de ’ , t e x t 1 )
print ( text2 )
# alice@google . de and bob@abc . de
\1, \2 ... in replacement refer to match group(1), group(2) ...
t e x t 1 = ’ alice@google . com and bob@abc . com ’
t e x t 2 = r e . sub (
r ’ ( [ \ w\. −]+)@( [ \ w\. −]+) ’ , # Expression
r ’ \2@\1 ’ ,
# Replacement s t r i n g
str )
# Input string
print ( text2 )
## google . com@alice and abc . com@bob
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
92
Introduction to Python for Biologists – – Exercise –
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
93
Introduction to Python for Biologists – – Exercise –
Exercise
URL
https://cbdm.uni-mainz.de/mb17
Jupyter Notebook
File: Regex.ipynb
Download the file into the notebooks folder
Data file
File: sequences.tsv
Download the file into the data folder
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
94
Introduction to Python for Biologists – Annexes
Introduction
Running code
Literals and variables
Numeric types
Strings
– Exercise–
Lists, tuples and ranges
Sets and dictionaries
March 21, 2017
Johannes Gutenberg-Universität Mainz
Convert and copy
Loops
– Exercise –
Functions
Branching
– Exercise –
Regular Expressions
– Exercise –
Annexes
Taškova & Fontaine
95
Introduction to Python for Biologists – Annexes
References
Python documentation
https://docs.python.org
Online tutorials (Python 2 or 3)
March 21, 2017
Google’s Python Class
ProgrammingForBiologists.org
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
96
Introduction to Python for Biologists – Annexes
Escape sequences
Escape Sequence
\newline
\\
\’
\”
\a
\b
\f
\n
\r
\t
\v
\ooo
\xhh
Meaning
Backslash and newline ignored
Backslash (\)
Single quote (’)
Double quote (”)
ASCII Bell (BEL)
ASCII Backspace (BS)
ASCII Formfeed (FF)
ASCII Linefeed (LF)
ASCII Carriage Return (CR)
ASCII Horizontal Tab (TAB)
ASCII Vertical Tab (VT)
Character with octal value ooo
Character with hex value hh
Table: Escape sequences
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
97
Introduction to Python for Biologists – Annexes
Common Sequence Operations
Operation
x in s
x not in s
s+t
s * n or n * s
s[i]
s[i:j]
s[i:j:k]
len(s)
min(s)
max(s)
s.index(x[, i[, j]])
s.count(x)
Result
True if an item of s is equal to x, else False
False if an item of s is equal to x, else True
the concatenation of s and t
equivalent to adding s to itself n times
ith item of s, origin 0
slice of s from i to j
slice of s from i to j with step k
length of s
smallest item of s
largest item of s
index of the first occurrence of x in s (at or after index i and before index j)
total number of occurrences of x in s
Table: Sequence operations sorted in ascending priority. s and t are
sequences of the same type, n, i, j and k are integers and x is an
arbitrary object that meets any type and value restrictions imposed by s.
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
98
Introduction to Python for Biologists – Annexes
Operations on mutable sequence types
Operation
s[i] = x
s[i:j] = t
del s[i:j]
s[i:j:k] = t
del s[i:j:k]
s.append(x)
s.clear()
s.copy()
s.extend(t) or s += t
s *= n
s.insert(i, x)
s.pop([i])
s.remove(x)
s.reverse()
Result
item i of s is replaced by x
slice of s from i to j is replaced by the contents of the iterable t
same as s[i:j] = []
the elements of s[i:j:k] are replaced by those of t
removes the elements of s[i:j:k] from the list
appends x to the end of the sequence (same as s[len(s):len(s)] = [x])
removes all items from s (same as del s[:])
creates a shallow copy of s (same as s[:])
extends s with the contents of t (for the most part the same as s[len(s):len(s)] = t)
updates s with its contents repeated n times
inserts x into s at the index given by i (same as s[i:i] = [x])
retrieves the item at i and also removes it from s
remove the first item from s where s[i] == x
reverses the items of s in place
Table: s is an instance of a mutable sequence type, t is any iterable
object and x is an arbitrary object that meets any type and value
restrictions imposed by s
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
99
Introduction to Python for Biologists – Annexes
Built-in functions
abs()
all()
any()
ascii()
bin()
bool()
chr()
dict()
dir()
float()
format()
help()
hex()
Return the absolute value of a number.
Return True if all elements of the iterable are true (or if the iterable is empty).
Return True if any element of the iterable is true. If the iterable is empty, return False.
Return a string containing a printable representation of an object (escape non-ASCII characters).
Convert an integer number to a binary string.
Convert a value to a Boolean.
Return the string representing a character.
Create a new dictionary.
Return the list of names in the current local scope.
Convert a string or a number to floating point.
Convert a value to a ”formatted” representation.
Invoke the built-in help system.
Convert an integer number to a hexadecimal string.
Table: Python built-in functions
March 21, 2017
Johannes Gutenberg-Universität Mainz
Taškova & Fontaine
100