Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Module4–PythonandRegularExpressions • Module4containsonlyanindividualassignment • DueonMondayFebruary27th • DonotwaitunAlthelastminutetostartonthismodule • ReadtheWIKIbeforestarAngalongwithafewPythontutorials • PorAonsoftoday’sslidescamefrom – MarcConrad • UniversityofLuton – PaulPrescod • VancouverPythonUsers’Group – JamesCasey • Opscode – TimFinin • UniveristyofMaryland ExtensibleNetworkingPla3orm 1-CSE330–Crea+veProgrammingandRapidPrototyping 1 WhatisPython? • Pythonisaneasytolearn,powerfulprogramming language – Efficienthigh-leveldatastructures – Simpleapproachtoobject-oriented programming. – Elegantsyntaxanddynamictyping – Up-and-cominglanguageintheopensource world • WeareusingPythonversion2.x(not3.x) ExtensibleNetworkingPla3orm 2-CSE330–Crea+veProgrammingandRapidPrototyping 2 UsabilityFeatures • Veryclearsyntax • Obviouswaytodomostthings • Hugeamountoffreecodeandlibraries • InteracAve • OnlyinnovaAvewhereinnovaAonisreally necessary – BeSertostealagoodideathaninventabadone! ExtensibleNetworkingPla3orm 3-CSE330–Crea+veProgrammingandRapidPrototyping 3 Python“Helloworld" print "Hello, World" ExtensibleNetworkingPla3orm 4-CSE330–Crea+veProgrammingandRapidPrototyping 4 PythonInterpreter • Justtype: • Todds-MacBook-Air:~todd$python • Python2.7.10(default,Sep232015,04:34:21) • [GCC4.2.1CompaAbleAppleLLVM7.0.0(clang-700.0.72)] ondarwin • Type"help","copyright","credits"or"license"formore informaAon. ExtensibleNetworkingPla3orm 5-CSE330–Crea+veProgrammingandRapidPrototyping 5 FeaturesoftheInterpreter • Linesstartwith“>>>”.YoucanrecognizePython interpretertranscriptsanywhereyouseethem. • Expressionsthatreturnavaluedisplaythevalue. >>> 5+3*4 17 • Thissavesyoufromexcessive“print”ing ExtensibleNetworkingPla3orm 6-CSE330–Crea+veProgrammingandRapidPrototyping 6 InteracAveInterpreters • Windowscommandline • OSX • Linux/Unix • Graphicalcommandlines:“IDLE”,“PythonWin”, “MacPython”,… • Jython • Andmanymore… ExtensibleNetworkingPla3orm 7-CSE330–Crea+veProgrammingandRapidPrototyping 7 Pythonscripts • SomeAmesyouwanttorunthesameprogrammore thanonce! • MakeafilewithPythonstatementsinit: foo.py: print “hello world” todd$ python foo.py hello world todd$ python foo.py hello world ExtensibleNetworkingPla3orm 8-CSE330–Crea+veProgrammingandRapidPrototyping 8 Pythonisdynamicallytyped width = 20 print width height = 5 * 9 print height print width * height width = "really wide" print width ExtensibleNetworkingPla3orm 9-CSE330–Crea+veProgrammingandRapidPrototyping 9 ExperimentintheInterpreter • AnyPythonvariablecanholdanyvalue. >>> width = 20 >>> height = 5 * 9 >>> print width * height 900 >>> width = "really wide" >>> print width really wide ExtensibleNetworkingPla3orm 10-CSE330–Crea+veProgrammingandRapidPrototyping 10 DynamicTypeChecking test_sqrt.py: import math def square_root(num): return math.sqrt(num) def goodfunc(): print square_root(10) def badfunc(): print square_root("10") goodfunc() badfunc() ExtensibleNetworkingPla3orm 11-CSE330–Crea+veProgrammingandRapidPrototyping 11 MulAplestatementsonaline • YoucancombinemulAplesimplestatementsona line: >>> a = 5;print a; a = 6; print a 5 6 ExtensibleNetworkingPla3orm 12-CSE330–Crea+veProgrammingandRapidPrototyping 12 IndentaAon • PythonusesindentaAonforscoping: if this_function(that_variable): do_something() else: do_something_else() ExtensibleNetworkingPla3orm 13-CSE330–Crea+veProgrammingandRapidPrototyping 13 IndentaAon • Tabsandspaceslookthesameinmosteditors. • Ifyoureditorusesadifferentconversionrate betweentabsandspacesthan“standard”,your Pythoncodemaynotparseproperly. • ThreeeasysoluAons: 1. Onlyusetabsorspacesinafile:don’tmixthem. 2. UseaneditorthatknowsaboutPython. 3. Configureeditortousethesametab/spacerulesasPython,vi,emacs, notepad,edit,etc.:8spacespertab ExtensibleNetworkingPla3orm 14-CSE330–Crea+veProgrammingandRapidPrototyping 14 ComparedtoPHP/Javascript • ExcellentforWebapps(PHPonserver,Javascript onclient)butnotmuchelse. • PythoncanbeusedforyourWebapps,your complicatedalgorithms,yourGUIs,yourCOM components,anextensionlanguageforJava programs • EveninWebapps,Pythonhandlescomplexity beoer. ExtensibleNetworkingPla3orm 15-CSE330–Crea+veProgrammingandRapidPrototyping 15 ComparedtoJava • Javaismoredifficultforamateur programmers. • StaActypecheckingcanbeinconvenientand inflexible. • Booomline:Javacanmakeprojectsharder thantheyneedtobe. ExtensibleNetworkingPla3orm 16-CSE330–Crea+veProgrammingandRapidPrototyping 16 PythonLimitaAons • NotthefastestexecuAngprogramminglanguage: – – – – C/C++isnaturallyfast Perl’sregularexpressionsandIOarealiSlefaster SomeJavaimplementabonshavegoodJITs ButPythonalsohassomespeedadvantages: • Fastimplementabonsofbuilt-indatastructures • PyrexcompilesPythoncodetoC • DynamictypecheckingrequiresmorecareintesAng. • Languagechanges(relaAvely)quickly:thisisa strengthandaweakness. ExtensibleNetworkingPla3orm 17-CSE330–Crea+veProgrammingandRapidPrototyping 17 ObjectsAlltheWayDown • EverythinginPythonisanobject • Integersareobjects. • Charactersareobjects. • Complexnumbersareobjects. • Booleansareobjects. • FuncAonsareobjects. • Methodsareobjects. • Modulesareobjects ExtensibleNetworkingPla3orm 18-CSE330–Crea+veProgrammingandRapidPrototyping 18 ObjectTypeandIdenAty • Youcanfindoutthetypeofanyobject: >>>printtype(1) <type'int'> >>>printtype(1.0) <type'float'> • EveryobjectalsohasauniqueidenAfier(usuallyonly fordebuggingpurposes) >>>printid(1) 7629640 >>>printid("1") 7910560 ExtensibleNetworkingPla3orm 19-CSE330–Crea+veProgrammingandRapidPrototyping 19 None • “None”representsthelackofavalue. • Like“NULL”insomelanguagesorindatabases. • Forinstance: >>> if y!=0: ... fraction = x/y ... else: ... fraction = None ExtensibleNetworkingPla3orm 20-CSE330–Crea+veProgrammingandRapidPrototyping 20 FileObjects • Representopenedfiles: >>> infile = file( "catalog.txt", "r" ) >>> data = infile.read() >>> infile.close() >>> outfile = file( "catalog2.txt", "w" ) >>> data = data+ "more data" >>> outfile.write( data ) >>> outfile.close() • YoumaysomeAmesseethename“open”used tocreatefiles. ExtensibleNetworkingPla3orm 21-CSE330–Crea+veProgrammingandRapidPrototyping 21 BasicFlowControl • if/elif/else(testcondiAon) • while(loopunAlcondiAonchanges) • for(iterateoveriteraterableobject) ExtensibleNetworkingPla3orm 22-CSE330–Crea+veProgrammingandRapidPrototyping 22 ifStatement if j=="Hello": doSomething() elif j=="World": doSomethingElse() else: doTheRightThing() ExtensibleNetworkingPla3orm 23-CSE330–Crea+veProgrammingandRapidPrototyping 23 whileStatement str="" while str!="quit": str=raw_input() print str print "Done" ExtensibleNetworkingPla3orm 24-CSE330–Crea+veProgrammingandRapidPrototyping 24 forStatement myList = ["a", "b", "c", "d", "e"] for i in myList: print i for i in range( 10 ): print i for i in range( len( myList ) ): if myList[i]=="c": myList[i]=None • Can“break”outoffor-loops. • Can“conAnue”tonextiteraAon. ExtensibleNetworkingPla3orm 25-CSE330–Crea+veProgrammingandRapidPrototyping 25 PythonModules ExtensibleNetworkingPla3orm 26-CSE330–Crea+veProgrammingandRapidPrototyping 26 WhatisaModule? - AfilecontainingsomePythoncode OR - A.dll(.soonUnix)containingcompiledcodewhich followssomeguidelines - Anamespace ExtensibleNetworkingPla3orm 27-CSE330–Crea+veProgrammingandRapidPrototyping 27 APythonModule def hello_world(): print "Hello world" • Savethisas“myModule.py”Nowwecanuseit: >>> import myModule >>> myModule.hello_world() • Or: >>> from myModule import hello_world >>> hello_world() ExtensibleNetworkingPla3orm 28-CSE330–Crea+veProgrammingandRapidPrototyping 28 WebClientAccess-Example >>> import urllib2 >>> url = 'http://research.engineering.wustl.edu/ ~todd/date.php' >>> data = urllib2.urlopen(url) >>> for line in data: ... If ’Today’ in line: ... print line ... <BR>Today is 02-24-2016 ExtensibleNetworkingPla3orm 29-CSE330–Crea+veProgrammingandRapidPrototyping 29 OtherBuilt-inProtocols • • • • • • • • FTP XML-RPC Telnet POP IMAP MIME NNTP HTTP ExtensibleNetworkingPla3orm 30-CSE330–Crea+veProgrammingandRapidPrototyping • • • • • SSL Sockets CGI Gopher URLParsing • Plusdownloadablemodules foreveryotherprotocolin theuniverse! 30 ExtensibleNetworkingPla3orm 31-CSE330–Crea+veProgrammingandRapidPrototyping 31 RegularExpressions ExtensibleNetworkingPla3orm 32-CSE330–Crea+veProgrammingandRapidPrototyping 32 RegularExpressions • Regularexpressionsareapowerfulstring manipulaAontool • Allmodernlanguageshavesimilarlibrarypackages forregularexpressions • Useregularexpressionsto: – Searchastring(search and match) – Replacepartsofastring (sub) – Breakstringsintosmallerpieces (split) ExtensibleNetworkingPla3orm 33-CSE330–Crea+veProgrammingandRapidPrototyping 33 RegularExpressionSyntax • Mostcharactersmatchthemselves Theregularexpression“test”matchesthestring ‘test’,andonlythatstring • [x]matchesanyoneofalistofcharacters “[abc]”matches‘a’,‘b’,or ‘c’ • [^x]matchesanyonecharacterthatisnotincluded inx “[^abc]”matchesanysinglecharacterexcept ‘a’,’b’,or ‘c’ ExtensibleNetworkingPla3orm 34-CSE330–Crea+veProgrammingandRapidPrototyping 34 RegularExpressionSyntax • “.”matchesanysinglecharacter • Parenthesescanbeusedforgrouping “(abc)+”matches’abc’, ‘abcabc’, ‘abcabcabc’, etc. • x|ymatchesxory “this|that”matches‘this’ and ‘that’, butnot ‘thisthat’. ExtensibleNetworkingPla3orm 35-CSE330–Crea+veProgrammingandRapidPrototyping 35 RegularExpressionSyntax • x*matcheszeroormorex’s “a*”matches’’,’a’,’aa’, etc. • x+matchesoneormorex’s “a+”matches’a’,’aa’,’aaa’,etc. • x?matcheszerooronex’s “a?”matches’’or’a’ • x{m,n}matchesix‘s,wherem<i<n “a{2,3}”matches’aa’ or ’aaa’ ExtensibleNetworkingPla3orm 36-CSE330–Crea+veProgrammingandRapidPrototyping 36 RegularExpressionSyntax • “\d”matchesanydigit; “\D”anynon-digit • “\s”matchesanywhitespacecharacter; “\S”anynon-whitespacecharacter • “\w”matchesanyalphanumericcharacter; “\W”anynon-alphanumericcharacter • “^”matchesthebeginningofthestring; “$”theendofthestring ExtensibleNetworkingPla3orm 37-CSE330–Crea+veProgrammingandRapidPrototyping 37 DebuggexExample ExtensibleNetworkingPla3orm 38-CSE330–Crea+veProgrammingandRapidPrototyping 38 SearchandMatchinPythonRegEx • ThetwobasicfuncAonsarere.searchandre.match – SearchlooksforapaSernanywhereinastring – Matchlooksforamatchstarbngatthebeginning • BothreturnNone(logicalfalse)ifthepaoernisn’t foundanda“matchobject”instanceifitis >>> import re >>> pat = "a*b” >>> re.search(pat,"fooaaabcde") <_sre.SRE_Match object at 0x809c0> >>> re.match(pat,"fooaaabcde") >>> ExtensibleNetworkingPla3orm 39-CSE330–Crea+veProgrammingandRapidPrototyping 39 What’samatchobject? • Aninstanceofthematchclasswiththedetailsofthe matchresult >>> r1 = re.search("a*b","fooaaabcde") >>> r1.group() # group returns string matched 'aaab' >>> r1.start() # index of the match start 3 >>> r1.end() # index of the match end 7 >>> r1.span() # tuple of (start, end) (3, 7) ExtensibleNetworkingPla3orm 40-CSE330–Crea+veProgrammingandRapidPrototyping 40 Whatgotmatched? • Here’sapaoerntomatchsimpleemailaddresses \w+@(\w+\.)+(com|org|net|edu) >>> pat1 = "\w+@(\w+\.)+(com|org|net|edu)" >>> r1 = re.match(pat1,"[email protected]") >>> r1.group() ’[email protected]’ • Wemightwanttoextractthepaoernparts,likethe emailnameandhost ExtensibleNetworkingPla3orm 41-CSE330–Crea+veProgrammingandRapidPrototyping 41 Whatgotmatched? • Wecanputparenthesesaroundgroupswewanttobe abletoreference >>> pat2 = "(\w+)@((\w+\.)+(com|org|net|edu))" >>> r2 = re.match(pat2,”[email protected]") >>> r2.group(1) ’todd' >>> r2.group(2) ’arl.wustl.edu' >>> r2.groups() r2.groups() (’todd', ’arl.wustl.edu', ’wustl.', 'edu’) • Notethatthe‘groups’arenumberedinapreorder traversal ExtensibleNetworkingPla3orm 42-CSE330–Crea+veProgrammingandRapidPrototyping 42 Whatgotmatched? • Wecan‘label’thegroupsaswell… >>> pat3 ="(?P<name>\w+)@(?P<host>(\w+\.)+ (com|org|net|edu))" >>> r3 = re.match(pat3,"[email protected]") >>> r3.group('name') ’todd' >>> r3.group('host') ’arl.wustl.edu’ • Andreferencethematchingpartsbythelabels ExtensibleNetworkingPla3orm 43-CSE330–Crea+veProgrammingandRapidPrototyping 43 MorerefuncAons • re.split()islikesplitbutcanusepaoerns >>> re.split("\W+", “This... is a test, short and sweet, of split().”) ['This', 'is', 'a', 'test', 'short’, 'and', 'sweet', 'of', 'split’, ‘’] • re.subsubsAtutesonestringforapaoern >>> re.sub('(blue|white|red)', 'black', 'blue socks and red shoes') 'black socks and black shoes’ • re.findall()findsallmatches >>> re.findall("\d+”,"12 dogs,11 cats, 1 egg") ['12', '11', ’1’] ExtensibleNetworkingPla3orm 44-CSE330–Crea+veProgrammingandRapidPrototyping 44 Compilingregularexpressions • Ifyouplantousearepaoernmorethanonce,compileit toareobject • Pythonproducesaspecialdatastructurethatspeedsup matching >>> cpat3 = re.compile(pat3) >>> cpat3 <_sre.SRE_Pattern object at 0x2d9c0> >>> r3 = cpat3.search("[email protected]") >>> r3 <_sre.SRE_Match object at 0x895a0> >>> r3.group() ’[email protected]' ExtensibleNetworkingPla3orm 45-CSE330–Crea+veProgrammingandRapidPrototyping 45 Module4Assignment ExtensibleNetworkingPla3orm 46-CSE330–Crea+veProgrammingandRapidPrototyping 46