Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
AboutThiseBook ePUBisanopen,industry-standardformatforeBooks.However,supportofePUBand itsmanyfeaturesvariesacrossreadingdevicesandapplications.Useyourdeviceorapp settingstocustomizethepresentationtoyourliking.Settingsthatyoucancustomizeoften includefont,fontsize,singleordoublecolumn,landscapeorportraitmode,andfigures thatyoucanclickortaptoenlarge.Foradditionalinformationaboutthesettingsand featuresonyourreadingdeviceorapp,visitthedevicemanufacturer’sWebsite. Manytitlesincludeprogrammingcodeorconfigurationexamples.Tooptimizethe presentationoftheseelements,viewtheeBookinsingle-column,landscapemodeand adjustthefontsizetothesmallestsetting.Inadditiontopresentingcodeand configurationsinthereflowabletextformat,wehaveincludedimagesofthecodethat mimicthepresentationfoundintheprintbook;therefore,wherethereflowableformat maycompromisethepresentationofthecodelisting,youwillseea“Clickheretoview codeimage”link.Clickthelinktoviewtheprint-fidelitycodeimage.Toreturntothe previouspageviewed,clicktheBackbuttononyourdeviceorapp. EffectivePython 59SPECIFICWAYSTOWRITEBETTERPYTHON BrettSlatkin UpperSaddleRiver,NJ•Boston•Indianapolis•SanFrancisco NewYork•Toronto•Montreal•London•Munich•Paris•Madrid Capetown•Sydney•Tokyo•Singapore•MexicoCity Manyofthedesignationsusedbymanufacturersandsellerstodistinguishtheirproducts areclaimedastrademarks.Wherethosedesignationsappearinthisbook,andthe publisherwasawareofatrademarkclaim,thedesignationshavebeenprintedwithinitial capitallettersorinallcapitals. Theauthorandpublisherhavetakencareinthepreparationofthisbook,butmakeno expressedorimpliedwarrantyofanykindandassumenoresponsibilityforerrorsor omissions.Noliabilityisassumedforincidentalorconsequentialdamagesinconnection withorarisingoutoftheuseoftheinformationorprogramscontainedherein. Forinformationaboutbuyingthistitleinbulkquantities,orforspecialsalesopportunities (whichmayincludeelectronicversions;customcoverdesigns;andcontentparticularto yourbusiness,traininggoals,marketingfocus,orbrandinginterests),pleasecontactour [email protected](800)382-3419. Forgovernmentsalesinquiries,[email protected]. ForquestionsaboutsalesoutsidetheUnitedStates,pleasecontact [email protected]. VisitusontheWeb:informit.com/aw LibraryofCongressCataloging-in-PublicationData Slatkin,Brett,author. EffectivePython:59specificwaystowritebetterPython/BrettSlatkin. pagescm Includesindex. ISBN978-0-13-403428-7(pbk.:alk.paper)—ISBN0-13-403428-7(pbk.:alk.paper) 1.Python(Computerprogramlanguage)2.Computerprogramming.I.Title. QA76.73.P98S572015 005.13’3—dc23 2014048305 Copyright©2015PearsonEducation,Inc. Allrightsreserved.PrintedintheUnitedStatesofAmerica.Thispublicationisprotected bycopyright,andpermissionmustbeobtainedfromthepublisherpriortoanyprohibited reproduction,storageinaretrievalsystem,ortransmissioninanyformorbyanymeans, electronic,mechanical,photocopying,recording,orlikewise.Toobtainpermissiontouse materialfromthiswork,pleasesubmitawrittenrequesttoPearsonEducation,Inc., PermissionsDepartment,OneLakeStreet,UpperSaddleRiver,NewJersey07458,oryou mayfaxyourrequestto(201)236-3290. ISBN-13:978-0-13-403428-7 ISBN-10:0-13-403428-7 TextprintedintheUnitedStatesonrecycledpaperatRRDonnelleyin Crawfordsville,Indiana. Firstprinting,March2015 Editor-in-Chief MarkL.Taub SeniorAcquisitionsEditor TrinaMacDonald ManagingEditor JohnFuller Full-ServiceProductionManager JulieB.Nahil CopyEditor StephanieGeels Indexer JackLewis Proofreader MelissaPanagos TechnicalReviewers BrettCannon TavisRudd MikeTaylor EditorialAssistant OliviaBasegio CoverDesigner ChutiPrasertsith Compositor LaurelTech PraiseforEffectivePython “EachiteminSlatkin’sEffectivePythonteachesaself-containedlessonwithits ownsourcecode.Thismakesthebookrandom-access:Itemsareeasytobrowse andstudyinwhateverorderthereaderneeds.IwillberecommendingEffective Pythontostudentsasanadmirablycompactsourceofmainstreamadviceona verybroadrangeoftopicsfortheintermediatePythonprogrammer.” —BrandonRhodes,softwareengineeratDropboxandchairofPyCon2016-2017 “I’vebeenprogramminginPythonforyearsandthoughtIknewitprettywell. Thankstothistreasuretroveoftipsandtechniques,Irealizethere’ssomuchmore IcouldbedoingwithmyPythoncodetomakeitfaster(e.g.,usingbuilt-indata structures),easiertoread(e.g.,enforcingkeyword-onlyarguments),andmuch morePythonic(e.g.,usingziptoiterateoverlistsinparallel).” —PamelaFox,educationeer,KhanAcademy “IfIhadthisbookwhenIfirstswitchedfromJavatoPython,itwouldhavesaved memanymonthsofrepeatedcoderewrites,whichhappenedeachtimeIrealizedI wasdoingparticularthings‘non-Pythonically.’Thisbookcollectsthevast majorityofbasicPython‘must-knows’intooneplace,eliminatingtheneedto stumbleuponthemone-by-oneoverthecourseofmonthsoryears.Thescopeof thebookisimpressive,startingwiththeimportanceofPEP8aswellasthatof majorPythonidioms,thenreachingthroughfunction,methodandclassdesign, effectivestandardlibraryuse,qualityAPIdesign,testing,andperformance measurement—thisbookreallyhasitall.Afantasticintroductiontowhatitreally meanstobeaPythonprogrammerforboththenoviceandtheexperienced developer.” —MikeBayer,creatorofSQLAlchemy “EffectivePythonwilltakeyourPythonskillstothenextlevelwithclear guidelinesforimprovingPythoncodestyleandfunction.” —LeahCulver,developeradvocate,Dropbox “Thisbookisanexceptionallygreatresourceforseasoneddevelopersinother languageswhoarelookingtoquicklypickupPythonandmovebeyondthebasic languageconstructsintomorePythoniccode.Theorganizationofthebookis clear,concise,andeasytodigest,andeachitemandchaptercanstandonitsown asameditationonaparticulartopic.Thebookcoversthebreadthoflanguage constructsinpurePythonwithoutconfusingthereaderwiththecomplexitiesof thebroaderPythonecosystem.Formoreseasoneddevelopersthebookprovides in-depthexamplesoflanguageconstructstheymaynothavepreviously encountered,andprovidesexamplesoflesscommonlyusedlanguagefeatures.It isclearthattheauthorisexceptionallyfacilewithPython,andheuseshis professionalexperiencetoalertthereadertocommonsubtlebugsandcommon failuremodes.Furthermore,thebookdoesanexcellentjobofpointingout subtletiesbetweenPython2.XandPython3.Xandcouldserveasarefresher courseasonetransitionsbetweenvariantsofPython.” —KatherineScott,softwarelead,TempoAutomation “Thisisagreatbookforbothnoviceandexperiencedprogrammers.Thecode examplesandexplanationsarewellthoughtoutandexplainedconciselyand thoroughly.” —C.TitusBrown,associateprofessor,UCDavis “ThisisanimmenselyusefulresourceforadvancedPythonusageandbuilding cleaner,moremaintainablesoftware.AnyonelookingtotaketheirPythonskillsto thenextlevelwouldbenefitfromputtingthebook’sadviceintopractice.” —WesMcKinney,creatorofpandas;authorofPythonforDataAnalysis;and softwareengineeratCloudera Toourfamily,lovedandlost Contents Preface Acknowledgments AbouttheAuthor Chapter1:PythonicThinking Item1:KnowWhichVersionofPythonYou’reUsing Item2:FollowthePEP8StyleGuide Item3:KnowtheDifferencesBetweenbytes,str,andunicode Item4:WriteHelperFunctionsInsteadofComplexExpressions Item5:KnowHowtoSliceSequences Item6:AvoidUsingstart,end,andstrideinaSingleSlice Item7:UseListComprehensionsInsteadofmapandfilter Item8:AvoidMoreThanTwoExpressionsinListComprehensions Item9:ConsiderGeneratorExpressionsforLargeComprehensions Item10:PreferenumerateOverrange Item11:UseziptoProcessIteratorsinParallel Item12:AvoidelseBlocksAfterforandwhileLoops Item13:TakeAdvantageofEachBlockintry/except/else/finally Chapter2:Functions Item14:PreferExceptionstoReturningNone Item15:KnowHowClosuresInteractwithVariableScope Item16:ConsiderGeneratorsInsteadofReturningLists Item17:BeDefensiveWhenIteratingOverArguments Item18:ReduceVisualNoisewithVariablePositionalArguments Item19:ProvideOptionalBehaviorwithKeywordArguments Item20:UseNoneandDocstringstoSpecifyDynamicDefaultArguments Item21:EnforceClaritywithKeyword-OnlyArguments Chapter3:ClassesandInheritance Item22:PreferHelperClassesOverBookkeepingwithDictionariesandTuples Item23:AcceptFunctionsforSimpleInterfacesInsteadofClasses Item24:Use@classmethodPolymorphismtoConstructObjectsGenerically Item25:InitializeParentClasseswithsuper Item26:UseMultipleInheritanceOnlyforMix-inUtilityClasses Item27:PreferPublicAttributesOverPrivateOnes Item28:Inheritfromcollections.abcforCustomContainerTypes Chapter4:MetaclassesandAttributes Item29:UsePlainAttributesInsteadofGetandSetMethods Item30:Consider@propertyInsteadofRefactoringAttributes Item31:UseDescriptorsforReusable@propertyMethods Item32:Use__getattr__,__getattribute__,and__setattr__forLazy Attributes Item33:ValidateSubclasseswithMetaclasses Item34:RegisterClassExistencewithMetaclasses Item35:AnnotateClassAttributeswithMetaclasses Chapter5:ConcurrencyandParallelism Item36:UsesubprocesstoManageChildProcesses Item37:UseThreadsforBlockingI/O,AvoidforParallelism Item38:UseLocktoPreventDataRacesinThreads Item39:UseQueuetoCoordinateWorkBetweenThreads Item40:ConsiderCoroutinestoRunManyFunctionsConcurrently Item41:Considerconcurrent.futuresforTrueParallelism Chapter6:Built-inModules Item42:DefineFunctionDecoratorswithfunctools.wraps Item43:ConsidercontextlibandwithStatementsforReusabletry/finally Behavior Item44:MakepickleReliablewithcopyreg Item45:UsedatetimeInsteadoftimeforLocalClocks Item46:UseBuilt-inAlgorithmsandDataStructures Item47:UsedecimalWhenPrecisionIsParamount Item48:KnowWheretoFindCommunity-BuiltModules Chapter7:Collaboration Item49:WriteDocstringsforEveryFunction,Class,andModule Item50:UsePackagestoOrganizeModulesandProvideStableAPIs Item51:DefineaRootExceptiontoInsulateCallersfromAPIs Item52:KnowHowtoBreakCircularDependencies Item53:UseVirtualEnvironmentsforIsolatedandReproducibleDependencies Chapter8:Production Item54:ConsiderModule-ScopedCodetoConfigureDeploymentEnvironments Item55:UsereprStringsforDebuggingOutput Item56:TestEverythingwithunittest Item57:ConsiderInteractiveDebuggingwithpdb Item58:ProfileBeforeOptimizing Item59:UsetracemalloctoUnderstandMemoryUsageandLeaks Index Preface ThePythonprogramminglanguagehasuniquestrengthsandcharmsthatcanbehardto grasp.ManyprogrammersfamiliarwithotherlanguagesoftenapproachPythonfroma limitedmindsetinsteadofembracingitsfullexpressivity.Someprogrammersgotoofarin theotherdirection,overusingPythonfeaturesthatcancausebigproblemslater. ThisbookprovidesinsightintothePythonicwayofwritingprograms:thebestwaytouse Python.ItbuildsonafundamentalunderstandingofthelanguagethatIassumeyou alreadyhave.NoviceprogrammerswilllearnthebestpracticesofPython’scapabilities. Experiencedprogrammerswilllearnhowtoembracethestrangenessofanewtoolwith confidence. MygoalistoprepareyoutomakeabigimpactwithPython. WhatThisBookCovers Eachchapterinthisbookcontainsabroadbutrelatedsetofitems.Feelfreetojump betweenitemsandfollowyourinterest.Eachitemcontainsconciseandspecificguidance explaininghowyoucanwritePythonprogramsmoreeffectively.Itemsincludeadviceon whattodo,whattoavoid,howtostriketherightbalance,andwhythisisthebestchoice. TheitemsinthisbookareforPython3andPython2programmersalike(seeItem1: “KnowWhichVersionofPythonYou’reUsing”).Programmersusingalternativeruntimes likeJython,IronPython,orPyPyshouldalsofindthemajorityofitemstobeapplicable. Chapter1:PythonicThinking ThePythoncommunityhascometousetheadjectivePythonictodescribecodethat followsaparticularstyle.TheidiomsofPythonhaveemergedovertimethrough experienceusingthelanguageandworkingwithothers.Thischaptercoversthebestway todothemostcommonthingsinPython. Chapter2:Functions FunctionsinPythonhaveavarietyofextrafeaturesthatmakeaprogrammer’slifeeasier. Somearesimilartocapabilitiesinotherprogramminglanguages,butmanyareuniqueto Python.Thischaptercovershowtousefunctionstoclarifyintention,promotereuse,and reducebugs. Chapter3:ClassesandInheritance Pythonisanobject-orientedlanguage.GettingthingsdoneinPythonoftenrequires writingnewclassesanddefininghowtheyinteractthroughtheirinterfacesand hierarchies.Thischaptercovershowtouseclassesandinheritancetoexpressyour intendedbehaviorswithobjects. Chapter4:MetaclassesandAttributes MetaclassesanddynamicattributesarepowerfulPythonfeatures.However,theyalso enableyoutoimplementextremelybizarreandunexpectedbehaviors.Thischaptercovers thecommonidiomsforusingthesemechanismstoensurethatyoufollowtheruleofleast surprise. Chapter5:ConcurrencyandParallelism Pythonmakesiteasytowriteconcurrentprogramsthatdomanydifferentthings seeminglyatthesametime.Pythoncanalsobeusedtodoparallelworkthroughsystem calls,subprocesses,andC-extensions.ThischaptercovershowtobestutilizePythonin thesesubtlydifferentsituations. Chapter6:Built-inModules Pythonisinstalledwithmanyoftheimportantmodulesthatyou’llneedtowriteprograms. ThesestandardpackagesaresocloselyintertwinedwithidiomaticPythonthattheymayas wellbepartofthelanguagespecification.Thischaptercoverstheessentialbuilt-in modules. Chapter7:Collaboration CollaboratingonPythonprogramsrequiresyoutobedeliberateabouthowyouwriteyour code.Evenifyou’reworkingalone,you’llwanttounderstandhowtousemoduleswritten byothers.Thischaptercoversthestandardtoolsandbestpracticesthatenablepeopleto worktogetheronPythonprograms. Chapter8:Production Pythonhasfacilitiesforadaptingtomultipledeploymentenvironments.Italsohasbuilt-in modulesthataidinhardeningyourprogramsandmakingthembulletproof.Thischapter covershowtousePythontodebug,optimize,andtestyourprogramstomaximizequality andperformanceatruntime. ConventionsUsedinThisBook Pythoncodesnippetsinthisbookareinmonospacefontandhavesyntax highlighting.ItakesomeartisticlicensewiththePythonstyleguidetomakethecode examplesbetterfittheformatofabookortohighlightthemostimportantparts.When linesarelong,Iuse characterstoindicatethattheywrap.Itruncatesnippetswith ellipsescomments(#…)toindicateregionswherecodeexiststhatisn’tessentialfor expressingthepoint.I’vealsoleftoutembeddeddocumentationtoreducethesizeofcode examples.Istronglysuggestthatyoudon’tdothisinyourprojects;instead,youshould followthestyleguide(seeItem2:“FollowthePEP8StyleGuide”)andwrite documentation(seeItem49:“WriteDocstringsforEveryFunction,Class,andModule”). Mostcodesnippetsinthisbookareaccompaniedbythecorrespondingoutputfrom runningthecode.WhenIsay“output,”Imeanconsoleorterminaloutput:whatyousee whenrunningthePythonprograminaninteractiveinterpreter.Outputsectionsarein monospacefontandareprecededbya>>>line(thePythoninteractiveprompt).Theidea isthatyoucouldtypethecodesnippetsintoaPythonshellandreproducetheexpected output. Finally,therearesomeothersectionsinmonospacefontthatarenotprecededbya>>> line.TheserepresenttheoutputofrunningprogramsbesidesthePythoninterpreter.These examplesoftenbeginwith$characterstoindicatethatI’mrunningprogramsfroma command-lineshelllikeBash. WheretoGettheCodeandErrata It’susefultoviewsomeoftheexamplesinthisbookaswholeprogramswithout interleavedprose.Thisalsogivesyouachancetotinkerwiththecodeyourselfand understandwhytheprogramworksasdescribed.Youcanfindthesourcecodeforallcode snippetsinthisbookonthebook’swebsite(http://www.effectivepython.com).Anyerrors foundinthebookwillhavecorrectionspostedonthewebsite. Acknowledgments Thisbookwouldnothavebeenpossiblewithouttheguidance,support,and encouragementfrommanypeopleinmylife. ThankstoScottMeyersfortheEffectiveSoftwareDevelopmentseries.Ifirstread EffectiveC++whenIwas15yearsoldandfellinlovewiththelanguage.There’sno doubtthatScott’sbooksledtomyacademicexperienceandfirstjobatGoogle.I’m thrilledtohavehadtheopportunitytowritethisbook. Thankstomycoretechnicalreviewersforthedepthandthoroughnessoftheirfeedback: BrettCannon,TavisRudd,andMikeTaylor.ThankstoLeahCulverandAdrianHolovaty forthinkingthisbookwouldbeagoodidea.Thankstomyfriendswhopatientlyread earlierversionsofthisbook:MichaelLevine,MarziaNiccolai,AdeOshineye,andKatrina Sostek.ThankstomycolleaguesatGooglefortheirreview.Withoutallofyourhelp,this bookwouldhavebeeninscrutable. Thankstoeveryoneinvolvedinmakingthisbookareality.ThankstomyeditorTrina MacDonaldforkickingofftheprocessandbeingsupportivethroughout.Thankstothe teamwhowereinstrumental:developmenteditorsTomCirtinandChrisZahn,editorial assistantOliviaBasegio,marketingmanagerStephaneNakib,copyeditorStephanie Geels,andproductioneditorJulieNahil. ThankstothewonderfulPythonprogrammersI’veknownandworkedwith:Anthony Baxter,BrettCannon,WesleyChun,JeremyHylton,AlexMartelli,NealNorwitz,Guido vanRossum,AndySmith,GregStein,andKa-PingYee.Iappreciateyourtutelageand leadership.PythonhasanexcellentcommunityandIfeelluckytobeapartofit. Thankstomyteammatesovertheyearsforlettingmebetheworstplayerintheband. ThankstoKevinGibbsforhelpingmetakerisks.ThankstoKenAshcraft,RyanBarrett, andJonMcAlisterforshowingmehowit’sdone.ThankstoBradFitzpatrickfortakingit tothenextlevel.ThankstoPaulMcDonaldforco-foundingourcrazyproject.Thanksto JeremyGinsbergandJackHebertformakingitareality. ThankstotheinspiringprogrammingteachersI’vehad:BenChelf,VinceHugo,Russ Lewin,JonStemmle,DerekThomson,andDanielWang.Withoutyourinstruction,I wouldneverhavepursuedourcraftorgainedtheperspectiverequiredtoteachothers. Thankstomymotherforgivingmeasenseofpurposeandencouragingmetobecomea programmer.Thankstomybrother,mygrandparents,andtherestofmyfamilyand childhoodfriendsforbeingrolemodelsasIgrewupandfoundmypassion. Finally,thankstomywife,Colleen,forherlove,support,andlaughterthroughthejourney oflife. AbouttheAuthor BrettSlatkinisaseniorstaffsoftwareengineeratGoogle.Heistheengineeringleadand co-founderofGoogleConsumerSurveys.HeformerlyworkedonGoogleAppEngine’s Pythoninfrastructure.Heistheco-creatorofthePubSubHubbubprotocol.Nineyearsago hecuthisteethusingPythontomanageGoogle’senormousfleetofservers. Outsideofhisdayjob,heworksonopensourcetoolsandwritesaboutsoftware,bicycles, andothertopicsonhispersonalwebsite(http://onebigfluke.com).HeearnedhisB.S.in computerengineeringfromColumbiaUniversityintheCityofNewYork.HelivesinSan Francisco. 1.PythonicThinking Theidiomsofaprogramminglanguagearedefinedbyitsusers.Overtheyears,the PythoncommunityhascometousetheadjectivePythonictodescribecodethatfollowsa particularstyle.ThePythonicstyleisn’tregimentedorenforcedbythecompiler.Ithas emergedovertimethroughexperienceusingthelanguageandworkingwithothers. Pythonprogrammersprefertobeexplicit,tochoosesimpleovercomplex,andto maximizereadability(typeimportthis). ProgrammersfamiliarwithotherlanguagesmaytrytowritePythonasifit’sC++,Java,or whatevertheyknowbest.Newprogrammersmaystillbegettingcomfortablewiththevast rangeofconceptsexpressibleinPython.It’simportantforeveryonetoknowthebest—the Pythonic—waytodothemostcommonthingsinPython.Thesepatternswillaffectevery programyouwrite. Item1:KnowWhichVersionofPythonYou’reUsing Throughoutthisbook,themajorityofexamplecodeisinthesyntaxofPython3.4 (releasedMarch17,2014).Thisbookalsoprovidessomeexamplesinthesyntaxof Python2.7(releasedJuly3,2010)tohighlightimportantdifferences.Mostofmyadvice appliestoallofthepopularPythonruntimes:CPython,Jython,IronPython,PyPy,etc. ManycomputerscomewithmultipleversionsofthestandardCPythonruntime preinstalled.However,thedefaultmeaningofpythononthecommand-linemaynotbe clear.pythonisusuallyanaliasforpython2.7,butitcansometimesbeanaliasfor olderversionslikepython2.6orpython2.5.Tofindoutexactlywhichversionof Pythonyou’reusing,youcanusethe--versionflag. $python—version Python2.7.8 Python3isusuallyavailableunderthenamepython3. $python3—version Python3.4.2 YoucanalsofigureouttheversionofPythonyou’reusingatruntimebyinspectingvalues inthesysbuilt-inmodule. Clickheretoviewcodeimage importsys print(sys.version_info) print(sys.version) >>> sys.version_info(major=3,minor=4,micro=2,releaselevel=‘final’,serial=0) 3.4.2(default,Oct192014,17:52:17) [GCC4.2.1CompatibleAppleLLVM6.0(clang-600.0.51)] Python2andPython3arebothactivelymaintainedbythePythoncommunity. DevelopmentonPython2isfrozenbeyondbugfixes,securityimprovements,and backportstoeasethetransitionfromPython2toPython3.Helpfultoolslikethe2to3 andsixexisttomakeiteasiertoadoptPython3goingforward. Python3isconstantlygettingnewfeaturesandimprovementsthatwillneverbeaddedto Python2.Asofthewritingofthisbook,themajorityofPython’smostcommonopen sourcelibrariesarecompatiblewithPython3.IstronglyencourageyoutousePython3 foryournextPythonproject. ThingstoRemember TherearetwomajorversionsofPythonstillinactiveuse:Python2andPython3. TherearemultiplepopularruntimesforPython:CPython,Jython,IronPython,PyPy, etc. Besurethatthecommand-lineforrunningPythononyoursystemistheversionyou expectittobe. PreferPython3foryournextprojectbecausethatistheprimaryfocusofthePython community. Item2:FollowthePEP8StyleGuide PythonEnhancementProposal#8,otherwiseknownasPEP8,isthestyleguideforhowto formatPythoncode.YouarewelcometowritePythoncodehoweveryouwant,aslongas ithasvalidsyntax.However,usingaconsistentstylemakesyourcodemoreapproachable andeasiertoread.SharingacommonstylewithotherPythonprogrammersinthelarger communityfacilitatescollaborationonprojects.Butevenifyouaretheonlyonewhowill everreadyourcode,followingthestyleguidewillmakeiteasiertochangethingslater. PEP8hasawealthofdetailsabouthowtowriteclearPythoncode.Itcontinuestobe updatedasthePythonlanguageevolves.It’sworthreadingthewholeguideonline (http://www.python.org/dev/peps/pep-0008/).Hereareafewrulesyoushouldbesureto follow: Whitespace:InPython,whitespaceissyntacticallysignificant.Pythonprogrammers areespeciallysensitivetotheeffectsofwhitespaceoncodeclarity. •Usespacesinsteadoftabsforindentation. •Usefourspacesforeachlevelofsyntacticallysignificantindenting. •Linesshouldbe79charactersinlengthorless. •Continuationsoflongexpressionsontoadditionallinesshouldbeindentedbyfour extraspacesfromtheirnormalindentationlevel. •Inafile,functionsandclassesshouldbeseparatedbytwoblanklines. •Inaclass,methodsshouldbeseparatedbyoneblankline. •Don’tputspacesaroundlistindexes,functioncalls,orkeywordargument assignments. •Putone—andonlyone—spacebeforeandaftervariableassignments. Naming:PEP8suggestsuniquestylesofnamingfordifferentpartsinthelanguage. Thismakesiteasytodistinguishwhichtypecorrespondstoeachnamewhenreading code. •Functions,variables,andattributesshouldbeinlowercase_underscore format. •Protectedinstanceattributesshouldbein_leading_underscoreformat. •Privateinstanceattributesshouldbein__double_leading_underscore format. •ClassesandexceptionsshouldbeinCapitalizedWordformat. •Module-levelconstantsshouldbeinALL_CAPSformat. •Instancemethodsinclassesshoulduseselfasthenameofthefirstparameter (whichreferstotheobject). •Classmethodsshoulduseclsasthenameofthefirstparameter(whichrefersto theclass). ExpressionsandStatements:TheZenofPythonstates:“Thereshouldbeone—and preferablyonlyone—obviouswaytodoit.”PEP8attemptstocodifythisstyleinits guidanceforexpressionsandstatements. •Useinlinenegation(ifaisnotb)insteadofnegationofpositiveexpressions (ifnotaisb). •Don’tcheckforemptyvalues(like[]or'')bycheckingthelength(if len(somelist)==0).Useifnotsomelistandassumeemptyvalues implicitlyevaluatetoFalse. •Thesamethinggoesfornon-emptyvalues(like[1]or'hi').Thestatementif somelistisimplicitlyTruefornon-emptyvalues. •Avoidsingle-lineifstatements,forandwhileloops,andexceptcompound statements.Spreadtheseovermultiplelinesforclarity. •Alwaysputimportstatementsatthetopofafile. •Alwaysuseabsolutenamesformoduleswhenimportingthem,notnamesrelativeto thecurrentmodule’sownpath.Forexample,toimportthefoomodulefromthe barpackage,youshoulddofrombarimportfoo,notjustimportfoo. •Ifyoumustdorelativeimports,usetheexplicitsyntaxfrom.importfoo. •Importsshouldbeinsectionsinthefollowingorder:standardlibrarymodules,thirdpartymodules,yourownmodules.Eachsubsectionshouldhaveimportsin alphabeticalorder. Note ThePylinttool(http://www.pylint.org/)isapopularstaticanalyzerforPython sourcecode.PylintprovidesautomatedenforcementofthePEP8styleguideand detectsmanyothertypesofcommonerrorsinPythonprograms. ThingstoRemember AlwaysfollowthePEP8styleguidewhenwritingPythoncode. SharingacommonstylewiththelargerPythoncommunityfacilitatescollaboration withothers. Usingaconsistentstylemakesiteasiertomodifyyourowncodelater. Item3:KnowtheDifferencesBetweenbytes,str,and unicode InPython3,therearetwotypesthatrepresentsequencesofcharacters:bytesandstr. Instancesofbytescontainraw8-bitvalues.InstancesofstrcontainUnicode characters. InPython2,therearetwotypesthatrepresentsequencesofcharacters:strand unicode.IncontrasttoPython3,instancesofstrcontainraw8-bitvalues.Instancesof unicodecontainUnicodecharacters. TherearemanywaystorepresentUnicodecharactersasbinarydata(raw8-bitvalues). ThemostcommonencodingisUTF-8.Importantly,strinstancesinPython3and unicodeinstancesinPython2donothaveanassociatedbinaryencoding.Toconvert Unicodecharacterstobinarydata,youmustusetheencodemethod.Toconvertbinary datatoUnicodecharacters,youmustusethedecodemethod. Whenyou’rewritingPythonprograms,it’simportanttodoencodinganddecodingof Unicodeatthefurthestboundaryofyourinterfaces.Thecoreofyourprogramshoulduse Unicodecharactertypes(strinPython3,unicodeinPython2)andshouldnotassume anythingaboutcharacterencodings.Thisapproachallowsyoutobeveryacceptingof alternativetextencodings(suchasLatin-1,ShiftJIS,andBig5)whilebeingstrictabout youroutputtextencoding(ideally,UTF-8). ThesplitbetweencharactertypesleadstotwocommonsituationsinPythoncode: Youwanttooperateonraw8-bitvaluesthatareUTF-8-encodedcharacters(orsome otherencoding). YouwanttooperateonUnicodecharactersthathavenospecificencoding. You’lloftenneedtwohelperfunctionstoconvertbetweenthesetwocasesandtoensure thatthetypeofinputvaluesmatchesyourcode’sexpectations. InPython3,you’llneedonemethodthattakesastrorbytesandalwaysreturnsa str. Clickheretoviewcodeimage defto_str(bytes_or_str): ifisinstance(bytes_or_str,bytes): value=bytes_or_str.decode(‘utf-8’) else: value=bytes_or_str returnvalue#Instanceofstr You’llneedanothermethodthattakesastrorbytesandalwaysreturnsabytes. Clickheretoviewcodeimage defto_bytes(bytes_or_str): ifisinstance(bytes_or_str,str): value=bytes_or_str.encode(‘utf-8’) else: value=bytes_or_str returnvalue#Instanceofbytes InPython2,you’llneedonemethodthattakesastrorunicodeandalwaysreturnsa unicode. Clickheretoviewcodeimage #Python2 defto_unicode(unicode_or_str): ifisinstance(unicode_or_str,str): value=unicode_or_str.decode(‘utf-8’) else: value=unicode_or_str returnvalue#Instanceofunicode You’llneedanothermethodthattakesstrorunicodeandalwaysreturnsastr. Clickheretoviewcodeimage #Python2 defto_str(unicode_or_str): ifisinstance(unicode_or_str,unicode): value=unicode_or_str.encode(‘utf-8’) else: value=unicode_or_str returnvalue#Instanceofstr Therearetwobiggotchaswhendealingwithraw8-bitvaluesandUnicodecharactersin Python. ThefirstissueisthatinPython2,unicodeandstrinstancesseemtobethesametype whenastronlycontains7-bitASCIIcharacters. Youcancombinesuchastrandunicodetogetherusingthe+operator. Youcancomparesuchstrandunicodeinstancesusingequalityandinequality operators. Youcanuseunicodeinstancesforformatstringslike'%s'. Allofthisbehaviormeansthatyoucanoftenpassastrorunicodeinstancetoa functionexpectingoneortheotherandthingswilljustwork(aslongasyou’reonly dealingwith7-bitASCII).InPython3,bytesandstrinstancesareneverequivalent— noteventheemptystring—soyoumustbemoredeliberateaboutthetypesofcharacter sequencesthatyou’repassingaround. ThesecondissueisthatinPython3,operationsinvolvingfilehandles(returnedbythe openbuilt-infunction)defaulttoUTF-8encoding.InPython2,fileoperationsdefaultto binaryencoding.Thiscausessurprisingfailures,especiallyforprogrammersaccustomed toPython2. Forexample,sayyouwanttowritesomerandombinarydatatoafile.InPython2,this works.InPython3,thisbreaks. Clickheretoviewcodeimage withopen(‘/tmp/random.bin’,‘w’)asf: f.write(os.urandom(10)) >>> TypeError:mustbestr,notbytes Thecauseofthisexceptionisthenewencodingargumentforopenthatwasaddedin Python3.Thisparameterdefaultsto'utf-8'.Thatmakesreadandwriteoperations onfilehandlesexpectstrinstancescontainingUnicodecharactersinsteadofbytes instancescontainingbinarydata. Tomakethisworkproperly,youmustindicatethatthedataisbeingopenedinwrite binarymode('wb')insteadofwritecharactermode('w').Here,Iuseopeninaway thatworkscorrectlyinPython2andPython3: Clickheretoviewcodeimage withopen(‘/tmp/random.bin’,‘wb’)asf: f.write(os.urandom(10)) Thisproblemalsoexistsforreadingdatafromfiles.Thesolutionisthesame:Indicate binarymodebyusing'rb'insteadof'r'whenopeningafile. ThingstoRemember InPython3,bytescontainssequencesof8-bitvalues,strcontainssequencesof Unicodecharacters.bytesandstrinstancescan’tbeusedtogetherwithoperators (like>or+). InPython2,strcontainssequencesof8-bitvalues,unicodecontainssequences ofUnicodecharacters.strandunicodecanbeusedtogetherwithoperatorsif thestronlycontains7-bitASCIIcharacters. Usehelperfunctionstoensurethattheinputsyouoperateonarethetypeof charactersequenceyouexpect(8-bitvalues,UTF-8encodedcharacters,Unicode characters,etc.). Ifyouwanttoreadorwritebinarydatato/fromafile,alwaysopenthefileusinga binarymode(like'rb'or'wb'). Item4:WriteHelperFunctionsInsteadofComplex Expressions Python’spithysyntaxmakesiteasytowritesingle-lineexpressionsthatimplementalot oflogic.Forexample,sayyouwanttodecodethequerystringfromaURL.Here,each querystringparameterrepresentsanintegervalue: Clickheretoviewcodeimage fromurllib.parseimportparse_qs my_values=parse_qs(‘red=5&blue=0&green=’, keep_blank_values=True) print(repr(my_values)) >>> {‘red’:[‘5’],‘green’:[”],‘blue’:[‘0’]} Somequerystringparametersmayhavemultiplevalues,somemayhavesinglevalues, somemaybepresentbuthaveblankvalues,andsomemaybemissingentirely.Usingthe getmethodontheresultdictionarywillreturndifferentvaluesineachcircumstance. Clickheretoviewcodeimage print(‘Red:’,my_values.get(‘red’)) print(‘Green:’,my_values.get(‘green’)) print(‘Opacity:‘,my_values.get(‘opacity’)) >>> Red:[‘5’] Green:[”] Opacity:None It’dbeniceifadefaultvalueof0wasassignedwhenaparameterisn’tsuppliedoris blank.YoumightchoosetodothiswithBooleanexpressionsbecauseitfeelslikethis logicdoesn’tmeritawholeifstatementorhelperfunctionquiteyet. Python’ssyntaxmakesthischoicealltooeasy.Thetrickhereisthattheemptystring,the emptylist,andzeroallevaluatetoFalseimplicitly.Thus,theexpressionsbelowwill evaluatetothesubexpressionaftertheoroperatorwhenthefirstsubexpressionis False. Clickheretoviewcodeimage #Forquerystring‘red=5&blue=0&green=’ red=my_values.get(‘red’,[”])[0]or0 green=my_values.get(‘green’,[”])[0]or0 opacity=my_values.get(‘opacity’,[”])[0]or0 print(‘Red:%r’%red) print(‘Green:%r’%green) print(‘Opacity:%r’%opacity) >>> Red:‘5’ Green:0 Opacity:0 Theredcaseworksbecausethekeyispresentinthemy_valuesdictionary.Thevalue isalistwithonemember:thestring'5'.ThisstringimplicitlyevaluatestoTrue,so redisassignedtothefirstpartoftheorexpression. Thegreencaseworksbecausethevalueinthemy_valuesdictionaryisalistwithone member:anemptystring.TheemptystringimplicitlyevaluatestoFalse,causingtheor expressiontoevaluateto0. Theopacitycaseworksbecausethevalueinthemy_valuesdictionaryismissing altogether.Thebehaviorofthegetmethodistoreturnitssecondargumentifthekey doesn’texistinthedictionary.Thedefaultvalueinthiscaseisalistwithonemember,an emptystring.Whenopacityisn’tfoundinthedictionary,thiscodedoesexactlythe samethingasthegreencase. However,thisexpressionisdifficulttoreadanditstilldoesn’tdoeverythingyouneed. You’dalsowanttoensurethatalltheparametervaluesareintegerssoyoucanusethemin mathematicalexpressions.Todothat,you’dwrapeachexpressionwiththeintbuilt-in functiontoparsethestringasaninteger. Clickheretoviewcodeimage red=int(my_values.get(‘red’,[”])[0]or0) Thisisnowextremelyhardtoread.There’ssomuchvisualnoise.Thecodeisn’t approachable.Anewreaderofthecodewouldhavetospendtoomuchtimepickingapart theexpressiontofigureoutwhatitactuallydoes.Eventhoughit’snicetokeepthings short,it’snotworthtryingtofitthisallononeline. Python2.5addedif/elseconditional—orternary—expressionstomakecaseslikethis clearerwhilekeepingthecodeshort. Clickheretoviewcodeimage red=my_values.get(‘red’,[”]) red=int(red[0])ifred[0]else0 Thisisbetter.Forlesscomplicatedsituations,if/elseconditionalexpressionscanmake thingsveryclear.Buttheexampleaboveisstillnotasclearasthealternativeofafull if/elsestatementovermultiplelines.Seeingallofthelogicspreadoutlikethismakes thedenseversionseemevenmorecomplex. Clickheretoviewcodeimage green=my_values.get(‘green’,[”]) ifgreen[0]: green=int(green[0]) else: green=0 Writingahelperfunctionisthewaytogo,especiallyifyouneedtousethislogic repeatedly. Clickheretoviewcodeimage defget_first_int(values,key,default=0): found=values.get(key,[”]) iffound[0]: found=int(found[0]) else: found=default returnfound Thecallingcodeismuchclearerthanthecomplexexpressionusingorandthetwo-line versionusingtheif/elseexpression. Clickheretoviewcodeimage green=get_first_int(my_values,‘green’) Assoonasyourexpressionsgetcomplicated,it’stimetoconsidersplittingtheminto smallerpiecesandmovinglogicintohelperfunctions.Whatyougaininreadability alwaysoutweighswhatbrevitymayhaveaffordedyou.Don’tletPython’spithysyntaxfor complexexpressionsgetyouintoamesslikethis. ThingstoRemember Python’ssyntaxmakesitalltooeasytowritesingle-lineexpressionsthatareoverly complicatedanddifficulttoread. Movecomplexexpressionsintohelperfunctions,especiallyifyouneedtousethe samelogicrepeatedly. Theif/elseexpressionprovidesamorereadablealternativetousingBoolean operatorslikeorandandinexpressions. Item5:KnowHowtoSliceSequences Pythonincludessyntaxforslicingsequencesintopieces.Slicingletsyouaccessasubset ofasequence’sitemswithminimaleffort.Thesimplestusesforslicingarethebuilt-in typeslist,str,andbytes.SlicingcanbeextendedtoanyPythonclassthat implementsthe__getitem__and__setitem__specialmethods(seeItem28: “Inheritfromcollections.abcforCustomContainerTypes”). Thebasicformoftheslicingsyntaxissomelist[start:end],wherestartis inclusiveandendisexclusive. Clickheretoviewcodeimage a=[‘a’,‘b’,‘c’,‘d’,‘e’,‘f’,‘g’,‘h’] print(‘Firstfour:’,a[:4]) print(‘Lastfour:‘,a[-4:]) print(‘Middletwo:’,a[3:-3]) >>> Firstfour:[‘a’,‘b’,‘c’,‘d’] Lastfour:[‘e’,‘f’,‘g’,‘h’] Middletwo:[‘d’,‘e’] Whenslicingfromthestartofalist,youshouldleaveoutthezeroindextoreducevisual noise. asserta[:5]==a[0:5] Whenslicingtotheendofalist,youshouldleaveoutthefinalindexbecauseit’s redundant. asserta[5:]==a[5:len(a)] Usingnegativenumbersforslicingishelpfulfordoingoffsetsrelativetotheendofalist. Alloftheseformsofslicingwouldbecleartoanewreaderofyourcode.Thereareno surprises,andIencourageyoutousethesevariations. Clickheretoviewcodeimage a[:]#[‘a’,‘b’,‘c’,‘d’,‘e’,‘f’,‘g’,‘h’] a[:5]#[‘a’,‘b’,‘c’,‘d’,‘e’] a[:-1]#[‘a’,‘b’,‘c’,‘d’,‘e’,‘f’,‘g’] a[4:]#[‘e’,‘f’,‘g’,‘h’] a[-3:]#[‘f’,‘g’,‘h’] a[2:5]#[‘c’,‘d’,‘e’] a[2:-1]#[‘c’,‘d’,‘e’,‘f’,‘g’] a[-3:-1]#[‘f’,‘g’] Slicingdealsproperlywithstartandendindexesthatarebeyondtheboundariesofthe list.Thatmakesiteasyforyourcodetoestablishamaximumlengthtoconsiderforan inputsequence. first_twenty_items=a[:20] last_twenty_items=a[-20:] Incontrast,accessingthesameindexdirectlycausesanexception. Clickheretoviewcodeimage a[20] >>> IndexError:listindexoutofrange Note Bewarethatindexingalistbyanegativevariableisoneofthefewsituationsin whichyoucangetsurprisingresultsfromslicing.Forexample,theexpression somelist[-n:]willworkfinewhennisgreaterthanone(e.g., somelist[-3:]).However,whenniszero,theexpressionsomelist[-0:] willresultinacopyoftheoriginallist. Theresultofslicingalistisawholenewlist.Referencestotheobjectsfromtheoriginal listaremaintained.Modifyingtheresultofslicingwon’taffecttheoriginallist. Clickheretoviewcodeimage b=a[4:] print(‘Before:’,b) b[1]=99 print(‘After:’,b) print(‘Nochange:’,a) >>> Before:[‘e’,‘f’,‘g’,‘h’] After:[‘e’,99,‘g’,‘h’] Nochange:[‘a’,‘b’,‘c’,‘d’,‘e’,‘f’,‘g’,‘h’] Whenusedinassignments,sliceswillreplacethespecifiedrangeintheoriginallist. Unliketupleassignments(likea,b=c[:2]),thelengthofsliceassignmentsdon’t needtobethesame.Thevaluesbeforeandaftertheassignedslicewillbepreserved.The listwillgroworshrinktoaccommodatethenewvalues. Clickheretoviewcodeimage print(‘Before‘,a) a[2:7]=[99,22,14] print(‘After’,a) >>> Before[‘a’,‘b’,‘c’,‘d’,‘e’,‘f’,‘g’,‘h’] After[‘a’,‘b’,99,22,14,‘h’] Ifyouleaveoutboththestartandtheendindexeswhenslicing,you’llendupwithacopy oftheoriginallist. Clickheretoviewcodeimage b=a[:] assertb==aandbisnota Ifyouassignaslicewithnostartorendindexes,you’llreplaceitsentirecontentswitha copyofwhat’sreferenced(insteadofallocatinganewlist). Clickheretoviewcodeimage b=a print(‘Before’,a) a[:]=[101,102,103] assertaisb#Stillthesamelistobject print(‘After‘,a)#Nowhasdifferentcontents >>> Before[‘a’,‘b’,99,22,14,‘h’] After[101,102,103] ThingstoRemember Avoidbeingverbose:Don’tsupply0forthestartindexorthelengthofthe sequencefortheendindex. Slicingisforgivingofstartorendindexesthatareoutofbounds,makingiteasy toexpressslicesonthefrontorbackboundariesofasequence(likea[:20]or a[-20:]). Assigningtoalistslicewillreplacethatrangeintheoriginalsequencewith what’sreferencedeveniftheirlengthsaredifferent. Item6:AvoidUsingstart,end,andstrideinaSingle Slice Inadditiontobasicslicing(seeItem5:“KnowHowtoSliceSequences”),Pythonhas specialsyntaxforthestrideofasliceintheformsomelist[start:end:stride]. Thisletsyoutakeeverynthitemwhenslicingasequence.Forexample,thestridemakes iteasytogroupbyevenandoddindexesinalist. Clickheretoviewcodeimage a=[‘red’,‘orange’,‘yellow’,‘green’,‘blue’,‘purple’] odds=a[::2] evens=a[1::2] print(odds) print(evens) >>> [‘red’,‘yellow’,‘blue’] [‘orange’,‘green’,‘purple’] Theproblemisthatthestridesyntaxoftencausesunexpectedbehaviorthatcan introducebugs.Forexample,acommonPythontrickforreversingabytestringistoslice thestringwithastrideof-1. x=b’mongoose’ y=x[::-1] print(y) >>> b’esoognom’ ThatworkswellforbytestringsandASCIIcharacters,butitwillbreakforUnicode charactersencodedasUTF-8bytestrings. Clickheretoviewcodeimage w=‘ ’ x=w.encode(‘utf-8’) y=x[::-1] z=y.decode(‘utf-8’) >>> UnicodeDecodeError:‘utf-8’codeccan’tdecodebyte0x9din position0:invalidstartbyte Arenegativestridesbesides-1useful?Considerthefollowingexamples. Clickheretoviewcodeimage a=[‘a’,‘b’,‘c’,‘d’,‘e’,‘f’,‘g’,‘h’] a[::2]#[‘a’,‘c’,‘e’,‘g’] a[::-2]#[‘h’,‘f’,‘d’,‘b’] Here,::2meansselecteveryseconditemstartingatthebeginning.Trickier,::-2 meansselecteveryseconditemstartingattheendandmovingbackwards. Whatdoyouthink2::2means?Whatabout-2::-2vs.-2:2:-2vs.2:2:-2? Clickheretoviewcodeimage a[2::2]#[‘c’,‘e’,‘g’] a[-2::-2]#[‘g’,‘e’,‘c’,‘a’] a[-2:2:-2]#[‘g’,‘e’] a[2:2:-2]#[] Thepointisthatthestridepartoftheslicingsyntaxcanbeextremelyconfusing. Havingthreenumberswithinthebracketsishardenoughtoreadbecauseofitsdensity. Thenit’snotobviouswhenthestartandendindexescomeintoeffectrelativetothe stridevalue,especiallywhenstrideisnegative. Topreventproblems,avoidusingstridealongwithstartandendindexes.Ifyou mustuseastride,prefermakingitapositivevalueandomitstartandendindexes. Ifyoumustusestridewithstartorendindexes,considerusingoneassignmentto strideandanothertoslice. Clickheretoviewcodeimage b=a[::2]#[‘a’,‘c’,‘e’,‘g’] c=b[1:-1]#[‘c’,‘e’] Slicingandthenstridingwillcreateanextrashallowcopyofthedata.Thefirstoperation shouldtrytoreducethesizeoftheresultingslicebyasmuchaspossible.Ifyourprogram can’taffordthetimeormemoryrequiredfortwosteps,considerusingtheitertools built-inmodule’sislicemethod(seeItem46:“UseBuilt-inAlgorithmsandData Structures”),whichdoesn’tpermitnegativevaluesforstart,end,orstride. ThingstoRemember Specifyingstart,end,andstrideinaslicecanbeextremelyconfusing. Preferusingpositivestridevaluesinsliceswithoutstartorendindexes. Avoidnegativestridevaluesifpossible. Avoidusingstart,end,andstridetogetherinasingleslice.Ifyouneedall threeparameters,considerdoingtwoassignments(onetoslice,anothertostride)or usingislicefromtheitertoolsbuilt-inmodule. Item7:UseListComprehensionsInsteadofmapand filter Pythonprovidescompactsyntaxforderivingonelistfromanother.Theseexpressionsare calledlistcomprehensions.Forexample,sayyouwanttocomputethesquareofeach numberinalist.Youcandothisbyprovidingtheexpressionforyourcomputationandthe inputsequencetoloopover. Clickheretoviewcodeimage a=[1,2,3,4,5,6,7,8,9,10] squares=[x**2forxina] print(squares) >>> [1,4,9,16,25,36,49,64,81,100] Unlessyou’reapplyingasingle-argumentfunction,listcomprehensionsareclearerthan themapbuilt-infunctionforsimplecases.maprequirescreatingalambdafunctionfor thecomputation,whichisvisuallynoisy. Clickheretoviewcodeimage squares=map(lambdax:x**2,a) Unlikemap,listcomprehensionsletyoueasilyfilteritemsfromtheinputlist,removing correspondingoutputsfromtheresult.Forexample,sayyouonlywanttocomputethe squaresofthenumbersthataredivisibleby2.Here,Idothisbyaddingaconditional expressiontothelistcomprehensionaftertheloop: Clickheretoviewcodeimage even_squares=[x**2forxinaifx%2==0] print(even_squares) >>> [4,16,36,64,100] Thefilterbuilt-infunctioncanbeusedalongwithmaptoachievethesameoutcome, butitismuchhardertoread. Clickheretoviewcodeimage alt=map(lambdax:x**2,filter(lambdax:x%2==0,a)) asserteven_squares==list(alt) Dictionariesandsetshavetheirownequivalentsoflistcomprehensions.Thesemakeit easytocreatederivativedatastructureswhenwritingalgorithms. Clickheretoviewcodeimage chile_ranks={‘ghost’:1,‘habanero’:2,‘cayenne’:3} rank_dict={rank:nameforname,rankinchile_ranks.items()} chile_len_set={len(name)fornameinrank_dict.values()} print(rank_dict) print(chile_len_set) >>> {1:‘ghost’,2:‘habanero’,3:‘cayenne’} {8,5,7} ThingstoRemember Listcomprehensionsareclearerthanthemapandfilterbuilt-infunctions becausetheydon’trequireextralambdaexpressions. Listcomprehensionsallowyoutoeasilyskipitemsfromtheinputlist,abehavior mapdoesn’tsupportwithouthelpfromfilter. Dictionariesandsetsalsosupportcomprehensionexpressions. Item8:AvoidMoreThanTwoExpressionsinList Comprehensions Beyondbasicusage(seeItem7:“UseListComprehensionsInsteadofmapand filter”),listcomprehensionsalsosupportmultiplelevelsoflooping.Forexample,say youwanttosimplifyamatrix(alistcontainingotherlists)intooneflatlistofallcells. Here,Idothiswithalistcomprehensionbyincludingtwoforexpressions.These expressionsrunintheorderprovidedfromlefttoright. Clickheretoviewcodeimage matrix=[[1,2,3],[4,5,6],[7,8,9]] flat=[xforrowinmatrixforxinrow] print(flat) >>> [1,2,3,4,5,6,7,8,9] Theexampleaboveissimple,readable,andareasonableusageofmultipleloops.Another reasonableusageofmultipleloopsisreplicatingthetwo-leveldeeplayoutoftheinputlist. Forexample,sayyouwanttosquarethevalueineachcellofatwo-dimensionalmatrix. Thisexpressionisnoisierbecauseoftheextra[]characters,butit’sstilleasytoread. Clickheretoviewcodeimage squared=[[x**2forxinrow]forrowinmatrix] print(squared) >>> [[1,4,9],[16,25,36],[49,64,81]] Ifthisexpressionincludedanotherloop,thelistcomprehensionwouldgetsolongthat you’dhavetosplititovermultiplelines. Clickheretoviewcodeimage my_lists=[ [[1,2,3],[4,5,6]], #… ] flat=[xforsublist1inmy_lists forsublist2insublist1 forxinsublist2] Atthispoint,themultilinecomprehensionisn’tmuchshorterthanthealternative.Here,I producethesameresultusingnormalloopstatements.Theindentationofthisversion makestheloopingclearerthanthelistcomprehension. flat=[] forsublist1inmy_lists: forsublist2insublist1: flat.extend(sublist2) Listcomprehensionsalsosupportmultipleifconditions.Multipleconditionsatthesame looplevelareanimplicitandexpression.Forexample,sayyouwanttofilteralistof numberstoonlyevenvaluesgreaterthanfour.Thesetwolistcomprehensionsare equivalent. Clickheretoviewcodeimage a=[1,2,3,4,5,6,7,8,9,10] b=[xforxinaifx>4ifx%2==0] c=[xforxinaifx>4andx%2==0] Conditionscanbespecifiedateachlevelofloopingaftertheforexpression.For example,sayyouwanttofilteramatrixsotheonlycellsremainingarethosedivisibleby 3inrowsthatsumto10orhigher.Expressingthiswithlistcomprehensionsisshort,but extremelydifficulttoread. Clickheretoviewcodeimage matrix=[[1,2,3],[4,5,6],[7,8,9]] filtered=[[xforxinrowifx%3==0] forrowinmatrixifsum(row)>=10] print(filtered) >>> [[6],[9]] Thoughthisexampleisabitconvoluted,inpracticeyou’llseesituationsarisewheresuch expressionsseemlikeagoodfit.Istronglyencourageyoutoavoidusinglist comprehensionsthatlooklikethis.Theresultingcodeisverydifficultforothersto comprehend.Whatyousaveinthenumberoflinesdoesn’toutweighthedifficultiesit couldcauselater. Theruleofthumbistoavoidusingmorethantwoexpressionsinalistcomprehension. Thiscouldbetwoconditions,twoloops,oroneconditionandoneloop.Assoonasitgets morecomplicatedthanthat,youshouldusenormalifandforstatementsandwritea helperfunction(seeItem16:“ConsiderGeneratorsInsteadofReturningLists”). ThingstoRemember Listcomprehensionssupportmultiplelevelsofloopsandmultipleconditionsper looplevel. Listcomprehensionswithmorethantwoexpressionsareverydifficulttoreadand shouldbeavoided. Item9:ConsiderGeneratorExpressionsforLarge Comprehensions Theproblemwithlistcomprehensions(seeItem7:“UseListComprehensionsInsteadof mapandfilter”)isthattheymaycreateawholenewlistcontainingoneitemforeach valueintheinputsequence.Thisisfineforsmallinputs,butforlargeinputsthiscould consumesignificantamountsofmemoryandcauseyourprogramtocrash. Forexample,sayyouwanttoreadafileandreturnthenumberofcharactersoneachline. Doingthiswithalistcomprehensionwouldrequireholdingthelengthofeverylineofthe fileinmemory.Ifthefileisabsolutelyenormousorperhapsanever-endingnetwork socket,listcomprehensionsareproblematic.Here,Iusealistcomprehensioninawaythat canonlyhandlesmallinputvalues. Clickheretoviewcodeimage value=[len(x)forxinopen(‘/tmp/my_file.txt’)] print(value) >>> [100,57,15,1,12,75,5,86,89,11] Tosolvethis,Pythonprovidesgeneratorexpressions,ageneralizationoflist comprehensionsandgenerators.Generatorexpressionsdon’tmaterializethewholeoutput sequencewhenthey’rerun.Instead,generatorexpressionsevaluatetoaniteratorthat yieldsoneitematatimefromtheexpression. Ageneratorexpressioniscreatedbyputtinglist-comprehension-likesyntaxbetween() characters.Here,Iuseageneratorexpressionthatisequivalenttothecodeabove. However,thegeneratorexpressionimmediatelyevaluatestoaniteratoranddoesn’tmake anyforwardprogress. Clickheretoviewcodeimage it=(len(x)forxinopen(‘/tmp/my_file.txt’)) print(it) >>> <generatorobject<genexpr>at0x101b81480> Thereturnediteratorcanbeadvancedonestepatatimetoproducethenextoutputfrom thegeneratorexpressionasneeded(usingthenextbuilt-infunction).Yourcodecan consumeasmuchofthegeneratorexpressionasyouwantwithoutriskingablowupin memoryusage. print(next(it)) print(next(it)) >>> 100 57 Anotherpowerfuloutcomeofgeneratorexpressionsisthattheycanbecomposedtogether. Here,Itaketheiteratorreturnedbythegeneratorexpressionaboveanduseitastheinput foranothergeneratorexpression. Clickheretoviewcodeimage roots=((x,x**0.5)forxinit) EachtimeIadvancethisiterator,itwillalsoadvancetheinterioriterator,creatinga dominoeffectoflooping,evaluatingconditionalexpressions,andpassingaroundinputs andoutputs. print(next(roots)) >>> (15,3.872983346207417) ChaininggeneratorslikethisexecutesveryquicklyinPython.Whenyou’relookingfora waytocomposefunctionalitythat’soperatingonalargestreamofinput,generator expressionsarethebesttoolforthejob.Theonlygotchaisthattheiteratorsreturnedby generatorexpressionsarestateful,soyoumustbecarefulnottousethemmorethanonce (seeItem17:“BeDefensiveWhenIteratingOverArguments”). ThingstoRemember Listcomprehensionscancauseproblemsforlargeinputsbyusingtoomuch memory. Generatorexpressionsavoidmemoryissuesbyproducingoutputsoneatatimeas aniterator. Generatorexpressionscanbecomposedbypassingtheiteratorfromonegenerator expressionintotheforsubexpressionofanother. Generatorexpressionsexecuteveryquicklywhenchainedtogether. Item10:PreferenumerateOverrange Therangebuilt-infunctionisusefulforloopsthatiterateoverasetofintegers. random_bits=0 foriinrange(64): ifrandint(0,1): random_bits|=1<<i Whenyouhaveadatastructuretoiterateover,likealistofstrings,youcanloopdirectly overthesequence. Clickheretoviewcodeimage flavor_list=[‘vanilla’,‘chocolate’,‘pecan’,‘strawberry’] forflavorinflavor_list: print(‘%sisdelicious’%flavor) Often,you’llwanttoiterateoveralistandalsoknowtheindexofthecurrentiteminthe list.Forexample,sayyouwanttoprinttherankingofyourfavoriteicecreamflavors.One waytodoitisusingrange. Clickheretoviewcodeimage foriinrange(len(flavor_list)): flavor=flavor_list[i] print(‘%d:%s’%(i+1,flavor)) Thislooksclumsycomparedwiththeotherexamplesofiteratingoverflavor_listor range.Youhavetogetthelengthofthelist.Youhavetoindexintothearray.It’sharder toread. Pythonprovidestheenumeratebuilt-infunctionforaddressingthissituation. enumeratewrapsanyiteratorwithalazygenerator.Thisgeneratoryieldspairsofthe loopindexandthenextvaluefromtheiterator.Theresultingcodeismuchclearer. Clickheretoviewcodeimage fori,flavorinenumerate(flavor_list): print(‘%d:%s’%(i+1,flavor)) >>> 1:vanilla 2:chocolate 3:pecan 4:strawberry Youcanmakethisevenshorterbyspecifyingthenumberfromwhichenumerate shouldbegincounting(1inthiscase). Clickheretoviewcodeimage fori,flavorinenumerate(flavor_list,1): print(‘%d:%s’%(i,flavor)) ThingstoRemember enumerateprovidesconcisesyntaxforloopingoveraniteratorandgettingthe indexofeachitemfromtheiteratorasyougo. Preferenumerateinsteadofloopingoverarangeandindexingintoasequence. Youcansupplyasecondparametertoenumeratetospecifythenumberfrom whichtobegincounting(zeroisthedefault). Item11:UseziptoProcessIteratorsinParallel OfteninPythonyoufindyourselfwithmanylistsofrelatedobjects.Listcomprehensions makeiteasytotakeasourcelistandgetaderivedlistbyapplyinganexpression(seeItem 7:“UseListComprehensionsInsteadofmapandfilter”). Clickheretoviewcodeimage names=[‘Cecilia’,‘Lise’,‘Marie’] letters=[len(n)forninnames] Theitemsinthederivedlistarerelatedtotheitemsinthesourcelistbytheirindexes.To iterateoverbothlistsinparallel,youcaniterateoverthelengthofthenamessourcelist. Clickheretoviewcodeimage longest_name=None max_letters=0 foriinrange(len(names)): count=letters[i] ifcount>max_letters: longest_name=names[i] max_letters=count print(longest_name) >>> Cecilia Theproblemisthatthiswholeloopstatementisvisuallynoisy.Theindexesintonames andlettersmakethecodehardtoread.Indexingintothearraysbytheloopindexi happenstwice.Usingenumerate(seeItem10:“PreferenumerateOverrange”) improvesthisslightly,butit’sstillnotideal. Clickheretoviewcodeimage fori,nameinenumerate(names): count=letters[i] ifcount>max_letters: longest_name=name max_letters=count Tomakethiscodeclearer,Pythonprovidesthezipbuilt-infunction.InPython3,zip wrapstwoormoreiteratorswithalazygenerator.Thezipgeneratoryieldstuples containingthenextvaluefromeachiterator.Theresultingcodeismuchcleanerthan indexingintomultiplelists. Clickheretoviewcodeimage forname,countinzip(names,letters): ifcount>max_letters: longest_name=name max_letters=count Therearetwoproblemswiththezipbuilt-in. ThefirstissueisthatinPython2zipisnotagenerator;itwillfullyexhaustthesupplied iteratorsandreturnalistofallthetuplesitcreates.Thiscouldpotentiallyusealotof memoryandcauseyourprogramtocrash.Ifyouwanttozipverylargeiteratorsin Python2,youshoulduseizipfromtheitertoolsbuilt-inmodule(seeItem46:“Use Built-inAlgorithmsandDataStructures”). Thesecondissueisthatzipbehavesstrangelyiftheinputiteratorsareofdifferent lengths.Forexample,sayyouaddanothernametothelistabovebutforgettoupdatethe lettercounts.Runningziponthetwoinputlistswillhaveanunexpectedresult. Clickheretoviewcodeimage names.append(‘Rosalind’) forname,countinzip(names,letters): print(name) >>> Cecilia Lise Marie Thenewitemfor'Rosalind'isn’tthere.Thisisjusthowzipworks.Itkeepsyielding tuplesuntilawrappediteratorisexhausted.Thisapproachworksfinewhenyouknowthat theiteratorsareofthesamelength,whichisoftenthecaseforderivedlistscreatedbylist comprehensions.Inmanyothercases,thetruncatingbehaviorofzipissurprisingand bad.Ifyouaren’tconfidentthatthelengthsofthelistsyouwanttozipareequal, considerusingthezip_longestfunctionfromtheitertoolsbuilt-inmodule instead(alsocalledizip_longestinPython2). ThingstoRemember Thezipbuilt-infunctioncanbeusedtoiterateovermultipleiteratorsinparallel. InPython3,zipisalazygeneratorthatproducestuples.InPython2,zipreturns thefullresultasalistoftuples. ziptruncatesitsoutputsilentlyifyousupplyitwithiteratorsofdifferentlengths. Thezip_longestfunctionfromtheitertoolsbuilt-inmoduleletsyouiterate overmultipleiteratorsinparallelregardlessoftheirlengths(seeItem46:“Use Built-inAlgorithmsandDataStructures”). Item12:AvoidelseBlocksAfterforandwhileLoops Pythonloopshaveanextrafeaturethatisnotavailableinmostotherprogramming languages:youcanputanelseblockimmediatelyafteraloop’srepeatedinteriorblock. foriinrange(3): print(‘Loop%d’%i) else: print(‘Elseblock!’) >>> Loop0 Loop1 Loop2 Elseblock! Surprisingly,theelseblockrunsimmediatelyaftertheloopfinishes.Whyistheclause called“else”?Whynot“and”?Inanif/elsestatement,elsemeans,“Dothisifthe blockbeforethisdoesn’thappen.”Inatry/exceptstatement,excepthasthesame definition:“Dothisiftryingtheblockbeforethisfailed.” Similarly,elsefromtry/except/elsefollowsthispattern(seeItem13:“Take AdvantageofEachBlockintry/except/else/finally”)becauseitmeans,“Dothis iftheblockbeforedidnotfail.”try/finallyisalsointuitivebecauseitmeans, “Alwaysdowhatisfinalaftertryingtheblockbefore.” Givenalloftheusesofelse,except,andfinallyinPython,anewprogrammer mightassumethattheelsepartoffor/elsemeans,“Dothisiftheloopwasn’t completed.”Inreality,itdoesexactlytheopposite.Usingabreakstatementinaloop willactuallyskiptheelseblock. foriinrange(3): print(‘Loop%d’%i) ifi==1: break else: print(‘Elseblock!’) >>> Loop0 Loop1 Anothersurpriseisthattheelseblockwillrunimmediatelyifyouloopoveranempty sequence. Clickheretoviewcodeimage forxin[]: print(‘Neverruns’) else: print(‘ForElseblock!’) >>> ForElseblock! Theelseblockalsorunswhenwhileloopsareinitiallyfalse. Clickheretoviewcodeimage whileFalse: print(‘Neverruns’) else: print(‘WhileElseblock!’) >>> WhileElseblock! Therationaleforthesebehaviorsisthatelseblocksafterloopsareusefulwhenyou’re usingloopstosearchforsomething.Forexample,sayyouwanttodeterminewhethertwo numbersarecoprime(theironlycommondivisoris1).Here,Iiteratethroughevery possiblecommondivisorandtestthenumbers.Aftereveryoptionhasbeentried,theloop ends.Theelseblockrunswhenthenumbersarecoprimebecausetheloopdoesn’t encounterabreak. Clickheretoviewcodeimage a=4 b=9 foriinrange(2,min(a,b)+1): print(‘Testing’,i) ifa%i==0andb%i==0: print(‘Notcoprime’) break else: print(‘Coprime’) >>> Testing2 Testing3 Testing4 Coprime Inpractice,youwouldn’twritethecodethisway.Instead,you’dwriteahelperfunctionto dothecalculation.Suchahelperfunctioniswrittenintwocommonstyles. Thefirstapproachistoreturnearlywhenyoufindtheconditionyou’relookingfor.You returnthedefaultoutcomeifyoufallthroughtheloop. Clickheretoviewcodeimage defcoprime(a,b): foriinrange(2,min(a,b)+1): ifa%i==0andb%i==0: returnFalse returnTrue Thesecondwayistohavearesultvariablethatindicateswhetheryou’vefoundwhat you’relookingforintheloop.Youbreakoutoftheloopassoonasyoufindsomething. Clickheretoviewcodeimage defcoprime2(a,b): is_coprime=True foriinrange(2,min(a,b)+1): ifa%i==0andb%i==0: is_coprime=False break returnis_coprime Bothoftheseapproachesaresomuchclearertoreadersofunfamiliarcode.The expressivityyougainfromtheelseblockdoesn’toutweightheburdenyouputon people(includingyourself)whowanttounderstandyourcodeinthefuture.Simple constructslikeloopsshouldbeself-evidentinPython.Youshouldavoidusingelse blocksafterloopsentirely. ThingstoRemember Pythonhasspecialsyntaxthatallowselseblockstoimmediatelyfollowforand whileloopinteriorblocks. Theelseblockafteralooponlyrunsiftheloopbodydidnotencounterabreak statement. Avoidusingelseblocksafterloopsbecausetheirbehaviorisn’tintuitiveandcan beconfusing. Item13:TakeAdvantageofEachBlockin try/except/else/finally Therearefourdistincttimesthatyoumaywanttotakeactionduringexceptionhandling inPython.Thesearecapturedinthefunctionalityoftry,except,else,andfinally blocks.Eachblockservesauniquepurposeinthecompoundstatement,andtheirvarious combinationsareuseful(seeItem51:“DefineaRootExceptiontoInsulateCallers fromAPIs”foranotherexample). FinallyBlocks Usetry/finallywhenyouwantexceptionstopropagateup,butyoualsowanttorun cleanupcodeevenwhenexceptionsoccur.Onecommonusageoftry/finallyisfor reliablyclosingfilehandles(seeItem43:“ConsidercontextlibandwithStatements forReusabletry/finallyBehavior”foranotherapproach). Clickheretoviewcodeimage handle=open(‘/tmp/random_data.txt’)#MayraiseIOError try: data=handle.read()#MayraiseUnicodeDecodeError finally: handle.close()#Alwaysrunsaftertry: Anyexceptionraisedbythereadmethodwillalwayspropagateuptothecallingcode, yettheclosemethodofhandleisalsoguaranteedtoruninthefinallyblock.You mustcallopenbeforethetryblockbecauseexceptionsthatoccurwhenopeningthefile (likeIOErrorifthefiledoesnotexist)shouldskipthefinallyblock. ElseBlocks Usetry/except/elsetomakeitclearwhichexceptionswillbehandledbyyourcode andwhichexceptionswillpropagateup.Whenthetryblockdoesn’traiseanexception, theelseblockwillrun.Theelseblockhelpsyouminimizetheamountofcodeinthe tryblockandimprovesreadability.Forexample,sayyouwanttoloadJSONdictionary datafromastringandreturnthevalueofakeyitcontains. Clickheretoviewcodeimage defload_json_key(data,key): try: result_dict=json.loads(data)#MayraiseValueError exceptValueErrorase: raiseKeyErrorfrome else: returnresult_dict[key]#MayraiseKeyError Ifthedataisn’tvalidJSON,thendecodingwithjson.loadswillraisea ValueError.Theexceptioniscaughtbytheexceptblockandhandled.Ifdecodingis successful,thenthekeylookupwilloccurintheelseblock.Ifthekeylookupraisesany exceptions,theywillpropagateuptothecallerbecausetheyareoutsidethetryblock. Theelseclauseensuresthatwhatfollowsthetry/exceptisvisuallydistinguished fromtheexceptblock.Thismakestheexceptionpropagationbehaviorclear. EverythingTogether Usetry/except/else/finallywhenyouwanttodoitallinonecompound statement.Forexample,sayyouwanttoreadadescriptionofworktodofromafile, processit,andthenupdatethefileinplace.Here,thetryblockisusedtoreadthefile andprocessit.Theexceptblockisusedtohandleexceptionsfromthetryblockthat areexpected.Theelseblockisusedtoupdatethefileinplaceandtoallowrelated exceptionstopropagateup.Thefinallyblockcleansupthefilehandle. Clickheretoviewcodeimage UNDEFINED=object() defdivide_json(path): handle=open(path,‘r+’)#MayraiseIOError try: data=handle.read()#MayraiseUnicodeDecodeError op=json.loads(data)#MayraiseValueError value=( op[‘numerator’]/ op[‘denominator’])#MayraiseZeroDivisionError exceptZeroDivisionErrorase: returnUNDEFINED else: op[‘result’]=value result=json.dumps(op) handle.seek(0) handle.write(result)#MayraiseIOError returnvalue finally: handle.close()#Alwaysruns Thislayoutisespeciallyusefulbecausealloftheblocksworktogetherinintuitiveways. Forexample,ifanexceptiongetsraisedintheelseblockwhilerewritingtheresultdata, thefinallyblockwillstillrunandclosethefilehandle. ThingstoRemember Thetry/finallycompoundstatementletsyouruncleanupcoderegardlessof whetherexceptionswereraisedinthetryblock. Theelseblockhelpsyouminimizetheamountofcodeintryblocksandvisually distinguishthesuccesscasefromthetry/exceptblocks. Anelseblockcanbeusedtoperformadditionalactionsafterasuccessfultry blockbutbeforecommoncleanupinafinallyblock. 2.Functions ThefirstorganizationaltoolprogrammersuseinPythonisthefunction.Asinother programminglanguages,functionsenableyoutobreaklargeprogramsintosmaller, simplerpieces.Theyimprovereadabilityandmakecodemoreapproachable.Theyallow forreuseandrefactoring. FunctionsinPythonhaveavarietyofextrafeaturesthatmaketheprogrammer’slife easier.Somearesimilartocapabilitiesinotherprogramminglanguages,butmanyare uniquetoPython.Theseextrascanmakeafunction’spurposemoreobvious.Theycan eliminatenoiseandclarifytheintentionofcallers.Theycansignificantlyreducesubtle bugsthataredifficulttofind. Item14:PreferExceptionstoReturningNone Whenwritingutilityfunctions,there’sadrawforPythonprogrammerstogivespecial meaningtothereturnvalueofNone.Itseemstomakessenseinsomecases.Forexample, sayyouwantahelperfunctionthatdividesonenumberbyanother.Inthecaseofdividing byzero,returningNoneseemsnaturalbecausetheresultisundefined. defdivide(a,b): try: returna/b exceptZeroDivisionError: returnNone Codeusingthisfunctioncaninterpretthereturnvalueaccordingly. result=divide(x,y) ifresultisNone: print(‘Invalidinputs’) Whathappenswhenthenumeratoriszero?Thatwillcausethereturnvaluetoalsobezero (ifthedenominatorisnon-zero).Thiscancauseproblemswhenyouevaluatetheresultin aconditionlikeanifstatement.YoumayaccidentallylookforanyFalseequivalent valuetoindicateerrorsinsteadofonlylookingforNone(seeItem4:“WriteHelper FunctionsInsteadofComplexExpressions”forasimilarsituation). Clickheretoviewcodeimage x,y=0,5 result=divide(x,y) ifnotresult: print(‘Invalidinputs’)#Thisiswrong! ThisisacommonmistakeinPythoncodewhenNonehasspecialmeaning.Thisiswhy returningNonefromafunctioniserrorprone.Therearetwowaystoreducethechanceof sucherrors. Thefirstwayistosplitthereturnvalueintoatwo-tuple.Thefirstpartofthetuple indicatesthattheoperationwasasuccessorfailure.Thesecondpartistheactualresult thatwascomputed. defdivide(a,b): try: returnTrue,a/b exceptZeroDivisionError: returnFalse,None Callersofthisfunctionhavetounpackthetuple.Thatforcesthemtoconsiderthestatus partofthetupleinsteadofjustlookingattheresultofdivision. Clickheretoviewcodeimage success,result=divide(x,y) ifnotsuccess: print(‘Invalidinputs’) Theproblemisthatcallerscaneasilyignorethefirstpartofthetuple(usingthe underscorevariablename,aPythonconventionforunusedvariables).Theresultingcode doesn’tlookwrongatfirstglance.ThisisasbadasjustreturningNone. _,result=divide(x,y) ifnotresult: print(‘Invalidinputs’) Thesecond,betterwaytoreducetheseerrorsistoneverreturnNoneatall.Instead,raise anexceptionuptothecallerandmakethemdealwithit.Here,Iturna ZeroDivisionErrorintoaValueErrortoindicatetothecallerthattheinput valuesarebad: Clickheretoviewcodeimage defdivide(a,b): try: returna/b exceptZeroDivisionErrorase: raiseValueError(‘Invalidinputs’)frome Nowthecallershouldhandletheexceptionfortheinvalidinputcase(thisbehaviorshould bedocumented;seeItem49:“WriteDocstringsforEveryFunction,Class,andModule”). Thecallernolongerrequiresaconditiononthereturnvalueofthefunction.Ifthe functiondidn’traiseanexception,thenthereturnvaluemustbegood.Theoutcomeof exceptionhandlingisclear. Clickheretoviewcodeimage x,y=5,2 try: result=divide(x,y) exceptValueError: print(‘Invalidinputs’) else: print(‘Resultis%.1f’%result) >>> Resultis2.5 ThingstoRemember FunctionsthatreturnNonetoindicatespecialmeaningareerrorpronebecause Noneandothervalues(e.g.,zero,theemptystring)allevaluatetoFalsein conditionalexpressions. RaiseexceptionstoindicatespecialsituationsinsteadofreturningNone.Expectthe callingcodetohandleexceptionsproperlywhenthey’redocumented. Item15:KnowHowClosuresInteractwithVariableScope Sayyouwanttosortalistofnumbersbutprioritizeonegroupofnumberstocomefirst. Thispatternisusefulwhenyou’rerenderingauserinterfaceandwantimportantmessages orexceptionaleventstobedisplayedbeforeeverythingelse. Acommonwaytodothisistopassahelperfunctionasthekeyargumenttoalist’s sortmethod.Thehelper’sreturnvaluewillbeusedasthevalueforsortingeachitemin thelist.Thehelpercancheckwhetherthegivenitemisintheimportantgroupandcan varythesortkeyaccordingly. Clickheretoviewcodeimage defsort_priority(values,group): defhelper(x): ifxingroup: return(0,x) return(1,x) values.sort(key=helper) Thisfunctionworksforsimpleinputs. Clickheretoviewcodeimage numbers=[8,3,1,2,5,4,7,6] group={2,3,5,7} sort_priority(numbers,group) print(numbers) >>> [2,3,5,7,1,4,6,8] Therearethreereasonswhythisfunctionoperatesasexpected: Pythonsupportsclosures:functionsthatrefertovariablesfromthescopeinwhich theyweredefined.Thisiswhythehelperfunctionisabletoaccessthegroup argumenttosort_priority. Functionsarefirst-classobjectsinPython,meaningyoucanrefertothemdirectly, assignthemtovariables,passthemasargumentstootherfunctions,comparethem inexpressionsandifstatements,etc.Thisishowthesortmethodcanaccepta closurefunctionasthekeyargument. Pythonhasspecificrulesforcomparingtuples.Itfirstcomparesitemsinindexzero, thenindexone,thenindextwo,andsoon.Thisiswhythereturnvaluefromthe helperclosurecausesthesortordertohavetwodistinctgroups. It’dbeniceifthisfunctionreturnedwhetherhigher-priorityitemswereseenatallsothe userinterfacecodecanactaccordingly.Addingsuchbehaviorseemsstraightforward. There’salreadyaclosurefunctionfordecidingwhichgroupeachnumberisin.Whynot alsousetheclosuretoflipaflagwhenhigh-priorityitemsareseen?Thenthefunctioncan returntheflagvalueafterit’sbeenmodifiedbytheclosure. Here,Itrytodothatinaseeminglyobviousway: Clickheretoviewcodeimage defsort_priority2(numbers,group): found=False defhelper(x): ifxingroup: found=True#Seemssimple return(0,x) return(1,x) numbers.sort(key=helper) returnfound Icanrunthefunctiononthesameinputsasbefore. Clickheretoviewcodeimage found=sort_priority2(numbers,group) print(‘Found:’,found) print(numbers) >>> Found:False [2,3,5,7,1,4,6,8] Thesortedresultsarecorrect,butthefoundresultiswrong.Itemsfromgroupwere definitelyfoundinnumbers,butthefunctionreturnedFalse.Howcouldthishappen? Whenyoureferenceavariableinanexpression,thePythoninterpreterwilltraversethe scopetoresolvethereferenceinthisorder: 1.Thecurrentfunction’sscope 2.Anyenclosingscopes(likeothercontainingfunctions) 3.Thescopeofthemodulethatcontainsthecode(alsocalledtheglobalscope) 4.Thebuilt-inscope(thatcontainsfunctionslikelenandstr) Ifnoneoftheseplaceshaveadefinedvariablewiththereferencedname,thena NameErrorexceptionisraised. Assigningavaluetoavariableworksdifferently.Ifthevariableisalreadydefinedinthe currentscope,thenitwilljusttakeonthenewvalue.Ifthevariabledoesn’texistinthe currentscope,thenPythontreatstheassignmentasavariabledefinition.Thescopeofthe newlydefinedvariableisthefunctionthatcontainstheassignment. Thisassignmentbehaviorexplainsthewrongreturnvalueofthesort_priority2 function.ThefoundvariableisassignedtoTrueinthehelperclosure.Theclosure’s assignmentistreatedasanewvariabledefinitionwithinhelper,notasanassignment withinsort_priority2. Clickheretoviewcodeimage defsort_priority2(numbers,group): found=False#Scope:‘sort_priority2’ defhelper(x): ifxingroup: found=True#Scope:‘helper’—Bad! return(0,x) return(1,x) numbers.sort(key=helper) returnfound Encounteringthisproblemissometimescalledthescopingbugbecauseitcanbeso surprisingtonewbies.Butthisistheintendedresult.Thisbehaviorpreventslocal variablesinafunctionfrompollutingthecontainingmodule.Otherwise,everyassignment withinafunctionwouldputgarbageintotheglobalmodulescope.Notonlywouldthatbe noise,buttheinterplayoftheresultingglobalvariablescouldcauseobscurebugs. GettingDataOut InPython3,thereisspecialsyntaxforgettingdataoutofaclosure.Thenonlocal statementisusedtoindicatethatscopetraversalshouldhappenuponassignmentfora specificvariablename.Theonlylimitisthatnonlocalwon’ttraverseuptothemodulelevelscope(toavoidpollutingglobals). Here,Idefinethesamefunctionagainusingnonlocal: Clickheretoviewcodeimage defsort_priority3(numbers,group): found=False defhelper(x): nonlocalfound ifxingroup: found=True return(0,x) return(1,x) numbers.sort(key=helper) returnfound Thenonlocalstatementmakesitclearwhendataisbeingassignedoutofaclosureinto anotherscope.It’scomplementarytotheglobalstatement,whichindicatesthata variable’sassignmentshouldgodirectlyintothemodulescope. However,muchliketheanti-patternofglobalvariables,I’dcautionagainstusing nonlocalforanythingbeyondsimplefunctions.Thesideeffectsofnonlocalcanbe hardtofollow.It’sespeciallyhardtounderstandinlongfunctionswherethenonlocal statementsandassignmentstoassociatedvariablesarefarapart. Whenyourusageofnonlocalstartsgettingcomplicated,it’sbettertowrapyourstate inahelperclass.Here,Idefineaclassthatachievesthesameresultasthenonlocal approach.It’salittlelonger,butismucheasiertoread(seeItem23:“AcceptFunctionsfor SimpleInterfacesInsteadofClasses”fordetailsonthe__call__specialmethod). Clickheretoviewcodeimage classSorter(object): def__init__(self,group): self.group=group self.found=False def__call__(self,x): ifxinself.group: self.found=True return(0,x) return(1,x) sorter=Sorter(group) numbers.sort(key=sorter) assertsorter.foundisTrue ScopeinPython2 Unfortunately,Python2doesn’tsupportthenonlocalkeyword.Inordertogetsimilar behavior,youneedtouseawork-aroundthattakesadvantageofPython’sscopingrules. Thisapproachisn’tpretty,butit’sthecommonPythonidiom. Clickheretoviewcodeimage #Python2 defsort_priority(numbers,group): found=[False] defhelper(x): ifxingroup: found[0]=True return(0,x) return(1,x) numbers.sort(key=helper) returnfound[0] Asexplainedabove,Pythonwilltraverseupthescopewherethefoundvariableis referencedtoresolveitscurrentvalue.Thetrickisthatthevalueforfoundisalist, whichismutable.Thismeansthatonceretrieved,theclosurecanmodifythestateof foundtosenddataoutoftheinnerscope(withfound[0]=True). Thisapproachalsoworkswhenthevariableusedtotraversethescopeisadictionary,a set,oraninstanceofaclassyou’vedefined. ThingstoRemember Closurefunctionscanrefertovariablesfromanyofthescopesinwhichtheywere defined. Bydefault,closurescan’taffectenclosingscopesbyassigningvariables. InPython3,usethenonlocalstatementtoindicatewhenaclosurecanmodifya variableinitsenclosingscopes. InPython2,useamutablevalue(likeasingle-itemlist)toworkaroundthelackof thenonlocalstatement. Avoidusingnonlocalstatementsforanythingbeyondsimplefunctions. Item16:ConsiderGeneratorsInsteadofReturningLists Thesimplestchoiceforfunctionsthatproduceasequenceofresultsistoreturnalistof items.Forexample,sayyouwanttofindtheindexofeverywordinastring.Here,I accumulateresultsinalistusingtheappendmethodandreturnitattheendofthe function: Clickheretoviewcodeimage defindex_words(text): result=[] iftext: result.append(0) forindex,letterinenumerate(text): ifletter==‘‘: result.append(index+1) returnresult Thisworksasexpectedforsomesampleinput. Clickheretoviewcodeimage address=‘Fourscoreandsevenyearsago…’ result=index_words(address) print(result[:3]) >>> [0,5,11] Therearetwoproblemswiththeindex_wordsfunction. Thefirstproblemisthatthecodeisabitdenseandnoisy.Eachtimeanewresultisfound, Icalltheappendmethod.Themethodcall’sbulk(result.append)deemphasizesthe valuebeingaddedtothelist(index+1).Thereisonelineforcreatingtheresultlist andanotherforreturningit.Whilethefunctionbodycontains~130characters(without whitespace),only~75charactersareimportant. Abetterwaytowritethisfunctionisusingagenerator.Generatorsarefunctionsthatuse yieldexpressions.Whencalled,generatorfunctionsdonotactuallyrunbutinstead immediatelyreturnaniterator.Witheachcalltothenextbuilt-infunction,theiterator willadvancethegeneratortoitsnextyieldexpression.Eachvaluepassedtoyieldby thegeneratorwillbereturnedbytheiteratortothecaller. Here,Idefineageneratorfunctionthatproducesthesameresultsasbefore: Clickheretoviewcodeimage defindex_words_iter(text): iftext: yield0 forindex,letterinenumerate(text): ifletter==‘‘: yieldindex+1 It’ssignificantlyeasiertoreadbecauseallinteractionswiththeresultlisthavebeen eliminated.Resultsarepassedtoyieldexpressionsinstead.Theiteratorreturnedbythe generatorcallcaneasilybeconvertedtoalistbypassingittothelistbuilt-infunction (seeItem9:“ConsiderGeneratorExpressionsforLargeComprehensions”forhowthis works). Clickheretoviewcodeimage result=list(index_words_iter(address)) Thesecondproblemwithindex_wordsisthatitrequiresallresultstobestoredinthe listbeforebeingreturned.Forhugeinputs,thiscancauseyourprogramtorunoutof memoryandcrash.Incontrast,ageneratorversionofthisfunctioncaneasilybeadapted totakeinputsofarbitrarylength. Here,Idefineageneratorthatstreamsinputfromafileonelineatatimeandyields outputsonewordatatime.Theworkingmemoryforthisfunctionisboundedtothe maximumlengthofonelineofinput. defindex_file(handle): offset=0 forlineinhandle: ifline: yieldoffset forletterinline: offset+=1 ifletter==‘‘: yieldoffset Runningthegeneratorproducesthesameresults. Clickheretoviewcodeimage withopen(‘/tmp/address.txt’,‘r’)asf: it=index_file(f) results=islice(it,0,3) print(list(results)) >>> [0,5,11] Theonlygotchaofdefininggeneratorslikethisisthatthecallersmustbeawarethatthe iteratorsreturnedarestatefulandcan’tbereused(seeItem17:“BeDefensiveWhen IteratingOverArguments”). ThingstoRemember Usinggeneratorscanbeclearerthanthealternativeofreturninglistsofaccumulated results. Theiteratorreturnedbyageneratorproducesthesetofvaluespassedtoyield expressionswithinthegeneratorfunction’sbody. Generatorscanproduceasequenceofoutputsforarbitrarilylargeinputsbecause theirworkingmemorydoesn’tincludeallinputsandoutputs. Item17:BeDefensiveWhenIteratingOverArguments Whenafunctiontakesalistofobjectsasaparameter,it’softenimportanttoiterateover thatlistmultipletimes.Forexample,sayyouwanttoanalyzetourismnumbersforthe U.S.stateofTexas.Imaginethedatasetisthenumberofvisitorstoeachcity(inmillions peryear).You’dliketofigureoutwhatpercentageofoveralltourismeachcityreceives. Todothisyouneedanormalizationfunction.Itsumstheinputstodeterminethetotal numberoftouristsperyear.Thenitdivideseachcity’sindividualvisitorcountbythetotal tofindthatcity’scontributiontothewhole. Clickheretoviewcodeimage defnormalize(numbers): total=sum(numbers) result=[] forvalueinnumbers: percent=100*value/total result.append(percent) returnresult Thisfunctionworkswhengivenalistofvisits. Clickheretoviewcodeimage visits=[15,35,80] percentages=normalize(visits) print(percentages) >>> [11.538461538461538,26.923076923076923,61.53846153846154] Toscalethisup,IneedtoreadthedatafromafilethatcontainseverycityinallofTexas. IdefineageneratortodothisbecausethenIcanreusethesamefunctionlaterwhenIwant tocomputetourismnumbersforthewholeworld,amuchlargerdataset(seeItem16: “ConsiderGeneratorsInsteadofReturningLists”). Clickheretoviewcodeimage defread_visits(data_path): withopen(data_path)asf: forlineinf: yieldint(line) Surprisingly,callingnormalizeonthegenerator’sreturnvalueproducesnoresults. Clickheretoviewcodeimage it=read_visits(‘/tmp/my_numbers.txt’) percentages=normalize(it) print(percentages) >>> [] Thecauseofthisbehavioristhataniteratoronlyproducesitsresultsasingletime.Ifyou iterateoveraniteratororgeneratorthathasalreadyraisedaStopIterationexception, youwon’tgetanyresultsthesecondtimearound. Clickheretoviewcodeimage it=read_visits(‘/tmp/my_numbers.txt’) print(list(it)) print(list(it))#Alreadyexhausted >>> [15,35,80] [] What’sconfusingisthatyoualsowon’tgetanyerrorswhenyouiterateoveranalready exhaustediterator.forloops,thelistconstructor,andmanyotherfunctionsthroughout thePythonstandardlibraryexpecttheStopIterationexceptiontoberaisedduring normaloperation.Thesefunctionscan’ttellthedifferencebetweenaniteratorthathasno outputandaniteratorthathadoutputandisnowexhausted. Tosolvethisproblem,youcanexplicitlyexhaustaninputiteratorandkeepacopyofits entirecontentsinalist.Youcantheniterateoverthelistversionofthedataasmanytimes asyouneedto.Here’sthesamefunctionasbefore,butitdefensivelycopiestheinput iterator: Clickheretoviewcodeimage defnormalize_copy(numbers): numbers=list(numbers)#Copytheiterator total=sum(numbers) result=[] forvalueinnumbers: percent=100*value/total result.append(percent) returnresult Nowthefunctionworkscorrectlyonagenerator’sreturnvalue. Clickheretoviewcodeimage it=read_visits(‘/tmp/my_numbers.txt’) percentages=normalize_copy(it) print(percentages) >>> [11.538461538461538,26.923076923076923,61.53846153846154] Theproblemwiththisapproachisthecopyoftheinputiterator’scontentscouldbelarge. Copyingtheiteratorcouldcauseyourprogramtorunoutofmemoryandcrash.Oneway aroundthisistoacceptafunctionthatreturnsanewiteratoreachtimeit’scalled. Clickheretoviewcodeimage defnormalize_func(get_iter): total=sum(get_iter())#Newiterator result=[] forvalueinget_iter():#Newiterator percent=100*value/total result.append(percent) returnresult Tousenormalize_func,youcanpassinalambdaexpressionthatcallsthegenerator andproducesanewiteratoreachtime. Clickheretoviewcodeimage percentages=normalize_func(lambda:read_visits(path)) Thoughitworks,havingtopassalambdafunctionlikethisisclumsy.Thebetterwayto achievethesameresultistoprovideanewcontainerclassthatimplementstheiterator protocol. TheiteratorprotocolishowPythonforloopsandrelatedexpressionstraversethe contentsofacontainertype.WhenPythonseesastatementlikeforxinfooitwill actuallycalliter(foo).Theiterbuilt-infunctioncallsthefoo.__iter__special methodinturn.The__iter__methodmustreturnaniteratorobject(whichitself implementsthe__next__specialmethod).Thentheforlooprepeatedlycallsthe nextbuilt-infunctionontheiteratorobjectuntilit’sexhausted(andraisesa StopIterationexception). Itsoundscomplicated,butpracticallyspeakingyoucanachieveallofthisbehaviorfor yourclassesbyimplementingthe__iter__methodasagenerator.Here,Idefinean iterablecontainerclassthatreadsthefilescontainingtourismdata: Clickheretoviewcodeimage classReadVisits(object): def__init__(self,data_path): self.data_path=data_path def__iter__(self): withopen(self.data_path)asf: forlineinf: yieldint(line) Thisnewcontainertypeworkscorrectlywhenpassedtotheoriginalfunctionwithoutany modifications. Clickheretoviewcodeimage visits=ReadVisits(path) percentages=normalize(visits) print(percentages) >>> [11.538461538461538,26.923076923076923,61.53846153846154] Thisworksbecausethesummethodinnormalizewillcall ReadVisits.__iter__toallocateanewiteratorobject.Theforlooptonormalize thenumberswillalsocall__iter__toallocateaseconditeratorobject.Eachofthose iteratorswillbeadvancedandexhaustedindependently,ensuringthateachunique iterationseesalloftheinputdatavalues.Theonlydownsideofthisapproachisthatit readstheinputdatamultipletimes. NowthatyouknowhowcontainerslikeReadVisitswork,youcanwriteyour functionstoensurethatparametersaren’tjustiterators.Theprotocolstatesthatwhenan iteratorispassedtotheiterbuilt-infunction,iterwillreturntheiteratoritself.In contrast,whenacontainertypeispassedtoiter,anewiteratorobjectwillbereturned eachtime.Thus,youcantestaninputvalueforthisbehaviorandraiseaTypeErrorto rejectiterators. Clickheretoviewcodeimage defnormalize_defensive(numbers): ifiter(numbers)isiter(numbers):#Aniterator—bad! raiseTypeError(‘Mustsupplyacontainer’) total=sum(numbers) result=[] forvalueinnumbers: percent=100*value/total result.append(percent) returnresult Thisisidealifyoudon’twanttocopythefullinputiteratorlikenormalize_copy above,butyoualsoneedtoiterateovertheinputdatamultipletimes.Thisfunctionworks asexpectedforlistandReadVisitsinputsbecausetheyarecontainers.Itwillwork foranytypeofcontainerthatfollowstheiteratorprotocol. Clickheretoviewcodeimage visits=[15,35,80] normalize_defensive(visits)#Noerror visits=ReadVisits(path) normalize_defensive(visits)#Noerror Thefunctionwillraiseanexceptioniftheinputisiterablebutnotacontainer. Clickheretoviewcodeimage it=iter(visits) normalize_defensive(it) >>> TypeError:Mustsupplyacontainer ThingstoRemember Bewareoffunctionsthatiterateoverinputargumentsmultipletimes.Ifthese argumentsareiterators,youmayseestrangebehaviorandmissingvalues. Python’siteratorprotocoldefineshowcontainersanditeratorsinteractwiththe iterandnextbuilt-infunctions,forloops,andrelatedexpressions. Youcaneasilydefineyourowniterablecontainertypebyimplementingthe __iter__methodasagenerator. Youcandetectthatavalueisaniterator(insteadofacontainer)ifcallingiteron ittwiceproducesthesameresult,whichcanthenbeprogressedwiththenextbuiltinfunction. Item18:ReduceVisualNoisewithVariablePositional Arguments Acceptingoptionalpositionalarguments(oftencalledstarargsinreferencetothe conventionalnamefortheparameter,*args)canmakeafunctioncallmoreclearand removevisualnoise. Forexample,sayyouwanttologsomedebuginformation.Withafixednumberof arguments,youwouldneedafunctionthattakesamessageandalistofvalues. Clickheretoviewcodeimage deflog(message,values): ifnotvalues: print(message) else: values_str=‘,‘.join(str(x)forxinvalues) print(‘%s:%s’%(message,values_str)) log(‘Mynumbersare’,[1,2]) log(‘Hithere’,[]) >>> Mynumbersare:1,2 Hithere Havingtopassanemptylistwhenyouhavenovaluestologiscumbersomeandnoisy. It’dbebettertoleaveoutthesecondargumententirely.YoucandothisinPythonby prefixingthelastpositionalparameternamewith*.Thefirstparameterforthelog messageisrequired,whereasanynumberofsubsequentpositionalargumentsareoptional. Thefunctionbodydoesn’tneedtochange,onlythecallersdo. Clickheretoviewcodeimage deflog(message,*values):#Theonlydifference ifnotvalues: print(message) else: values_str=‘,‘.join(str(x)forxinvalues) print(‘%s:%s’%(message,values_str)) log(‘Mynumbersare’,1,2) log(‘Hithere’)#Muchbetter >>> Mynumbersare:1,2 Hithere Ifyoualreadyhavealistandwanttocallavariableargumentfunctionlikelog,youcan dothisbyusingthe*operator.ThisinstructsPythontopassitemsfromthesequenceas positionalarguments. Clickheretoviewcodeimage favorites=[7,33,99] log(‘Favoritecolors’,*favorites) >>> Favoritecolors:7,33,99 Therearetwoproblemswithacceptingavariablenumberofpositionalarguments. Thefirstissueisthatthevariableargumentsarealwaysturnedintoatuplebeforetheyare passedtoyourfunction.Thismeansthatifthecallerofyourfunctionusesthe*operator onagenerator,itwillbeiterateduntilit’sexhausted.Theresultingtuplewillincludeevery valuefromthegenerator,whichcouldconsumealotofmemoryandcauseyourprogram tocrash. Clickheretoviewcodeimage defmy_generator(): foriinrange(10): yieldi defmy_func(*args): print(args) it=my_generator() my_func(*it) >>> (0,1,2,3,4,5,6,7,8,9) Functionsthataccept*argsarebestforsituationswhereyouknowthenumberofinputs intheargumentlistwillbereasonablysmall.It’sidealforfunctioncallsthatpassmany literalsorvariablenamestogether.It’sprimarilyfortheconvenienceoftheprogrammer andthereadabilityofthecode. Thesecondissuewith*argsisthatyoucan’taddnewpositionalargumentstoyour functioninthefuturewithoutmigratingeverycaller.Ifyoutrytoaddapositional argumentinthefrontoftheargumentlist,existingcallerswillsubtlybreakiftheyaren’t updated. Clickheretoviewcodeimage deflog(sequence,message,*values): ifnotvalues: print(‘%s:%s’%(sequence,message)) else: values_str=‘,‘.join(str(x)forxinvalues) print(‘%s:%s:%s’%(sequence,message,values_str)) log(1,‘Favorites’,7,33)#NewusageisOK log(‘Favoritenumbers’,7,33)#Oldusagebreaks >>> 1:Favorites:7,33 Favoritenumbers:7:33 Theproblemhereisthatthesecondcalltologused7asthemessageparameter becauseasequenceargumentwasn’tgiven.Bugslikethisarehardtotrackdown becausethecodestillrunswithoutraisinganyexceptions.Toavoidthispossibility entirely,youshouldusekeyword-onlyargumentswhenyouwanttoextendfunctionsthat accept*args(seeItem21:“EnforceClaritywithKeyword-OnlyArguments”). ThingstoRemember Functionscanacceptavariablenumberofpositionalargumentsbyusing*argsin thedefstatement. Youcanusetheitemsfromasequenceasthepositionalargumentsforafunction withthe*operator. Usingthe*operatorwithageneratormaycauseyourprogramtorunoutof memoryandcrash. Addingnewpositionalparameterstofunctionsthataccept*argscanintroduce hard-to-findbugs. Item19:ProvideOptionalBehaviorwithKeyword Arguments Likemostotherprogramminglanguages,callingafunctioninPythonallowsforpassing argumentsbyposition. Clickheretoviewcodeimage defremainder(number,divisor): returnnumber%divisor assertremainder(20,7)==6 AllpositionalargumentstoPythonfunctionscanalsobepassedbykeyword,wherethe nameoftheargumentisusedinanassignmentwithintheparenthesesofafunctioncall. Thekeywordargumentscanbepassedinanyorderaslongasalloftherequiredpositional argumentsarespecified.Youcanmixandmatchkeywordandpositionalarguments.These callsareequivalent: Clickheretoviewcodeimage remainder(20,7) remainder(20,divisor=7) remainder(number=20,divisor=7) remainder(divisor=7,number=20) Positionalargumentsmustbespecifiedbeforekeywordarguments. Clickheretoviewcodeimage remainder(number=20,7) >>> SyntaxError:non-keywordargafterkeywordarg Eachargumentcanonlybespecifiedonce. Clickheretoviewcodeimage remainder(20,number=7) >>> TypeError:remainder()gotmultiplevaluesforargument‘number’ Theflexibilityofkeywordargumentsprovidesthreesignificantbenefits. Thefirstadvantageisthatkeywordargumentsmakethefunctioncallclearertonew readersofthecode.Withthecallremainder(20,7),it’snotevidentwhichargument isthenumberandwhichisthedivisorwithoutlookingattheimplementationofthe remaindermethod.Inthecallwithkeywordarguments,number=20and divisor=7makeitimmediatelyobviouswhichparameterisbeingusedforeach purpose. Thesecondimpactofkeywordargumentsisthattheycanhavedefaultvaluesspecifiedin thefunctiondefinition.Thisallowsafunctiontoprovideadditionalcapabilitieswhenyou needthembutletsyouacceptthedefaultbehaviormostofthetime.Thiscaneliminate repetitivecodeandreducenoise. Forexample,sayyouwanttocomputetherateoffluidflowingintoavat.Ifthevatisalso onascale,thenyoucouldusethedifferencebetweentwoweightmeasurementsattwo differenttimestodeterminetheflowrate. Clickheretoviewcodeimage defflow_rate(weight_diff,time_diff): returnweight_diff/time_diff weight_diff=0.5 time_diff=3 flow=flow_rate(weight_diff,time_diff) print(‘%.3fkgpersecond’%flow) >>> 0.167kgpersecond Inthetypicalcase,it’susefultoknowtheflowrateinkilogramspersecond.Othertimes, it’dbehelpfultousethelastsensormeasurementstoapproximatelargertimescales,like hoursordays.Youcanprovidethisbehaviorinthesamefunctionbyaddinganargument forthetimeperiodscalingfactor. Clickheretoviewcodeimage defflow_rate(weight_diff,time_diff,period): return(weight_diff/time_diff)*period Theproblemisthatnowyouneedtospecifytheperiodargumenteverytimeyoucall thefunction,eveninthecommoncaseofflowratepersecond(wheretheperiodis1). Clickheretoviewcodeimage flow_per_second=flow_rate(weight_diff,time_diff,1) Tomakethislessnoisy,Icangivetheperiodargumentadefaultvalue. Clickheretoviewcodeimage defflow_rate(weight_diff,time_diff,period=1): return(weight_diff/time_diff)*period Theperiodargumentisnowoptional. Clickheretoviewcodeimage flow_per_second=flow_rate(weight_diff,time_diff) flow_per_hour=flow_rate(weight_diff,time_diff,period=3600) Thisworkswellforsimpledefaultvalues(itgetstrickyforcomplexdefaultvalues—see Item20:“UseNoneandDocstringstoSpecifyDynamicDefaultArguments”). Thethirdreasontousekeywordargumentsisthattheyprovideapowerfulwaytoextenda function’sparameterswhileremainingbackwardscompatiblewithexistingcallers.This letsyouprovideadditionalfunctionalitywithouthavingtomigratealotofcode,reducing thechanceofintroducingbugs. Forexample,sayyouwanttoextendtheflow_ratefunctionabovetocalculateflow ratesinweightunitsbesideskilograms.Youcandothisbyaddinganewoptional parameterthatprovidesaconversionratetoyourpreferredmeasurementunits. Clickheretoviewcodeimage defflow_rate(weight_diff,time_diff, period=1,units_per_kg=1): return((weight_diff/units_per_kg)/time_diff)*period Thedefaultargumentvalueforunits_per_kgis1,whichmakesthereturnedweight unitsremainaskilograms.Thismeansthatallexistingcallerswillseenochangein behavior.Newcallerstoflow_ratecanspecifythenewkeywordargumenttoseethe newbehavior. Clickheretoviewcodeimage pounds_per_hour=flow_rate(weight_diff,time_diff, period=3600,units_per_kg=2.2) Theonlyproblemwiththisapproachisthatoptionalkeywordargumentslikeperiod andunits_per_kgmaystillbespecifiedaspositionalarguments. Clickheretoviewcodeimage pounds_per_hour=flow_rate(weight_diff,time_diff,3600,2.2) Supplyingoptionalargumentspositionallycanbeconfusingbecauseitisn’tclearwhatthe values3600and2.2correspondto.Thebestpracticeistoalwaysspecifyoptional argumentsusingthekeywordnamesandneverpassthemaspositionalarguments. Note Backwardscompatibilityusingoptionalkeywordargumentslikethisiscrucialfor functionsthataccept*args(seeItem18:“ReduceVisualNoisewithVariable PositionalArguments”).Butanevenbetterpracticeistousekeyword-only arguments(seeItem21:“EnforceClaritywithKeyword-OnlyArguments”). ThingstoRemember Functionargumentscanbespecifiedbypositionorbykeyword. Keywordsmakeitclearwhatthepurposeofeachargumentiswhenitwouldbe confusingwithonlypositionalarguments. Keywordargumentswithdefaultvaluesmakeiteasytoaddnewbehaviorstoa function,especiallywhenthefunctionhasexistingcallers. Optionalkeywordargumentsshouldalwaysbepassedbykeywordinsteadofby position. Item20:UseNoneandDocstringstoSpecifyDynamic DefaultArguments Sometimesyouneedtouseanon-statictypeasakeywordargument’sdefaultvalue.For example,sayyouwanttoprintloggingmessagesthataremarkedwiththetimeofthe loggedevent.Inthedefaultcase,youwantthemessagetoincludethetimewhenthe functionwascalled.Youmighttrythefollowingapproach,assumingthedefault argumentsarereevaluatedeachtimethefunctioniscalled. Clickheretoviewcodeimage deflog(message,when=datetime.now()): print(‘%s:%s’%(when,message)) log(‘Hithere!’) sleep(0.1) log(‘Hiagain!’) >>> 2014-11-1521:10:10.371432:Hithere! 2014-11-1521:10:10.371432:Hiagain! Thetimestampsarethesamebecausedatetime.nowisonlyexecutedasingletime: whenthefunctionisdefined.Defaultargumentvaluesareevaluatedonlyoncepermodule load,whichusuallyhappenswhenaprogramstartsup.Afterthemodulecontainingthis codeisloaded,thedatetime.nowdefaultargumentwillneverbeevaluatedagain. TheconventionforachievingthedesiredresultinPythonistoprovideadefaultvalueof Noneandtodocumenttheactualbehaviorinthedocstring(seeItem49:“Write DocstringsforEveryFunction,Class,andModule”).Whenyourcodeseesanargument valueofNone,youallocatethedefaultvalueaccordingly. Clickheretoviewcodeimage deflog(message,when=None): “““Logamessagewithatimestamp. Args: message:Messagetoprint. when:datetimeofwhenthemessageoccurred. Defaultstothepresenttime. ””” when=datetime.now()ifwhenisNoneelsewhen print(‘%s:%s’%(when,message)) Nowthetimestampswillbedifferent. Clickheretoviewcodeimage log(‘Hithere!’) sleep(0.1) log(‘Hiagain!’) >>> 2014-11-1521:10:10.472303:Hithere! 2014-11-1521:10:10.573395:Hiagain! UsingNonefordefaultargumentvaluesisespeciallyimportantwhentheargumentsare mutable.Forexample,sayyouwanttoloadavalueencodedasJSONdata.Ifdecoding thedatafails,youwantanemptydictionarytobereturnedbydefault.Youmighttrythis approach. Clickheretoviewcodeimage defdecode(data,default={}): try: returnjson.loads(data) exceptValueError: returndefault Theproblemhereisthesameasthedatetime.nowexampleabove.Thedictionary specifiedfordefaultwillbesharedbyallcallstodecodebecausedefaultargument valuesareonlyevaluatedonce(atmoduleloadtime).Thiscancauseextremelysurprising behavior. foo=decode(‘baddata’) foo[‘stuff’]=5 bar=decode(‘alsobad’) bar[‘meep’]=1 print(‘Foo:’,foo) print(‘Bar:’,bar) >>> Foo:{‘stuff’:5,‘meep’:1} Bar:{‘stuff’:5,‘meep’:1} You’dexpecttwodifferentdictionaries,eachwithasinglekeyandvalue.Butmodifying oneseemstoalsomodifytheother.Theculpritisthatfooandbararebothequaltothe defaultparameter.Theyarethesamedictionaryobject. assertfooisbar ThefixistosetthekeywordargumentdefaultvaluetoNoneandthendocumentthe behaviorinthefunction’sdocstring. Clickheretoviewcodeimage defdecode(data,default=None): “““LoadJSONdatafromastring. Args: data:JSONdatatodecode. default:Valuetoreturnifdecodingfails. Defaultstoanemptydictionary. ””” ifdefaultisNone: default={} try: returnjson.loads(data) exceptValueError: returndefault Now,runningthesametestcodeasbeforeproducestheexpectedresult. foo=decode(‘baddata’) foo[‘stuff’]=5 bar=decode(‘alsobad’) bar[‘meep’]=1 print(‘Foo:’,foo) print(‘Bar:’,bar) >>> Foo:{‘stuff’:5} Bar:{‘meep’:1} ThingstoRemember Defaultargumentsareonlyevaluatedonce:duringfunctiondefinitionatmodule loadtime.Thiscancauseoddbehaviorsfordynamicvalues(like{}or[]). UseNoneasthedefaultvalueforkeywordargumentsthathaveadynamicvalue. Documenttheactualdefaultbehaviorinthefunction’sdocstring. Item21:EnforceClaritywithKeyword-OnlyArguments PassingargumentsbykeywordisapowerfulfeatureofPythonfunctions(seeItem19: “ProvideOptionalBehaviorwithKeywordArguments”).Theflexibilityofkeyword argumentsenablesyoutowritecodethatwillbeclearforyourusecases. Forexample,sayyouwanttodivideonenumberbyanotherbutbeverycarefulabout specialcases.SometimesyouwanttoignoreZeroDivisionErrorexceptionsand returninfinityinstead.Othertimes,youwanttoignoreOverflowErrorexceptionsand returnzeroinstead. Clickheretoviewcodeimage defsafe_division(number,divisor,ignore_overflow, ignore_zero_division): try: returnnumber/divisor exceptOverflowError: ifignore_overflow: return0 else: raise exceptZeroDivisionError: ifignore_zero_division: returnfloat(‘inf’) else: raise Usingthisfunctionisstraightforward.Thiscallwillignorethefloatoverflowfrom divisionandwillreturnzero. Clickheretoviewcodeimage result=safe_division(1,10**500,True,False) print(result) >>> 0.0 Thiscallwillignoretheerrorfromdividingbyzeroandwillreturninfinity. Clickheretoviewcodeimage result=safe_division(1,0,False,True) print(result) >>> inf Theproblemisthatit’seasytoconfusethepositionofthetwoBooleanargumentsthat controltheexception-ignoringbehavior.Thiscaneasilycausebugsthatarehardtotrack down.Onewaytoimprovethereadabilityofthiscodeistousekeywordarguments.By default,thefunctioncanbeoverlycautiousandcanalwaysre-raiseexceptions. Clickheretoviewcodeimage defsafe_division_b(number,divisor, ignore_overflow=False, ignore_zero_division=False): #… Thencallerscanusekeywordargumentstospecifywhichoftheignoreflagstheywantto flipforspecificoperations,overridingthedefaultbehavior. Clickheretoviewcodeimage safe_division_b(1,10**500,ignore_overflow=True) safe_division_b(1,0,ignore_zero_division=True) Theproblemis,sincethesekeywordargumentsareoptionalbehavior,there’snothing forcingcallersofyourfunctionstousekeywordargumentsforclarity.Evenwiththenew definitionofsafe_division_b,youcanstillcallittheoldwaywithpositional arguments. Clickheretoviewcodeimage safe_division_b(1,10**500,True,False) Withcomplexfunctionslikethis,it’sbettertorequirethatcallersareclearabouttheir intentions.InPython3,youcandemandclaritybydefiningyourfunctionswithkeywordonlyarguments.Theseargumentscanonlybesuppliedbykeyword,neverbyposition. Here,Iredefinethesafe_divisionfunctiontoacceptkeyword-onlyarguments.The *symbolintheargumentlistindicatestheendofpositionalargumentsandthebeginning ofkeyword-onlyarguments. Clickheretoviewcodeimage defsafe_division_c(number,divisor,*, ignore_overflow=False, ignore_zero_division=False): #… Now,callingthefunctionwithpositionalargumentsforthekeywordargumentswon’t work. Clickheretoviewcodeimage safe_division_c(1,10**500,True,False) >>> TypeError:safe_division_c()takes2positionalargumentsbut4weregiven Keywordargumentsandtheirdefaultvaluesworkasexpected. Clickheretoviewcodeimage safe_division_c(1,0,ignore_zero_division=True)#OK try: safe_division_c(1,0) exceptZeroDivisionError: pass#Expected Keyword-OnlyArgumentsinPython2 Unfortunately,Python2doesn’thaveexplicitsyntaxforspecifyingkeyword-only argumentslikePython3.ButyoucanachievethesamebehaviorofraisingTypeErrors forinvalidfunctioncallsbyusingthe**operatorinargumentlists.The**operatoris similartothe*operator(seeItem18:“ReduceVisualNoisewithVariablePositional Arguments”),exceptthatinsteadofacceptingavariablenumberofpositionalarguments, itacceptsanynumberofkeywordarguments,evenwhenthey’renotdefined. Clickheretoviewcodeimage #Python2 defprint_args(*args,**kwargs): print‘Positional:’,args print‘Keyword:’,kwargs print_args(1,2,foo=‘bar’,stuff=‘meep’) >>> Positional:(1,2) Keyword:{‘foo’:‘bar’,‘stuff’:‘meep’} Tomakesafe_divisiontakekeyword-onlyargumentsinPython2,youhavethe functionaccept**kwargs.Thenyoupopkeywordargumentsthatyouexpectoutofthe kwargsdictionary,usingthepopmethod’ssecondargumenttospecifythedefaultvalue whenthekeyismissing.Finally,youmakesuretherearenomorekeywordargumentsleft inkwargstopreventcallersfromsupplyingargumentsthatareinvalid. Clickheretoviewcodeimage #Python2 defsafe_division_d(number,divisor,**kwargs): ignore_overflow=kwargs.pop(‘ignore_overflow’,False) ignore_zero_div=kwargs.pop(‘ignore_zero_division’,False) ifkwargs: raiseTypeError(‘Unexpected**kwargs:%r’%kwargs) #… Now,youcancallthefunctionwithorwithoutkeywordarguments. Clickheretoviewcodeimage safe_division_d(1,10) safe_division_d(1,0,ignore_zero_division=True) safe_division_d(1,10**500,ignore_overflow=True) Tryingtopasskeyword-onlyargumentsbypositionwon’twork,justlikeinPython3. Clickheretoviewcodeimage safe_division_d(1,0,False,True) >>> TypeError:safe_division_d()takes2positionalargumentsbut4weregiven Tryingtopassunexpectedkeywordargumentsalsowon’twork. Clickheretoviewcodeimage safe_division_d(0,0,unexpected=True) >>> TypeError:Unexpected**kwargs:{‘unexpected’:True} ThingstoRemember Keywordargumentsmaketheintentionofafunctioncallmoreclear. Usekeyword-onlyargumentstoforcecallerstosupplykeywordargumentsfor potentiallyconfusingfunctions,especiallythosethatacceptmultipleBooleanflags. Python3supportsexplicitsyntaxforkeyword-onlyargumentsinfunctions. Python2canemulatekeyword-onlyargumentsforfunctionsbyusing**kwargs andmanuallyraisingTypeErrorexceptions. 3.ClassesandInheritance Asanobject-orientedprogramminglanguage,Pythonsupportsafullrangeoffeatures, suchasinheritance,polymorphism,andencapsulation.GettingthingsdoneinPython oftenrequireswritingnewclassesanddefininghowtheyinteractthroughtheirinterfaces andhierarchies. Python’sclassesandinheritancemakeiteasytoexpressyourprogram’sintended behaviorswithobjects.Theyallowyoutoimproveandexpandfunctionalityovertime. Theyprovideflexibilityinanenvironmentofchangingrequirements.Knowinghowtouse themwellenablesyoutowritemaintainablecode. Item22:PreferHelperClassesOverBookkeepingwith DictionariesandTuples Python’sbuilt-indictionarytypeiswonderfulformaintainingdynamicinternalstateover thelifetimeofanobject.Bydynamic,Imeansituationsinwhichyouneedtodo bookkeepingforanunexpectedsetofidentifiers.Forexample,sayyouwanttorecordthe gradesofasetofstudentswhosenamesaren’tknowninadvance.Youcandefineaclass tostorethenamesinadictionaryinsteadofusingapredefinedattributeforeachstudent. Clickheretoviewcodeimage classSimpleGradebook(object): def__init__(self): self._grades={} defadd_student(self,name): self._grades[name]=[] defreport_grade(self,name,score): self._grades[name].append(score) defaverage_grade(self,name): grades=self._grades[name] returnsum(grades)/len(grades) Usingtheclassissimple. Clickheretoviewcodeimage book=SimpleGradebook() book.add_student(‘IsaacNewton’) book.report_grade(‘IsaacNewton’,90) #… print(book.average_grade(‘IsaacNewton’)) >>> 90.0 Dictionariesaresoeasytousethatthere’sadangerofoverextendingthemtowritebrittle code.Forexample,sayyouwanttoextendtheSimpleGradebookclasstokeepalist ofgradesbysubject,notjustoverall.Youcandothisbychangingthe_grades dictionarytomapstudentnames(thekeys)toyetanotherdictionary(thevalues).The innermostdictionarywillmapsubjects(thekeys)togrades(thevalues). Clickheretoviewcodeimage classBySubjectGradebook(object): def__init__(self): self._grades={} defadd_student(self,name): self._grades[name]={} Thisseemsstraightforwardenough.Thereport_gradeandaverage_grade methodswillgainquiteabitofcomplexitytodealwiththemultileveldictionary,butit’s manageable. Clickheretoviewcodeimage defreport_grade(self,name,subject,grade): by_subject=self._grades[name] grade_list=by_subject.setdefault(subject,[]) grade_list.append(grade) defaverage_grade(self,name): by_subject=self._grades[name] total,count=0,0 forgradesinby_subject.values(): total+=sum(grades) count+=len(grades) returntotal/count Usingtheclassremainssimple. Clickheretoviewcodeimage book=BySubjectGradebook() book.add_student(‘AlbertEinstein’) book.report_grade(‘AlbertEinstein’,‘Math’,75) book.report_grade(‘AlbertEinstein’,‘Math’,65) book.report_grade(‘AlbertEinstein’,‘Gym’,90) book.report_grade(‘AlbertEinstein’,‘Gym’,95) Now,imagineyourrequirementschangeagain.Youalsowanttotracktheweightofeach scoretowardtheoverallgradeintheclasssomidtermsandfinalsaremoreimportantthan popquizzes.Onewaytoimplementthisfeatureistochangetheinnermostdictionary; insteadofmappingsubjects(thekeys)togrades(thevalues),Icanusethetuple (score,weight)asvalues. Clickheretoviewcodeimage classWeightedGradebook(object): #… defreport_grade(self,name,subject,score,weight): by_subject=self._grades[name] grade_list=by_subject.setdefault(subject,[]) grade_list.append((score,weight)) Althoughthechangestoreport_gradeseemsimple—justmakethevalueatuple—the average_grademethodnowhasaloopwithinaloopandisdifficulttoread. Clickheretoviewcodeimage defaverage_grade(self,name): by_subject=self._grades[name] score_sum,score_count=0,0 forsubject,scoresinby_subject.items(): subject_avg,total_weight=0,0 forscore,weightinscores: #… returnscore_sum/score_count Usingtheclasshasalsogottenmoredifficult.It’sunclearwhatallofthenumbersinthe positionalargumentsmean. Clickheretoviewcodeimage book.report_grade(‘AlbertEinstein’,‘Math’,80,0.10) Whenyouseecomplexitylikethishappen,it’stimetomaketheleapfromdictionaries andtuplestoahierarchyofclasses. Atfirst,youdidn’tknowyou’dneedtosupportweightedgrades,sothecomplexityof additionalhelperclassesseemedunwarranted.Python’sbuilt-indictionaryandtupletypes madeiteasytokeepgoing,addinglayerafterlayertotheinternalbookkeeping.Butyou shouldavoiddoingthisformorethanonelevelofnesting(i.e.,avoiddictionariesthat containdictionaries).Itmakesyourcodehardtoreadbyotherprogrammersandsetsyou upforamaintenancenightmare. Assoonasyourealizethebookkeepingisgettingcomplicated,breakitalloutintoclasses. Thisletsyouprovidewell-definedinterfacesthatbetterencapsulateyourdata.Thisalso enablesyoutocreatealayerofabstractionbetweenyourinterfacesandyourconcrete implementations. RefactoringtoClasses Youcanstartmovingtoclassesatthebottomofthedependencytree:asinglegrade.A classseemstooheavyweightforsuchsimpleinformation.Atuple,though,seems appropriatebecausegradesareimmutable.Here,Iusethetuple(score,weight)to trackgradesinalist: Clickheretoviewcodeimage grades=[] grades.append((95,0.45)) #… total=sum(score*weightforscore,weightingrades) total_weight=sum(weightfor_,weightingrades) average_grade=total/total_weight Theproblemisthatplaintuplesarepositional.Whenyouwanttoassociatemore informationwithagrade,likeasetofnotesfromtheteacher,you’llneedtorewriteevery usageofthetwo-tupletobeawarethattherearenowthreeitemspresentinsteadoftwo. Here,Iuse_(theunderscorevariablename,aPythonconventionforunusedvariables)to capturethethirdentryinthetupleandjustignoreit: Clickheretoviewcodeimage grades=[] grades.append((95,0.45,‘Greatjob’)) #… total=sum(score*weightforscore,weight,_ingrades) total_weight=sum(weightfor_,weight,_ingrades) average_grade=total/total_weight Thispatternofextendingtupleslongerandlongerissimilartodeepeninglayersof dictionaries.Assoonasyoufindyourselfgoinglongerthanatwo-tuple,it’stimeto consideranotherapproach. Thenamedtupletypeinthecollectionsmoduledoesexactlywhatyouneed.It letsyoueasilydefinetiny,immutabledataclasses. Clickheretoviewcodeimage importcollections Grade=collections.namedtuple(‘Grade’,(‘score’,‘weight’)) Theseclassescanbeconstructedwithpositionalorkeywordarguments.Thefieldsare accessiblewithnamedattributes.Havingnamedattributesmakesiteasytomovefroma namedtupletoyourownclasslaterifyourrequirementschangeagainandyouneedto addbehaviorstothesimpledatacontainers. Limitationsofnamedtuple Althoughusefulinmanycircumstances,it’simportanttounderstandwhen namedtuplecancausemoreharmthangood. Youcan’tspecifydefaultargumentvaluesfornamedtupleclasses.Thismakes themunwieldywhenyourdatamayhavemanyoptionalproperties.Ifyoufind yourselfusingmorethanahandfulofattributes,definingyourownclassmaybea betterchoice. Theattributevaluesofnamedtupleinstancesarestillaccessibleusingnumerical indexesanditeration.EspeciallyinexternalizedAPIs,thiscanleadtounintentional usagethatmakesithardertomovetoarealclasslater.Ifyou’renotincontrolofall oftheusageofyournamedtupleinstances,it’sbettertodefineyourownclass. Next,youcanwriteaclasstorepresentasinglesubjectthatcontainsasetofgrades. Clickheretoviewcodeimage classSubject(object): def__init__(self): self._grades=[] defreport_grade(self,score,weight): self._grades.append(Grade(score,weight)) defaverage_grade(self): total,total_weight=0,0 forgradeinself._grades: total+=grade.score*grade.weight total_weight+=grade.weight returntotal/total_weight Thenyouwouldwriteaclasstorepresentasetofsubjectsthatarebeingstudiedbya singlestudent. Clickheretoviewcodeimage classStudent(object): def__init__(self): self._subjects={} defsubject(self,name): ifnamenotinself._subjects: self._subjects[name]=Subject() returnself._subjects[name] defaverage_grade(self): total,count=0,0 forsubjectinself._subjects.values(): total+=subject.average_grade() count+=1 returntotal/count Finally,you’dwriteacontainerforallofthestudentskeyeddynamicallybytheirnames. Clickheretoviewcodeimage classGradebook(object): def__init__(self): self._students={} defstudent(self,name): ifnamenotinself._students: self._students[name]=Student() returnself._students[name] Thelinecountoftheseclassesisalmostdoublethepreviousimplementation’ssize.But thiscodeismucheasiertoread.Theexampledrivingtheclassesisalsomoreclearand extensible. Clickheretoviewcodeimage book=Gradebook() albert=book.student(‘AlbertEinstein’) math=albert.subject(‘Math’) math.report_grade(80,0.10) #… print(albert.average_grade()) >>> 81.5 Ifnecessary,youcanwritebackwards-compatiblemethodstohelpmigrateusageofthe oldAPIstyletothenewhierarchyofobjects. ThingstoRemember Avoidmakingdictionarieswithvaluesthatareotherdictionariesorlongtuples. Usenamedtupleforlightweight,immutabledatacontainersbeforeyouneedthe flexibilityofafullclass. Moveyourbookkeepingcodetousemultiplehelperclasseswhenyourinternalstate dictionariesgetcomplicated. Item23:AcceptFunctionsforSimpleInterfacesInsteadof Classes ManyofPython’sbuilt-inAPIsallowyoutocustomizebehaviorbypassinginafunction. ThesehooksareusedbyAPIstocallbackyourcodewhiletheyexecute.Forexample,the listtype’ssortmethodtakesanoptionalkeyargumentthat’susedtodetermineeach index’svalueforsorting.Here,Isortalistofnamesbasedontheirlengthsbyprovidinga lambdaexpressionasthekeyhook: Clickheretoviewcodeimage names=[‘Socrates’,‘Archimedes’,‘Plato’,‘Aristotle’] names.sort(key=lambdax:len(x)) print(names) >>> [‘Plato’,‘Socrates’,‘Aristotle’,‘Archimedes’] Inotherlanguages,youmightexpecthookstobedefinedbyanabstractclass.InPython, manyhooksarejuststatelessfunctionswithwell-definedargumentsandreturnvalues. Functionsareidealforhooksbecausetheyareeasiertodescribeandsimplertodefine thanclasses.FunctionsworkashooksbecausePythonhasfirst-classfunctions:Functions andmethodscanbepassedaroundandreferencedlikeanyothervalueinthelanguage. Forexample,sayyouwanttocustomizethebehaviorofthedefaultdictclass(see Item46:“UseBuilt-inAlgorithmsandDataStructures”fordetails).Thisdatastructure allowsyoutosupplyafunctionthatwillbecalledeachtimeamissingkeyisaccessed. Thefunctionmustreturnthedefaultvaluethemissingkeyshouldhaveinthedictionary. Here,Idefineahookthatlogseachtimeakeyismissingandreturns0forthedefault value: deflog_missing(): print(‘Keyadded’) return0 Givenaninitialdictionaryandasetofdesiredincrements,Icancausethe log_missingfunctiontorunandprinttwice(for'red'and'orange'). Clickheretoviewcodeimage current={‘green’:12,‘blue’:3} increments=[ (‘red’,5), (‘blue’,17), (‘orange’,9), ] result=defaultdict(log_missing,current) print(‘Before:’,dict(result)) forkey,amountinincrements: result[key]+=amount print(‘After:‘,dict(result)) >>> Before:{‘green’:12,‘blue’:3} Keyadded Keyadded After:{‘orange’:9,‘green’:12,‘blue’:20,‘red’:5} Supplyingfunctionslikelog_missingmakesAPIseasytobuildandtestbecauseit separatessideeffectsfromdeterministicbehavior.Forexample,sayyounowwantthe defaultvaluehookpassedtodefaultdicttocountthetotalnumberofkeysthatwere missing.Onewaytoachievethisisusingastatefulclosure(seeItem15:“KnowHow ClosuresInteractwithVariableScope”fordetails).Here,Idefineahelperfunctionthat usessuchaclosureasthedefaultvaluehook: Clickheretoviewcodeimage defincrement_with_report(current,increments): added_count=0 defmissing(): nonlocaladded_count#Statefulclosure added_count+=1 return0 result=defaultdict(missing,current) forkey,amountinincrements: result[key]+=amount returnresult,added_count Runningthisfunctionproducestheexpectedresult(2),eventhoughthedefaultdict hasnoideathatthemissinghookmaintainsstate.Thisisanotherbenefitofaccepting simplefunctionsforinterfaces.It’seasytoaddfunctionalitylaterbyhidingstateina closure. Clickheretoviewcodeimage result,count=increment_with_report(current,increments) assertcount==2 Theproblemwithdefiningaclosureforstatefulhooksisthatit’shardertoreadthanthe statelessfunctionexample.Anotherapproachistodefineasmallclassthatencapsulates thestateyouwanttotrack. classCountMissing(object): def__init__(self): self.added=0 defmissing(self): self.added+=1 return0 Inotherlanguages,youmightexpectthatnowdefaultdictwouldhavetobe modifiedtoaccommodatetheinterfaceofCountMissing.ButinPython,thanksto first-classfunctions,youcanreferencetheCountMissing.missingmethoddirectly onanobjectandpassittodefaultdictasthedefaultvaluehook.It’strivialtohavea methodsatisfyafunctioninterface. Clickheretoviewcodeimage counter=CountMissing() result=defaultdict(counter.missing,current)#Methodref forkey,amountinincrements: result[key]+=amount assertcounter.added==2 Usingahelperclasslikethistoprovidethebehaviorofastatefulclosureisclearerthan theincrement_with_reportfunctionabove.However,inisolationit’sstillnot immediatelyobviouswhatthepurposeoftheCountMissingclassis.Whoconstructsa CountMissingobject?Whocallsthemissingmethod?Willtheclassneedother publicmethodstobeaddedinthefuture?Untilyouseeitsusagewithdefaultdict, theclassisamystery. Toclarifythissituation,Pythonallowsclassestodefinethe__call__specialmethod. __call__allowsanobjecttobecalledjustlikeafunction.Italsocausesthe callablebuilt-infunctiontoreturnTrueforsuchaninstance. Clickheretoviewcodeimage classBetterCountMissing(object): def__init__(self): self.added=0 def__call__(self): self.added+=1 return0 counter=BetterCountMissing() counter() assertcallable(counter) Here,IuseaBetterCountMissinginstanceasthedefaultvaluehookfora defaultdicttotrackthenumberofmissingkeysthatwereadded: Clickheretoviewcodeimage counter=BetterCountMissing() result=defaultdict(counter,current)#Relieson__call__ forkey,amountinincrements: result[key]+=amount assertcounter.added==2 ThisismuchclearerthantheCountMissing.missingexample.The__call__ methodindicatesthataclass’sinstanceswillbeusedsomewhereafunctionargument wouldalsobesuitable(likeAPIhooks).Itdirectsnewreadersofthecodetotheentry pointthat’sresponsiblefortheclass’sprimarybehavior.Itprovidesastronghintthatthe goaloftheclassistoactasastatefulclosure. Bestofall,defaultdictstillhasnoviewintowhat’sgoingonwhenyouuse __call__.Allthatdefaultdictrequiresisafunctionforthedefaultvaluehook. Pythonprovidesmanydifferentwaystosatisfyasimplefunctioninterfacedependingon whatyouneedtoaccomplish. ThingstoRemember Insteadofdefiningandinstantiatingclasses,functionsareoftenallyouneedfor simpleinterfacesbetweencomponentsinPython. ReferencestofunctionsandmethodsinPythonarefirstclass,meaningtheycanbe usedinexpressionslikeanyothertype. The__call__specialmethodenablesinstancesofaclasstobecalledlikeplain Pythonfunctions. Whenyouneedafunctiontomaintainstate,considerdefiningaclassthatprovides the__call__methodinsteadofdefiningastatefulclosure(seeItem15:“Know HowClosuresInteractwithVariableScope”). Item24:Use@classmethodPolymorphismtoConstruct ObjectsGenerically InPython,notonlydotheobjectssupportpolymorphism,buttheclassesdoaswell.What doesthatmean,andwhatisitgoodfor? Polymorphismisawayformultipleclassesinahierarchytoimplementtheirownunique versionsofamethod.Thisallowsmanyclassestofulfillthesameinterfaceorabstract baseclasswhileprovidingdifferentfunctionality(seeItem28:“Inheritfrom collections.abcforCustomContainerTypes”foranexample). Forexample,sayyou’rewritingaMapReduceimplementationandyouwantacommon classtorepresenttheinputdata.Here,Idefinesuchaclasswithareadmethodthatmust bedefinedbysubclasses: Clickheretoviewcodeimage classInputData(object): defread(self): raiseNotImplementedError Here,IhaveaconcretesubclassofInputDatathatreadsdatafromafileondisk: Clickheretoviewcodeimage classPathInputData(InputData): def__init__(self,path): super().__init__() self.path=path defread(self): returnopen(self.path).read() YoucouldhaveanynumberofInputDatasubclasseslikePathInputDataandeach ofthemcouldimplementthestandardinterfaceforreadtoreturnthebytesofdatato process.OtherInputDatasubclassescouldreadfromthenetwork,decompressdata transparently,etc. You’dwantasimilarabstractinterfacefortheMapReduceworkerthatconsumestheinput datainastandardway. Clickheretoviewcodeimage classWorker(object): def__init__(self,input_data): self.input_data=input_data self.result=None defmap(self): raiseNotImplementedError defreduce(self,other): raiseNotImplementedError Here,IdefineaconcretesubclassofWorkertoimplementthespecificMapReduce functionIwanttoapply:asimplenewlinecounter: Clickheretoviewcodeimage classLineCountWorker(Worker): defmap(self): data=self.input_data.read() self.result=data.count(‘\n’) defreduce(self,other): self.result+=other.result Itmaylooklikethisimplementationisgoinggreat,butI’vereachedthebiggesthurdlein allofthis.Whatconnectsallofthesepieces?Ihaveanicesetofclasseswithreasonable interfacesandabstractions—butthat’sonlyusefuloncetheobjectsareconstructed. What’sresponsibleforbuildingtheobjectsandorchestratingtheMapReduce? Thesimplestapproachistomanuallybuildandconnecttheobjectswithsomehelper functions.Here,IlistthecontentsofadirectoryandconstructaPathInputData instanceforeachfileitcontains: Clickheretoviewcodeimage defgenerate_inputs(data_dir): fornameinos.listdir(data_dir): yieldPathInputData(os.path.join(data_dir,name)) Next,IcreatetheLineCountWorkerinstancesusingtheInputDatainstances returnedbygenerate_inputs. Clickheretoviewcodeimage defcreate_workers(input_list): workers=[] forinput_dataininput_list: workers.append(LineCountWorker(input_data)) returnworkers IexecutetheseWorkerinstancesbyfanningoutthemapsteptomultiplethreads(see Item37:“UseThreadsforBlockingI/O,AvoidforParallelism”).Then,Icallreduce repeatedlytocombinetheresultsintoonefinalvalue. Clickheretoviewcodeimage defexecute(workers): threads=[Thread(target=w.map)forwinworkers] forthreadinthreads:thread.start() forthreadinthreads:thread.join() first,rest=workers[0],workers[1:] forworkerinrest: first.reduce(worker) returnfirst.result Finally,Iconnectallofthepiecestogetherinafunctiontoruneachstep. Clickheretoviewcodeimage defmapreduce(data_dir): inputs=generate_inputs(data_dir) workers=create_workers(inputs) returnexecute(workers) Runningthisfunctiononasetoftestinputfilesworksgreat. Clickheretoviewcodeimage fromtempfileimportTemporaryDirectory defwrite_test_files(tmpdir): #… withTemporaryDirectory()astmpdir: write_test_files(tmpdir) result=mapreduce(tmpdir) print(‘Thereare’,result,‘lines’) >>> Thereare4360lines What’stheproblem?Thehugeissueisthemapreducefunctionisnotgenericatall.If youwanttowriteanotherInputDataorWorkersubclass,youwouldalsohaveto rewritethegenerate_inputs,create_workers,andmapreducefunctionsto match. Thisproblemboilsdowntoneedingagenericwaytoconstructobjects.Inother languages,you’dsolvethisproblemwithconstructorpolymorphism,requiringthateach InputDatasubclassprovidesaspecialconstructorthatcanbeusedgenericallybythe helpermethodsthatorchestratetheMapReduce.ThetroubleisthatPythononlyallowsfor thesingleconstructormethod__init__.It’sunreasonabletorequireevery InputDatasubclasstohaveacompatibleconstructor. Thebestwaytosolvethisproblemiswith@classmethodpolymorphism.Thisis exactlyliketheinstancemethodpolymorphismIusedforInputData.read,except thatitappliestowholeclassesinsteadoftheirconstructedobjects. LetmeapplythisideatotheMapReduceclasses.Here,IextendtheInputDataclass withagenericclassmethodthat’sresponsibleforcreatingnewInputDatainstances usingacommoninterface: Clickheretoviewcodeimage classGenericInputData(object): defread(self): raiseNotImplementedError @classmethod defgenerate_inputs(cls,config): raiseNotImplementedError Ihavegenerate_inputstakeadictionarywithasetofconfigurationparametersthat areuptotheInputDataconcretesubclasstointerpret.Here,Iusetheconfigtofind thedirectorytolistforinputfiles: Clickheretoviewcodeimage classPathInputData(GenericInputData): #… defread(self): returnopen(self.path).read() @classmethod defgenerate_inputs(cls,config): data_dir=config[‘data_dir’] fornameinos.listdir(data_dir): yieldcls(os.path.join(data_dir,name)) Similarly,Icanmakethecreate_workershelperpartoftheGenericWorkerclass. Here,Iusetheinput_classparameter,whichmustbeasubclassof GenericInputData,togeneratethenecessaryinputs.Iconstructinstancesofthe GenericWorkerconcretesubclassusingcls()asagenericconstructor. Clickheretoviewcodeimage classGenericWorker(object): #… defmap(self): raiseNotImplementedError defreduce(self,other): raiseNotImplementedError @classmethod defcreate_workers(cls,input_class,config): workers=[] forinput_dataininput_class.generate_inputs(config): workers.append(cls(input_data)) returnworkers Notethatthecalltoinput_class.generate_inputsaboveistheclass polymorphismI’mtryingtoshow.Youcanalsoseehowcreate_workerscallingcls providesanalternatewaytoconstructGenericWorkerobjectsbesidesusingthe __init__methoddirectly. TheeffectonmyconcreteGenericWorkersubclassisnothingmorethanchangingits parentclass. Clickheretoviewcodeimage classLineCountWorker(GenericWorker): #… Andfinally,Icanrewritethemapreducefunctiontobecompletelygeneric. Clickheretoviewcodeimage defmapreduce(worker_class,input_class,config): workers=worker_class.create_workers(input_class,config) returnexecute(workers) Runningthenewworkeronasetoftestfilesproducesthesameresultastheold implementation.Thedifferenceisthatthemapreducefunctionrequiresmore parameterssothatitcanoperategenerically. Clickheretoviewcodeimage withTemporaryDirectory()astmpdir: write_test_files(tmpdir) config={‘data_dir’:tmpdir} result=mapreduce(LineCountWorker,PathInputData,config) NowyoucanwriteotherGenericInputDataandGenericWorkerclassesasyou wishandnothavetorewriteanyofthegluecode. ThingstoRemember Pythononlysupportsasingleconstructorperclass,the__init__method. Use@classmethodtodefinealternativeconstructorsforyourclasses. Useclassmethodpolymorphismtoprovidegenericwaystobuildandconnect concretesubclasses. Item25:InitializeParentClasseswithsuper Theoldwaytoinitializeaparentclassfromachildclassistodirectlycalltheparent class’s__init__methodwiththechildinstance. Clickheretoviewcodeimage classMyBaseClass(object): def__init__(self,value): self.value=value classMyChildClass(MyBaseClass): def__init__(self): MyBaseClass.__init__(self,5) Thisapproachworksfineforsimplehierarchiesbutbreaksdowninmanycases. Ifyourclassisaffectedbymultipleinheritance(somethingtoavoidingeneral;seeItem 26:“UseMultipleInheritanceOnlyforMix-inUtilityClasses”),callingthesuperclasses’ __init__methodsdirectlycanleadtounpredictablebehavior. Oneproblemisthatthe__init__callorderisn’tspecifiedacrossallsubclasses.For example,hereIdefinetwoparentclassesthatoperateontheinstance’svaluefield: classTimesTwo(object): def__init__(self): self.value*=2 classPlusFive(object): def__init__(self): self.value+=5 Thisclassdefinesitsparentclassesinoneordering. Clickheretoviewcodeimage classOneWay(MyBaseClass,TimesTwo,PlusFive): def__init__(self,value): MyBaseClass.__init__(self,value) TimesTwo.__init__(self) PlusFive.__init__(self) Andconstructingitproducesaresultthatmatchestheparentclassordering. Clickheretoviewcodeimage foo=OneWay(5) print(‘Firstorderingis(5*2)+5=’,foo.value) >>> Firstorderingis(5*2)+5=15 Here’sanotherclassthatdefinesthesameparentclassesbutinadifferentordering: Clickheretoviewcodeimage classAnotherWay(MyBaseClass,PlusFive,TimesTwo): def__init__(self,value): MyBaseClass.__init__(self,value) TimesTwo.__init__(self) PlusFive.__init__(self) However,IleftthecallstotheparentclassconstructorsPlusFive.__init__and TimesTwo.__init__inthesameorderasbefore,causingthisclass’sbehaviornotto matchtheorderoftheparentclassesinitsdefinition. Clickheretoviewcodeimage bar=AnotherWay(5) print(‘Secondorderingstillis’,bar.value) >>> Secondorderingstillis15 Anotherproblemoccurswithdiamondinheritance.Diamondinheritancehappenswhena subclassinheritsfromtwoseparateclassesthathavethesamesuperclasssomewherein thehierarchy.Diamondinheritancecausesthecommonsuperclass’s__init__method torunmultipletimes,causingunexpectedbehavior.Forexample,hereIdefinetwochild classesthatinheritfromMyBaseClass. Clickheretoviewcodeimage classTimesFive(MyBaseClass): def__init__(self,value): MyBaseClass.__init__(self,value) self.value*=5 classPlusTwo(MyBaseClass): def__init__(self,value): MyBaseClass.__init__(self,value) self.value+=2 Then,Idefineachildclassthatinheritsfrombothoftheseclasses,making MyBaseClassthetopofthediamond. Clickheretoviewcodeimage classThisWay(TimesFive,PlusTwo): def__init__(self,value): TimesFive.__init__(self,value) PlusTwo.__init__(self,value) foo=ThisWay(5) print(‘Shouldbe(5*5)+2=27butis’,foo.value) >>> Shouldbe(5*5)+2=27butis7 Theoutputshouldbe27because(5*5)+2=27.Butthecalltothesecond parentclass’sconstructor,PlusTwo.__init__,causesself.valuetoberesetback to5whenMyBaseClass.__init__getscalledasecondtime. Tosolvetheseproblems,Python2.2addedthesuperbuilt-infunctionanddefinedthe methodresolutionorder(MRO).TheMROstandardizeswhichsuperclassesareinitialized beforeothers(e.g.,depth-first,left-to-right).Italsoensuresthatcommonsuperclassesin diamondhierarchiesareonlyrunonce. Here,Icreateadiamond-shapedclasshierarchyagain,butthistimeIusesuper(inthe Python2style)toinitializetheparentclass: Clickheretoviewcodeimage #Python2 classTimesFiveCorrect(MyBaseClass): def__init__(self,value): super(TimesFiveCorrect,self).__init__(value) self.value*=5 classPlusTwoCorrect(MyBaseClass): def__init__(self,value): super(PlusTwoCorrect,self).__init__(value) self.value+=2 Nowthetoppartofthediamond,MyBaseClass.__init__,isonlyrunasingletime. Theotherparentclassesarerunintheorderspecifiedintheclassstatement. Clickheretoviewcodeimage #Python2 classGoodWay(TimesFiveCorrect,PlusTwoCorrect): def__init__(self,value): super(GoodWay,self).__init__(value) foo=GoodWay(5) print‘Shouldbe5*(5+2)=35andis’,foo.value >>> Shouldbe5*(5+2)=35andis35 Thisordermayseembackwardsatfirst.Shouldn’tTimesFiveCorrect.__init__ haverunfirst?Shouldn’ttheresultbe(5*5)+2=27?Theanswerisno.This orderingmatcheswhattheMROdefinesforthisclass.TheMROorderingisavailableon aclassmethodcalledmro. Clickheretoviewcodeimage frompprintimportpprint pprint(GoodWay.mro()) >>> [<class‘__main__.GoodWay’>, <class‘__main__.TimesFiveCorrect’>, <class‘__main__.PlusTwoCorrect’>, <class‘__main__.MyBaseClass’>, <class‘object’>] WhenIcallGoodWay(5),itinturncallsTimesFiveCorrect.__init__,which callsPlusTwoCorrect.__init__,whichcallsMyBaseClass.__init__.Once thisreachesthetopofthediamond,thenalloftheinitializationmethodsactuallydotheir workintheoppositeorderfromhowtheir__init__functionswerecalled. MyBaseClass.__init__assignsthevalueto5.PlusTwoCorrect.__init__ adds2tomakevalueequal7.TimesFiveCorrect.__init__multipliesitby5to makevalueequal35. Thesuperbuilt-infunctionworkswell,butitstillhastwonoticeableproblemsinPython 2: Itssyntaxisabitverbose.Youhavetospecifytheclassyou’rein,theselfobject, themethodname(usually__init__),andallthearguments.Thisconstructioncan beconfusingtonewPythonprogrammers. Youhavetospecifythecurrentclassbynameinthecalltosuper.Ifyouever changetheclass’sname—averycommonactivitywhenimprovingaclasshierarchy —youalsoneedtoupdateeverycalltosuper. Thankfully,Python3fixestheseissuesbymakingcallstosuperwithnoarguments equivalenttocallingsuperwith__class__andselfspecified.InPython3,you shouldalwaysusesuperbecauseit’sclear,concise,andalwaysdoestherightthing. Clickheretoviewcodeimage classExplicit(MyBaseClass): def__init__(self,value): super(__class__,self).__init__(value*2) classImplicit(MyBaseClass): def__init__(self,value): super().__init__(value*2) assertExplicit(10).value==Implicit(10).value ThisworksbecausePython3letsyoureliablyreferencethecurrentclassinmethodsusing the__class__variable.Thisdoesn’tworkinPython2because__class__isn’t defined.Youmayguessthatyoucoulduseself.__class__asanargumentto super,butthisbreaksbecauseofthewaysuperisimplementedinPython2. ThingstoRemember Python’sstandardmethodresolutionorder(MRO)solvestheproblemsofsuperclass initializationorderanddiamondinheritance. Alwaysusethesuperbuilt-infunctiontoinitializeparentclasses. Item26:UseMultipleInheritanceOnlyforMix-inUtility Classes Pythonisanobject-orientedlanguagewithbuilt-infacilitiesformakingmultiple inheritancetractable(seeItem25:“InitializeParentClasseswithsuper”).However,it’s bettertoavoidmultipleinheritancealtogether. Ifyoufindyourselfdesiringtheconvenienceandencapsulationthatcomeswithmultiple inheritance,considerwritingamix-ininstead.Amix-inisasmallclassthatonlydefinesa setofadditionalmethodsthataclassshouldprovide.Mix-inclassesdon’tdefinetheir owninstanceattributesnorrequiretheir__init__constructortobecalled. Writingmix-insiseasybecausePythonmakesittrivialtoinspectthecurrentstateofany objectregardlessofitstype.Dynamicinspectionletsyouwritegenericfunctionalitya singletime,inamix-in,thatcanbeappliedtomanyotherclasses.Mix-inscanbe composedandlayeredtominimizerepetitivecodeandmaximizereuse. Forexample,sayyouwanttheabilitytoconvertaPythonobjectfromitsin-memory representationtoadictionarythat’sreadyforserialization.Whynotwritethis functionalitygenericallysoyoucanuseitwithallofyourclasses? Here,Idefineanexamplemix-inthataccomplishesthiswithanewpublicmethodthat’s addedtoanyclassthatinheritsfromit: Clickheretoviewcodeimage classToDictMixin(object): defto_dict(self): returnself._traverse_dict(self.__dict__) Theimplementationdetailsarestraightforwardandrelyondynamicattributeaccessusing hasattr,dynamictypeinspectionwithisinstance,andaccessingtheinstance dictionary__dict__. Clickheretoviewcodeimage def_traverse_dict(self,instance_dict): output={} forkey,valueininstance_dict.items(): output[key]=self._traverse(key,value) returnoutput def_traverse(self,key,value): ifisinstance(value,ToDictMixin): returnvalue.to_dict() elifisinstance(value,dict): returnself._traverse_dict(value) elifisinstance(value,list): return[self._traverse(key,i)foriinvalue] elifhasattr(value,‘__dict__’): returnself._traverse_dict(value.__dict__) else: returnvalue Here,Idefineanexampleclassthatusesthemix-intomakeadictionaryrepresentationof abinarytree: Clickheretoviewcodeimage classBinaryTree(ToDictMixin): def__init__(self,value,left=None,right=None): self.value=value self.left=left self.right=right TranslatingalargenumberofrelatedPythonobjectsintoadictionarybecomeseasy. Clickheretoviewcodeimage tree=BinaryTree(10, left=BinaryTree(7,right=BinaryTree(9)), right=BinaryTree(13,left=BinaryTree(11))) print(tree.to_dict()) >>> {‘left’:{‘left’:None, ‘right’:{‘left’:None,‘right’:None,‘value’:9}, ‘value’:7}, ‘right’:{‘left’:{‘left’:None,‘right’:None,‘value’:11}, ‘right’:None, ‘value’:13}, ‘value’:10} Thebestpartaboutmix-insisthatyoucanmaketheirgenericfunctionalitypluggableso behaviorscanbeoverriddenwhenrequired.Forexample,hereIdefineasubclassof BinaryTreethatholdsareferencetoitsparent.Thiscircularreferencewouldcausethe defaultimplementationofToDictMixin.to_dicttoloopforever. Clickheretoviewcodeimage classBinaryTreeWithParent(BinaryTree): def__init__(self,value,left=None, right=None,parent=None): super().__init__(value,left=left,right=right) self.parent=parent ThesolutionistooverridetheToDictMixin._traversemethodinthe BinaryTreeWithParentclasstoonlyprocessvaluesthatmatter,preventingcycles encounteredbythemix-in.Here,Ioverridethe_traversemethodtonottraversethe parentandjustinsertitsnumericalvalue: Clickheretoviewcodeimage def_traverse(self,key,value): if(isinstance(value,BinaryTreeWithParent)and key==‘parent’): returnvalue.value#Preventcycles else: returnsuper()._traverse(key,value) CallingBinaryTreeWithParent.to_dictwillworkwithoutissuebecausethe circularreferencingpropertiesaren’tfollowed. Clickheretoviewcodeimage root=BinaryTreeWithParent(10) root.left=BinaryTreeWithParent(7,parent=root) root.left.right=BinaryTreeWithParent(9,parent=root.left) print(root.to_dict()) >>> {‘left’:{‘left’:None, ‘parent’:10, ‘right’:{‘left’:None, ‘parent’:7, ‘right’:None, ‘value’:9}, ‘value’:7}, ‘parent’:None, ‘right’:None, ‘value’:10} BydefiningBinaryTreeWithParent._traverse,I’vealsoenabledanyclassthat hasanattributeoftypeBinaryTreeWithParenttoautomaticallyworkwith ToDictMixin. Clickheretoviewcodeimage classNamedSubTree(ToDictMixin): def__init__(self,name,tree_with_parent): self.name=name self.tree_with_parent=tree_with_parent my_tree=NamedSubTree(‘foobar’,root.left.right) print(my_tree.to_dict())#Noinfiniteloop >>> {‘name’:‘foobar’, ‘tree_with_parent’:{‘left’:None, ‘parent’:7, ‘right’:None, ‘value’:9}} Mix-inscanalsobecomposedtogether.Forexample,sayyouwantamix-inthatprovides genericJSONserializationforanyclass.Youcandothisbyassumingthataclassprovides ato_dictmethod(whichmayormaynotbeprovidedbytheToDictMixinclass). Clickheretoviewcodeimage classJsonMixin(object): @classmethod deffrom_json(cls,data): kwargs=json.loads(data) returncls(**kwargs) defto_json(self): returnjson.dumps(self.to_dict()) NotehowtheJsonMixinclassdefinesbothinstancemethodsandclassmethods.Mixinsletyouaddeitherkindofbehavior.Inthisexample,theonlyrequirementsofthe JsonMixinarethattheclasshasato_dictmethodandits__init__methodtakes keywordarguments(seeItem19:“ProvideOptionalBehaviorwithKeyword Arguments”). Thismix-inmakesitsimpletocreatehierarchiesofutilityclassesthatcanbeserializedto andfromJSONwithlittleboilerplate.Forexample,hereIhaveahierarchyofdataclasses representingpartsofadatacentertopology: Clickheretoviewcodeimage classDatacenterRack(ToDictMixin,JsonMixin): def__init__(self,switch=None,machines=None): self.switch=Switch(**switch) self.machines=[ Machine(**kwargs)forkwargsinmachines] classSwitch(ToDictMixin,JsonMixin): #… classMachine(ToDictMixin,JsonMixin): #… SerializingtheseclassestoandfromJSONissimple.Here,Iverifythatthedataisableto besentround-tripthroughserializinganddeserializing: Clickheretoviewcodeimage serialized=”””{ “switch”:{“ports”:5,“speed”:1e9}, “machines”:[ {“cores”:8,“ram”:32e9,“disk”:5e12}, {“cores”:4,“ram”:16e9,“disk”:1e12}, {“cores”:2,“ram”:4e9,“disk”:500e9} ] }””” deserialized=DatacenterRack.from_json(serialized) roundtrip=deserialized.to_json() assertjson.loads(serialized)==json.loads(roundtrip) Whenyouusemix-inslikethis,it’salsofineiftheclassalreadyinheritsfrom JsonMixinhigherupintheobjecthierarchy.Theresultingclasswillbehavethesame way. ThingstoRemember Avoidusingmultipleinheritanceifmix-inclassescanachievethesameoutcome. Usepluggablebehaviorsattheinstanceleveltoprovideper-classcustomization whenmix-inclassesmayrequireit. Composemix-instocreatecomplexfunctionalityfromsimplebehaviors. Item27:PreferPublicAttributesOverPrivateOnes InPython,thereareonlytwotypesofattributevisibilityforaclass’sattributes:publicand private. Clickheretoviewcodeimage classMyObject(object): def__init__(self): self.public_field=5 self.__private_field=10 defget_private_field(self): returnself.__private_field Publicattributescanbeaccessedbyanyoneusingthedotoperatorontheobject. Clickheretoviewcodeimage foo=MyObject() assertfoo.public_field==5 Privatefieldsarespecifiedbyprefixinganattribute’snamewithadoubleunderscore. Theycanbeaccesseddirectlybymethodsofthecontainingclass. Clickheretoviewcodeimage assertfoo.get_private_field()==10 Directlyaccessingprivatefieldsfromoutsidetheclassraisesanexception. Clickheretoviewcodeimage foo.__private_field >>> AttributeError:‘MyObject’objecthasnoattribute‘__private_field’ Classmethodsalsohaveaccesstoprivateattributesbecausetheyaredeclaredwithinthe surroundingclassblock. Clickheretoviewcodeimage classMyOtherObject(object): def__init__(self): self.__private_field=71 @classmethod defget_private_field_of_instance(cls,instance): returninstance.__private_field bar=MyOtherObject() assertMyOtherObject.get_private_field_of_instance(bar)==71 Asyou’dexpectwithprivatefields,asubclasscan’taccessitsparentclass’sprivatefields. Clickheretoviewcodeimage classMyParentObject(object): def__init__(self): self.__private_field=71 classMyChildObject(MyParentObject): defget_private_field(self): returnself.__private_field baz=MyChildObject() baz.get_private_field() >>> AttributeError:‘MyChildObject’objecthasnoattribute ‘_MyChildObject__private_field’ Theprivateattributebehaviorisimplementedwithasimpletransformationoftheattribute name.WhenthePythoncompilerseesprivateattributeaccessinmethodslike MyChildObject.get_private_field,ittranslates__private_fieldto access_MyChildObject__private_fieldinstead.Inthisexample, __private_fieldwasonlydefinedinMyParentObject.__init__,meaning theprivateattribute’srealnameis_MyParentObject__private_field. Accessingtheparent’sprivateattributefromthechildclassfailssimplybecausethe transformedattributenamedoesn’tmatch. Knowingthisscheme,youcaneasilyaccesstheprivateattributesofanyclass,froma subclassorexternally,withoutaskingforpermission. Clickheretoviewcodeimage assertbaz._MyParentObject__private_field==71 Ifyoulookintheobject’sattributedictionary,you’llseethatprivateattributesareactually storedwiththenamesastheyappearafterthetransformation. Clickheretoviewcodeimage print(baz.__dict__) >>> {‘_MyParentObject__private_field’:71} Whydoesn’tthesyntaxforprivateattributesactuallyenforcestrictvisibility?The simplestanswerisoneoften-quotedmottoofPython:“Weareallconsentingadultshere.” Pythonprogrammersbelievethatthebenefitsofbeingopenoutweighthedownsidesof beingclosed. Beyondthat,havingtheabilitytohooklanguagefeatureslikeattributeaccess(seeItem 32:“Use__getattr__,__getattribute__,and__setattr__forLazy Attributes”)enablesyoutomessaroundwiththeinternalsofobjectswheneveryouwish. Ifyoucandothat,whatisthevalueofPythontryingtopreventprivateattributeaccess otherwise? Tominimizethedamageofaccessinginternalsunknowingly,Pythonprogrammersfollow anamingconventiondefinedinthestyleguide(seeItem2:“FollowthePEP8Style Guide”).Fieldsprefixedbyasingleunderscore(like_protected_field)are protected,meaningexternalusersoftheclassshouldproceedwithcaution. However,manyprogrammerswhoarenewtoPythonuseprivatefieldstoindicatean internalAPIthatshouldn’tbeaccessedbysubclassesorexternally. Clickheretoviewcodeimage classMyClass(object): def__init__(self,value): self.__value=value defget_value(self): returnstr(self.__value) foo=MyClass(5) assertfoo.get_value()==‘5’ Thisisthewrongapproach.Inevitablysomeone,includingyou,willwanttosubclassyour classtoaddnewbehaviorortoworkarounddeficienciesinexistingmethods(likeabove, howMyClass.get_valuealwaysreturnsastring).Bychoosingprivateattributes, you’reonlymakingsubclassoverridesandextensionscumbersomeandbrittle.Your potentialsubclasserswillstillaccesstheprivatefieldswhentheyabsolutelyneedtodoso. Clickheretoviewcodeimage classMyIntegerSubclass(MyClass): defget_value(self): returnint(self._MyClass__value) foo=MyIntegerSubclass(5) assertfoo.get_value()==5 Butiftheclasshierarchychangesbeneathyou,theseclasseswillbreakbecausetheprivate referencesarenolongervalid.Here,theMyIntegerSubclassclass’simmediate parent,MyClass,hashadanotherparentclassaddedcalledMyBaseClass: Clickheretoviewcodeimage classMyBaseClass(object): def__init__(self,value): self.__value=value #… classMyClass(MyBaseClass): #… classMyIntegerSubclass(MyClass): defget_value(self): returnint(self._MyClass__value) The__valueattributeisnowassignedintheMyBaseClassparentclass,notthe MyClassparent.Thatcausestheprivatevariablereferenceself._MyClass__value tobreakinMyIntegerSubclass. Clickheretoviewcodeimage foo=MyIntegerSubclass(5) foo.get_value() >>> AttributeError:‘MyIntegerSubclass’objecthasnoattribute‘_MyClass__value’ Ingeneral,it’sbettertoerronthesideofallowingsubclassestodomorebyusing protectedattributes.DocumenteachprotectedfieldandexplainwhichareinternalAPIs availabletosubclassesandwhichshouldbeleftaloneentirely.Thisisasmuchadviceto otherprogrammersasitisguidanceforyourfutureselfonhowtoextendyourowncode safely. Clickheretoviewcodeimage classMyClass(object): def__init__(self,value): #Thisstorestheuser-suppliedvaluefortheobject. #Itshouldbecoercibletoastring.Onceassignedfor #theobjectitshouldbetreatedasimmutable. self._value=value Theonlytimetoseriouslyconsiderusingprivateattributesiswhenyou’reworriedabout namingconflictswithsubclasses.Thisproblemoccurswhenachildclassunwittingly definesanattributethatwasalreadydefinedbyitsparentclass. Clickheretoviewcodeimage classApiClass(object): def__init__(self): self._value=5 defget(self): returnself._value classChild(ApiClass): def__init__(self): super().__init__() self._value=‘hello’#Conflicts a=Child() print(a.get(),‘and’,a._value,‘shouldbedifferent’) >>> helloandhelloshouldbedifferent ThisisprimarilyaconcernwithclassesthatarepartofapublicAPI;thesubclassesare outofyourcontrol,soyoucan’trefactortofixtheproblem.Suchaconflictisespecially possiblewithattributenamesthatareverycommon(likevalue).Toreducetheriskof thishappening,youcanuseaprivateattributeintheparentclasstoensurethatthereare noattributenamesthatoverlapwithchildclasses. Clickheretoviewcodeimage classApiClass(object): def__init__(self): self.__value=5 defget(self): returnself.__value classChild(ApiClass): def__init__(self): super().__init__() self._value=‘hello’#OK! a=Child() print(a.get(),‘and’,a._value,‘aredifferent’) >>> 5andhelloaredifferent ThingstoRemember Privateattributesaren’trigorouslyenforcedbythePythoncompiler. PlanfromthebeginningtoallowsubclassestodomorewithyourinternalAPIsand attributesinsteadoflockingthemoutbydefault. Usedocumentationofprotectedfieldstoguidesubclassesinsteadoftryingtoforce accesscontrolwithprivateattributes. Onlyconsiderusingprivateattributestoavoidnamingconflictswithsubclassesthat areoutofyourcontrol. Item28:Inheritfromcollections.abcforCustom ContainerTypes MuchofprogramminginPythonisdefiningclassesthatcontaindataanddescribinghow suchobjectsrelatetoeachother.EveryPythonclassisacontainerofsomekind, encapsulatingattributesandfunctionalitytogether.Pythonalsoprovidesbuilt-incontainer typesformanagingdata:lists,tuples,sets,anddictionaries. Whenyou’redesigningclassesforsimpleusecaseslikesequences,it’snaturalthatyou’d wanttosubclassPython’sbuilt-inlisttypedirectly.Forexample,sayyouwanttocreate yourowncustomlisttypethathasadditionalmethodsforcountingthefrequencyofits members. Clickheretoviewcodeimage classFrequencyList(list): def__init__(self,members): super().__init__(members) deffrequency(self): counts={} foriteminself: counts.setdefault(item,0) counts[item]+=1 returncounts Bysubclassinglist,yougetalloflist’sstandardfunctionalityandpreservethe semanticsfamiliartoallPythonprogrammers.Youradditionalmethodscanaddany custombehaviorsyouneed. Clickheretoviewcodeimage foo=FrequencyList([‘a’,‘b’,‘a’,‘c’,‘b’,‘a’,‘d’]) print(‘Lengthis’,len(foo)) foo.pop() print(‘Afterpop:’,repr(foo)) print(‘Frequency:’,foo.frequency()) >>> Lengthis7 Afterpop:[‘a’,‘b’,‘a’,‘c’,‘b’,‘a’] Frequency:{‘a’:3,‘c’:1,‘b’:2} Now,imagineyouwanttoprovideanobjectthatfeelslikealist,allowingindexing,but isn’talistsubclass.Forexample,sayyouwanttoprovidesequencesemantics(like listortuple)forabinarytreeclass. Clickheretoviewcodeimage classBinaryNode(object): def__init__(self,value,left=None,right=None): self.value=value self.left=left self.right=right Howdoyoumakethisactlikeasequencetype?Pythonimplementsitscontainer behaviorswithinstancemethodsthathavespecialnames.Whenyouaccessasequence itembyindex: bar=[1,2,3] bar[0] itwillbeinterpretedas: bar.__getitem__(0) TomaketheBinaryNodeclassactlikeasequence,youcanprovideacustom implementationof__getitem__thattraversestheobjecttreedepthfirst. Clickheretoviewcodeimage classIndexableNode(BinaryNode): def_search(self,count,index): #… #Returns(found,count) def__getitem__(self,index): found,_=self._search(0,index) ifnotfound: raiseIndexError(‘Indexoutofrange’) returnfound.value Youcanconstructyourbinarytreeasusual. Clickheretoviewcodeimage tree=IndexableNode( 10, left=IndexableNode( 5, left=IndexableNode(2), right=IndexableNode( 6,right=IndexableNode(7))), right=IndexableNode( 15,left=IndexableNode(11))) Butyoucanalsoaccessitlikealistinadditiontotreetraversal. Clickheretoviewcodeimage print(‘LRR=’,tree.left.right.right.value) print(‘Index0=’,tree[0]) print(‘Index1=’,tree[1]) print(‘11inthetree?’,11intree) print(‘17inthetree?’,17intree) print(‘Treeis’,list(tree)) >>> LRR=7 Index0=2 Index1=5 11inthetree?True 17inthetree?False Treeis[2,5,6,7,10,11,15] Theproblemisthatimplementing__getitem__isn’tenoughtoprovideallofthe sequencesemanticsyou’dexpect. Clickheretoviewcodeimage len(tree) >>> TypeError:objectoftype‘IndexableNode’hasnolen() Thelenbuilt-infunctionrequiresanotherspecialmethodnamed__len__thatmust haveanimplementationforyourcustomsequencetype. Clickheretoviewcodeimage classSequenceNode(IndexableNode): def__len__(self): _,count=self._search(0,None) returncount tree=SequenceNode( #… ) print(‘Treehas%dnodes’%len(tree)) >>> Treehas7nodes Unfortunately,thisstillisn’tenough.Alsomissingarethecountandindexmethods thataPythonprogrammerwouldexpecttoseeonasequencelikelistortuple. Definingyourowncontainertypesismuchharderthanitlooks. ToavoidthisdifficultythroughoutthePythonuniverse,thebuilt-incollections.abc moduledefinesasetofabstractbaseclassesthatprovideallofthetypicalmethodsfor eachcontainertype.Whenyousubclassfromtheseabstractbaseclassesandforgetto implementrequiredmethods,themodulewilltellyousomethingiswrong. Clickheretoviewcodeimage fromcollections.abcimportSequence classBadType(Sequence): pass foo=BadType() >>> TypeError:Can’tinstantiateabstractclassBadTypewithabstractmethods __getitem__,__len__ Whenyoudoimplementallofthemethodsrequiredbyanabstractbaseclass,asIdid abovewithSequenceNode,itwillprovidealloftheadditionalmethodslikeindex andcountforfree. Clickheretoviewcodeimage classBetterNode(SequenceNode,Sequence): pass tree=BetterNode( #… ) print(‘Indexof7is’,tree.index(7)) print(‘Countof10is’,tree.count(10)) >>> Indexof7is3 Countof10is1 Thebenefitofusingtheseabstractbaseclassesisevengreaterformorecomplextypes likeSetandMutableMapping,whichhavealargenumberofspecialmethodsthat needtobeimplementedtomatchPythonconventions. ThingstoRemember InheritdirectlyfromPython’scontainertypes(likelistordict)forsimpleuse cases. Bewareofthelargenumberofmethodsrequiredtoimplementcustomcontainer typescorrectly. Haveyourcustomcontainertypesinheritfromtheinterfacesdefinedin collections.abctoensurethatyourclassesmatchrequiredinterfacesand behaviors. 4.MetaclassesandAttributes MetaclassesareoftenmentionedinlistsofPython’sfeatures,butfewunderstandwhat theyaccomplishinpractice.Thenamemetaclassvaguelyimpliesaconceptaboveand beyondaclass.Simplyput,metaclassesletyouinterceptPython’sclassstatementand providespecialbehavioreachtimeaclassisdefined. SimilarlymysteriousandpowerfularePython’sbuilt-infeaturesfordynamically customizingattributeaccesses.AlongwithPython’sobject-orientedconstructs,these facilitiesprovidewonderfultoolstoeasethetransitionfromsimpleclassestocomplex ones. However,withthesepowerscomemanypitfalls.Dynamicattributesenableyouto overrideobjectsandcauseunexpectedsideeffects.Metaclassescancreateextremely bizarrebehaviorsthatareunapproachabletonewcomers.It’simportantthatyoufollowthe ruleofleastsurpriseandonlyusethesemechanismstoimplementwell-understood idioms. Item29:UsePlainAttributesInsteadofGetandSet Methods ProgrammerscomingtoPythonfromotherlanguagesmaynaturallytrytoimplement explicitgetterandsettermethodsintheirclasses. classOldResistor(object): def__init__(self,ohms): self._ohms=ohms defget_ohms(self): returnself._ohms defset_ohms(self,ohms): self._ohms=ohms Usingthesesettersandgettersissimple,butit’snotPythonic. Clickheretoviewcodeimage r0=OldResistor(50e3) print(‘Before:%5r’%r0.get_ohms()) r0.set_ohms(10e3) print(‘After:%5r’%r0.get_ohms()) >>> Before:50000.0 After:10000.0 Suchmethodsareespeciallyclumsyforoperationslikeincrementinginplace. Clickheretoviewcodeimage r0.set_ohms(r0.get_ohms()+5e3) Theseutilitymethodsdohelpdefinetheinterfaceforyourclass,makingiteasierto encapsulatefunctionality,validateusage,anddefineboundaries.Thoseareimportant goalswhendesigningaclasstoensureyoudon’tbreakcallersasyourclassevolvesover time. InPython,however,youalmostneverneedtoimplementexplicitsetterorgettermethods. Instead,youshouldalwaysstartyourimplementationswithsimplepublicattributes. Clickheretoviewcodeimage classResistor(object): def__init__(self,ohms): self.ohms=ohms self.voltage=0 self.current=0 r1=Resistor(50e3) r1.ohms=10e3 Thesemakeoperationslikeincrementinginplacenaturalandclear. r1.ohms+=5e3 Later,ifyoudecideyouneedspecialbehaviorwhenanattributeisset,youcanmigrateto [email protected],Idefineanew subclassofResistorthatletsmevarythecurrentbyassigningthevoltage property.Notethatinordertoworkproperlythenameofboththesetterandgetter methodsmustmatchtheintendedpropertyname. Clickheretoviewcodeimage classVoltageResistance(Resistor): def__init__(self,ohms): super().__init__(ohms) self._voltage=0 @property defvoltage(self): returnself._voltage @voltage.setter defvoltage(self,voltage): self._voltage=voltage self.current=self._voltage/self.ohms Now,assigningthevoltagepropertywillrunthevoltagesettermethod,updatingthe currentpropertyoftheobjecttomatch. Clickheretoviewcodeimage r2=VoltageResistance(1e3) print(‘Before:%5ramps’%r2.current) r2.voltage=10 print(‘After:%5ramps’%r2.current) >>> Before:0amps After:0.01amps Specifyingasetteronapropertyalsoletsyouperformtypecheckingandvalidationon valuespassedtoyourclass.Here,Idefineaclassthatensuresallresistancevaluesare abovezeroohms: Clickheretoviewcodeimage classBoundedResistance(Resistor): def__init__(self,ohms): super().__init__(ohms) @property defohms(self): returnself._ohms @ohms.setter defohms(self,ohms): ifohms<=0: raiseValueError(‘%fohmsmustbe>0’%ohms) self._ohms=ohms Assigninganinvalidresistancetotheattributeraisesanexception. Clickheretoviewcodeimage r3=BoundedResistance(1e3) r3.ohms=0 >>> ValueError:0.000000ohmsmustbe>0 Anexceptionwillalsoberaisedifyoupassaninvalidvaluetotheconstructor. Clickheretoviewcodeimage BoundedResistance(-5) >>> ValueError:-5.000000ohmsmustbe>0 ThishappensbecauseBoundedResistance.__init__calls Resistor.__init__,whichassignsself.ohms=-5.Thatassignmentcausesthe @ohms.settermethodfromBoundedResistancetobecalled,immediately runningthevalidationcodebeforeobjectconstructionhascompleted. Youcanevenuse@propertytomakeattributesfromparentclassesimmutable. Clickheretoviewcodeimage classFixedResistance(Resistor): #… @property defohms(self): returnself._ohms @ohms.setter defohms(self,ohms): ifhasattr(self,‘_ohms’): raiseAttributeError(“Can’tsetattribute”) self._ohms=ohms Tryingtoassigntothepropertyafterconstructionraisesanexception. Clickheretoviewcodeimage r4=FixedResistance(1e3) r4.ohms=2e3 >>> AttributeError:Can’tsetattribute Thebiggestshortcomingof@propertyisthatthemethodsforanattributecanonlybe sharedbysubclasses.Unrelatedclassescan’tsharethesameimplementation.However, Pythonalsosupportsdescriptors(seeItem31:“UseDescriptorsforReusable @propertyMethods”)thatenablereusablepropertylogicandmanyotherusecases. Finally,whenyouuse@propertymethodstoimplementsettersandgetters,besurethat thebehavioryouimplementisnotsurprising.Forexample,don’tsetotherattributesin getterpropertymethods. Clickheretoviewcodeimage classMysteriousResistor(Resistor): @property defohms(self): self.voltage=self._ohms*self.current returnself._ohms #… Thisleadstoextremelybizarrebehavior. Clickheretoviewcodeimage r7=MysteriousResistor(10) r7.current=0.01 print(‘Before:%5r’%r7.voltage) r7.ohms print(‘After:%5r’%r7.voltage) >>> Before:0 After:0.1 Thebestpolicyistoonlymodifyrelatedobjectstatein@property.settermethods. Besuretoavoidanyothersideeffectsthecallermaynotexpectbeyondtheobject,suchas importingmodulesdynamically,runningslowhelperfunctions,ormakingexpensive databasequeries.UsersofyourclasswillexpectitsattributestobelikeanyotherPython object:quickandeasy.Usenormalmethodstodoanythingmorecomplexorslow. ThingstoRemember Definenewclassinterfacesusingsimplepublicattributes,andavoidsetandget methods. Use@propertytodefinespecialbehaviorwhenattributesareaccessedonyour objects,ifnecessary. Followtheruleofleastsurpriseandavoidweirdsideeffectsinyour@property methods. Ensurethat@propertymethodsarefast;dosloworcomplexworkusingnormal methods. Item30:Consider@propertyInsteadofRefactoring Attributes Thebuilt-in@propertydecoratormakesiteasyforsimpleaccessesofaninstance’s attributestoactsmarter(seeItem29:“UsePlainAttributesInsteadofGetandSet Methods”).Oneadvancedbutcommonuseof@propertyistransitioningwhatwas onceasimplenumericalattributeintoanon-the-flycalculation.Thisisextremelyhelpful becauseitletsyoumigrateallexistingusageofaclasstohavenewbehaviorswithout rewritinganyofthecallsites.Italsoprovidesanimportantstopgapforimprovingyour interfacesovertime. Forexample,sayyouwanttoimplementaleakybucketquotausingplainPythonobjects. Here,theBucketclassrepresentshowmuchquotaremainsandthedurationforwhich thequotawillbeavailable: Clickheretoviewcodeimage classBucket(object): def__init__(self,period): self.period_delta=timedelta(seconds=period) self.reset_time=datetime.now() self.quota=0 def__repr__(self): return‘Bucket(quota=%d)’%self.quota Theleakybucketalgorithmworksbyensuringthat,wheneverthebucketisfilled,the amountofquotadoesnotcarryoverfromoneperiodtothenext. Clickheretoviewcodeimage deffill(bucket,amount): now=datetime.now() ifnow-bucket.reset_time>bucket.period_delta: bucket.quota=0 bucket.reset_time=now bucket.quota+=amount Eachtimeaquotaconsumerwantstodosomething,itfirstmustensurethatitcandeduct theamountofquotaitneedstouse. Clickheretoviewcodeimage defdeduct(bucket,amount): now=datetime.now() ifnow-bucket.reset_time>bucket.period_delta: returnFalse ifbucket.quota-amount<0: returnFalse bucket.quota-=amount returnTrue Tousethisclass,firstIfillthebucket. bucket=Bucket(60) fill(bucket,100) print(bucket) >>> Bucket(quota=100) Then,IdeductthequotathatIneed. Clickheretoviewcodeimage ifdeduct(bucket,99): print(‘Had99quota’) else: print(‘Notenoughfor99quota’) print(bucket) >>> Had99quota Bucket(quota=1) Eventually,I’mpreventedfrommakingprogressbecauseItrytodeductmorequotathan isavailable.Inthiscase,thebucket’squotalevelremainsunchanged. Clickheretoviewcodeimage ifdeduct(bucket,3): print(‘Had3quota’) else: print(‘Notenoughfor3quota’) print(bucket) >>> Notenoughfor3quota Bucket(quota=1) TheproblemwiththisimplementationisthatIneverknowwhatquotalevelthebucket startedwith.Thequotaisdeductedoverthecourseoftheperioduntilitreacheszero.At thatpoint,deductwillalwaysreturnFalse.Whenthathappens,itwouldbeusefulto knowwhethercallerstodeductarebeingblockedbecausetheBucketranoutofquota orbecausetheBucketneverhadquotainthefirstplace. Tofixthis,Icanchangetheclasstokeeptrackofthemax_quotaissuedintheperiod andthequota_consumedintheperiod. Clickheretoviewcodeimage classBucket(object): def__init__(self,period): self.period_delta=timedelta(seconds=period) self.reset_time=datetime.now() self.max_quota=0 self.quota_consumed=0 def__repr__(self): return(‘Bucket(max_quota=%d,quota_consumed=%d)’% (self.max_quota,self.quota_consumed)) Iusea@propertymethodtocomputethecurrentlevelofquotaon-the-flyusingthese newattributes. Clickheretoviewcodeimage @property defquota(self): returnself.max_quota-self.quota_consumed Whenthequotaattributeisassigned,Itakespecialactionmatchingthecurrentinterface oftheclassusedbyfillanddeduct. Clickheretoviewcodeimage @quota.setter defquota(self,amount): delta=self.max_quota-amount ifamount==0: #Quotabeingresetforanewperiod self.quota_consumed=0 self.max_quota=0 elifdelta<0: #Quotabeingfilledforthenewperiod assertself.quota_consumed==0 self.max_quota=amount else: #Quotabeingconsumedduringtheperiod assertself.max_quota>=self.quota_consumed self.quota_consumed+=delta Rerunningthedemocodefromaboveproducesthesameresults. Clickheretoviewcodeimage bucket=Bucket(60) print(‘Initial’,bucket) fill(bucket,100) print(‘Filled’,bucket) ifdeduct(bucket,99): print(‘Had99quota’) else: print(‘Notenoughfor99quota’) print(‘Now’,bucket) ifdeduct(bucket,3): print(‘Had3quota’) else: print(‘Notenoughfor3quota’) print(‘Still’,bucket) >>> InitialBucket(max_quota=0,quota_consumed=0) FilledBucket(max_quota=100,quota_consumed=0) Had99quota NowBucket(max_quota=100,quota_consumed=99) Notenoughfor3quota StillBucket(max_quota=100,quota_consumed=99) ThebestpartisthatthecodeusingBucket.quotadoesn’thavetochangeorknowthat theclasshaschanged.NewusageofBucketcandotherightthingandaccess max_quotaandquota_consumeddirectly. Iespeciallylike@propertybecauseitletsyoumakeincrementalprogresstowarda betterdatamodelovertime.ReadingtheBucketexampleabove,youmayhavethought toyourself,“fillanddeductshouldhavebeenimplementedasinstancemethodsin thefirstplace.”Althoughyou’reprobablyright(seeItem22:“PreferHelperClassesOver BookkeepingwithDictionariesandTuples”),inpracticetherearemanysituationsin whichobjectsstartwithpoorlydefinedinterfacesoractasdumbdatacontainers.This happenswhencodegrowsovertime,scopeincreases,multipleauthorscontributewithout anyoneconsideringlong-termhygiene,etc. @propertyisatooltohelpyouaddressproblemsyou’llcomeacrossinreal-world code.Don’toveruseit.Whenyoufindyourselfrepeatedlyextending@property methods,it’sprobablytimetorefactoryourclassinsteadoffurtherpavingoveryour code’spoordesign. ThingstoRemember Use@propertytogiveexistinginstanceattributesnewfunctionality. Makeincrementalprogresstowardbetterdatamodelsbyusing@property. Considerrefactoringaclassandallcallsiteswhenyoufindyourselfusing @propertytooheavily. Item31:UseDescriptorsforReusable@propertyMethods Thebigproblemwiththe@propertybuilt-in(seeItem29:“UsePlainAttributes InsteadofGetandSetMethods”andItem30:“Consider@propertyInsteadof RefactoringAttributes”)isreuse.Themethodsitdecoratescan’tbereusedformultiple attributesofthesameclass.Theyalsocan’tbereusedbyunrelatedclasses. Forexample,sayyouwantaclasstovalidatethatthegradereceivedbyastudentona homeworkassignmentisapercentage. Clickheretoviewcodeimage classHomework(object): def__init__(self): self._grade=0 @property defgrade(self): returnself._grade @grade.setter defgrade(self,value): ifnot(0<=value<=100): raiseValueError(‘Grademustbebetween0and100’) self._grade=value Usingan@propertymakesthisclasseasytouse. galileo=Homework() galileo.grade=95 Sayyoualsowanttogivethestudentagradeforanexam,wheretheexamhasmultiple subjects,eachwithaseparategrade. Clickheretoviewcodeimage classExam(object): def__init__(self): self._writing_grade=0 self._math_grade=0 @staticmethod def_check_grade(value): ifnot(0<=value<=100): raiseValueError(‘Grademustbebetween0and100’) Thisquicklygetstedious.Eachsectionoftheexamrequiresaddinganew@property andrelatedvalidation. Clickheretoviewcodeimage @property defwriting_grade(self): returnself._writing_grade @writing_grade.setter defwriting_grade(self,value): self._check_grade(value) self._writing_grade=value @property defmath_grade(self): returnself._math_grade @math_grade.setter defmath_grade(self,value): self._check_grade(value) self._math_grade=value Also,thisapproachisnotgeneral.Ifyouwanttoreusethispercentagevalidationbeyond homeworkandexams,you’dneedtowritethe@propertyboilerplateand _check_graderepeatedly. ThebetterwaytodothisinPythonistouseadescriptor.Thedescriptorprotocoldefines howattributeaccessisinterpretedbythelanguage.Adescriptorclasscanprovide __get__and__set__methodsthatletyoureusethegradevalidationbehaviorwithout anyboilerplate.Forthispurpose,descriptorsarealsobetterthanmix-ins(seeItem26: “UseMultipleInheritanceOnlyforMix-inUtilityClasses”)becausetheyletyoureusethe samelogicformanydifferentattributesinasingleclass. Here,IdefineanewclasscalledExamwithclassattributesthatareGradeinstances.The Gradeclassimplementsthedescriptorprotocol.BeforeIexplainhowtheGradeclass works,it’simportanttounderstandwhatPythonwilldowhenyourcodeaccessessuch descriptorattributesonanExaminstance. Clickheretoviewcodeimage classGrade(object): def__get__(*args,**kwargs): #… def__set__(*args,**kwargs): #… classExam(object): #Classattributes math_grade=Grade() writing_grade=Grade() science_grade=Grade() Whenyouassignaproperty: exam=Exam() exam.writing_grade=40 itwillbeinterpretedas: Clickheretoviewcodeimage Exam.__dict__[‘writing_grade’].__set__(exam,40) Whenyouretrieveaproperty: print(exam.writing_grade) itwillbeinterpretedas: Clickheretoviewcodeimage print(Exam.__dict__[‘writing_grade’].__get__(exam,Exam)) Whatdrivesthisbehavioristhe__getattribute__methodofobject(seeItem32: “Use__getattr__,__getattribute__,and__setattr__forLazy Attributes”).Inshort,whenanExaminstancedoesn’thaveanattributenamed writing_grade,PythonwillfallbacktotheExamclass’sattributeinstead.Ifthis classattributeisanobjectthathas__get__and__set__methods,Pythonwillassume youwanttofollowthedescriptorprotocol. KnowingthisbehaviorandhowIused@propertyforgradevalidationinthe Homeworkclass,here’sareasonablefirstattemptatimplementingtheGradedescriptor. Clickheretoviewcodeimage classGrade(object): def__init__(self): self._value=0 def__get__(self,instance,instance_type): returnself._value def__set__(self,instance,value): ifnot(0<=value<=100): raiseValueError(‘Grademustbebetween0and100’) self._value=value Unfortunately,thisiswrongandwillresultinbrokenbehavior.Accessingmultiple attributesonasingleExaminstanceworksasexpected. Clickheretoviewcodeimage first_exam=Exam() first_exam.writing_grade=82 first_exam.science_grade=99 print(‘Writing’,first_exam.writing_grade) print(‘Science’,first_exam.science_grade) >>> Writing82 Science99 ButaccessingtheseattributesonmultipleExaminstanceswillhaveunexpectedbehavior. Clickheretoviewcodeimage second_exam=Exam() second_exam.writing_grade=75 print(‘Second’,second_exam.writing_grade,‘isright’) print(‘First‘,first_exam.writing_grade,‘iswrong’) >>> Second75isright First75iswrong TheproblemisthatasingleGradeinstanceissharedacrossallExaminstancesforthe classattributewriting_grade.TheGradeinstanceforthisattributeisconstructed onceintheprogramlifetimewhentheExamclassisfirstdefined,noteachtimeanExam instanceiscreated. Tosolvethis,IneedtheGradeclasstokeeptrackofitsvalueforeachuniqueExam instance.Icandothisbysavingtheper-instancestateinadictionary. Clickheretoviewcodeimage classGrade(object): def__init__(self): self._values={} def__get__(self,instance,instance_type): ifinstanceisNone:returnself returnself._values.get(instance,0) def__set__(self,instance,value): ifnot(0<=value<=100): raiseValueError(‘Grademustbebetween0and100’) self._values[instance]=value Thisimplementationissimpleandworkswell,butthere’sstillonegotcha:Itleaks memory.The_valuesdictionarywillholdareferencetoeveryinstanceofExamever passedto__set__overthelifetimeoftheprogram.Thiscausesinstancestoneverhave theirreferencecountgotozero,preventingcleanupbythegarbagecollector. Tofixthis,IcanusePython’sweakrefbuilt-inmodule.Thismoduleprovidesaspecial classcalledWeakKeyDictionarythatcantaketheplaceofthesimpledictionaryused for_values.TheuniquebehaviorofWeakKeyDictionaryisthatitwillremove Examinstancesfromitssetofkeyswhentheruntimeknowsit’sholdingtheinstance’s lastremainingreferenceintheprogram.Pythonwilldothebookkeepingforyouand ensurethatthe_valuesdictionarywillbeemptywhenallExaminstancesarenolonger inuse. Clickheretoviewcodeimage classGrade(object): def__init__(self): self._values=WeakKeyDictionary() #… UsingthisimplementationoftheGradedescriptor,everythingworksasexpected. Clickheretoviewcodeimage classExam(object): math_grade=Grade() writing_grade=Grade() science_grade=Grade() first_exam=Exam() first_exam.writing_grade=82 second_exam=Exam() second_exam.writing_grade=75 print(‘First‘,first_exam.writing_grade,‘isright’) print(‘Second’,second_exam.writing_grade,‘isright’) >>> First82isright Second75isright ThingstoRemember Reusethebehaviorandvalidationof@propertymethodsbydefiningyourown descriptorclasses. UseWeakKeyDictionarytoensurethatyourdescriptorclassesdon’tcause memoryleaks. Don’tgetboggeddowntryingtounderstandexactlyhow__getattribute__ usesthedescriptorprotocolforgettingandsettingattributes. Item32:Use__getattr__,__getattribute__,and __setattr__forLazyAttributes Python’slanguagehooksmakeiteasytowritegenericcodeforgluingsystemstogether. Forexample,sayyouwanttorepresenttherowsofyourdatabaseasPythonobjects.Your databasehasitsschemaset.Yourcodethatusesobjectscorrespondingtothoserowsmust alsoknowwhatyourdatabaselookslike.However,inPython,thecodethatconnectsyour Pythonobjectstothedatabasedoesn’tneedtoknowtheschemaofyourrows;itcanbe generic. Howisthatpossible?Plaininstanceattributes,@propertymethods,anddescriptors can’tdothisbecausetheyallneedtobedefinedinadvance.Pythonmakesthisdynamic behaviorpossiblewiththe__getattr__specialmethod.Ifyourclassdefines __getattr__,thatmethodiscalledeverytimeanattributecan’tbefoundinanobject’s instancedictionary. Clickheretoviewcodeimage classLazyDB(object): def__init__(self): self.exists=5 def__getattr__(self,name): value=‘Valuefor%s’%name setattr(self,name,value) returnvalue Here,Iaccessthemissingpropertyfoo.ThiscausesPythontocallthe__getattr__ methodabove,whichmutatestheinstancedictionary__dict__: Clickheretoviewcodeimage data=LazyDB() print(‘Before:’,data.__dict__) print(‘foo:’,data.foo) print(‘After:‘,data.__dict__) >>> Before:{‘exists’:5} foo:Valueforfoo After:{‘exists’:5,‘foo’:‘Valueforfoo’} Here,IaddloggingtoLazyDBtoshowwhen__getattr__isactuallycalled.Note thatIusesuper().__getattr__()togettherealpropertyvalueinordertoavoid infiniterecursion. Clickheretoviewcodeimage classLoggingLazyDB(LazyDB): def__getattr__(self,name): print(‘Called__getattr__(%s)’%name) returnsuper().__getattr__(name) data=LoggingLazyDB() print(‘exists:’,data.exists) print(‘foo:’,data.foo) print(‘foo:’,data.foo) >>> exists:5 Called__getattr__(foo) foo:Valueforfoo foo:Valueforfoo Theexistsattributeispresentintheinstancedictionary,so__getattr__isnever calledforit.Thefooattributeisnotintheinstancedictionaryinitially,so __getattr__iscalledthefirsttime.Butthecallto__getattr__forfooalsodoes asetattr,whichpopulatesfoointheinstancedictionary.Thisiswhythesecondtime Iaccessfoothereisn’tacallto__getattr__. Thisbehaviorisespeciallyhelpfulforusecaseslikelazilyaccessingschemalessdata. __getattr__runsoncetodothehardworkofloadingaproperty;allsubsequent accessesretrievetheexistingresult. Sayyoualsowanttransactionsinthisdatabasesystem.Thenexttimetheuseraccessesa property,youwanttoknowwhetherthecorrespondingrowinthedatabaseisstillvalid andwhetherthetransactionisstillopen.The__getattr__hookwon’tletyoudothis reliablybecauseitwillusetheobject’sinstancedictionaryasthefastpathforexisting attributes. Toenablethisusecase,Pythonhasanotherlanguagehookcalled__getattribute__. Thisspecialmethodiscalledeverytimeanattributeisaccessedonanobject,evenin caseswhereitdoesexistintheattributedictionary.Thisenablesyoutodothingslike checkglobaltransactionstateoneverypropertyaccess.Here,IdefineValidatingDB tologeachtime__getattribute__iscalled: Clickheretoviewcodeimage classValidatingDB(object): def__init__(self): self.exists=5 def__getattribute__(self,name): print(‘Called__getattribute__(%s)’%name) try: returnsuper().__getattribute__(name) exceptAttributeError: value=‘Valuefor%s’%name setattr(self,name,value) returnvalue data=ValidatingDB() print(‘exists:’,data.exists) print(‘foo:’,data.foo) print(‘foo:’,data.foo) >>> Called__getattribute__(exists) exists:5 Called__getattribute__(foo) foo:Valueforfoo Called__getattribute__(foo) foo:Valueforfoo Intheeventthatadynamicallyaccessedpropertyshouldn’texist,youcanraisean AttributeErrortocausePython’sstandardmissingpropertybehaviorforboth __getattr__and__getattribute__. Clickheretoviewcodeimage classMissingPropertyDB(object): def__getattr__(self,name): ifname==‘bad_name’: raiseAttributeError(‘%sismissing’%name) #… data=MissingPropertyDB() data.bad_name >>> AttributeError:bad_nameismissing Pythoncodeimplementinggenericfunctionalityoftenreliesonthehasattrbuilt-in functiontodeterminewhenpropertiesexist,andthegetattrbuilt-infunctionto retrievepropertyvalues.Thesefunctionsalsolookintheinstancedictionaryforan attributenamebeforecalling__getattr__. Clickheretoviewcodeimage data=LoggingLazyDB() print(‘Before:’,data.__dict__) print(‘fooexists:‘,hasattr(data,‘foo’)) print(‘After:’,data.__dict__) print(‘fooexists:‘,hasattr(data,‘foo’)) >>> Before:{‘exists’:5} Called__getattr__(foo) fooexists:True After:{‘exists’:5,‘foo’:‘Valueforfoo’} fooexists:True Intheexampleabove,__getattr__isonlycalledonce.Incontrast,classesthat implement__getattribute__willhavethatmethodcalledeachtimehasattror getattrisrunonanobject. Clickheretoviewcodeimage data=ValidatingDB() print(‘fooexists:‘,hasattr(data,‘foo’)) print(‘fooexists:‘,hasattr(data,‘foo’)) >>> Called__getattribute__(foo) fooexists:True Called__getattribute__(foo) fooexists:True Now,sayyouwanttolazilypushdatabacktothedatabasewhenvaluesareassignedto yourPythonobject.Youcandothiswith__setattr__,asimilarlanguagehookthat letsyouinterceptarbitraryattributeassignments.Unlikeretrievinganattributewith __getattr__and__getattribute__,there’snoneedfortwoseparatemethods. The__setattr__methodisalwayscalledeverytimeanattributeisassignedonan instance(eitherdirectlyorthroughthesetattrbuilt-infunction). Clickheretoviewcodeimage classSavingDB(object): def__setattr__(self,name,value): #SavesomedatatotheDBlog #… super().__setattr__(name,value) Here,IdefinealoggingsubclassofSavingDB.Its__setattr__methodisalways calledoneachattributeassignment: Clickheretoviewcodeimage classLoggingSavingDB(SavingDB): def__setattr__(self,name,value): print(‘Called__setattr__(%s,%r)’%(name,value)) super().__setattr__(name,value) data=LoggingSavingDB() print(‘Before:‘,data.__dict__) data.foo=5 print(‘After:’,data.__dict__) data.foo=7 print(‘Finally:’,data.__dict__) >>> Before:{} Called__setattr__(foo,5) After:{‘foo’:5} Called__setattr__(foo,7) Finally:{‘foo’:7} Theproblemwith__getattribute__and__setattr__isthatthey’recalledon everyattributeaccessforanobject,evenwhenyoumaynotwantthattohappen.For example,sayyouwantattributeaccessesonyourobjecttoactuallylookupkeysinan associateddictionary. Clickheretoviewcodeimage classBrokenDictionaryDB(object): def__init__(self,data): self._data={} def__getattribute__(self,name): print(‘Called__getattribute__(%s)’%name) returnself._data[name] Thisrequiresaccessingself._datafromthe__getattribute__method. However,ifyouactuallytrytodothat,Pythonwillrecurseuntilitreachesitsstacklimit, andthenit’lldie. Clickheretoviewcodeimage data=BrokenDictionaryDB({‘foo’:3}) data.foo >>> Called__getattribute__(foo) Called__getattribute__(_data) Called__getattribute__(_data) … Traceback… RuntimeError:maximumrecursiondepthexceeded Theproblemisthat__getattribute__accessesself._data,whichcauses __getattribute__torunagain,whichaccessesself._dataagain,andsoon.The solutionistousethesuper().__getattribute__methodonyourinstancetofetch valuesfromtheinstanceattributedictionary.Thisavoidstherecursion. Clickheretoviewcodeimage classDictionaryDB(object): def__init__(self,data): self._data=data def__getattribute__(self,name): data_dict=super().__getattribute__(‘_data’) returndata_dict[name] Similarly,you’llneed__setattr__methodsthatmodifyattributesonanobjecttouse super().__setattr__. ThingstoRemember Use__getattr__and__setattr__tolazilyloadandsaveattributesforan object. Understandthat__getattr__onlygetscalledoncewhenaccessingamissing attribute,whereas__getattribute__getscalledeverytimeanattributeis accessed. Avoidinfiniterecursionin__getattribute__and__setattr__byusing methodsfromsuper()(i.e.,theobjectclass)toaccessinstanceattributes directly. Item33:ValidateSubclasseswithMetaclasses Oneofthesimplestapplicationsofmetaclassesisverifyingthataclasswasdefined correctly.Whenyou’rebuildingacomplexclasshierarchy,youmaywanttoenforcestyle, requireoverridingmethods,orhavestrictrelationshipsbetweenclassattributes. Metaclassesenabletheseusecasesbyprovidingareliablewaytorunyourvalidationcode eachtimeanewsubclassisdefined. Oftenaclass’svalidationcoderunsinthe__init__method,whenanobjectofthe class’stypeisconstructed(seeItem28:“Inheritfromcollections.abcforCustom ContainerTypes”foranexample).Usingmetaclassesforvalidationcanraiseerrorsmuch earlier. BeforeIgetintohowtodefineametaclassforvalidatingsubclasses,it’simportantto understandthemetaclassactionforstandardobjects.Ametaclassisdefinedbyinheriting fromtype.Inthedefaultcase,ametaclassreceivesthecontentsofassociatedclass statementsinits__new__method.Here,youcanmodifytheclassinformationbeforethe typeisactuallyconstructed: Clickheretoviewcodeimage classMeta(type): def__new__(meta,name,bases,class_dict): print((meta,name,bases,class_dict)) returntype.__new__(meta,name,bases,class_dict) classMyClass(object,metaclass=Meta): stuff=123 deffoo(self): pass Themetaclasshasaccesstothenameoftheclass,theparentclassesitinheritsfrom,and alloftheclassattributesthatweredefinedintheclass’sbody. Clickheretoviewcodeimage >>> (<class‘__main__.Meta’>, ‘MyClass’, (<class‘object’>,), {‘__module__’:‘__main__’, ‘__qualname__’:‘MyClass’, ‘foo’:<functionMyClass.fooat0x102c7dd08>, ‘stuff’:123}) Python2hasslightlydifferentsyntaxandspecifiesametaclassusingthe __metaclass__classattribute.TheMeta.__new__interfaceisthesame. Clickheretoviewcodeimage #Python2 classMeta(type): def__new__(meta,name,bases,class_dict): #… classMyClassInPython2(object): __metaclass__=Meta #… YoucanaddfunctionalitytotheMeta.__new__methodinordertovalidateallofthe parametersofaclassbeforeit’sdefined.Forexample,sayyouwanttorepresentanytype ofmultisidedpolygon.Youcandothisbydefiningaspecialvalidatingmetaclassand usingitinthebaseclassofyourpolygonclasshierarchy.Notethatit’simportantnotto applythesamevalidationtothebaseclass. Clickheretoviewcodeimage classValidatePolygon(type): def__new__(meta,name,bases,class_dict): #Don’tvalidatetheabstractPolygonclass ifbases!=(object,): ifclass_dict[‘sides’]<3: raiseValueError(‘Polygonsneed3+sides’) returntype.__new__(meta,name,bases,class_dict) classPolygon(object,metaclass=ValidatePolygon): sides=None#Specifiedbysubclasses @classmethod definterior_angles(cls): return(cls.sides-2)*180 classTriangle(Polygon): sides=3 Ifyoutrytodefineapolygonwithfewerthanthreesides,thevalidationwillcausethe classstatementtofailimmediatelyaftertheclassstatementbody.Thismeansyour programwillnotevenbeabletostartrunningwhenyoudefinesuchaclass. Clickheretoviewcodeimage print(‘Beforeclass’) classLine(Polygon): print(‘Beforesides’) sides=1 print(‘Aftersides’) print(‘Afterclass’) >>> Beforeclass Beforesides Aftersides Traceback… ValueError:Polygonsneed3+sides ThingstoRemember Usemetaclassestoensurethatsubclassesarewellformedatthetimetheyare defined,beforeobjectsoftheirtypeareconstructed. MetaclasseshaveslightlydifferentsyntaxinPython2vs.Python3. The__new__methodofmetaclassesisrunaftertheclassstatement’sentire bodyhasbeenprocessed. Item34:RegisterClassExistencewithMetaclasses Anothercommonuseofmetaclassesistoautomaticallyregistertypesinyourprogram. Registrationisusefulfordoingreverselookups,whereyouneedtomapasimpleidentifier backtoacorrespondingclass. Forexample,sayyouwanttoimplementyourownserializedrepresentationofaPython objectusingJSON.YouneedawaytotakeanobjectandturnitintoaJSONstring.Here, Idothisgenericallybydefiningabaseclassthatrecordstheconstructorparametersand turnsthemintoaJSONdictionary: Clickheretoviewcodeimage classSerializable(object): def__init__(self,*args): self.args=args defserialize(self): returnjson.dumps({‘args’:self.args}) Thisclassmakesiteasytoserializesimple,immutabledatastructureslikePoint2Dtoa string. Clickheretoviewcodeimage classPoint2D(Serializable): def__init__(self,x,y): super().__init__(x,y) self.x=x self.y=y def__repr__(self): return‘Point2D(%d,%d)’%(self.x,self.y) point=Point2D(5,3) print(‘Object:’,point) print(‘Serialized:’,point.serialize()) >>> Object:Point2D(5,3) Serialized:{“args”:[5,3]} Now,IneedtodeserializethisJSONstringandconstructthePoint2Dobjectit represents.Here,Idefineanotherclassthatcandeserializethedatafromits Serializableparentclass: Clickheretoviewcodeimage classDeserializable(Serializable): @classmethod defdeserialize(cls,json_data): params=json.loads(json_data) returncls(*params[‘args’]) UsingDeserializablemakesiteasytoserializeanddeserializesimple,immutable objectsinagenericway. Clickheretoviewcodeimage classBetterPoint2D(Deserializable): #… point=BetterPoint2D(5,3) print(‘Before:’,point) data=point.serialize() print(‘Serialized:’,data) after=BetterPoint2D.deserialize(data) print(‘After:’,after) >>> Before:BetterPoint2D(5,3) Serialized:{“args”:[5,3]} After:BetterPoint2D(5,3) Theproblemwiththisapproachisthatitonlyworksifyouknowtheintendedtypeofthe serializeddataaheadoftime(e.g.,Point2D,BetterPoint2D).Ideally,you’dhavea largenumberofclassesserializingtoJSONandonecommonfunctionthatcould deserializeanyofthembacktoacorrespondingPythonobject. Todothis,Icanincludetheserializedobject’sclassnameintheJSONdata. Clickheretoviewcodeimage classBetterSerializable(object): def__init__(self,*args): self.args=args defserialize(self): returnjson.dumps({ ‘class’:self.__class__.__name__, ‘args’:self.args, }) def__repr__(self): #… Then,Icanmaintainamappingofclassnamesbacktoconstructorsforthoseobjects.The generaldeserializefunctionwillworkforanyclassespassedto register_class. Clickheretoviewcodeimage registry={} defregister_class(target_class): registry[target_class.__name__]=target_class defdeserialize(data): params=json.loads(data) name=params[‘class’] target_class=registry[name] returntarget_class(*params[‘args’]) Toensurethatdeserializealwaysworksproperly,Imustcallregister_class foreveryclassImaywanttodeserializeinthefuture. Clickheretoviewcodeimage classEvenBetterPoint2D(BetterSerializable): def__init__(self,x,y): super().__init__(x,y) self.x=x self.y=y register_class(EvenBetterPoint2D) Now,IcandeserializeanarbitraryJSONstringwithouthavingtoknowwhichclassit contains. Clickheretoviewcodeimage point=EvenBetterPoint2D(5,3) print(‘Before:’,point) data=point.serialize() print(‘Serialized:’,data) after=deserialize(data) print(‘After:’,after) >>> Before:EvenBetterPoint2D(5,3) Serialized:{“class”:“EvenBetterPoint2D”,“args”:[5,3]} After:EvenBetterPoint2D(5,3) Theproblemwiththisapproachisthatyoucanforgettocallregister_class. Clickheretoviewcodeimage classPoint3D(BetterSerializable): def__init__(self,x,y,z): super().__init__(x,y,z) self.x=x self.y=y self.z=z #Forgottocallregister_class!Whoops! Thiswillcauseyourcodetobreakatruntime,whenyoufinallytrytodeserializeanobject ofaclassyouforgottoregister. point=Point3D(5,9,-4) data=point.serialize() deserialize(data) >>> KeyError:‘Point3D’ EventhoughyouchosetosubclassBetterSerializable,youwon’tactuallygetall ofitsfeaturesifyouforgettocallregister_classafteryourclassstatementbody. Thisapproachiserrorproneandespeciallychallengingforbeginners.Thesameomission canhappenwithclassdecoratorsinPython3. Whatifyoucouldsomehowactontheprogrammer’sintenttouse BetterSerializableandensurethatregister_classiscalledinallcases? Metaclassesenablethisbyinterceptingtheclassstatementwhensubclassesaredefined (seeItem33:“ValidateSubclasseswithMetaclasses”).Thisletsyouregisterthenewtype immediatelyaftertheclass’sbody. Clickheretoviewcodeimage classMeta(type): def__new__(meta,name,bases,class_dict): cls=type.__new__(meta,name,bases,class_dict) register_class(cls) returncls classRegisteredSerializable(BetterSerializable, metaclass=Meta): pass WhenIdefineasubclassofRegisteredSerializable,Icanbeconfidentthatthe calltoregister_classhappenedanddeserializewillalwaysworkasexpected. Clickheretoviewcodeimage classVector3D(RegisteredSerializable): def__init__(self,x,y,z): super().__init__(x,y,z) self.x,self.y,self.z=x,y,z v3=Vector3D(10,-7,3) print(‘Before:’,v3) data=v3.serialize() print(‘Serialized:’,data) print(‘After:’,deserialize(data)) >>> Before:Vector3D(10,-7,3) Serialized:{“class”:“Vector3D”,“args”:[10,-7,3]} After:Vector3D(10,-7,3) Usingmetaclassesforclassregistrationensuresthatyou’llnevermissaclassaslongas theinheritancetreeisright.Thisworkswellforserialization,asI’veshown,andalso appliestodatabaseobject-relationshipmappings(ORMs),plug-insystems,andsystem hooks. ThingstoRemember ClassregistrationisahelpfulpatternforbuildingmodularPythonprograms. Metaclassesletyourunregistrationcodeautomaticallyeachtimeyourbaseclassis subclassedinaprogram. Usingmetaclassesforclassregistrationavoidserrorsbyensuringthatyounever missaregistrationcall. Item35:AnnotateClassAttributeswithMetaclasses Onemoreusefulfeatureenabledbymetaclassesistheabilitytomodifyorannotate propertiesafteraclassisdefinedbutbeforetheclassisactuallyused.Thisapproachis commonlyusedwithdescriptors(seeItem31:“UseDescriptorsforReusable @propertyMethods”)togivethemmoreintrospectionintohowthey’rebeingused withintheircontainingclass. Forexample,sayyouwanttodefineanewclassthatrepresentsarowinyourcustomer database.You’dlikeacorrespondingpropertyontheclassforeachcolumninthedatabase table.Todothis,hereIdefineadescriptorclasstoconnectattributestocolumnnames. Clickheretoviewcodeimage classField(object): def__init__(self,name): self.name=name self.internal_name=‘_’+self.name def__get__(self,instance,instance_type): ifinstanceisNone:returnself returngetattr(instance,self.internal_name,”) def__set__(self,instance,value): setattr(instance,self.internal_name,value) WiththecolumnnamestoredintheFielddescriptor,Icansavealloftheper-instance statedirectlyintheinstancedictionaryasprotectedfieldsusingthesetattrand getattrbuilt-infunctions.Atfirst,thisseemstobemuchmoreconvenientthan buildingdescriptorswithweakreftoavoidmemoryleaks. Definingtheclassrepresentingarowrequiressupplyingthecolumnnameforeachclass attribute. Clickheretoviewcodeimage classCustomer(object): #Classattributes first_name=Field(‘first_name’) last_name=Field(‘last_name’) prefix=Field(‘prefix’) suffix=Field(‘suffix’) Usingtheclassissimple.Here,youcanseehowtheFielddescriptorsmodifythe instancedictionary__dict__asexpected: Clickheretoviewcodeimage foo=Customer() print(‘Before:’,repr(foo.first_name),foo.__dict__) foo.first_name=‘Euclid’ print(‘After:‘,repr(foo.first_name),foo.__dict__) >>> Before:”{} After:‘Euclid’{‘_first_name’:‘Euclid’} Butitseemsredundant.IalreadydeclaredthenameofthefieldwhenIassignedthe constructedFieldobjecttoCustomer.first_nameintheclassstatementbody. WhydoIalsohavetopassthefieldname('first_name'inthiscase)totheField constructor? TheproblemisthattheorderofoperationsintheCustomerclassdefinitionisthe oppositeofhowitreadsfromlefttoright.First,theFieldconstructoriscalledas Field('first_name').Then,thereturnvalueofthatisassignedto Customer.field_name.There’snowayfortheFieldtoknowupfrontwhichclass attributeitwillbeassignedto. Toeliminatetheredundancy,Icanuseametaclass.Metaclassesletyouhooktheclass statementdirectlyandtakeactionassoonasaclassbodyisfinished.Inthiscase,Ican usethemetaclasstoassignField.nameandField.internal_nameonthe descriptorautomaticallyinsteadofmanuallyspecifyingthefieldnamemultipletimes. Clickheretoviewcodeimage classMeta(type): def__new__(meta,name,bases,class_dict): forkey,valueinclass_dict.items(): ifisinstance(value,Field): value.name=key value.internal_name=‘_’+key cls=type.__new__(meta,name,bases,class_dict) returncls Here,Idefineabaseclassthatusesthemetaclass.Allclassesrepresentingdatabaserows shouldinheritfromthisclasstoensurethattheyusethemetaclass: Clickheretoviewcodeimage classDatabaseRow(object,metaclass=Meta): pass Toworkwiththemetaclass,thefielddescriptorislargelyunchanged.Theonlydifference isthatitnolongerrequiresanyargumentstobepassedtoitsconstructor.Instead,its attributesaresetbytheMeta.__new__methodabove. Clickheretoviewcodeimage classField(object): def__init__(self): #Thesewillbeassignedbythemetaclass. self.name=None self.internal_name=None #… Byusingthemetaclass,thenewDatabaseRowbaseclass,andthenewField descriptor,theclassdefinitionforadatabaserownolongerhastheredundancyfrom before. Clickheretoviewcodeimage classBetterCustomer(DatabaseRow): first_name=Field() last_name=Field() prefix=Field() suffix=Field() Thebehaviorofthenewclassisidenticaltotheoldone. Clickheretoviewcodeimage foo=BetterCustomer() print(‘Before:’,repr(foo.first_name),foo.__dict__) foo.first_name=‘Euler’ print(‘After:‘,repr(foo.first_name),foo.__dict__) >>> Before:”{} After:‘Euler’{‘_first_name’:‘Euler’} ThingstoRemember Metaclassesenableyoutomodifyaclass’sattributesbeforetheclassisfully defined. Descriptorsandmetaclassesmakeapowerfulcombinationfordeclarativebehavior andruntimeintrospection. Youcanavoidbothmemoryleaksandtheweakrefmodulebyusingmetaclasses alongwithdescriptors. 5.ConcurrencyandParallelism Concurrencyiswhenacomputerdoesmanydifferentthingsseeminglyatthesametime. Forexample,onacomputerwithoneCPUcore,theoperatingsystemwillrapidlychange whichprogramisrunningonthesingleprocessor.Thisinterleavesexecutionofthe programs,providingtheillusionthattheprogramsarerunningsimultaneously. Parallelismisactuallydoingmanydifferentthingsatthesametime.Computerswith multipleCPUcorescanexecutemultipleprogramssimultaneously.EachCPUcoreruns theinstructionsofaseparateprogram,allowingeachprogramtomakeforwardprogress duringthesameinstant. Withinasingleprogram,concurrencyisatoolthatmakesiteasierforprogrammersto solvecertaintypesofproblems.Concurrentprogramsenablemanydistinctpathsof executiontomakeforwardprogressinawaythatseemstobebothsimultaneousand independent. Thekeydifferencebetweenparallelismandconcurrencyisspeedup.Whentwodistinct pathsofexecutioninaprogrammakeforwardprogressinparallel,thetimeittakestodo thetotalworkiscutinhalf;thespeedofexecutionisfasterbyafactoroftwo.Incontrast, concurrentprogramsmayrunthousandsofseparatepathsofexecutionseeminglyin parallelbutprovidenospeedupforthetotalwork. Pythonmakesiteasytowriteconcurrentprograms.Pythoncanalsobeusedtodoparallel workthroughsystemcalls,subprocesses,andC-extensions.Butitcanbeverydifficultto makeconcurrentPythoncodetrulyruninparallel.It’simportanttounderstandhowtobest utilizePythoninthesesubtlydifferentsituations. Item36:UsesubprocesstoManageChildProcesses Pythonhasbattle-hardenedlibrariesforrunningandmanagingchildprocesses.This makesPythonagreatlanguageforgluingothertoolstogether,suchascommand-line utilities.Whenexistingshellscriptsgetcomplicated,astheyoftendoovertime, graduatingthemtoarewriteinPythonisanaturalchoiceforthesakeofreadabilityand maintainability. ChildprocessesstartedbyPythonareabletoruninparallel,enablingyoutousePythonto consumealloftheCPUcoresofyourmachineandmaximizethethroughputofyour programs.AlthoughPythonitselfmaybeCPUbound(seeItem37:“UseThreadsfor BlockingI/O,AvoidforParallelism”),it’seasytousePythontodriveandcoordinate CPU-intensiveworkloads. Pythonhashadmanywaystorunsubprocessesovertheyears,includingpopen, popen2,andos.exec*.WiththePythonoftoday,thebestandsimplestchoicefor managingchildprocessesistousethesubprocessbuilt-inmodule. Runningachildprocesswithsubprocessissimple.Here,thePopenconstructorstarts theprocess.Thecommunicatemethodreadsthechildprocess’soutputandwaitsfor termination. Clickheretoviewcodeimage proc=subprocess.Popen( [‘echo’,‘Hellofromthechild!’], stdout=subprocess.PIPE) out,err=proc.communicate() print(out.decode(‘utf-8’)) >>> Hellofromthechild! Childprocesseswillrunindependentlyfromtheirparentprocess,thePythoninterpreter. TheirstatuscanbepolledperiodicallywhilePythondoesotherwork. Clickheretoviewcodeimage proc=subprocess.Popen([‘sleep’,‘0.3’]) whileproc.poll()isNone: print(‘Working…’) #Sometime-consumingworkhere #… print(‘Exitstatus’,proc.poll()) >>> Working… Working… Exitstatus0 Decouplingthechildprocessfromtheparentmeansthattheparentprocessisfreetorun manychildprocessesinparallel.Youcandothisbystartingallthechildprocesses togetherupfront. Clickheretoviewcodeimage defrun_sleep(period): proc=subprocess.Popen([‘sleep’,str(period)]) returnproc start=time() procs=[] for_inrange(10): proc=run_sleep(0.1) procs.append(proc) Later,youcanwaitforthemtofinishtheirI/Oandterminatewiththecommunicate method. Clickheretoviewcodeimage forprocinprocs: proc.communicate() end=time() print(‘Finishedin%.3fseconds’%(end-start)) >>> Finishedin0.117seconds Note Iftheseprocessesraninsequence,thetotaldelaywouldbe1second,notthe~0.1 secondImeasured. YoucanalsopipedatafromyourPythonprogramintoasubprocessandretrieveits output.Thisallowsyoutoutilizeotherprogramstodoworkinparallel.Forexample,say youwanttousetheopensslcommand-linetooltoencryptsomedata.Startingthechild processwithcommand-lineargumentsandI/Opipesiseasy. Clickheretoviewcodeimage defrun_openssl(data): env=os.environ.copy() env[‘password’]=b’\xe24U\n\xd0Ql3S\x11’ proc=subprocess.Popen( [‘openssl’,‘enc’,‘-des3’,‘-pass’,‘env:password’], env=env, stdin=subprocess.PIPE, stdout=subprocess.PIPE) proc.stdin.write(data) proc.stdin.flush()#Ensurethechildgetsinput returnproc Here,Ipiperandombytesintotheencryptionfunction,butinpracticethiswouldbeuser input,afilehandle,anetworksocket,etc.: procs=[] for_inrange(3): data=os.urandom(10) proc=run_openssl(data) procs.append(proc) Thechildprocesseswillruninparallelandconsumetheirinput.Here,Iwaitforthemto finishandthenretrievetheirfinaloutput: Clickheretoviewcodeimage forprocinprocs: out,err=proc.communicate() print(out[-10:]) >>> b’o4,G\x91\x95\xfe\xa0\xaa\xb7’ b’\x0b\x01\\xb1\xb7\xfb\xb2C\xe1b’ b’ds\xc5\xf4;j\x1f\xd0c-‘ YoucanalsocreatechainsofparallelprocessesjustlikeUNIXpipes,connectingthe outputofonechildprocessintotheinputofanother,andsoon.Here’safunctionthat startsachildprocessthatwillcausethemd5command-linetooltoconsumeaninput stream: Clickheretoviewcodeimage defrun_md5(input_stdin): proc=subprocess.Popen( [‘md5’], stdin=input_stdin, stdout=subprocess.PIPE) returnproc Note Python’shashlibbuilt-inmoduleprovidesthemd5function,sorunninga subprocesslikethisisn’talwaysnecessary.Thegoalhereistodemonstratehow subprocessescanpipeinputsandoutputs. Now,Icankickoffasetofopensslprocessestoencryptsomedataandanothersetof processestomd5hashtheencryptedoutput. Clickheretoviewcodeimage input_procs=[] hash_procs=[] for_inrange(3): data=os.urandom(10) proc=run_openssl(data) input_procs.append(proc) hash_proc=run_md5(proc.stdout) hash_procs.append(hash_proc) TheI/Obetweenthechildprocesseswillhappenautomaticallyonceyougetthemstarted. Allyouneedtodoiswaitforthemtofinishandprintthefinaloutput. Clickheretoviewcodeimage forprocininput_procs: proc.communicate() forprocinhash_procs: out,err=proc.communicate() print(out.strip()) >>> b‘7a1822875dcf9650a5a71e5e41e77bf3’ b’d41d8cd98f00b204e9800998ecf8427e’ b‘1720f581cfdc448b6273048d42621100’ Ifyou’reworriedaboutthechildprocessesneverfinishingorsomehowblockingoninput oroutputpipes,thenbesuretopassthetimeoutparametertothecommunicate method.Thiswillcauseanexceptiontoberaisedifthechildprocesshasn’tresponded withinatimeperiod,givingyouachancetoterminatethemisbehavingchild. Clickheretoviewcodeimage proc=run_sleep(10) try: proc.communicate(timeout=0.1) exceptsubprocess.TimeoutExpired: proc.terminate() proc.wait() print(‘Exitstatus’,proc.poll()) >>> Exitstatus-15 Unfortunately,thetimeoutparameterisonlyavailableinPython3.3andlater.Inearlier versionsofPython,you’llneedtousetheselectbuilt-inmoduleonproc.stdin, proc.stdout,andproc.stderrinordertoenforcetimeoutsonI/O. ThingstoRemember Usethesubprocessmoduletorunchildprocessesandmanagetheirinputand outputstreams. ChildprocessesruninparallelwiththePythoninterpreter,enablingyouto maximizeyourCPUusage. Usethetimeoutparameterwithcommunicatetoavoiddeadlocksandhanging childprocesses. Item37:UseThreadsforBlockingI/O,Avoidfor Parallelism ThestandardimplementationofPythoniscalledCPython.CPythonrunsaPython programintwosteps.First,itparsesandcompilesthesourcetextintobytecode.Then,it runsthebytecodeusingastack-basedinterpreter.Thebytecodeinterpreterhasstatethat mustbemaintainedandcoherentwhilethePythonprogramexecutes.Pythonenforces coherencewithamechanismcalledtheglobalinterpreterlock(GIL). Essentially,theGILisamutual-exclusionlock(mutex)thatpreventsCPythonfrombeing affectedbypreemptivemultithreading,whereonethreadtakescontrolofaprogramby interruptinganotherthread.Suchaninterruptioncouldcorrupttheinterpreterstateifit comesatanunexpectedtime.TheGILpreventstheseinterruptionsandensuresthatevery bytecodeinstructionworkscorrectlywiththeCPythonimplementationanditsCextensionmodules. TheGILhasanimportantnegativesideeffect.Withprogramswritteninlanguageslike C++orJava,havingmultiplethreadsofexecutionmeansyourprogramcouldutilize multipleCPUcoresatthesametime.AlthoughPythonsupportsmultiplethreadsof execution,theGILcausesonlyoneofthemtomakeforwardprogressatatime.This meansthatwhenyoureachforthreadstodoparallelcomputationandspeedupyour Pythonprograms,youwillbesorelydisappointed. Forexample,sayyouwanttodosomethingcomputationallyintensivewithPython.I’ll useanaivenumberfactorizationalgorithmasaproxy. Clickheretoviewcodeimage deffactorize(number): foriinrange(1,number+1): ifnumber%i==0: yieldi Factoringasetofnumbersinserialtakesquitealongtime. Clickheretoviewcodeimage numbers=[2139079,1214759,1516637,1852285] start=time() fornumberinnumbers: list(factorize(number)) end=time() print(‘Took%.3fseconds’%(end-start)) >>> Took1.040seconds Usingmultiplethreadstodothiscomputationwouldmakesenseinotherlanguages becauseyoucouldtakeadvantageofalloftheCPUcoresofyourcomputer.Letmetry thatinPython.Here,IdefineaPythonthreadfordoingthesamecomputationasbefore: Clickheretoviewcodeimage fromthreadingimportThread classFactorizeThread(Thread): def__init__(self,number): super().__init__() self.number=number defrun(self): self.factors=list(factorize(self.number)) Then,Istartathreadforfactorizingeachnumberinparallel. Clickheretoviewcodeimage start=time() threads=[] fornumberinnumbers: thread=FactorizeThread(number) thread.start() threads.append(thread) Finally,Iwaitforallofthethreadstofinish. Clickheretoviewcodeimage forthreadinthreads: thread.join() end=time() print(‘Took%.3fseconds’%(end-start)) >>> Took1.061seconds What’ssurprisingisthatthistakesevenlongerthanrunningfactorizeinserial.With onethreadpernumber,youmayexpectlessthana4×speedupinotherlanguagesdueto theoverheadofcreatingthreadsandcoordinatingwiththem.Youmayexpectonlya2× speeduponthedual-coremachineIusedtorunthiscode.Butyouwouldneverexpectthe performanceofthesethreadstobeworsewhenyouhavemultipleCPUstoutilize.This demonstratestheeffectoftheGILonprogramsrunninginthestandardCPython interpreter. TherearewaystogetCPythontoutilizemultiplecores,butitdoesn’tworkwiththe standardThreadclass(seeItem41:“Considerconcurrent.futuresforTrue Parallelism”)anditcanrequiresubstantialeffort.Knowingtheselimitationsyoumay wonder,whydoesPythonsupportthreadsatall?Therearetwogoodreasons. First,multiplethreadsmakeiteasyforyourprogramtoseemlikeit’sdoingmultiple thingsatthesametime.Managingthejugglingactofsimultaneoustasksisdifficultto implementyourself(seeItem40:“ConsiderCoroutinestoRunManyFunctions Concurrently”foranexample).Withthreads,youcanleaveittoPythontorunyour functionsseeminglyinparallel.ThisworksbecauseCPythonensuresaleveloffairness betweenPythonthreadsofexecution,eventhoughonlyoneofthemmakesforward progressatatimeduetotheGIL. ThesecondreasonPythonsupportsthreadsistodealwithblockingI/O,whichhappens whenPythondoescertaintypesofsystemcalls.SystemcallsarehowyourPython programasksyourcomputer’soperatingsystemtointeractwiththeexternalenvironment onyourbehalf.BlockingI/Oincludesthingslikereadingandwritingfiles,interacting withnetworks,communicatingwithdeviceslikedisplays,etc.Threadshelpyouhandle blockingI/Obyinsulatingyourprogramfromthetimeittakesfortheoperatingsystemto respondtoyourrequests. Forexample,sayyouwanttosendasignaltoaremote-controlledhelicopterthrougha serialport.I’lluseaslowsystemcall(select)asaproxyforthisactivity.Thisfunction askstheoperatingsystemtoblockfor0.1secondandthenreturncontroltomyprogram, similartowhatwouldhappenwhenusingasynchronousserialport. Clickheretoviewcodeimage importselect defslow_systemcall(): select.select([],[],[],0.1) Runningthissystemcallinserialrequiresalinearlyincreasingamountoftime. Clickheretoviewcodeimage start=time() for_inrange(5): slow_systemcall() end=time() print(‘Took%.3fseconds’%(end-start)) >>> Took0.503seconds Theproblemisthatwhiletheslow_systemcallfunctionisrunning,myprogram can’tmakeanyotherprogress.Myprogram’smainthreadofexecutionisblockedonthe selectsystemcall.Thissituationisawfulinpractice.Youneedtobeabletocompute yourhelicopter’snextmovewhileyou’resendingitasignal,otherwiseit’llcrash.When youfindyourselfneedingtodoblockingI/Oandcomputationsimultaneously,it’stimeto considermovingyoursystemcallstothreads. Here,Irunmultipleinvocationsoftheslow_systemcallfunctioninseparatethreads. Thiswouldallowyoutocommunicatewithmultipleserialports(andhelicopters)atthe sametime,whileleavingthemainthreadtodowhatevercomputationisrequired. Clickheretoviewcodeimage start=time() threads=[] for_inrange(5): thread=Thread(target=slow_systemcall) thread.start() threads.append(thread) Withthethreadsstarted,hereIdosomeworktocalculatethenexthelicoptermovebefore waitingforthesystemcallthreadstofinish. Clickheretoviewcodeimage defcompute_helicopter_location(index): #… foriinrange(5): compute_helicopter_location(i) forthreadinthreads: thread.join() end=time() print(‘Took%.3fseconds’%(end-start)) >>> Took0.102seconds Theparalleltimeis5×lessthantheserialtime.Thisshowsthatthesystemcallswillall runinparallelfrommultiplePythonthreadseventhoughthey’relimitedbytheGIL.The GILpreventsmyPythoncodefromrunninginparallel,butithasnonegativeeffecton systemcalls.ThisworksbecausePythonthreadsreleasetheGILjustbeforetheymake systemcallsandreacquiretheGILassoonasthesystemcallsaredone. TherearemanyotherwaystodealwithblockingI/Obesidesthreads,suchasthe asynciobuilt-inmodule,andthesealternativeshaveimportantbenefits.Butthese optionsalsorequireextraworkinrefactoringyourcodetofitadifferentmodelof execution(seeItem40:“ConsiderCoroutinestoRunManyFunctionsConcurrently”). UsingthreadsisthesimplestwaytodoblockingI/Oinparallelwithminimalchangesto yourprogram. ThingstoRemember Pythonthreadscan’trunbytecodeinparallelonmultipleCPUcoresbecauseofthe globalinterpreterlock(GIL). PythonthreadsarestillusefuldespitetheGILbecausetheyprovideaneasywayto domultiplethingsatseeminglythesametime. UsePythonthreadstomakemultiplesystemcallsinparallel.Thisallowsyoutodo blockingI/Oatthesametimeascomputation. Item38:UseLocktoPreventDataRacesinThreads Afterlearningabouttheglobalinterpreterlock(GIL)(seeItem37:“UseThreadsfor BlockingI/O,AvoidforParallelism”),manynewPythonprogrammersassumetheycan forgousingmutual-exclusionlocks(mutexes)intheircodealtogether.IftheGILis alreadypreventingPythonthreadsfromrunningonmultipleCPUcoresinparallel,itmust alsoactasalockforaprogram’sdatastructures,right?Sometestingontypeslikelists anddictionariesmayevenshowthatthisassumptionappearstohold. Butbeware,thisistrulynotthecase.TheGILwillnotprotectyou.Althoughonlyone Pythonthreadrunsatatime,athread’soperationsondatastructurescanbeinterrupted betweenanytwobytecodeinstructionsinthePythoninterpreter.Thisisdangerousifyou accessthesameobjectsfrommultiplethreadssimultaneously.Theinvariantsofyourdata structurescouldbeviolatedatpracticallyanytimebecauseoftheseinterruptions,leaving yourprograminacorruptedstate. Forexample,sayyouwanttowriteaprogramthatcountsmanythingsinparallel,like samplinglightlevelsfromawholenetworkofsensors.Ifyouwanttodeterminethetotal numberoflightsamplesovertime,youcanaggregatethemwithanewclass. Clickheretoviewcodeimage classCounter(object): def__init__(self): self.count=0 defincrement(self,offset): self.count+=offset Imaginethateachsensorhasitsownworkerthreadbecausereadingfromthesensor requiresblockingI/O.Aftereachsensormeasurement,theworkerthreadincrementsthe counteruptoamaximumnumberofdesiredreadings. Clickheretoviewcodeimage defworker(sensor_index,how_many,counter): for_inrange(how_many): #Readfromthesensor #… counter.increment(1) Here,Idefineafunctionthatstartsaworkerthreadforeachsensorandwaitsforthemall tofinishtheirreadings: Clickheretoviewcodeimage defrun_threads(func,how_many,counter): threads=[] foriinrange(5): args=(i,how_many,counter) thread=Thread(target=func,args=args) threads.append(thread) thread.start() forthreadinthreads: thread.join() Runningfivethreadsinparallelseemssimple,andtheoutcomeshouldbeobvious. Clickheretoviewcodeimage how_many=10**5 counter=Counter() run_threads(worker,how_many,counter) print(‘Countershouldbe%d,found%d’% (5*how_many,counter.count)) >>> Countershouldbe500000,found278328 Butthisresultiswayoff!Whathappenedhere?Howcouldsomethingsosimplegoso wrong,especiallysinceonlyonePythoninterpreterthreadcanrunatatime? ThePythoninterpreterenforcesfairnessbetweenallofthethreadsthatareexecutingto ensuretheygetaroughlyequalamountofprocessingtime.Todothis,Pythonwill suspendathreadasit’srunningandwillresumeanotherthreadinturn.Theproblemis thatyoudon’tknowexactlywhenPythonwillsuspendyourthreads.Athreadcanevenbe pausedseeminglyhalfwaythroughwhatlookslikeanatomicoperation.That’swhat happenedinthiscase. TheCounterobject’sincrementmethodlookssimple. counter.count+=offset Butthe+=operatorusedonanobjectattributeactuallyinstructsPythontodothree separateoperationsbehindthescenes.Thestatementaboveisequivalenttothis: Clickheretoviewcodeimage value=getattr(counter,‘count’) result=value+offset setattr(counter,‘count’,result) Pythonthreadsincrementingthecountercanbesuspendedbetweenanytwoofthese operations.Thisisproblematicifthewaytheoperationsinterleavecausesoldversionsof valuetobeassignedtothecounter.Here’sanexampleofbadinteractionbetweentwo threads,AandB: Clickheretoviewcodeimage #RunninginThreadA value_a=getattr(counter,‘count’) #ContextswitchtoThreadB value_b=getattr(counter,‘count’) result_b=value_b+1 setattr(counter,‘count’,result_b) #ContextswitchbacktoThreadA result_a=value_a+1 setattr(counter,‘count’,result_a) ThreadAstompedonthreadB,erasingallofitsprogressincrementingthecounter.Thisis exactlywhathappenedinthelightsensorexampleabove. Topreventdataracesliketheseandotherformsofdatastructurecorruption,Python includesarobustsetoftoolsinthethreadingbuilt-inmodule.Thesimplestandmost usefulofthemistheLockclass,amutual-exclusionlock(mutex). Byusingalock,IcanhavetheCounterclassprotectitscurrentvalueagainst simultaneousaccessfrommultiplethreads.Onlyonethreadwillbeabletoacquirethe lockatatime.Here,Iuseawithstatementtoacquireandreleasethelock;thismakesit easiertoseewhichcodeisexecutingwhilethelockisheld(seeItem43:“Consider contextlibandwithStatementsforReusabletry/finallyBehavior”fordetails): Clickheretoviewcodeimage classLockingCounter(object): def__init__(self): self.lock=Lock() self.count=0 defincrement(self,offset): withself.lock: self.count+=offset NowIruntheworkerthreadsasbefore,butuseaLockingCounterinstead. Clickheretoviewcodeimage counter=LockingCounter() run_threads(worker,how_many,counter) print(‘Countershouldbe%d,found%d’% (5*how_many,counter.count)) >>> Countershouldbe500000,found500000 TheresultisexactlywhatIexpect.TheLocksolvedtheproblem. ThingstoRemember EventhoughPythonhasaglobalinterpreterlock,you’restillresponsiblefor protectingagainstdataracesbetweenthethreadsinyourprograms. Yourprogramswillcorrupttheirdatastructuresifyouallowmultiplethreadsto modifythesameobjectswithoutlocks. TheLockclassinthethreadingbuilt-inmoduleisPython’sstandardmutual exclusionlockimplementation. Item39:UseQueuetoCoordinateWorkBetweenThreads Pythonprogramsthatdomanythingsconcurrentlyoftenneedtocoordinatetheirwork. Oneofthemostusefularrangementsforconcurrentworkisapipelineoffunctions. Apipelineworkslikeanassemblylineusedinmanufacturing.Pipelineshavemany phasesinserialwithaspecificfunctionforeachphase.Newpiecesofworkareconstantly addedtothebeginningofthepipeline.Eachfunctioncanoperateconcurrentlyonthe pieceofworkinitsphase.Theworkmovesforwardaseachfunctioncompletesuntilthere arenophasesremaining.Thisapproachisespeciallygoodforworkthatincludesblocking I/Oorsubprocesses—activitiesthatcaneasilybeparallelizedusingPython(seeItem37: “UseThreadsforBlockingI/O,AvoidforParallelism”). Forexample,sayyouwanttobuildasystemthatwilltakeaconstantstreamofimages fromyourdigitalcamera,resizethem,andthenaddthemtoaphotogalleryonline.Sucha programcouldbesplitintothreephasesofapipeline.Newimagesareretrievedinthefirst phase.Thedownloadedimagesarepassedthroughtheresizefunctioninthesecondphase. Theresizedimagesareconsumedbytheuploadfunctioninthefinalphase. ImagineyouhadalreadywrittenPythonfunctionsthatexecutethephases:download, resize,upload.Howdoyouassembleapipelinetodotheworkconcurrently? Thefirstthingyouneedisawaytohandoffworkbetweenthepipelinephases.Thiscan bemodeledasathread-safeproducer-consumerqueue(seeItem38:“UseLockto PreventDataRacesinThreads”tounderstandtheimportanceofthreadsafetyinPython; seeItem46:“UseBuilt-inAlgorithmsandDataStructures”forthedequeclass). classMyQueue(object): def__init__(self): self.items=deque() self.lock=Lock() Theproducer,yourdigitalcamera,addsnewimagestotheendofthelistofpending items. Clickheretoviewcodeimage defput(self,item): withself.lock: self.items.append(item) Theconsumer,thefirstphaseofyourprocessingpipeline,removesimagesfromthefront ofthelistofpendingitems. Clickheretoviewcodeimage defget(self): withself.lock: returnself.items.popleft() Here,IrepresenteachphaseofthepipelineasaPythonthreadthattakesworkfromone queuelikethis,runsafunctiononit,andputstheresultonanotherqueue.Ialsotrackhow manytimestheworkerhascheckedfornewinputandhowmuchworkit’scompleted. Clickheretoviewcodeimage classWorker(Thread): def__init__(self,func,in_queue,out_queue): super().__init__() self.func=func self.in_queue=in_queue self.out_queue=out_queue self.polled_count=0 self.work_done=0 Thetrickiestpartisthattheworkerthreadmustproperlyhandlethecasewheretheinput queueisemptybecausethepreviousphasehasn’tcompleteditsworkyet.Thishappens whereIcatchtheIndexErrorexceptionbelow.Youcanthinkofthisasaholdupinthe assemblyline. Clickheretoviewcodeimage defrun(self): whileTrue: self.polled_count+=1 try: item=self.in_queue.get() exceptIndexError: sleep(0.01)#Noworktodo else: result=self.func(item) self.out_queue.put(result) self.work_done+=1 NowIcanconnectthethreephasestogetherbycreatingthequeuesfortheircoordination pointsandthecorrespondingworkerthreads. Clickheretoviewcodeimage download_queue=MyQueue() resize_queue=MyQueue() upload_queue=MyQueue() done_queue=MyQueue() threads=[ Worker(download,download_queue,resize_queue), Worker(resize,resize_queue,upload_queue), Worker(upload,upload_queue,done_queue), ] Icanstartthethreadsandtheninjectabunchofworkintothefirstphaseofthepipeline. Here,Iuseaplainobjectinstanceasaproxyfortherealdatarequiredbythe downloadfunction: Clickheretoviewcodeimage forthreadinthreads: thread.start() for_inrange(1000): download_queue.put(object()) NowIwaitforalloftheitemstobeprocessedbythepipelineandendupinthe done_queue. Clickheretoviewcodeimage whilelen(done_queue.items)<1000: #Dosomethingusefulwhilewaiting #… Thisrunsproperly,butthere’saninterestingsideeffectcausedbythethreadspollingtheir inputqueuesfornewwork.Thetrickypart,whereIcatchIndexErrorexceptionsinthe runmethod,executesalargenumberoftimes. Clickheretoviewcodeimage processed=len(done_queue.items) polled=sum(t.polled_countfortinthreads) print(‘Processed’,processed,‘itemsafterpolling’, polled,‘times’) >>> Processed1000itemsafterpolling3030times Whentheworkerfunctionsvaryinspeeds,anearlierphasecanpreventprogressinlater phases,backingupthepipeline.Thiscauseslaterphasestostarveandconstantlycheck theirinputqueuesfornewworkinatightloop.Theoutcomeisthatworkerthreadswaste CPUtimedoingnothinguseful(they’reconstantlyraisingandcatchingIndexError exceptions). Butthat’sjustthebeginningofwhat’swrongwiththisimplementation.Therearethree moreproblemsthatyoushouldalsoavoid.First,determiningthatalloftheinputworkis completerequiresyetanotherbusywaitonthedone_queue.Second,inWorkerthe runmethodwillexecuteforeverinitsbusyloop.There’snowaytosignaltoaworker threadthatit’stimetoexit. Third,andworstofall,abackupinthepipelinecancausetheprogramtocrasharbitrarily. Ifthefirstphasemakesrapidprogressbutthesecondphasemakesslowprogress,thenthe queueconnectingthefirstphasetothesecondphasewillconstantlyincreaseinsize.The secondphasewon’tbeabletokeepup.Givenenoughtimeandinputdata,theprogram willeventuallyrunoutofmemoryanddie. Thelessonhereisn’tthatpipelinesarebad;it’sthatit’shardtobuildagoodproducerconsumerqueueyourself. QueuetotheRescue TheQueueclassfromthequeuebuilt-inmoduleprovidesallofthefunctionalityyou needtosolvetheseproblems. Queueeliminatesthebusywaitingintheworkerbymakingthegetmethodblockuntil newdataisavailable.Forexample,hereIstartathreadthatwaitsforsomeinputdataona queue: Clickheretoviewcodeimage fromqueueimportQueue queue=Queue() defconsumer(): print(‘Consumerwaiting’) queue.get()#Runsafterput()below print(‘Consumerdone’) thread=Thread(target=consumer) thread.start() Eventhoughthethreadisrunningfirst,itwon’tfinishuntilanitemisputontheQueue instanceandthegetmethodhassomethingtoreturn. Clickheretoviewcodeimage print(‘Producerputting’) queue.put(object())#Runsbeforeget()above thread.join() print(‘Producerdone’) >>> Consumerwaiting Producerputting Consumerdone Producerdone Tosolvethepipelinebackupissue,theQueueclassletsyouspecifythemaximum amountofpendingworkyou’llallowbetweentwophases.Thisbuffersizecausescallsto puttoblockwhenthequeueisalreadyfull.Forexample,hereIdefineathreadthatwaits forawhilebeforeconsumingaqueue: Clickheretoviewcodeimage queue=Queue(1)#Buffersizeof1 defconsumer(): time.sleep(0.1)#Wait queue.get()#Runssecond print(‘Consumergot1’) queue.get()#Runsfourth print(‘Consumergot2’) thread=Thread(target=consumer) thread.start() Thewaitshouldallowtheproducerthreadtoputbothobjectsonthequeuebeforethe consumethreadevercallsget.ButtheQueuesizeisone.Thatmeanstheproducer addingitemstothequeuewillhavetowaitfortheconsumerthreadtocallgetatleast oncebeforethesecondcalltoputwillstopblockingandaddtheseconditemtothe queue. Clickheretoviewcodeimage queue.put(object())#Runsfirst print(‘Producerput1’) queue.put(object())#Runsthird print(‘Producerput2’) thread.join() print(‘Producerdone’) >>> Producerput1 Consumergot1 Producerput2 Consumergot2 Producerdone TheQueueclasscanalsotracktheprogressofworkusingthetask_donemethod.This letsyouwaitforaphase’sinputqueuetodrainandeliminatestheneedforpollingthe done_queueattheendofyourpipeline.Forexample,hereIdefineaconsumerthread thatcallstask_donewhenitfinishesworkingonanitem. Clickheretoviewcodeimage in_queue=Queue() defconsumer(): print(‘Consumerwaiting’) work=in_queue.get()#Donesecond print(‘Consumerworking’) #Doingwork #… print(‘Consumerdone’) in_queue.task_done()#Donethird Thread(target=consumer).start() Now,theproducercodedoesn’thavetojointheconsumerthreadorpoll.Theproducer canjustwaitforthein_queuetofinishbycallingjoinontheQueueinstance.Even onceit’sempty,thein_queuewon’tbejoinableuntilaftertask_doneiscalledfor everyitemthatwaseverenqueued. Clickheretoviewcodeimage in_queue.put(object())#Donefirst print(‘Producerwaiting’) in_queue.join()#Donefourth print(‘Producerdone’) >>> Consumerwaiting Producerwaiting Consumerworking Consumerdone Producerdone IcanputallofthesebehaviorstogetherintoaQueuesubclassthatalsotellstheworker threadwhenitshouldstopprocessing.Here,Idefineaclosemethodthataddsaspecial itemtothequeuethatindicatestherewillbenomoreinputitemsafterit: Clickheretoviewcodeimage classClosableQueue(Queue): SENTINEL=object() defclose(self): self.put(self.SENTINEL) Then,Idefineaniteratorforthequeuethatlooksforthisspecialobjectandstopsiteration whenit’sfound.This__iter__methodalsocallstask_doneatappropriatetimes, lettingmetracktheprogressofworkonthequeue. Clickheretoviewcodeimage def__iter__(self): whileTrue: item=self.get() try: ifitemisself.SENTINEL: return#Causethethreadtoexit yielditem finally: self.task_done() Now,IcanredefinemyworkerthreadtorelyonthebehavioroftheClosableQueue class.Thethreadwillexitoncetheforloopisexhausted. Clickheretoviewcodeimage classStoppableWorker(Thread): def__init__(self,func,in_queue,out_queue): #… defrun(self): foriteminself.in_queue: result=self.func(item) self.out_queue.put(result) Here,Ire-createthesetofworkerthreadsusingthenewworkerclass: Clickheretoviewcodeimage download_queue=ClosableQueue() #… threads=[ StoppableWorker(download,download_queue,resize_queue), #… ] Afterrunningtheworkerthreadslikebefore,Ialsosendthestopsignaloncealltheinput workhasbeeninjectedbyclosingtheinputqueueofthefirstphase. Clickheretoviewcodeimage forthreadinthreads: thread.start() for_inrange(1000): download_queue.put(object()) download_queue.close() Finally,Iwaitfortheworktofinishbyjoiningeachqueuethatconnectsthephases.Each timeonephaseisdone,Isignalthenextphasetostopbyclosingitsinputqueue.Atthe end,thedone_queuecontainsalloftheoutputobjectsasexpected. Clickheretoviewcodeimage download_queue.join() resize_queue.close() resize_queue.join() upload_queue.close() upload_queue.join() print(done_queue.qsize(),‘itemsfinished’) >>> 1000itemsfinished ThingstoRemember Pipelinesareagreatwaytoorganizesequencesofworkthatrunconcurrentlyusing multiplePythonthreads. Beawareofthemanyproblemsinbuildingconcurrentpipelines:busywaiting, stoppingworkers,andmemoryexplosion. TheQueueclasshasallofthefacilitiesyouneedtobuildrobustpipelines:blocking operations,buffersizes,andjoining. Item40:ConsiderCoroutinestoRunManyFunctions Concurrently ThreadsgivePythonprogrammersawaytorunmultiplefunctionsseeminglyatthesame time(seeItem37:“UseThreadsforBlockingI/O,AvoidforParallelism”).Butthereare threebigproblemswiththreads: Theyrequirespecialtoolstocoordinatewitheachothersafely(seeItem38:“Use LocktoPreventDataRacesinThreads”andItem39:“UseQueuetoCoordinate WorkBetweenThreads”).Thismakescodethatusesthreadshardertoreasonabout thanprocedural,single-threadedcode.Thiscomplexitymakesthreadedcodemore difficulttoextendandmaintainovertime. Threadsrequirealotofmemory,about8MBperexecutingthread.Onmany computers,thatamountofmemorydoesn’tmatterforadozenthreadsorso.But whatifyouwantyourprogramtoruntensofthousandsoffunctions “simultaneously”?Thesefunctionsmaycorrespondtouserrequeststoaserver, pixelsonascreen,particlesinasimulation,etc.Runningathreadperuniqueactivity justwon’twork. Threadsarecostlytostart.Ifyouwanttoconstantlybecreatingnewconcurrent functionsandfinishingthem,theoverheadofusingthreadsbecomeslargeandslows everythingdown. Pythoncanworkaroundalltheseissueswithcoroutines.Coroutinesletyouhavemany seeminglysimultaneousfunctionsinyourPythonprograms.They’reimplementedasan extensiontogenerators(seeItem16:“ConsiderGeneratorsInsteadofReturningLists”). Thecostofstartingageneratorcoroutineisafunctioncall.Onceactive,theyeachuseless than1KBofmemoryuntilthey’reexhausted. Coroutinesworkbyenablingthecodeconsumingageneratortosendavaluebackinto thegeneratorfunctionaftereachyieldexpression.Thegeneratorfunctionreceivesthe valuepassedtothesendfunctionastheresultofthecorrespondingyieldexpression. Clickheretoviewcodeimage defmy_coroutine(): whileTrue: received=yield print(‘Received:’,received) it=my_coroutine() next(it)#Primethecoroutine it.send(‘First’) it.send(‘Second’) >>> Received:First Received:Second Theinitialcalltonextisrequiredtopreparethegeneratorforreceivingthefirstsend byadvancingittothefirstyieldexpression.Together,yieldandsendprovide generatorswithastandardwaytovarytheirnextyieldedvalueinresponsetoexternal input. Forexample,sayyouwanttoimplementageneratorcoroutinethatyieldstheminimum valueit’sbeensentsofar.Here,thebareyieldpreparesthecoroutinewiththeinitial minimumvaluesentinfromtheoutside.Thenthegeneratorrepeatedlyyieldsthenew minimuminexchangeforthenextvaluetoconsider. Clickheretoviewcodeimage defminimize(): current=yield whileTrue: value=yieldcurrent current=min(value,current) Thecodeconsumingthegeneratorcanrunonestepatatimeandwilloutputtheminimum valueseenaftereachinput. Clickheretoviewcodeimage it=minimize() next(it)#Primethegenerator print(it.send(10)) print(it.send(4)) print(it.send(22)) print(it.send(-1)) >>> 10 4 4 -1 Thegeneratorfunctionwillseeminglyrunforever,makingforwardprogresswitheach newcalltosend.Likethreads,coroutinesareindependentfunctionsthatcanconsume inputsfromtheirenvironmentandproduceresultingoutputs.Thedifferenceisthat coroutinespauseateachyieldexpressioninthegeneratorfunctionandresumeafter eachcalltosendfromtheoutside.Thisisthemagicalmechanismofcoroutines. Thisbehaviorallowsthecodeconsumingthegeneratortotakeactionaftereachyield expressioninthecoroutine.Theconsumingcodecanusethegenerator’soutputvaluesto callotherfunctionsandupdatedatastructures.Mostimportantly,itcanadvanceother generatorfunctionsuntiltheirnextyieldexpressions.Byadvancingmanyseparate generatorsinlockstep,theywillallseemtoberunningsimultaneously,mimickingthe concurrentbehaviorofPythonthreads. TheGameofLife Letmedemonstratethesimultaneousbehaviorofcoroutineswithanexample.Sayyou wanttousecoroutinestoimplementConway’sGameofLife.Therulesofthegameare simple.Youhaveatwo-dimensionalgridofanarbitrarysize.Eachcellinthegridcan eitherbealiveorempty. ALIVE=‘*’ EMPTY=‘-‘ Thegameprogressesonetickoftheclockatatime.Ateachtick,eachcellcountshow manyofitsneighboringeightcellsarestillalive.Basedonitsneighborcount,eachcell decidesifitwillkeepliving,die,orregenerate.Here’sanexampleofa5×5GameofLife gridafterfourgenerationswithtimegoingtotheright.I’llexplainthespecificrules furtherbelow. Clickheretoviewcodeimage 0|1|2|3|4 –—|–—|–—|–—|–— -*–|—*—|—**-|—*—|–— —**-|—**-|-*–|-*–|-**— –*-|—**-|—**-|—*—|–— –—|–—|–—|–—|–— Icanmodelthisgamebyrepresentingeachcellasageneratorcoroutinerunningin lockstepwithalltheothers. Toimplementthis,firstIneedawaytoretrievethestatusofneighboringcells.Icando thiswithacoroutinenamedcount_neighborsthatworksbyyieldingQueryobjects. TheQueryclassIdefinemyself.Itspurposeistoprovidethegeneratorcoroutinewitha waytoaskitssurroundingenvironmentforinformation. Clickheretoviewcodeimage Query=namedtuple(‘Query’,(‘y’,‘x’)) ThecoroutineyieldsaQueryforeachneighbor.Theresultofeachyieldexpression willbethevalueALIVEorEMPTY.That’stheinterfacecontractI’vedefinedbetweenthe coroutineanditsconsumingcode.Thecount_neighborsgeneratorseesthe neighbors’statesandreturnsthecountoflivingneighbors. Clickheretoviewcodeimage defcount_neighbors(y,x): n_=yieldQuery(y+1,x+0)#North ne=yieldQuery(y+1,x+1)#Northeast #Definee_,se,s_,sw,w_,nw… #… neighbor_states=[n_,ne,e_,se,s_,sw,w_,nw] count=0 forstateinneighbor_states: ifstate==ALIVE: count+=1 returncount Icandrivethecount_neighborscoroutinewithfakedatatotestit.Here,Ishowhow Queryobjectswillbeyieldedforeachneighbor.count_neighborsexpectsto receivecellstatescorrespondingtoeachQuerythroughthecoroutine’ssendmethod. ThefinalcountisreturnedintheStopIterationexceptionthatisraisedwhenthe generatorisexhaustedbythereturnstatement. Clickheretoviewcodeimage it=count_neighbors(10,5) q1=next(it)#Getthefirstquery print(‘Firstyield:‘,q1) q2=it.send(ALIVE)#Sendq1state,getq2 print(‘Secondyield:’,q2) q3=it.send(ALIVE)#Sendq2state,getq3 #… try: count=it.send(EMPTY)#Sendq8state,retrievecount exceptStopIterationase: print(‘Count:‘,e.value)#Valuefromreturnstatement >>> Firstyield:Query(y=11,x=5) Secondyield:Query(y=11,x=6) … Count:2 NowIneedtheabilitytoindicatethatacellwilltransitiontoanewstateinresponsetothe neighborcountthatitfoundfromcount_neighbors.Todothis,Idefineanother coroutinecalledstep_cell.Thisgeneratorwillindicatetransitionsinacell’sstateby yieldingTransitionobjects.ThisisanotherclassthatIdefine,justliketheQuery class. Clickheretoviewcodeimage Transition=namedtuple(‘Transition’,(‘y’,‘x’,‘state’)) Thestep_cellcoroutinereceivesitscoordinatesinthegridasarguments.Ityieldsa Querytogettheinitialstateofthosecoordinates.Itrunscount_neighborsto inspectthecellsaroundit.Itrunsthegamelogictodeterminewhatstatethecellshould haveforthenextclocktick.Finally,ityieldsaTransitionobjecttotellthe environmentthecell’snextstate. Clickheretoviewcodeimage defgame_logic(state,neighbors): #… defstep_cell(y,x): state=yieldQuery(y,x) neighbors=yieldfromcount_neighbors(y,x) next_state=game_logic(state,neighbors) yieldTransition(y,x,next_state) Importantly,thecalltocount_neighborsusestheyieldfromexpression.This expressionallowsPythontocomposegeneratorcoroutinestogether,makingiteasyto reusesmallerpiecesoffunctionalityandbuildcomplexcoroutinesfromsimplerones. Whencount_neighborsisexhausted,thefinalvalueitreturns(withthereturn statement)willbepassedtostep_cellastheresultoftheyieldfromexpression. Now,IcanfinallydefinethesimplegamelogicforConway’sGameofLife.Thereare onlythreerules. Clickheretoviewcodeimage defgame_logic(state,neighbors): ifstate==ALIVE: ifneighbors<2: returnEMPTY#Die:Toofew elifneighbors>3: returnEMPTY#Die:Toomany else: ifneighbors==3: returnALIVE#Regenerate returnstate Icandrivethestep_cellcoroutinewithfakedatatotestit. Clickheretoviewcodeimage it=step_cell(10,5) q0=next(it)#Initiallocationquery print(‘Me:’,q0) q1=it.send(ALIVE)#Sendmystatus,getneighborquery print(‘Q1:’,q1) #… t1=it.send(EMPTY)#Sendforq8,getgamedecision print(‘Outcome:‘,t1) >>> Me:Query(y=10,x=5) Q1:Query(y=11,x=5) … Outcome:Transition(y=10,x=5,state=’-‘) Thegoalofthegameistorunthislogicforawholegridofcellsinlockstep.Todothis,I canfurthercomposethestep_cellcoroutineintoasimulatecoroutine.This coroutineprogressesthegridofcellsforwardbyyieldingfromstep_cellmanytimes. Afterprogressingeverycoordinate,ityieldsaTICKobjecttoindicatethatthecurrent generationofcellshavealltransitioned. Clickheretoviewcodeimage TICK=object() defsimulate(height,width): whileTrue: foryinrange(height): forxinrange(width): yieldfromstep_cell(y,x) yieldTICK What’simpressiveaboutsimulateisthatit’scompletelydisconnectedfromthe surroundingenvironment.Istillhaven’tdefinedhowthegridisrepresentedinPython objects,howQuery,Transition,andTICKvaluesarehandledontheoutside,nor howthegamegetsitsinitialstate.Butthelogicisclear.Eachcellwilltransitionby runningstep_cell.Thenthegameclockwilltick.Thiswillcontinueforever,aslong asthesimulatecoroutineisadvanced. Thisisthebeautyofcoroutines.Theyhelpyoufocusonthelogicofwhatyou’retryingto accomplish.Theydecoupleyourcode’sinstructionsfortheenvironmentfromthe implementationthatcarriesoutyourwishes.Thisenablesyoutoruncoroutinesseemingly inparallel.Thisalsoallowsyoutoimprovetheimplementationoffollowingthose instructionsovertimewithoutchangingthecoroutines. Now,Iwanttorunsimulateinarealenvironment.Todothat,Ineedtorepresentthe stateofeachcellinthegrid.Here,Idefineaclasstocontainthegrid: Clickheretoviewcodeimage classGrid(object): def__init__(self,height,width): self.height=height self.width=width self.rows=[] for_inrange(self.height): self.rows.append([EMPTY]*self.width) def__str__(self): #… Thegridallowsyoutogetandsetthevalueofanycoordinate.Coordinatesthatareoutof boundswillwraparound,makingthegridactlikeinfiniteloopingspace. Clickheretoviewcodeimage defquery(self,y,x): returnself.rows[y%self.height][x%self.width] defassign(self,y,x,state): self.rows[y%self.height][x%self.width]=state Atlast,Icandefinethefunctionthatinterpretsthevaluesyieldedfromsimulateandall ofitsinteriorcoroutines.Thisfunctionturnstheinstructionsfromthecoroutinesinto interactionswiththesurroundingenvironment.Itprogressesthewholegridofcells forwardasinglestepandthenreturnsanewgridcontainingthenextstate. Clickheretoviewcodeimage deflive_a_generation(grid,sim): progeny=Grid(grid.height,grid.width) item=next(sim) whileitemisnotTICK: ifisinstance(item,Query): state=grid.query(item.y,item.x) item=sim.send(state) else:#MustbeaTransition progeny.assign(item.y,item.x,item.state) item=next(sim) returnprogeny Toseethisfunctioninaction,Ineedtocreateagridandsetitsinitialstate.Here,Imakea classicshapecalledaglider. grid=Grid(5,9) grid.assign(0,3,ALIVE) #… print(grid) >>> –*–— –-*–—***–––– ––– NowIcanprogressthisgridforwardonegenerationatatime.Youcanseehowtheglider movesdownandtotherightonthegridbasedonthesimplerulesfromthegame_logic function. Clickheretoviewcodeimage classColumnPrinter(object): #… columns=ColumnPrinter() sim=simulate(grid.height,grid.width) foriinrange(5): columns.append(str(grid)) grid=live_a_generation(grid,sim) print(columns) >>> 0|1|2|3|4 –*–—|–––|–––|–––|––– –-*–-|—*-*–-|–-*–-|–*–—|–-*–—***–-|–**–-|—*-*–-|–-**–|–—*– –––|–*–—|–**–-|–**–-|–***– –––|–––|–––|–––|––– ThebestpartaboutthisapproachisthatIcanchangethegame_logicfunctionwithout havingtoupdatethecodethatsurroundsit.Icanchangetherulesoraddlargerspheresof influencewiththeexistingmechanicsofQuery,Transition,andTICK.This demonstrateshowcoroutinesenabletheseparationofconcerns,whichisanimportant designprinciple. CoroutinesinPython2 Unfortunately,Python2ismissingsomeofthesyntacticalsugarthatmakescoroutinesso elegantinPython3.Therearetwolimitations.First,thereisnoyieldfromexpression. ThatmeansthatwhenyouwanttocomposegeneratorcoroutinesinPython2,youneedto includeanadditionalloopatthedelegationpoint. Clickheretoviewcodeimage #Python2 defdelegated(): yield1 yield2 defcomposed(): yield‘A’ forvalueindelegated():#yieldfrominPython3 yieldvalue yield‘B’ printlist(composed()) >>> [‘A’,1,2,‘B’] ThesecondlimitationisthatthereisnosupportforthereturnstatementinPython2 generators.Togetthesamebehaviorthatinteractscorrectlywithtry/except/finally blocks,youneedtodefineyourownexceptiontypeandraiseitwhenyouwanttoreturna value. Clickheretoviewcodeimage #Python2 classMyReturn(Exception): def__init__(self,value): self.value=value defdelegated(): yield1 raiseMyReturn(2)#return2inPython3 yield‘Notreached’ defcomposed(): try: forvalueindelegated(): yieldvalue exceptMyReturnase: output=e.value yieldoutput*4 printlist(composed()) >>> [1,8] ThingstoRemember Coroutinesprovideanefficientwaytoruntensofthousandsoffunctionsseemingly atthesametime. Withinagenerator,thevalueoftheyieldexpressionwillbewhatevervaluewas passedtothegenerator’ssendmethodfromtheexteriorcode. Coroutinesgiveyouapowerfultoolforseparatingthecorelogicofyourprogram fromitsinteractionwiththesurroundingenvironment. Python2doesn’tsupportyieldfromorreturningvaluesfromgenerators. Item41:Considerconcurrent.futuresforTrue Parallelism AtsomepointinwritingPythonprograms,youmayhittheperformancewall.Evenafter optimizingyourcode(seeItem58:“ProfileBeforeOptimizing”),yourprogram’s executionmaystillbetooslowforyourneeds.Onmoderncomputersthathavean increasingnumberofCPUcores,it’sreasonabletoassumethatonesolutionwouldbe parallelism.Whatifyoucouldsplityourcode’scomputationintoindependentpiecesof workthatrunsimultaneouslyacrossmultipleCPUcores? Unfortunately,Python’sglobalinterpreterlock(GIL)preventstrueparallelisminthreads (seeItem37:“UseThreadsforBlockingI/O,AvoidforParallelism”),sothatoptionisout. Anothercommonsuggestionistorewriteyourmostperformance-criticalcodeasan extensionmoduleusingtheClanguage.Cgetsyouclosertothebaremetalandcanrun fasterthanPython,eliminatingtheneedforparallelism.C-extensionscanalsostartnative threadsthatruninparallelandutilizemultipleCPUcores.Python’sAPIforC-extensions iswelldocumentedandagoodchoiceforanescapehatch. ButrewritingyourcodeinChasahighcost.Codethatisshortandunderstandablein PythoncanbecomeverboseandcomplicatedinC.Suchaportrequiresextensivetesting toensurethatthefunctionalityisequivalenttotheoriginalPythoncodeandthatnobugs havebeenintroduced.Sometimesit’sworthit,whichexplainsthelargeecosystemofCextensionmodulesinthePythoncommunitythatspeedupthingsliketextparsing,image compositing,andmatrixmath.ThereareevenopensourcetoolssuchasCython (http://cython.org/)andNumba(http://numba.pydata.org/)thatcaneasethetransitiontoC. TheproblemisthatmovingonepieceofyourprogramtoCisn’tsufficientmostofthe time.OptimizedPythonprogramsusuallydon’thaveonemajorsourceofslowness,but rather,thereareoftenmanysignificantcontributors.TogetthebenefitsofC’sbaremetal andthreads,you’dneedtoportlargepartsofyourprogram,drasticallyincreasingtesting needsandrisk.TheremustbeabetterwaytopreserveyourinvestmentinPythontosolve difficultcomputationalproblems. Themultiprocessingbuilt-inmodule,easilyaccessedviathe concurrent.futuresbuilt-inmodule,maybeexactlywhatyouneed.Itenables PythontoutilizemultipleCPUcoresinparallelbyrunningadditionalinterpretersaschild processes.Thesechildprocessesareseparatefromthemaininterpreter,sotheirglobal interpreterlocksarealsoseparate.EachchildcanfullyutilizeoneCPUcore.Eachchild hasalinktothemainprocesswhereitreceivesinstructionstodocomputationandreturns results. Forexample,sayyouwanttodosomethingcomputationallyintensivewithPythonand utilizemultipleCPUcores.I’lluseanimplementationoffindingthegreatestcommon divisoroftwonumbersasaproxyforamorecomputationallyintensealgorithm,like simulatingfluiddynamicswiththeNavier-Stokesequation. Clickheretoviewcodeimage defgcd(pair): a,b=pair low=min(a,b) foriinrange(low,0,-1): ifa%i==0andb%i==0: returni Runningthisfunctioninserialtakesalinearlyincreasingamountoftimebecausethereis noparallelism. Clickheretoviewcodeimage numbers=[(1963309,2265973),(2030677,3814172), (1551645,2229620),(2039045,2020802)] start=time() results=list(map(gcd,numbers)) end=time() print(‘Took%.3fseconds’%(end-start)) >>> Took1.170seconds RunningthiscodeonmultiplePythonthreadswillyieldnospeedimprovementbecause theGILpreventsPythonfromusingmultipleCPUcoresinparallel.Here,Idothesame computationasaboveusingtheconcurrent.futuresmodulewithits ThreadPoolExecutorclassandtwoworkerthreads(tomatchthenumberofCPU coresonmycomputer): Clickheretoviewcodeimage start=time() pool=ThreadPoolExecutor(max_workers=2) results=list(pool.map(gcd,numbers)) end=time() print(‘Took%.3fseconds’%(end-start)) >>> Took1.199seconds It’sevenslowerthistimebecauseoftheoverheadofstartingandcommunicatingwiththe poolofthreads. Nowforthesurprisingpart:Bychangingasinglelineofcode,somethingmagical happens.IfIreplacetheThreadPoolExecutorwiththeProcessPoolExecutor fromtheconcurrent.futuresmodule,everythingspeedsup. Clickheretoviewcodeimage start=time() pool=ProcessPoolExecutor(max_workers=2)#Theonechange results=list(pool.map(gcd,numbers)) end=time() print(‘Took%.3fseconds’%(end-start)) >>> Took0.663seconds Runningonmydual-coremachine,it’ssignificantlyfaster!Howisthispossible?Here’s whattheProcessPoolExecutorclassactuallydoes(viathelow-levelconstructs providedbythemultiprocessingmodule): 1.Ittakeseachitemfromthenumbersinputdatatomap. 2.Itserializesitintobinarydatausingthepicklemodule(seeItem44:“Make pickleReliablewithcopyreg”). 3.Itcopiestheserializeddatafromthemaininterpreterprocesstoachildinterpreter processoveralocalsocket. 4.Next,itdeserializesthedatabackintoPythonobjectsusingpickleinthechild process. 5.ItthenimportsthePythonmodulecontainingthegcdfunction. 6.Itrunsthefunctionontheinputdatainparallelwithotherchildprocesses. 7.Itserializestheresultbackintobytes. 8.Itcopiesthosebytesbackthroughthesocket. 9.ItdeserializesthebytesbackintoPythonobjectsintheparentprocess. 10.Finally,itmergestheresultsfrommultiplechildrenintoasinglelisttoreturn. Althoughitlookssimpletotheprogrammer,themultiprocessingmoduleand ProcessPoolExecutorclassdoahugeamountofworktomakeparallelismpossible. Inmostotherlanguages,theonlytouchpointyouneedtocoordinatetwothreadsisa singlelockoratomicoperation.Theoverheadofusingmultiprocessingishigh becauseofalloftheserializationanddeserializationthatmusthappenbetweentheparent andchildprocesses. Thisschemeiswellsuitedtocertaintypesofisolated,high-leveragetasks.Byisolated,I meanfunctionsthatdon’tneedtosharestatewithotherpartsoftheprogram.Byhighleverage,Imeansituationsinwhichonlyasmallamountofdatamustbetransferred betweentheparentandchildprocessestoenablealargeamountofcomputation.The greatestcommondenominatoralgorithmisoneexampleofthis,butmanyother mathematicalalgorithmsworksimilarly. Ifyourcomputationdoesn’thavethesecharacteristics,thentheoverheadof multiprocessingmaypreventitfromspeedingupyourprogramthrough parallelization.Whenthathappens,multiprocessingprovidesmoreadvanced facilitiesforsharedmemory,cross-processlocks,queues,andproxies.Butallofthese featuresareverycomplex.It’shardenoughtoreasonaboutsuchtoolsinthememory spaceofasingleprocesssharedbetweenPythonthreads.Extendingthatcomplexityto otherprocessesandinvolvingsocketsmakesthismuchmoredifficulttounderstand. Isuggestavoidingallpartsofmultiprocessingandusingthesefeaturesviathe simplerconcurrent.futuresmodule.Youcanstartbyusingthe ThreadPoolExecutorclasstorunisolated,high-leveragefunctionsinthreads.Later, youcanmovetotheProcessPoolExecutortogetaspeedup.Finally,onceyou’ve completelyexhaustedtheotheroptions,youcanconsiderusingthemultiprocessing moduledirectly. ThingstoRemember MovingCPUbottleneckstoC-extensionmodulescanbeaneffectivewayto improveperformancewhilemaximizingyourinvestmentinPythoncode.However, thecostofdoingsoishighandmayintroducebugs. Themultiprocessingmoduleprovidespowerfultoolsthatcanparallelize certaintypesofPythoncomputationwithminimaleffort. Thepowerofmultiprocessingisbestaccessedthroughthe concurrent.futuresbuilt-inmoduleanditssimple ProcessPoolExecutorclass. Theadvancedpartsofthemultiprocessingmoduleshouldbeavoidedbecause theyaresocomplex. 6.Built-inModules Pythontakesa“batteriesincluded”approachtothestandardlibrary.Manyotherlanguages shipwithasmallnumberofcommonpackagesandrequireyoutolookelsewherefor importantfunctionality.AlthoughPythonalsohasanimpressiverepositoryofcommunitybuiltmodules,itstrivestoprovide,initsdefaultinstallation,themostimportantmodules forcommonusesofthelanguage. Thefullsetofstandardmodulesistoolargetocoverinthisbook.ButsomeofthesebuiltinpackagesaresocloselyintertwinedwithidiomaticPythonthattheymayaswellbepart ofthelanguagespecification.Theseessentialbuilt-inmodulesareespeciallyimportant whenwritingtheintricate,error-pronepartsofprograms. Item42:DefineFunctionDecoratorswith functools.wraps Pythonhasspecialsyntaxfordecoratorsthatcanbeappliedtofunctions.Decoratorshave theabilitytorunadditionalcodebeforeandafteranycallstothefunctionstheywrap.This allowsthemtoaccessandmodifyinputargumentsandreturnvalues.Thisfunctionality canbeusefulforenforcingsemantics,debugging,registeringfunctions,andmore. Forexample,sayyouwanttoprinttheargumentsandreturnvalueofafunctioncall.This isespeciallyhelpfulwhendebuggingastackoffunctioncallsfromarecursivefunction. Here,Idefinesuchadecorator: Clickheretoviewcodeimage deftrace(func): defwrapper(*args,**kwargs): result=func(*args,**kwargs) print(‘%s(%r,%r)->%r’% (func.__name__,args,kwargs,result)) returnresult returnwrapper Icanapplythistoafunctionusingthe@symbol. Clickheretoviewcodeimage @trace deffibonacci(n): “““Returnthen-thFibonaccinumber””” ifnin(0,1): returnn return(fibonacci(n-2)+fibonacci(n-1)) The@symbolisequivalenttocallingthedecoratoronthefunctionitwrapsandassigning thereturnvaluetotheoriginalnameinthesamescope. fibonacci=trace(fibonacci) Callingthisdecoratedfunctionwillrunthewrappercodebeforeandafterfibonacci runs,printingtheargumentsandreturnvalueateachlevelintherecursivestack. fibonacci(3) >>> fibonacci((1,),{})->1 fibonacci((0,),{})->0 fibonacci((1,),{})->1 fibonacci((2,),{})->1 fibonacci((3,),{})->2 Thisworkswell,butithasanunintendedsideeffect.Thevaluereturnedbythedecorator —thefunctionthat’scalledabove—doesn’tthinkit’snamedfibonacci. Clickheretoviewcodeimage print(fibonacci) >>> <functiontrace.<locals>.wrapperat0x107f7ed08> Thecauseofthisisn’thardtosee.Thetracefunctionreturnsthewrapperitdefines. Thewrapperfunctioniswhat’sassignedtothefibonaccinameinthecontaining modulebecauseofthedecorator.Thisbehaviorisproblematicbecauseitunderminestools thatdointrospection,suchasdebuggers(seeItem57:“ConsiderInteractiveDebugging withpdb”)andobjectserializers(seeItem44:“MakepickleReliablewith copyreg”). Forexample,thehelpbuilt-infunctionisuselessonthedecoratedfibonacci function. Clickheretoviewcodeimage help(fibonacci) >>> Helponfunctionwrapperinmodule__main__: wrapper(*args,**kwargs) Thesolutionistousethewrapshelperfunctionfromthefunctoolsbuilt-inmodule. Thisisadecoratorthathelpsyouwritedecorators.Applyingittothewrapperfunction willcopyalloftheimportantmetadataabouttheinnerfunctiontotheouterfunction. Clickheretoviewcodeimage deftrace(func): @wraps(func) defwrapper(*args,**kwargs): #… returnwrapper @trace deffibonacci(n): #… Now,runningthehelpfunctionproducestheexpectedresult,eventhoughthefunctionis decorated. Clickheretoviewcodeimage help(fibonacci) >>> Helponfunctionfibonacciinmodule__main__: fibonacci(n) Returnthen-thFibonaccinumber Callinghelpisjustoneexampleofhowdecoratorscansubtlycauseproblems.Python functionshavemanyotherstandardattributes(e.g.,__name__,__module__)thatmust bepreservedtomaintaintheinterfaceoffunctionsinthelanguage.Usingwrapsensures thatyou’llalwaysgetthecorrectbehavior. ThingstoRemember DecoratorsarePythonsyntaxforallowingonefunctiontomodifyanotherfunction atruntime. Usingdecoratorscancausestrangebehaviorsintoolsthatdointrospection,suchas debuggers. Usethewrapsdecoratorfromthefunctoolsbuilt-inmodulewhenyoudefine yourowndecoratorstoavoidanyissues. Item43:ConsidercontextlibandwithStatementsfor Reusabletry/finallyBehavior ThewithstatementinPythonisusedtoindicatewhencodeisrunninginaspecial context.Forexample,mutualexclusionlocks(seeItem38:“UseLocktoPreventData RacesinThreads”)canbeusedinwithstatementstoindicatethattheindentedcodeonly runswhilethelockisheld. lock=Lock() withlock: print(‘Lockisheld’) Theexampleaboveisequivalenttothistry/finallyconstructionbecausetheLock classproperlyenablesthewithstatement. lock.acquire() try: print(‘Lockisheld’) finally: lock.release() Thewithstatementversionofthisisbetterbecauseiteliminatestheneedtowritethe repetitivecodeofthetry/finallyconstruction.It’seasytomakeyourobjectsand functionscapableofuseinwithstatementsbyusingthecontextlibbuilt-inmodule. Thismodulecontainsthecontextmanagerdecorator,whichletsasimplefunctionbe usedinwithstatements.Thisismucheasierthandefininganewclasswiththespecial methods__enter__and__exit__(thestandardway). Forexample,sayyouwantaregionofyourcodetohavemoredebugloggingsometimes. Here,Idefineafunctionthatdoesloggingattwoseveritylevels: Clickheretoviewcodeimage defmy_function(): logging.debug(‘Somedebugdata’) logging.error(‘Errorloghere’) logging.debug(‘Moredebugdata’) ThedefaultloglevelformyprogramisWARNING,soonlytheerrormessagewillprintto screenwhenIrunthefunction. my_function() >>> Errorloghere Icanelevatetheloglevelofthisfunctiontemporarilybydefiningacontextmanager.This helperfunctionbooststheloggingseveritylevelbeforerunningthecodeinthewith blockandreducestheloggingseveritylevelafterward. Clickheretoviewcodeimage @contextmanager defdebug_logging(level): logger=logging.getLogger() old_level=logger.getEffectiveLevel() logger.setLevel(level) try: yield finally: logger.setLevel(old_level) Theyieldexpressionisthepointatwhichthewithblock’scontentswillexecute.Any exceptionsthathappeninthewithblockwillbere-raisedbytheyieldexpressionfor youtocatchinthehelperfunction(seeItem40:“ConsiderCoroutinestoRunMany FunctionsConcurrently”foranexplanationofhowthatworks). Now,Icancallthesameloggingfunctionagain,butinthedebug_loggingcontext. Thistime,allofthedebugmessagesareprintedtothescreenduringthewithblock.The samefunctionrunningoutsidethewithblockwon’tprintdebugmessages. Clickheretoviewcodeimage withdebug_logging(logging.DEBUG): print(‘Inside:’) my_function() print(‘After:’) my_function() >>> Inside: Somedebugdata Errorloghere Moredebugdata After: Errorloghere UsingwithTargets Thecontextmanagerpassedtoawithstatementmayalsoreturnanobject.Thisobjectis assignedtoalocalvariableintheaspartofthecompoundstatement.Thisgivesthecode runninginthewithblocktheabilitytodirectlyinteractwithitscontext. Forexample,sayyouwanttowriteafileandensurethatit’salwaysclosedcorrectly.You candothisbypassingopentothewithstatement.openreturnsafilehandlefortheas targetofwithandwillclosethehandlewhenthewithblockexits. Clickheretoviewcodeimage withopen(‘/tmp/my_output.txt’,‘w’)ashandle: handle.write(‘Thisissomedata!’) Thisapproachispreferabletomanuallyopeningandclosingthefilehandleeverytime.It givesyouconfidencethatthefileiseventuallyclosedwhenexecutionleavesthewith statement.Italsoencouragesyoutoreducetheamountofcodethatexecuteswhilethefile handleisopen,whichisgoodpracticeingeneral. Toenableyourownfunctionstosupplyvaluesforastargets,allyouneedtodoisyield avaluefromyourcontextmanager.Forexample,hereIdefineacontextmanagertofetch aLoggerinstance,setitslevel,andthenyielditfortheastarget. Clickheretoviewcodeimage @contextmanager deflog_level(level,name): logger=logging.getLogger(name) old_level=logger.getEffectiveLevel() logger.setLevel(level) try: yieldlogger finally: logger.setLevel(old_level) Callingloggingmethodslikedebugontheastargetwillproduceoutputbecausethe loggingseveritylevelissetlowenoughinthewithblock.Usingtheloggingmodule directlywon’tprintanythingbecausethedefaultloggingseveritylevelforthedefault programloggerisWARNING. Clickheretoviewcodeimage withlog_level(logging.DEBUG,‘my-log’)aslogger: logger.debug(‘Thisismymessage!’) logging.debug(‘Thiswillnotprint’) >>> Thisismymessage! Afterthewithstatementexits,callingdebugloggingmethodsontheLoggernamed 'my-log'willnotprintanythingbecausethedefaultloggingseveritylevelhasbeen restored.Errorlogmessageswillalwaysprint. Clickheretoviewcodeimage logger=logging.getLogger(‘my-log’) logger.debug(‘Debugwillnotprint’) logger.error(‘Errorwillprint’) >>> Errorwillprint ThingstoRemember Thewithstatementallowsyoutoreuselogicfromtry/finallyblocksand reducevisualnoise. Thecontextlibbuilt-inmoduleprovidesacontextmanagerdecoratorthat makesiteasytouseyourownfunctionsinwithstatements. Thevalueyieldedbycontextmanagersissuppliedtotheaspartofthewith statement.It’susefulforlettingyourcodedirectlyaccessthecauseofthespecial context. Item44:MakepickleReliablewithcopyreg Thepicklebuilt-inmodulecanserializePythonobjectsintoastreamofbytesand deserializebytesbackintoobjects.Pickledbytestreamsshouldn’tbeusedto communicatebetweenuntrustedparties.ThepurposeofpickleistoletyoupassPython objectsbetweenprogramsthatyoucontroloverbinarychannels. Note Thepicklemodule’sserializationformatisunsafebydesign.Theserializeddata containswhatisessentiallyaprogramthatdescribeshowtoreconstructtheoriginal Pythonobject.Thismeansamaliciouspicklepayloadcouldbeusedto compromiseanypartofthePythonprogramthatattemptstodeserializeit. Incontrast,thejsonmoduleissafebydesign.SerializedJSONdatacontainsa simpledescriptionofanobjecthierarchy.DeserializingJSONdatadoesnotexpose aPythonprogramtoanyadditionalrisk.FormatslikeJSONshouldbeusedfor communicationbetweenprogramsorpeoplethatdon’ttrusteachother. Forexample,sayyouwanttouseaPythonobjecttorepresentthestateofaplayer’s progressinagame.Thegamestateincludestheleveltheplayerisonandthenumberof livesheorshehasremaining. classGameState(object): def__init__(self): self.level=0 self.lives=4 Theprogrammodifiesthisobjectasthegameruns. Clickheretoviewcodeimage state=GameState() state.level+=1#Playerbeatalevel state.lives-=1#Playerhadtotryagain Whentheuserquitsplaying,theprogramcansavethestateofthegametoafilesoitcan beresumedatalatertime.Thepicklemodulemakesiteasytodothis.Here,Idump theGameStateobjectdirectlytoafile: Clickheretoviewcodeimage state_path=‘/tmp/game_state.bin’ withopen(state_path,‘wb’)asf: pickle.dump(state,f) Later,IcanloadthefileandgetbacktheGameStateobjectasifithadneverbeen serialized. Clickheretoviewcodeimage withopen(state_path,‘rb’)asf: state_after=pickle.load(f) print(state_after.__dict__) >>> {‘lives’:3,‘level’:1} Theproblemwiththisapproachiswhathappensasthegame’sfeaturesexpandovertime. Imagineyouwanttheplayertoearnpointstowardsahighscore.Totracktheplayer’s points,you’daddanewfieldtotheGameStateclass. classGameState(object): def__init__(self): #… self.points=0 SerializingthenewversionoftheGameStateclassusingpicklewillworkexactlyas before.Here,Isimulatetheround-tripthroughafilebyserializingtoastringwithdumps andbacktoanobjectwithloads: Clickheretoviewcodeimage state=GameState() serialized=pickle.dumps(state) state_after=pickle.loads(serialized) print(state_after.__dict__) >>> {‘lives’:4,‘level’:0,‘points’:0} ButwhathappenstooldersavedGameStateobjectsthattheusermaywanttoresume? Here,Iunpickleanoldgamefileusingaprogramwiththenewdefinitionofthe GameStateclass: Clickheretoviewcodeimage withopen(state_path,‘rb’)asf: state_after=pickle.load(f) print(state_after.__dict__) >>> {‘lives’:3,‘level’:1} Thepointsattributeismissing!Thisisespeciallyconfusingbecausethereturnedobject isaninstanceofthenewGameStateclass. Clickheretoviewcodeimage assertisinstance(state_after,GameState) Thisbehaviorisabyproductofthewaythepicklemoduleworks.Itsprimaryusecase ismakingiteasytoserializeobjects.Assoonasyouruseofpickleexpandsbeyond trivialusage,themodule’sfunctionalitystartstobreakdowninsurprisingways. Fixingtheseproblemsisstraightforwardusingthecopyregbuilt-inmodule.The copyregmoduleletsyouregisterthefunctionsresponsibleforserializingPython objects,allowingyoutocontrolthebehaviorofpickleandmakeitmorereliable. DefaultAttributeValues Inthesimplestcase,youcanuseaconstructorwithdefaultarguments(seeItem19: “ProvideOptionalBehaviorwithKeywordArguments”)toensurethatGameState objectswillalwayshaveallattributesafterunpickling.Here,Iredefinetheconstructorthis way: Clickheretoviewcodeimage classGameState(object): def__init__(self,level=0,lives=4,points=0): self.level=level self.lives=lives self.points=points Tousethisconstructorforpickling,IdefineahelperfunctionthattakesaGameState objectandturnsitintoatupleofparametersforthecopyregmodule.Thereturnedtuple containsthefunctiontouseforunpicklingandtheparameterstopasstotheunpickling function. Clickheretoviewcodeimage defpickle_game_state(game_state): kwargs=game_state.__dict__ returnunpickle_game_state,(kwargs,) Now,Ineedtodefinetheunpickle_game_statehelper.Thisfunctiontakes serializeddataandparametersfrompickle_game_stateandreturnsthe correspondingGameStateobject.It’satinywrapperaroundtheconstructor. Clickheretoviewcodeimage defunpickle_game_state(kwargs): returnGameState(**kwargs) Now,Iregisterthesewiththecopyregbuilt-inmodule. Clickheretoviewcodeimage copyreg.pickle(GameState,pickle_game_state) Serializinganddeserializingworksasbefore. Clickheretoviewcodeimage state=GameState() state.points+=1000 serialized=pickle.dumps(state) state_after=pickle.loads(serialized) print(state_after.__dict__) >>> {‘lives’:4,‘level’:0,‘points’:1000} Withthisregistrationdone,nowIcanchangethedefinitionofGameStatetogivethe playeracountofmagicspellstouse.ThischangeissimilartowhenIaddedthepoints fieldtoGameState. Clickheretoviewcodeimage classGameState(object): def__init__(self,level=0,lives=4,points=0,magic=5): #… Butunlikebefore,deserializinganoldGameStateobjectwillresultinvalidgamedata insteadofmissingattributes.Thisworksbecauseunpickle_game_statecallsthe GameStateconstructordirectly.Theconstructor’skeywordargumentshavedefault valueswhenparametersaremissing.Thiscausesoldgamestatefilestoreceivethedefault valueforthenewmagicfieldwhentheyaredeserialized. Clickheretoviewcodeimage state_after=pickle.loads(serialized) print(state_after.__dict__) >>> {‘level’:0,‘points’:1000,‘magic’:5,‘lives’:4} VersioningClasses Sometimesyou’llneedtomakebackwards-incompatiblechangestoyourPythonobjects byremovingfields.Thispreventsthedefaultargumentapproachtoserializationfrom working. Forexample,sayyourealizethatalimitednumberoflivesisabadidea,andyouwantto removetheconceptoflivesfromthegame.Here,IredefinetheGameStatetonolonger havealivesfield: Clickheretoviewcodeimage classGameState(object): def__init__(self,level=0,points=0,magic=5): #… Theproblemisthatthisbreaksdeserializingoldgamedata.Allfieldsfromtheolddata, evenonesremovedfromtheclass,willbepassedtotheGameStateconstructorbythe unpickle_game_statefunction. Clickheretoviewcodeimage pickle.loads(serialized) >>> TypeError:__init__()gotanunexpectedkeywordargument‘lives’ Thesolutionistoaddaversionparametertothefunctionssuppliedtocopyreg.New serializeddatawillhaveaversionof2specifiedwhenpicklinganewGameState object. Clickheretoviewcodeimage defpickle_game_state(game_state): kwargs=game_state.__dict__ kwargs[‘version’]=2 returnunpickle_game_state,(kwargs,) Oldversionsofthedatawillnothaveaversionargumentpresent,allowingyouto manipulatetheargumentspassedtotheGameStateconstructoraccordingly. Clickheretoviewcodeimage defunpickle_game_state(kwargs): version=kwargs.pop(‘version’,1) ifversion==1: kwargs.pop(‘lives’) returnGameState(**kwargs) Now,deserializinganoldobjectworksproperly. Clickheretoviewcodeimage copyreg.pickle(GameState,pickle_game_state) state_after=pickle.loads(serialized) print(state_after.__dict__) >>> {‘magic’:5,‘level’:0,‘points’:1000} Youcancontinuethisapproachtohandlechangesbetweenfutureversionsofthesame class.Anylogicyouneedtoadaptanoldversionoftheclasstoanewversionoftheclass cangointheunpickle_game_statefunction. StableImportPaths Oneotherissueyoumayencounterwithpickleisbreakagefromrenamingaclass. Oftenoverthelifecycleofaprogram,you’llrefactoryourcodebyrenamingclassesand movingthemtoothermodules.Unfortunately,thiswillbreakthepicklemoduleunless you’recareful. Here,IrenametheGameStateclasstoBetterGameState,removingtheoldclass fromtheprogramentirely: Clickheretoviewcodeimage classBetterGameState(object): def__init__(self,level=0,points=0,magic=5): #… AttemptingtodeserializeanoldGameStateobjectwillnowfailbecausetheclasscan’t befound. Clickheretoviewcodeimage pickle.loads(serialized) >>> AttributeError:Can’tgetattribute‘GameState’on<module‘__main__’from ‘my_code.py’> Thecauseofthisexceptionisthattheimportpathoftheserializedobject’sclassis encodedinthepickleddata. Clickheretoviewcodeimage print(serialized[:25]) >>> b’\x80\x03c__main__\nGameState\nq\x00)’ Thesolutionistousecopyregagain.Youcanspecifyastableidentifierforthefunction touseforunpicklinganobject.Thisallowsyoutotransitionpickleddatatodifferent classeswithdifferentnameswhenit’sdeserialized.Itgivesyoualevelofindirection. Clickheretoviewcodeimage copyreg.pickle(BetterGameState,pickle_game_state) Afterusingcopyreg,youcanseethattheimportpathtopickle_game_stateis encodedintheserializeddatainsteadofBetterGameState. Clickheretoviewcodeimage state=BetterGameState() serialized=pickle.dumps(state) print(serialized[:35]) >>> b’\x80\x03c__main__\nunpickle_game_state\nq\x00}’ Theonlygotchaisthatyoucan’tchangethepathofthemoduleinwhichthe unpickle_game_statefunctionispresent.Onceyouserializedatawithafunction,it mustremainavailableonthatimportpathfordeserializinginthefuture. ThingstoRemember Thepicklebuilt-inmoduleisonlyusefulforserializinganddeserializingobjects betweentrustedprograms. Thepicklemodulemaybreakdownwhenusedformorethantrivialusecases. Usethecopyregbuilt-inmodulewithpickletoaddmissingattributevalues, allowversioningofclasses,andprovidestableimportpaths. Item45:UsedatetimeInsteadoftimeforLocalClocks CoordinatedUniversalTime(UTC)isthestandard,time-zone-independentrepresentation oftime.UTCworksgreatforcomputersthatrepresenttimeassecondssincetheUNIX epoch.ButUTCisn’tidealforhumans.Humansreferencetimerelativetowherethey’re currentlylocated.Peoplesay“noon”or“8am”insteadof“UTC15:00minus7hours.”If yourprogramhandlestime,you’llprobablyfindyourselfconvertingtimebetweenUTC andlocalclockstomakeiteasierforhumanstounderstand. Pythonprovidestwowaysofaccomplishingtimezoneconversions.Theoldway,using thetimebuilt-inmodule,isdisastrouslyerrorprone.Thenewway,usingthedatetime built-inmodule,worksgreatwithsomehelpfromthecommunity-builtpackagenamed pytz. Youshouldbeacquaintedwithbothtimeanddatetimetothoroughlyunderstandwhy datetimeisthebestchoiceandtimeshouldbeavoided. ThetimeModule Thelocaltimefunctionfromthetimebuilt-inmoduleletsyouconvertaUNIX timestamp(secondssincetheUNIXepochinUTC)toalocaltimethatmatchesthehost computer’stimezone(PacificDaylightTime,inmycase). Clickheretoviewcodeimage fromtimeimportlocaltime,strftime now=1407694710 local_tuple=localtime(now) time_format=‘%Y-%m-%d%H:%M:%S’ time_str=strftime(time_format,local_tuple) print(time_str) >>> 2014-08-1011:18:30 You’lloftenneedtogotheotherwayaswell,startingwithuserinputinlocaltimeand convertingittoUTCtime.Youcandothisbyusingthestrptimefunctiontoparsethe timestring,thencallmktimetoconvertlocaltimetoaUNIXtimestamp. Clickheretoviewcodeimage fromtimeimportmktime,strptime time_tuple=strptime(time_str,time_format) utc_now=mktime(time_tuple) print(utc_now) >>> 1407694710.0 Howdoyouconvertlocaltimeinonetimezonetolocaltimeinanother?Forexample, sayyouaretakingaflightbetweenSanFranciscoandNewYork,andwanttoknowwhat timeitwillbeinSanFranciscoonceyou’vearrivedinNewYork. Directlymanipulatingthereturnvaluesfromthetime,localtime,andstrptime functionstodotimezoneconversionsisabadidea.Timezoneschangeallthetimedueto locallaws.It’stoocomplicatedtomanageyourself,especiallyifyouwanttohandleevery globalcityforflightdepartureandarrival. Manyoperatingsystemshaveconfigurationfilesthatkeepupwiththetimezonechanges automatically.Pythonletsyouusethesetimezonesthroughthetimemodule.For example,hereIparsethedeparturetimefromtheSanFranciscotimezoneofPacific DaylightTime: Clickheretoviewcodeimage parse_format=‘%Y-%m-%d%H:%M:%S%Z’ depart_sfo=‘2014-05-0115:45:16PDT’ time_tuple=strptime(depart_sfo,parse_format) time_str=strftime(time_format,time_tuple) print(time_str) >>> 2014-05-0115:45:16 AfterseeingthatPDTworkswiththestrptimefunction,youmightalsoassumethat othertimezonesknowntomycomputerwillalsowork.Unfortunately,thisisn’tthecase. Instead,strptimeraisesanexceptionwhenitseesEasternDaylightTime(thetime zoneforNewYork). Clickheretoviewcodeimage arrival_nyc=‘2014-05-0123:33:24EDT’ time_tuple=strptime(arrival_nyc,time_format) >>> ValueError:unconverteddataremains:EDT Theproblemhereistheplatform-dependentnatureofthetimemodule.Itsactual behaviorisdeterminedbyhowtheunderlyingCfunctionsworkwiththehostoperating system.ThismakesthefunctionalityofthetimemoduleunreliableinPython.Thetime modulefailstoconsistentlyworkproperlyformultiplelocaltimes.Thus,youshould avoidthetimemoduleforthispurpose.Ifyoumustusetime,onlyuseittoconvert betweenUTCandthehostcomputer’slocaltime.Forallothertypesofconversions,use thedatetimemodule. ThedatetimeModule ThesecondoptionforrepresentingtimesinPythonisthedatetimeclassfromthe datetimebuilt-inmodule.Likethetimemodule,datetimecanbeusedtoconvert fromthecurrenttimeinUTCtolocaltime. Here,ItakethepresenttimeinUTCandconvertittomycomputer’slocaltime(Pacific DaylightTime): Clickheretoviewcodeimage fromdatetimeimportdatetime,timezone now=datetime(2014,8,10,18,18,30) now_utc=now.replace(tzinfo=timezone.utc) now_local=now_utc.astimezone() print(now_local) >>> 2014-08-1011:18:30-07:00 ThedatetimemodulecanalsoeasilyconvertalocaltimebacktoaUNIXtimestampin UTC. Clickheretoviewcodeimage time_str=‘2014-08-1011:18:30’ now=datetime.strptime(time_str,time_format) time_tuple=now.timetuple() utc_now=mktime(time_tuple) print(utc_now) >>> 1407694710.0 Unlikethetimemodule,thedatetimemodulehasfacilitiesforreliablyconverting fromonelocaltimetoanotherlocaltime.However,datetimeonlyprovidesthe machineryfortimezoneoperationswithitstzinfoclassandrelatedmethods.What’s missingarethetimezonedefinitionsbesidesUTC. Luckily,thePythoncommunityhasaddressedthisgapwiththepytzmodulethat’s availablefordownloadfromthePythonPackageIndex (https://pypi.python.org/pypi/pytz/).pytzcontainsafulldatabaseofeverytimezone definitionyoumightneed. Tousepytzeffectively,youshouldalwaysconvertlocaltimestoUTCfirst.Performany datetimeoperationsyouneedontheUTCvalues(suchasoffsetting).Then,convertto localtimesasafinalstep. Forexample,hereIconvertanNYCflightarrivaltimetoaUTCdatetime.Although someofthesecallsseemredundant,allofthemarenecessarywhenusingpytz. Clickheretoviewcodeimage arrival_nyc=‘2014-05-0123:33:24’ nyc_dt_naive=datetime.strptime(arrival_nyc,time_format) eastern=pytz.timezone(‘US/Eastern’) nyc_dt=eastern.localize(nyc_dt_naive) utc_dt=pytz.utc.normalize(nyc_dt.astimezone(pytz.utc)) print(utc_dt) >>> 2014-05-0203:33:24+00:00 OnceIhaveaUTCdatetime,IcanconvertittoSanFranciscolocaltime. Clickheretoviewcodeimage pacific=pytz.timezone(‘US/Pacific’) sf_dt=pacific.normalize(utc_dt.astimezone(pacific)) print(sf_dt) >>> 2014-05-0120:33:24-07:00 Justaseasily,IcanconvertittothelocaltimeinNepal. Clickheretoviewcodeimage nepal=pytz.timezone(‘Asia/Katmandu’) nepal_dt=nepal.normalize(utc_dt.astimezone(nepal)) print(nepal_dt) >>> 2014-05-0209:18:24+05:45 Withdatetimeandpytz,theseconversionsareconsistentacrossallenvironments regardlessofwhatoperatingsystemthehostcomputerisrunning. ThingstoRemember Avoidusingthetimemodulefortranslatingbetweendifferenttimezones. Usethedatetimebuilt-inmodulealongwiththepytzmoduletoreliablyconvert betweentimesindifferenttimezones. AlwaysrepresenttimeinUTCanddoconversionstolocaltimeasthefinalstep beforepresentation. Item46:UseBuilt-inAlgorithmsandDataStructures Whenyou’reimplementingPythonprogramsthathandleanon-trivialamountofdata, you’lleventuallyseeslowdownscausedbythealgorithmiccomplexityofyourcode.This usuallyisn’ttheresultofPython’sspeedasalanguage(seeItem41:“Consider concurrent.futuresforTrueParallelism”ifitis).Theissue,morelikely,isthat youaren’tusingthebestalgorithmsanddatastructuresforyourproblem. Luckily,thePythonstandardlibraryhasmanyofthealgorithmsanddatastructuresyou’ll needtousebuiltin.Besidesspeed,usingthesecommonalgorithmsanddatastructures canmakeyourlifeeasier.Someofthemostvaluabletoolsyoumaywanttousearetricky toimplementcorrectly.Avoidingreimplementationofcommonfunctionalitywillsaveyou timeandheadaches. Double-endedQueue Thedequeclassfromthecollectionsmoduleisadouble-endedqueue.Itprovides constanttimeoperationsforinsertingorremovingitemsfromitsbeginningorend.This makesitidealforfirst-in-first-out(FIFO)queues. Clickheretoviewcodeimage fifo=deque() fifo.append(1)#Producer x=fifo.popleft()#Consumer Thelistbuilt-intypealsocontainsanorderedsequenceofitemslikeaqueue.Youcan insertorremoveitemsfromtheendofalistinconstanttime.Butinsertingorremoving itemsfromtheheadofalisttakeslineartime,whichismuchslowerthantheconstant timeofadeque. OrderedDictionary Standarddictionariesareunordered.Thatmeansadictwiththesamekeysandvalues canresultindifferentordersofiteration.Thisbehaviorisasurprisingbyproductofthe waythedictionary’sfasthashtableisimplemented. Clickheretoviewcodeimage a={} a[‘foo’]=1 a[‘bar’]=2 #Randomlypopulate‘b’tocausehashconflicts whileTrue: z=randint(99,1013) b={} foriinrange(z): b[i]=i b[‘foo’]=1 b[‘bar’]=2 foriinrange(z): delb[i] ifstr(b)!=str(a): break print(a) print(b) print(‘Equal?’,a==b) >>> {‘foo’:1,‘bar’:2} {‘bar’:2,‘foo’:1} Equal?True TheOrderedDictclassfromthecollectionsmoduleisaspecialtypeof dictionarythatkeepstrackoftheorderinwhichitskeyswereinserted.Iteratingthekeys ofanOrderedDicthaspredictablebehavior.Thiscanvastlysimplifytestingand debuggingbymakingallcodedeterministic. Clickheretoviewcodeimage a=OrderedDict() a[‘foo’]=1 a[‘bar’]=2 b=OrderedDict() b[‘foo’]=‘red’ b[‘bar’]=‘blue’ forvalue1,value2inzip(a.values(),b.values()): print(value1,value2) >>> 1red 2blue DefaultDictionary Dictionariesareusefulforbookkeepingandtrackingstatistics.Oneproblemwith dictionariesisthatyoucan’tassumeanykeysarealreadypresent.Thatmakesitclumsyto dosimplethingslikeincrementacounterstoredinadictionary. stats={} key=‘my_counter’ ifkeynotinstats: stats[key]=0 stats[key]+=1 Thedefaultdictclassfromthecollectionsmodulesimplifiesthisby automaticallystoringadefaultvaluewhenakeydoesn’texist.Allyouhavetodois provideafunctionthatwillreturnthedefaultvalueeachtimeakeyismissing.Inthis example,theintbuilt-infunctionreturns0(seeItem23:“AcceptFunctionsforSimple InterfacesInsteadofClasses”foranotherexample).Now,incrementingacounteris simple. stats=defaultdict(int) stats[‘my_counter’]+=1 HeapQueue Heapsareusefuldatastructuresformaintainingapriorityqueue.Theheapqmodule providesfunctionsforcreatingheapsinstandardlisttypeswithfunctionslike heappush,heappop,andnsmallest. Itemsofanyprioritycanbeinsertedintotheheapinanyorder. a=[] heappush(a,5) heappush(a,3) heappush(a,7) heappush(a,4) Itemsarealwaysremovedbyhighestpriority(lowestnumber)first. Clickheretoviewcodeimage print(heappop(a),heappop(a),heappop(a),heappop(a)) >>> 3457 Theresultinglistiseasytouseoutsideofheapq.Accessingthe0indexoftheheap willalwaysreturnthesmallestitem. Clickheretoviewcodeimage a=[] heappush(a,5) heappush(a,3) heappush(a,7) heappush(a,4) asserta[0]==nsmallest(1,a)[0]==3 Callingthesortmethodonthelistmaintainstheheapinvariant. print(‘Before:’,a) a.sort() print(‘After:‘,a) >>> Before:[3,4,7,5] After:[3,4,5,7] Eachoftheseheapqoperationstakeslogarithmictimeinproportiontothelengthofthe list.DoingthesameworkwithastandardPythonlistwouldscalelinearly. Bisection Searchingforaniteminalisttakeslineartimeproportionaltoitslengthwhenyoucall theindexmethod. x=list(range(10**6)) i=x.index(991234) Thebisectmodule’sfunctions,suchasbisect_left,provideanefficientbinary searchthroughasequenceofsorteditems.Theindexitreturnsistheinsertionpointofthe valueintothesequence. i=bisect_left(x,991234) Thecomplexityofabinarysearchislogarithmic.Thatmeansusingbisecttosearcha listof1millionitemstakesroughlythesameamountoftimeasusingindextolinearly searchalistof14items.It’swayfaster! IteratorTools Theitertoolsbuilt-inmodulecontainsalargenumberoffunctionsthatareusefulfor organizingandinteractingwithiterators(seeItem16:“ConsiderGeneratorsInsteadof ReturningLists”andItem17:“BeDefensiveWhenIteratingOverArguments”for background).NotalloftheseareavailableinPython2,buttheycaneasilybebuiltusing simplerecipesdocumentedinthemodule.Seehelp(itertools)inaninteractive Pythonsessionformoredetails. Theitertoolsfunctionsfallintothreemaincategories: Linkingiteratorstogether •chain:Combinesmultipleiteratorsintoasinglesequentialiterator. •cycle:Repeatsaniterator’sitemsforever. •tee:Splitsasingleiteratorintomultipleparalleliterators. •zip_longest:Avariantofthezipbuilt-infunctionthatworkswellwith iteratorsofdifferentlengths. Filteringitemsfromaniterator •islice:Slicesaniteratorbynumericalindexeswithoutcopying. •takewhile:Returnsitemsfromaniteratorwhileapredicatefunctionreturns True. •dropwhile:Returnsitemsfromaniteratoroncethepredicatefunctionreturns Falseforthefirsttime. •filterfalse:Returnsallitemsfromaniteratorwhereapredicatefunction returnsFalse.Theoppositeofthefilterbuilt-infunction. Combinationsofitemsfromiterators •product:ReturnstheCartesianproductofitemsfromaniterator,whichisa nicealternativetodeeplynestedlistcomprehensions. •permutations:ReturnsorderedpermutationsoflengthNwithitemsfroman iterator. •combination:ReturnstheunorderedcombinationsoflengthNwith unrepeateditemsfromaniterator. ThereareevenmorefunctionsandrecipesavailableintheitertoolsmodulethatI don’tmentionhere.Wheneveryoufindyourselfdealingwithsometrickyiterationcode, it’sworthlookingattheitertoolsdocumentationagaintoseewhetherthere’s anythingthereforyoutouse. ThingstoRemember UsePython’sbuilt-inmodulesforalgorithmsanddatastructures. Don’treimplementthisfunctionalityyourself.It’shardtogetright. Item47:UsedecimalWhenPrecisionIsParamount Pythonisanexcellentlanguageforwritingcodethatinteractswithnumericaldata. Python’sintegertypecanrepresentvaluesofanypracticalsize.Itsdouble-precision floatingpointtypecomplieswiththeIEEE754standard.Thelanguagealsoprovidesa standardcomplexnumbertypeforimaginaryvalues.However,thesearen’tenoughfor everysituation. Forexample,sayyouwanttocomputetheamounttochargeacustomerforan internationalphonecall.Youknowthetimeinminutesandsecondsthatthecustomerwas onthephone(say,3minutes42seconds).Youalsohaveasetrateforthecostofcalling AntarcticafromtheUnitedStates($1.45/minute).Whatshouldthechargebe? Withfloatingpointmath,thecomputedchargeseemsreasonable. rate=1.45 seconds=3*60+42 cost=rate*seconds/60 print(cost) >>> 5.364999999999999 Butroundingittothenearestwholecentroundsdownwhenyouwantittoroundupto properlycoverallcostsincurredbythecustomer. print(round(cost,2)) >>> 5.36 Sayyoualsowanttosupportveryshortphonecallsbetweenplacesthataremuchcheaper toconnect.Here,Icomputethechargeforaphonecallthatwas5secondslongwitharate of$0.05/minute: rate=0.05 seconds=5 cost=rate*seconds/60 print(cost) >>> 0.004166666666666667 Theresultingfloatissolowthatitroundsdowntozero.Thiswon’tdo! print(round(cost,2)) >>> 0.0 ThesolutionistousetheDecimalclassfromthedecimalbuilt-inmodule.The Decimalclassprovidesfixedpointmathof28decimalpointsbydefault.Itcangoeven higherifrequired.ThisworksaroundtheprecisionissuesinIEEE754floatingpoint numbers.Theclassalsogivesyoumorecontroloverroundingbehaviors. Forexample,redoingtheAntarcticacalculationwithDecimalresultsinanexactcharge insteadofanapproximation. Clickheretoviewcodeimage rate=Decimal(‘1.45’) seconds=Decimal(‘222’)#3*60+42 cost=rate*seconds/Decimal(‘60’) print(cost) >>> 5.365 TheDecimalclasshasabuilt-infunctionforroundingtoexactlythedecimalplaceyou needwiththeroundingbehavioryouwant. Clickheretoviewcodeimage rounded=cost.quantize(Decimal(‘0.01’),rounding=ROUND_UP) print(rounded) >>> 5.37 Usingthequantizemethodthiswayalsoproperlyhandlesthesmallusagecasefor short,cheapphonecalls.Here,youcanseetheDecimalcostisstilllessthan1centfor thecall: Clickheretoviewcodeimage rate=Decimal(‘0.05’) seconds=Decimal(‘5’) cost=rate*seconds/Decimal(‘60’) print(cost) >>> 0.004166666666666666666666666667 Butthequantizebehaviorensuresthatthisisroundeduptoonewholecent. Clickheretoviewcodeimage rounded=cost.quantize(Decimal(‘0.01’),rounding=ROUND_UP) print(rounded) >>> 0.01 WhileDecimalworksgreatforfixedpointnumbers,itstillhaslimitationsinits precision(e.g.,1/3willbeanapproximation).Forrepresentingrationalnumberswithno limittoprecision,considerusingtheFractionclassfromthefractionsbuilt-in module. ThingstoRemember Pythonhasbuilt-intypesandclassesinmodulesthatcanrepresentpracticallyevery typeofnumericalvalue. TheDecimalclassisidealforsituationsthatrequirehighprecisionandexact roundingbehavior,suchascomputationsofmonetaryvalues. Item48:KnowWheretoFindCommunity-BuiltModules Pythonhasacentralrepositoryofmodules(https://pypi.python.org)foryoutoinstalland useinyourprograms.Thesemodulesarebuiltandmaintainedbypeoplelikeyou:the Pythoncommunity.Whenyoufindyourselffacinganunfamiliarchallenge,thePython PackageIndex(PyPI)isagreatplacetolookforcodethatwillgetyouclosertoyourgoal. TousethePackageIndex,you’llneedtouseacommand-linetoolnamedpip.pipis installedbydefaultinPython3.4andabove(it’salsoaccessiblewithpython-mpip). Forearlierversions,youcanfindinstructionsforinstallingpiponthePythonPackaging website(https://packaging.python.org). Onceinstalled,usingpiptoinstallanewmoduleissimple.Forexample,hereIinstall thepytzmodulethatIusedinanotheriteminthischapter(seeItem45:“Use datetimeInsteadoftimeforLocalClocks”): Clickheretoviewcodeimage $pip3installpytz Downloading/unpackingpytz Downloadingpytz-2014.4.tar.bz2(159kB):159kBdownloaded Runningsetup.py(…)egg_infoforpackagepytz Installingcollectedpackages:pytz Runningsetup.pyinstallforpytz Successfullyinstalledpytz Cleaningup… Intheexampleabove,Iusedthepip3command-linetoinstallthePython3versionof thepackage.Thepipcommand-line(withoutthe3)isalsoavailableforinstalling packagesforPython2.Themajorityofpopularpackagesarenowavailableforeither versionofPython(seeItem1:“KnowWhichVersionofPythonYou’reUsing”).pipcan alsobeusedwithpyvenvtotracksetsofpackagestoinstallforyourprojects(seeItem 53:“UseVirtualEnvironmentsforIsolatedandReproducibleDependencies”). EachmoduleinthePyPIhasitsownsoftwarelicense.Mostofthepackages,especially thepopularones,havefreeoropensourcelicenses(seehttp://opensource.orgfordetails). Inmostcases,theselicensesallowyoutoincludeacopyofthemodulewithyourprogram (whenindoubt,talktoalawyer). ThingstoRemember ThePythonPackageIndex(PyPI)containsawealthofcommonpackagesthatare builtandmaintainedbythePythoncommunity. pipisthecommand-linetooltouseforinstallingpackagesfromPyPI. pipisinstalledbydefaultinPython3.4andabove;youmustinstallityourselffor olderversions. ThemajorityofPyPImodulesarefreeandopensourcesoftware. 7.Collaboration TherearelanguagefeaturesinPythontohelpyouconstructwell-definedAPIswithclear interfaceboundaries.ThePythoncommunityhasestablishedbestpracticesthatmaximize themaintainabilityofcodeovertime.TherearealsostandardtoolsthatshipwithPython thatenablelargeteamstoworktogetheracrossdisparateenvironments. CollaboratingwithothersonPythonprogramsrequiresbeingdeliberateabouthowyou writeyourcode.Evenifyou’reworkingonyourown,chancesareyou’llbeusingcode writtenbysomeoneelseviathestandardlibraryoropensourcepackages.It’simportantto understandthemechanismsthatmakeiteasytocollaboratewithotherPython programmers. Item49:WriteDocstringsforEveryFunction,Class,and Module DocumentationinPythonisextremelyimportantbecauseofthedynamicnatureofthe language.Pythonprovidesbuilt-insupportforattachingdocumentationtoblocksofcode. Unlikemanyotherlanguages,thedocumentationfromaprogram’ssourcecodeisdirectly accessibleastheprogramruns. Forexample,youcanadddocumentationbyprovidingadocstringimmediatelyafterthe defstatementofafunction. Clickheretoviewcodeimage defpalindrome(word): “““ReturnTrueifthegivenwordisapalindrome.””” returnword==word[::-1] YoucanretrievethedocstringfromwithinthePythonprogramitselfbyaccessingthe function’s__doc__specialattribute. Clickheretoviewcodeimage print(repr(palindrome.__doc__)) >>> ‘ReturnTrueifthegivenwordisapalindrome.’ Docstringscanbeattachedtofunctions,classes,andmodules.Thisconnectionispartof theprocessofcompilingandrunningaPythonprogram.Supportfordocstringsandthe __doc__attributehasthreeconsequences: Theaccessibilityofdocumentationmakesinteractivedevelopmenteasier.Youcan inspectfunctions,classes,andmodulestoseetheirdocumentationbyusingthe helpbuilt-infunction.ThismakesthePythoninteractiveinterpreter(thePython “shell”)andtoolslikeIPythonNotebook(http://ipython.org)ajoytousewhile you’redevelopingalgorithms,testingAPIs,andwritingcodesnippets. Astandardwayofdefiningdocumentationmakesiteasytobuildtoolsthatconvert thetextintomoreappealingformats(likeHTML).Thishasledtoexcellent documentation-generationtoolsforthePythoncommunity,suchasSphinx (http://sphinx-doc.org).It’salsoenabledcommunity-fundedsiteslikeReadtheDocs (https://readthedocs.org)thatprovidefreehostingofbeautiful-looking documentationforopensourcePythonprojects. Python’sfirst-class,accessible,andgood-lookingdocumentationencouragespeople towritemoredocumentation.ThemembersofthePythoncommunityhaveastrong beliefintheimportanceofdocumentation.There’sanassumptionthat“goodcode” alsomeanswell-documentedcode.Thismeansthatyoucanexpectmostopen sourcePythonlibrariestohavedecentdocumentation. Toparticipateinthisexcellentcultureofdocumentation,youneedtofollowafew guidelineswhenyouwritedocstrings.ThefulldetailsarediscussedonlineinPEP257 (http://www.python.org/dev/peps/pep-0257/).Thereareafewbest-practicesyoushouldbe suretofollow. DocumentingModules Eachmoduleshouldhaveatop-leveldocstring.Thisisastringliteralthatisthefirst statementinasourcefile.Itshouldusethreedoublequotes(""").Thegoalofthis docstringistointroducethemoduleanditscontents. Thefirstlineofthedocstringshouldbeasinglesentencedescribingthemodule’spurpose. Theparagraphsthatfollowshouldcontainthedetailsthatallusersofthemoduleshould knowaboutitsoperation.Themoduledocstringisalsoajumping-offpointwhereyoucan highlightimportantclassesandfunctionsfoundinthemodule. Here’sanexampleofamoduledocstring: Clickheretoviewcodeimage #words.py #!/usr/bin/envpython3 “““Libraryfortestingwordsforvariouslinguisticpatterns. Testinghowwordsrelatetoeachothercanbetrickysometimes! Thismoduleprovideseasywaystodeterminewhenwordsyou’ve foundhavespecialproperties. Availablefunctions: -palindrome:Determineifawordisapalindrome. -check_anagram:Determineiftwowordsareanagrams. … ””” #… Ifthemoduleisacommand-lineutility,themoduledocstringisalsoagreatplacetoput usageinformationforrunningthetoolfromthecommand-line. DocumentingClasses Eachclassshouldhaveaclass-leveldocstring.Thislargelyfollowsthesamepatternasthe module-leveldocstring.Thefirstlineisthesingle-sentencepurposeoftheclass. Paragraphsthatfollowdiscussimportantdetailsoftheclass’soperation. Importantpublicattributesandmethodsoftheclassshouldbehighlightedintheclassleveldocstring.Itshouldalsoprovideguidancetosubclassesonhowtoproperlyinteract withprotectedattributes(seeItem27:“PreferPublicAttributesOverPrivateOnes”)and thesuperclass’smethods. Here’sanexampleofaclassdocstring: Clickheretoviewcodeimage classPlayer(object): “““Representsaplayerofthegame. Subclassesmayoverridethe‘tick’methodtoprovide customanimationsfortheplayer’smovementdepending ontheirpowerlevel,etc. Publicattributes: -power:Unusedpower-ups(floatbetween0and1). -coins:Coinsfoundduringthelevel(integer). ””” #… DocumentingFunctions Eachpublicfunctionandmethodshouldhaveadocstring.Thisfollowsthesamepattern asmodulesandclasses.Thefirstlineisthesingle-sentencedescriptionofwhatthe functiondoes.Theparagraphsthatfollowshoulddescribeanyspecificbehaviorsandthe argumentsforthefunction.Anyreturnvaluesshouldbementioned.Anyexceptionsthat callersmusthandleaspartofthefunction’sinterfaceshouldbeexplained. Here’sanexampleofafunctiondocstring: Clickheretoviewcodeimage deffind_anagrams(word,dictionary): “““Findallanagramsforaword. Thisfunctiononlyrunsasfastasthetestfor membershipinthe‘dictionary’container.Itwill beslowifthedictionaryisalistandfastif it’saset. Args: word:Stringofthetargetword. dictionary:Containerwithallstringsthat areknowntobeactualwords. Returns: Listofanagramsthatwerefound.Emptyif nonewerefound. ””” #… Therearealsosomespecialcasesinwritingdocstringsforfunctionsthatareimportantto know. Ifyourfunctionhasnoargumentsandasimplereturnvalue,asinglesentence descriptionisprobablygoodenough. Ifyourfunctiondoesn’treturnanything,it’sbettertoleaveoutanymentionofthe returnvalueinsteadofsaying“returnsNone.” Ifyoudon’texpectyourfunctiontoraiseanexceptionduringnormaloperation, don’tmentionthatfact. Ifyourfunctionacceptsavariablenumberofarguments(seeItem18:“Reduce VisualNoisewithVariablePositionalArguments”)orkeyword-arguments(seeItem 19:“ProvideOptionalBehaviorwithKeywordArguments”),use*argsand **kwargsinthedocumentedlistofargumentstodescribetheirpurpose. Ifyourfunctionhasargumentswithdefaultvalues,thosedefaultsshouldbe mentioned(seeItem20:“UseNoneandDocstringstoSpecifyDynamicDefault Arguments”). Ifyourfunctionisagenerator(seeItem16:“ConsiderGeneratorsInsteadof ReturningLists”),thenyourdocstringshoulddescribewhatthegeneratoryields whenit’siterated. Ifyourfunctionisacoroutine(seeItem40:“ConsiderCoroutinestoRunMany FunctionsConcurrently”),thenyourdocstringshouldcontainwhatthecoroutine yields,whatitexpectstoreceivefromyieldexpressions,andwhenitwillstop iteration. Note Onceyou’vewrittendocstringsforyourmodules,it’simportanttokeepthe documentationuptodate.Thedoctestbuilt-inmodulemakesiteasytoexercise usageexamplesembeddedindocstringstoensurethatyoursourcecodeandits documentationdon’tdivergeovertime. ThingstoRemember Writedocumentationforeverymodule,class,andfunctionusingdocstrings.Keep themuptodateasyourcodechanges. Formodules:Introducethecontentsofthemoduleandanyimportantclassesor functionsallusersshouldknowabout. Forclasses:Documentbehavior,importantattributes,andsubclassbehaviorinthe docstringfollowingtheclassstatement. Forfunctionsandmethods:Documenteveryargument,returnedvalue,raised exception,andotherbehaviorsinthedocstringfollowingthedefstatement. Item50:UsePackagestoOrganizeModulesandProvide StableAPIs Asthesizeofaprogram’scodebasegrows,it’snaturalforyoutoreorganizeitsstructure. Yousplitlargerfunctionsintosmallerfunctions.Yourefactordatastructuresintohelper classes(seeItem22:“PreferHelperClassesOverBookkeepingwithDictionariesand Tuples”).Youseparatefunctionalityintovariousmodulesthatdependoneachother. Atsomepoint,you’llfindyourselfwithsomanymodulesthatyouneedanotherlayerin yourprogramtomakeitunderstandable.Forthispurpose,Pythonprovidespackages. Packagesaremodulesthatcontainothermodules. Inmostcases,packagesaredefinedbyputtinganemptyfilenamed__init__.pyinto adirectory.Once__init__.pyispresent,anyotherPythonfilesinthatdirectorywill beavailableforimportusingapathrelativetothedirectory.Forexample,imaginethat youhavethefollowingdirectorystructureinyourprogram. main.py mypackage/__init__.py mypackage/models.py mypackage/utils.py Toimporttheutilsmodule,youusetheabsolutemodulenamethatincludesthe packagedirectory’sname. #main.py frommypackageimportutils Thispatterncontinueswhenyouhavepackagedirectoriespresentwithinotherpackages (likemypackage.foo.bar). Note Python3.4introducesnamespacepackages,amoreflexiblewaytodefine packages.Namespacepackagescanbecomposedofmodulesfromcompletely separatedirectories,ziparchives,orevenremotesystems.Fordetailsonhowtouse theadvancedfeaturesofnamespacepackages,seePEP420 (http://www.python.org/dev/peps/pep-0420/). ThefunctionalityprovidedbypackageshastwoprimarypurposesinPythonprograms. Namespaces Thefirstuseofpackagesistohelpdivideyourmodulesintoseparatenamespaces.This allowsyoutohavemanymoduleswiththesamefilenamebutdifferentabsolutepathsthat areunique.Forexample,here’saprogramthatimportsattributesfromtwomoduleswith thesamename,utils.py.Thisworksbecausethemodulescanbeaddressedbytheir absolutepaths. Clickheretoviewcodeimage #main.py fromanalysis.utilsimportlog_base2_bucket fromfrontend.utilsimportstringify bucket=stringify(log_base2_bucket(33)) Thisapproachbreaksdownwhenthefunctions,classes,orsubmodulesdefinedin packageshavethesamenames.Forexample,sayyouwanttousetheinspectfunction fromboththeanalysis.utilsandfrontend.utilsmodules.Importingthe attributesdirectlywon’tworkbecausethesecondimportstatementwilloverwritethe valueofinspectinthecurrentscope. Clickheretoviewcodeimage #main2.py fromanalysis.utilsimportinspect fromfrontend.utilsimportinspect#Overwrites! Thesolutionistousetheasclauseoftheimportstatementtorenamewhateveryou’ve importedforthecurrentscope. Clickheretoviewcodeimage #main3.py fromanalysis.utilsimportinspectasanalysis_inspect fromfrontend.utilsimportinspectasfrontend_inspect value=33 ifanalysis_inspect(value)==frontend_inspect(value): print(‘Inspectionequal!’) Theasclausecanbeusedtorenameanythingyouretrievewiththeimportstatement, includingentiremodules.Thismakesiteasytoaccessnamespacedcodeandmakeits identityclearwhenyouuseit. Note Anotherapproachforavoidingimportednameconflictsistoalwaysaccessnames bytheirhighestuniquemodulename. Fortheexampleabove,you’dfirstimportanalysis.utilsandimport frontend.utils.Then,you’daccesstheinspectfunctionswiththefull pathsofanalysis.utils.inspectandfrontend.utils.inspect. Thisapproachallowsyoutoavoidtheasclausealtogether.Italsomakesit abundantlycleartonewreadersofthecodewhereeachfunctionisdefined. StableAPIs TheseconduseofpackagesinPythonistoprovidestrict,stableAPIsforexternal consumers. Whenyou’rewritinganAPIforwiderconsumption,likeanopensourcepackage(see Item48:“KnowWheretoFindCommunity-BuiltModules”),you’llwanttoprovide stablefunctionalitythatdoesn’tchangebetweenreleases.Toensurethathappens,it’s importanttohideyourinternalcodeorganizationfromexternalusers.Thisenablesyouto refactorandimproveyourpackage’sinternalmoduleswithoutbreakingexistingusers. PythoncanlimitthesurfaceareaexposedtoAPIconsumersbyusingthe__all__ specialattributeofamoduleorpackage.Thevalueof__all__isalistofeverynameto exportfromthemoduleaspartofitspublicAPI.Whenconsumingcodedoesfromfoo import*,onlytheattributesinfoo.__all__willbeimportedfromfoo.If __all__isn’tpresentinfoo,thenonlypublicattributes,thosewithoutaleading underscore,areimported(seeItem27:“PreferPublicAttributesOverPrivateOnes”). Forexample,sayyouwanttoprovideapackageforcalculatingcollisionsbetween movingprojectiles.Here,Idefinethemodelsmoduleofmypackagetocontainthe representationofprojectiles: Clickheretoviewcodeimage #models.py __all__=[‘Projectile’] classProjectile(object): def__init__(self,mass,velocity): self.mass=mass self.velocity=velocity Ialsodefineautilsmoduleinmypackagetoperformoperationsonthe Projectileinstances,suchassimulatingcollisionsbetweenthem. Clickheretoviewcodeimage #utils.py from.modelsimportProjectile __all__=[‘simulate_collision’] def_dot_product(a,b): #… defsimulate_collision(a,b): #… Now,I’dliketoprovideallofthepublicpartsofthisAPIasasetofattributesthatare availableonthemypackagemodule.Thiswillallowdownstreamconsumerstoalways importdirectlyfrommypackageinsteadofimportingfrommypackage.modelsor mypackage.utils.ThisensuresthattheAPIconsumer’scodewillcontinuetowork eveniftheinternalorganizationofmypackagechanges(e.g.,models.pyisdeleted). TodothiswithPythonpackages,youneedtomodifythe__init__.pyfileinthe mypackagedirectory.Thisfileactuallybecomesthecontentsofthemypackage modulewhenit’simported.Thus,youcanspecifyanexplicitAPIformypackageby limitingwhatyouimportinto__init__.py.Sinceallofmyinternalmodulesalready specify__all__,Icanexposethepublicinterfaceofmypackagebysimplyimporting everythingfromtheinternalmodulesandupdating__all__accordingly. #__init__.py __all__=[] from.modelsimport* __all__+=models.__all__ from.utilsimport* __all__+=utils.__all__ Here’saconsumeroftheAPIthatdirectlyimportsfrommypackageinsteadofaccessing theinnermodules: Clickheretoviewcodeimage #api_consumer.py frommypackageimport* a=Projectile(1.5,3) b=Projectile(4,1.7) after_a,after_b=simulate_collision(a,b) Notably,internal-onlyfunctionslikemypackage.utils._dot_productwillnotbe availabletotheAPIconsumeronmypackagebecausetheyweren’tpresentin __all__.Beingomittedfrom__all__meanstheyweren’timportedbythefrom mypackageimport*statement.Theinternal-onlynamesareeffectivelyhidden. Thiswholeapproachworksgreatwhenit’simportanttoprovideanexplicit,stableAPI. However,ifyou’rebuildinganAPIforusebetweenyourownmodules,thefunctionality of__all__isprobablyunnecessaryandshouldbeavoided.Thenamespacingprovided bypackagesisusuallyenoughforateamofprogrammerstocollaborateonlargeamounts ofcodetheycontrolwhilemaintainingreasonableinterfaceboundaries. Bewareofimport* Importstatementslikefromximportyareclearbecausethesourceofyis explicitlythexpackageormodule.Wildcardimportslikefromfooimport* canalsobeuseful,especiallyininteractivePythonsessions.However,wildcards makecodemoredifficulttounderstand. fromfooimport*hidesthesourceofnamesfromnewreadersofthecode.If amodulehasmultipleimport*statements,you’llneedtocheckallofthe referencedmodulestofigureoutwhereanamewasdefined. Namesfromimport*statementswilloverwriteanyconflictingnameswithinthe containingmodule.Thiscanleadtostrangebugscausedbyaccidentalinteractions betweenyourcodeandoverlappingnamesfrommultipleimport*statements. Thesafestapproachistoavoidimport*inyourcodeandexplicitlyimport nameswiththefromximportystyle. ThingstoRemember PackagesinPythonaremodulesthatcontainothermodules.Packagesallowyouto organizeyourcodeintoseparate,non-conflictingnamespaceswithuniqueabsolute modulenames. Simplepackagesaredefinedbyaddingan__init__.pyfiletoadirectorythat containsothersourcefiles.Thesefilesbecomethechildmodulesofthedirectory’s package.Packagedirectoriesmayalsocontainotherpackages. YoucanprovideanexplicitAPIforamodulebylistingitspubliclyvisiblenamesin its__all__specialattribute. Youcanhideapackage’sinternalimplementationbyonlyimportingpublicnames inthepackage’s__init__.pyfileorbynaminginternal-onlymemberswitha leadingunderscore. Whencollaboratingwithinasingleteamoronasinglecodebase,using__all__ forexplicitAPIsisprobablyunnecessary. Item51:DefineaRootExceptiontoInsulateCallersfrom APIs Whenyou’redefiningamodule’sAPI,theexceptionsyouthrowarejustasmuchapartof yourinterfaceasthefunctionsandclassesyoudefine(seeItem14:“PreferExceptionsto ReturningNone”). Pythonhasabuilt-inhierarchyofexceptionsforthelanguageandstandardlibrary. There’sadrawtousingthebuilt-inexceptiontypesforreportingerrorsinsteadofdefining yourownnewtypes.Forexample,youcouldraiseaValueErrorexceptionwhenever aninvalidparameterispassedtoyourfunction. Clickheretoviewcodeimage defdetermine_weight(volume,density): ifdensity<=0: raiseValueError(‘Densitymustbepositive’) #… Insomecases,usingValueErrormakessense,butforAPIsit’smuchmorepowerfulto defineyourownhierarchyofexceptions.Youcandothisbyprovidingaroot Exceptioninyourmodule.Then,haveallotherexceptionsraisedbythatmodule inheritfromtherootexception. Clickheretoviewcodeimage #my_module.py classError(Exception): “““Base-classforallexceptionsraisedbythismodule.””” classInvalidDensityError(Error): “““Therewasaproblemwithaprovideddensityvalue.””” HavingarootexceptioninamodulemakesiteasyforconsumersofyourAPItocatchall oftheexceptionsthatyouraiseonpurpose.Forexample,hereaconsumerofyourAPI makesafunctioncallwithatry/exceptstatementthatcatchesyourrootexception: Clickheretoviewcodeimage try: weight=my_module.determine_weight(1,-1) exceptmy_module.Errorase: logging.error(‘Unexpectederror:%s’,e) Thistry/exceptpreventsyourAPI’sexceptionsfrompropagatingtoofarupwardand breakingthecallingprogram.ItinsulatesthecallingcodefromyourAPI.Thisinsulation hasthreehelpfuleffects. First,rootexceptionsletcallersunderstandwhenthere’saproblemwiththeirusageof yourAPI.IfcallersareusingyourAPIproperly,theyshouldcatchthevariousexceptions thatyoudeliberatelyraise.Iftheydon’thandlesuchanexception,itwillpropagateallthe wayuptotheinsulatingexceptblockthatcatchesyourmodule’srootexception.That blockcanbringtheexceptiontotheattentionoftheAPIconsumer,givingthemachance toaddproperhandlingoftheexceptiontype. Clickheretoviewcodeimage try: weight=my_module.determine_weight(1,-1) exceptmy_module.InvalidDensityError: weight=0 exceptmy_module.Errorase: logging.error(‘Buginthecallingcode:%s’,e) ThesecondadvantageofusingrootexceptionsisthattheycanhelpfindbugsinyourAPI module’scode.Ifyourcodeonlydeliberatelyraisesexceptionsthatyoudefinewithin yourmodule’shierarchy,thenallothertypesofexceptionsraisedbyyourmodulemustbe theonesthatyoudidn’tintendtoraise.ThesearebugsinyourAPI’scode. Usingthetry/exceptstatementabovewillnotinsulateAPIconsumersfrombugsin yourAPImodule’scode.Todothat,thecallerneedstoaddanotherexceptblockthat catchesPython’sbaseExceptionclass.ThisallowstheAPIconsumertodetectwhen there’sabugintheAPImodule’simplementationthatneedstobefixed. Clickheretoviewcodeimage try: weight=my_module.determine_weight(1,-1) exceptmy_module.InvalidDensityError: weight=0 exceptmy_module.Errorase: logging.error(‘Buginthecallingcode:%s’,e) exceptExceptionase: logging.error(‘BugintheAPIcode:%s’,e) raise Thethirdimpactofusingrootexceptionsisfuture-proofingyourAPI.Overtime,youmay wanttoexpandyourAPItoprovidemorespecificexceptionsincertainsituations.For example,youcouldaddanExceptionsubclassthatindicatestheerrorconditionof supplyingnegativedensities. Clickheretoviewcodeimage #my_module.py classNegativeDensityError(InvalidDensityError): “““Aprovideddensityvaluewasnegative.””” defdetermine_weight(volume,density): ifdensity<0: raiseNegativeDensityError Thecallingcodewillcontinuetoworkexactlyasbeforebecauseitalreadycatches InvalidDensityErrorexceptions(theparentclassof NegativeDensityError).Inthefuture,thecallercoulddecidetospecial-casethe newtypeofexceptionandchangeitsbehavioraccordingly. Clickheretoviewcodeimage try: weight=my_module.determine_weight(1,-1) exceptmy_module.NegativeDensityErrorase: raiseValueError(‘Mustsupplynon-negativedensity’)frome exceptmy_module.InvalidDensityError: weight=0 exceptmy_module.Errorase: logging.error(‘Buginthecallingcode:%s’,e) exceptExceptionase: logging.error(‘BugintheAPIcode:%s’,e) raise YoucantakeAPIfuture-proofingfurtherbyprovidingabroadersetofexceptionsdirectly belowtherootexception.Forexample,imagineyouhadonesetoferrorsrelatedto calculatingweights,anotherrelatedtocalculatingvolume,andathirdrelatedto calculatingdensity. Clickheretoviewcodeimage #my_module.py classWeightError(Error): “““Base-classforweightcalculationerrors.””” classVolumeError(Error): “““Base-classforvolumecalculationerrors.””” classDensityError(Error): “““Base-classfordensitycalculationerrors.””” Specificexceptionswouldinheritfromthesegeneralexceptions.Eachintermediate exceptionactsasitsownkindofrootexception.Thismakesiteasiertoinsulatelayersof callingcodefromAPIcodebasedonbroadfunctionality.Thisismuchbetterthanhaving allcallerscatchalonglistofveryspecificExceptionsubclasses. ThingstoRemember DefiningrootexceptionsforyourmodulesallowsAPIconsumerstoinsulate themselvesfromyourAPI. CatchingrootexceptionscanhelpyoufindbugsincodethatconsumesanAPI. CatchingthePythonExceptionbaseclasscanhelpyoufindbugsinAPI implementations. Intermediaterootexceptionsletyouaddmorespecifictypesofexceptionsinthe futurewithoutbreakingyourAPIconsumers. Item52:KnowHowtoBreakCircularDependencies Inevitably,whileyou’recollaboratingwithothers,you’llfindamutualinterdependency betweenmodules.Itcanevenhappenwhileyouworkbyyourselfonthevariouspartsofa singleprogram. Forexample,sayyouwantyourGUIapplicationtoshowadialogboxforchoosingwhere tosaveadocument.Thedatadisplayedbythedialogcouldbespecifiedthrough argumentstoyoureventhandlers.Butthedialogalsoneedstoreadglobalstate,likeuser preferences,toknowhowtorenderproperly. Here,Idefineadialogthatretrievesthedefaultdocumentsavelocationfromglobal preferences: Clickheretoviewcodeimage #dialog.py importapp classDialog(object): def__init__(self,save_dir): self.save_dir=save_dir #… save_dialog=Dialog(app.prefs.get(‘save_dir’)) defshow(): #… Theproblemisthattheappmodulethatcontainstheprefsobjectalsoimportsthe dialogclassinordertoshowthedialogonprogramstart. #app.py importdialog classPrefs(object): #… defget(self,name): #… prefs=Prefs() dialog.show() It’sacirculardependency.Ifyoutrytousetheappmodulefromyourmainprogram, you’llgetanexceptionwhenyouimportit. Clickheretoviewcodeimage Traceback(mostrecentcalllast): File“main.py”,line4,in<module> importapp File“app.py”,line4,in<module> importdialog File“dialog.py”,line16,in<module> save_dialog=Dialog(app.prefs.get(‘save_dir’)) AttributeError:‘module’objecthasnoattribute‘prefs’ Tounderstandwhat’shappeninghere,youneedtoknowthedetailsofPython’simport machinery.Whenamoduleisimported,here’swhatPythonactuallydoesindepth-first order: 1.Searchesforyourmoduleinlocationsfromsys.path 2.Loadsthecodefromthemoduleandensuresthatitcompiles 3.Createsacorrespondingemptymoduleobject 4.Insertsthemoduleintosys.modules 5.Runsthecodeinthemoduleobjecttodefineitscontents Theproblemwithacirculardependencyisthattheattributesofamodulearen’tdefined untilthecodeforthoseattributeshasexecuted(afterstep#5).Butthemodulecanbe loadedwiththeimportstatementimmediatelyafterit’sinsertedintosys.modules (afterstep#4). Intheexampleabove,theappmoduleimportsdialogbeforedefininganything.Then, thedialogmoduleimportsapp.Sinceappstillhasn’tfinishedrunning—it’scurrently importingdialog—theappmoduleisjustanemptyshell(fromstep#4).The AttributeErrorisraised(duringstep#5fordialog)becausethecodethatdefines prefshasn’trunyet(step#5forappisn’tcomplete). Thebestsolutiontothisproblemistorefactoryourcodesothattheprefsdatastructure isatthebottomofthedependencytree.Then,bothappanddialogcanimportthesame utilitymoduleandavoidanycirculardependencies.Butsuchacleardivisionisn’talways possibleorcouldrequiretoomuchrefactoringtobeworththeeffort. Therearethreeotherwaystobreakcirculardependencies. ReorderingImports Thefirstapproachistochangetheorderofimports.Forexample,ifyouimportthe dialogmoduletowardthebottomoftheappmodule,afteritscontentshaverun,the AttributeErrorgoesaway. #app.py classPrefs(object): #… prefs=Prefs() importdialog#Moved dialog.show() Thisworksbecause,whenthedialogmoduleisloadedlate,itsrecursiveimportofapp willfindthatapp.prefshasalreadybeendefined(step#5ismostlydoneforapp). AlthoughthisavoidstheAttributeError,itgoesagainstthePEP8styleguide(see Item2:“FollowthePEP8StyleGuide”).Thestyleguidesuggeststhatyoualwaysput importsatthetopofyourPythonfiles.Thismakesyourmodule’sdependenciesclearto newreadersofthecode.Italsoensuresthatanymoduleyoudependonisinscopeand availabletoallthecodeinyourmodule. Havingimportslaterinafilecanbebrittleandcancausesmallchangesintheorderingof yourcodetobreakthemoduleentirely.Thus,youshouldavoidimportreorderingtosolve yourcirculardependencyissues. Import,Configure,Run Asecondsolutiontothecircularimportsproblemistohaveyourmodulesminimizeside effectsatimporttime.Youhaveyourmodulesonlydefinefunctions,classes,and constants.Youavoidactuallyrunninganyfunctionsatimporttime.Then,youhaveeach moduleprovideaconfigurefunctionthatyoucallonceallothermoduleshave finishedimporting.Thepurposeofconfigureistoprepareeachmodule’sstateby accessingtheattributesofothermodules.Yourunconfigureafterallmoduleshave beenimported(step#5iscomplete),soallattributesmustbedefined. Here,Iredefinethedialogmoduletoonlyaccesstheprefsobjectwhenconfigure iscalled: Clickheretoviewcodeimage #dialog.py importapp classDialog(object): #… save_dialog=Dialog() defshow(): #… defconfigure(): save_dialog.save_dir=app.prefs.get(‘save_dir’) Ialsoredefinetheappmoduletonotrunanyactivitiesonimport. #app.py importdialog classPrefs(object): #… prefs=Prefs() defconfigure(): #… Finally,themainmodulehasthreedistinctphasesofexecution:importeverything, configureeverything,andrunthefirstactivity. #main.py importapp importdialog app.configure() dialog.configure() dialog.show() Thisworkswellinmanysituationsandenablespatternslikedependencyinjection.But sometimesitcanbedifficulttostructureyourcodesothatanexplicitconfigurestepis possible.Havingtwodistinctphaseswithinamodulecanalsomakeyourcodeharderto readbecauseitseparatesthedefinitionofobjectsfromtheirconfiguration. DynamicImport Thethird—andoftensimplest—solutiontothecircularimportsproblemistousean importstatementwithinafunctionormethod.Thisiscalledadynamicimportbecause themoduleimporthappenswhiletheprogramisrunning,notwhiletheprogramisfirst startingupandinitializingitsmodules. Here,Iredefinethedialogmoduletouseadynamicimport.Thedialog.show functionimportstheappmoduleatruntimeinsteadofthedialogmoduleimporting appatinitializationtime. Clickheretoviewcodeimage #dialog.py classDialog(object): #… save_dialog=Dialog() defshow(): importapp#Dynamicimport save_dialog.save_dir=app.prefs.get(‘save_dir’) #… Theappmodulecannowbethesameasitwasintheoriginalexample.Itimports dialogatthetopandcallsdialog.showatthebottom. #app.py importdialog classPrefs(object): #… prefs=Prefs() dialog.show() Thisapproachhasasimilareffecttotheimport,configure,andrunstepsfrombefore.The differenceisthatthisrequiresnostructuralchangestothewaythemodulesaredefined andimported.You’resimplydelayingthecircularimportuntilthemomentyoumust accesstheothermodule.Atthatpoint,youcanbeprettysurethatallothermoduleshave alreadybeeninitialized(step#5iscompleteforeverything). Ingeneral,it’sgoodtoavoiddynamicimportslikethis.Thecostoftheimportstatement isnotnegligibleandcanbeespeciallybadintightloops.Bydelayingexecution,dynamic importsalsosetyouupforsurprisingfailuresatruntime,suchasSyntaxError exceptionslongafteryourprogramhasstartedrunning(seeItem56:“TestEverything withunittest”forhowtoavoidthat).However,thesedownsidesareoftenbetterthan thealternativeofrestructuringyourentireprogram. ThingstoRemember Circulardependencieshappenwhentwomodulesmustcallintoeachotheratimport time.Theycancauseyourprogramtocrashatstartup. Thebestwaytobreakacirculardependencyisrefactoringmutualdependenciesinto aseparatemoduleatthebottomofthedependencytree. Dynamicimportsarethesimplestsolutionforbreakingacirculardependency betweenmoduleswhileminimizingrefactoringandcomplexity. Item53:UseVirtualEnvironmentsforIsolatedand ReproducibleDependencies Buildinglargerandmorecomplexprogramsoftenleadsyoutorelyonvariouspackages fromthePythoncommunity(seeItem48:“KnowWheretoFindCommunity-Built Modules”).You’llfindyourselfrunningpiptoinstallpackageslikepytz,numpy,and manyothers. Theproblemisthat,bydefault,pipinstallsnewpackagesinagloballocation.That causesallPythonprogramsonyoursystemtobeaffectedbytheseinstalledmodules.In theory,thisshouldn’tbeanissue.Ifyouinstallapackageandneverimportit,how coulditaffectyourprograms? Thetroublecomesfromtransitivedependencies:thepackagesthatthepackagesyou installdependon.Forexample,youcanseewhattheSphinxpackagedependsonafter installingitbyaskingpip. Clickheretoviewcodeimage $pip3showSphinx – Name:Sphinx Version:1.2.2 Location:/usr/local/lib/python3.4/site-packages Requires:docutils,Jinja2,Pygments Ifyouinstallanotherpackagelikeflask,youcanseethatit,too,dependsonthe Jinja2package. Clickheretoviewcodeimage $pip3showflask – Name:Flask Version:0.10.1 Location:/usr/local/lib/python3.4/site-packages Requires:Werkzeug,Jinja2,itsdangerous TheconflictarisesasSphinxandflaskdivergeovertime.Perhapsrightnowtheyboth requirethesameversionofJinja2andeverythingisfine.Butsixmonthsorayearfrom now,Jinja2mayreleaseanewversionthatmakesbreakingchangestousersofthe library.IfyouupdateyourglobalversionofJinja2withpipinstall-upgrade,youmayfindthatSphinxbreakswhileflaskkeepsworking. ThecauseofthisbreakageisthatPythoncanonlyhaveasingleglobalversionofa moduleinstalledatatime.Ifoneofyourinstalledpackagesmustusethenewversionand anotherpackagemustusetheoldversion,yoursystemisn’tgoingtoworkproperly. SuchbreakagecanevenhappenwhenpackagemaintainerstrytheirbesttopreserveAPI compatibilitybetweenreleases(seeItem50:“UsePackagestoOrganizeModulesand ProvideStableAPIs”).NewversionsofalibrarycansubtlychangebehaviorsthatAPIconsumingcoderelieson.Usersonasystemmayupgradeonepackagetoanewversion butnotothers,whichcoulddependencies.There’saconstantriskofthegroundmoving beneathyourfeet. Thesedifficultiesaremagnifiedwhenyoucollaboratewithotherdeveloperswhodotheir workonseparatecomputers.It’sreasonabletoassumethattheversionsofPythonand globalpackagestheyhaveinstalledontheirmachineswillbeslightlydifferentthanyour own.Thiscancausefrustratingsituationswhereacodebaseworksperfectlyonone programmer’smachineandiscompletelybrokenonanother’s. Thesolutiontoalloftheseproblemsisatoolcalledpyvenv,whichprovidesvirtual environments.SincePython3.4,thepyvenvcommand-linetoolisavailablebydefault alongwiththePythoninstallation(it’salsoaccessiblewithpython-mvenv).Prior versionsofPythonrequireinstallingaseparatepackage(withpipinstall virtualenv)andusingacommand-linetoolcalledvirtualenv. pyvenvallowsyoutocreateisolatedversionsofthePythonenvironment.Using pyvenv,youcanhavemanydifferentversionsofthesamepackageinstalledonthesame systematthesametimewithoutconflicts.Thisletsyouworkonmanydifferentprojects andusemanydifferenttoolsonthesamecomputer. pyvenvdoesthisbyinstallingexplicitversionsofpackagesandtheirdependenciesinto completelyseparatedirectorystructures.ThismakesitpossibletoreproduceaPython environmentthatyouknowwillworkwithyourcode.It’sareliablewaytoavoid surprisingbreakages. ThepyvenvCommand Here’saquicktutorialonhowtousepyvenveffectively.Beforeusingthetool,it’s importanttonotethemeaningofthepython3command-lineonyoursystem.Onmy computer,python3islocatedinthe/usr/local/bindirectoryandevaluatesto version3.4.2(seeItem1:“KnowWhichVersionofPythonYou’reUsing”). $whichpython3 /usr/local/bin/python3 $python3—version Python3.4.2 Todemonstratethesetupofmyenvironment,Icantestthatrunningacommandtoimport thepytzmoduledoesn’tcauseanerror.ThisworksbecauseIalreadyhavethepytz packageinstalledasaglobalmodule. $python3-c‘importpytz’ $ Now,Iusepyvenvtocreateanewvirtualenvironmentcalledmyproject.Eachvirtual environmentmustliveinitsownuniquedirectory.Theresultofthecommandisatreeof directoriesandfiles. Clickheretoviewcodeimage $pyvenv/tmp/myproject $cd/tmp/myproject $ls binincludelibpyvenv.cfg Tostartusingthevirtualenvironment,Iusethesourcecommandfrommyshellonthe bin/activatescript.activatemodifiesallofmyenvironmentvariablestomatch thevirtualenvironment.Italsoupdatesmycommand-lineprompttoincludethevirtual environmentname('myproject')tomakeitextremelyclearwhatI’mworkingon. $sourcebin/activate (myproject)$ Afteractivation,youcanseethatthepathtothepython3command-linetoolhasmoved towithinthevirtualenvironmentdirectory. Clickheretoviewcodeimage (myproject)$whichpython3 /tmp/myproject/bin/python3 (myproject)$ls-l/tmp/myproject/bin/python3 …->/tmp/myproject/bin/python3.4 (myproject)$ls-l/tmp/myproject/bin/python3.4 …->/usr/local/bin/python3.4 Thisensuresthatchangestotheoutsidesystemwillnotaffectthevirtualenvironment. Eveniftheoutersystemupgradesitsdefaultpython3toversion3.5,myvirtual environmentwillstillexplicitlypointtoversion3.4. ThevirtualenvironmentIcreatedwithpyvenvstartswithnopackagesinstalledexcept forpipandsetuptools.Tryingtousethepytzpackagethatwasinstalledasaglobal moduleintheoutsidesystemwillfailbecauseit’sunknowntothevirtualenvironment. Clickheretoviewcodeimage (myproject)$python3-c‘importpytz’ Traceback(mostrecentcalllast): File“<string>”,line1,in<module> ImportError:Nomodulenamed‘pytz’ Icanusepiptoinstallthepytzmoduleintomyvirtualenvironment. Clickheretoviewcodeimage (myproject)$pip3installpytz Onceit’sinstalled,Icanverifythatit’sworkingwiththesametestimportcommand. Clickheretoviewcodeimage (myproject)$python3-c‘importpytz’ (myproject)$ Whenyou’redonewithavirtualenvironmentandwanttogobacktoyourdefaultsystem, youusethedeactivatecommand.Thisrestoresyourenvironmenttothesystem defaults,includingthelocationofthepython3command-linetool. (myproject)$deactivate $whichpython3 /usr/local/bin/python3 Ifyoueverwanttoworkinthemyprojectenvironmentagain,youcanjustrun sourcebin/activateinthedirectorylikebefore. ReproducingDependencies Onceyouhaveavirtualenvironment,youcancontinueinstallingpackageswithpipas youneedthem.Eventually,youmaywanttocopyyourenvironmentsomewhereelse.For example,sayyouwanttoreproduceyourdevelopmentenvironmentonaproduction server.Ormaybeyouwanttoclonesomeoneelse’senvironmentonyourownmachineso youcanruntheircode. pyvenvmakesthesesituationseasy.Youcanusethepipfreezecommandtosaveall ofyourexplicitpackagedependenciesintoafile.Byconvention,thisfileisnamed requirements.txt. Clickheretoviewcodeimage (myproject)$pip3freeze>requirements.txt (myproject)$catrequirements.txt numpy==1.8.2 pytz==2014.4 requests==2.3.0 Now,imaginethatyou’dliketohaveanothervirtualenvironmentthatmatchesthe myprojectenvironment.Youcancreateanewdirectorylikebeforeusingpyvenvand activateit. $pyvenv/tmp/otherproject $cd/tmp/otherproject $sourcebin/activate (otherproject)$ Thenewenvironmentwillhavenoextrapackagesinstalled. (otherproject)$pip3list pip(1.5.6) setuptools(2.1) Youcaninstallallofthepackagesfromthefirstenvironmentbyrunningpipinstall ontherequirements.txtthatyougeneratedwiththepipfreezecommand. Clickheretoviewcodeimage (otherproject)$pip3install-r/tmp/myproject/requirements.txt Thiscommandwillcrankalongforalittlewhileasitretrievesandinstallsallofthe packagesrequiredtoreproducethefirstenvironment.Onceit’sdone,listingthesetof installedpackagesinthesecondvirtualenvironmentwillproducethesamelistof dependenciesfoundinthefirstvirtualenvironment. (otherproject)$piplist numpy(1.8.2) pip(1.5.6) pytz(2014.4) requests(2.3.0) setuptools(2.1) Usingarequirements.txtfileisidealforcollaboratingwithothersthrougha revisioncontrolsystem.Youcancommitchangestoyourcodeatthesametimeyou updateyourlistofpackagedependencies,ensuringthattheymoveinlockstep. Thegotchawithvirtualenvironmentsisthatmovingthembreakseverythingbecauseall ofthepaths,likepython3,arehard-codedtotheenvironment’sinstalldirectory.Butthat doesn’tmatter.Thewholepurposeofvirtualenvironmentsistomakeiteasytoreproduce thesamesetup.Insteadofmovingavirtualenvironmentdirectory,justfreezetheold one,createanewonesomewhereelse,andreinstalleverythingfromthe requirements.txtfile. ThingstoRemember Virtualenvironmentsallowyoutousepiptoinstallmanydifferentversionsofthe samepackageonthesamemachinewithoutconflicts. Virtualenvironmentsarecreatedwithpyvenv,enabledwithsource bin/activate,anddisabledwithdeactivate. Youcandumpalloftherequirementsofanenvironmentwithpipfreeze.You canreproducetheenvironmentbysupplyingtherequirements.txtfiletopip install-r. InversionsofPythonbefore3.4,thepyvenvtoolmustbedownloadedand installedseparately.Thecommand-linetooliscalledvirtualenvinsteadof pyvenv. 8.Production PuttingaPythonprogramtouserequiresmovingitfromadevelopmentenvironmenttoa productionenvironment.Supportingdisparateconfigurationslikethiscanbeachallenge. Makingprogramsthataredependableinmultiplesituationsisjustasimportantasmaking programswithcorrectfunctionality. ThegoalistoproductionizeyourPythonprogramsandmakethembulletproofwhile they’reinuse.Pythonhasbuilt-inmodulesthataidinhardeningyourprograms.It providesfacilitiesfordebugging,optimizing,andtestingtomaximizethequalityand performanceofyourprogramsatruntime. Item54:ConsiderModule-ScopedCodetoConfigure DeploymentEnvironments Adeploymentenvironmentisaconfigurationinwhichyourprogramruns.Everyprogram hasatleastonedeploymentenvironment,theproductionenvironment.Thegoalofwriting aprograminthefirstplaceistoputittoworkintheproductionenvironmentandachieve somekindofoutcome. Writingormodifyingaprogramrequiresbeingabletorunitonthecomputeryouusefor developing.Theconfigurationofyourdevelopmentenvironmentmaybemuchdifferent fromyourproductionenvironment.Forexample,youmaybewritingaprogramfor supercomputersusingaLinuxworkstation. Toolslikepyvenv(seeItem53:“UseVirtualEnvironmentsforIsolatedand ReproducibleDependencies”)makeiteasytoensurethatallenvironmentshavethesame Pythonpackagesinstalled.Thetroubleisthatproductionenvironmentsoftenrequiremany externalassumptionsthatarehardtoreproduceindevelopmentenvironments. Forexample,sayyouwanttorunyourprograminawebservercontainerandgiveit accesstoadatabase.Thismeansthateverytimeyouwanttomodifyyourprogram’scode, youneedtorunaservercontainer,thedatabasemustbesetupproperly,andyourprogram needsthepasswordforaccess.That’saveryhighcostifallyou’retryingtodoisverify thataone-linechangetoyourprogramworkscorrectly. Thebestwaytoworkaroundtheseissuesistooverridepartsofyourprogramatstartup timetoprovidedifferentfunctionalitydependingonthedeploymentenvironment.For example,youcouldhavetwodifferent__main__files,oneforproductionandonefor development. Clickheretoviewcodeimage #dev_main.py TESTING=True importdb_connection db=db_connection.Database() #prod_main.py TESTING=False importdb_connection db=db_connection.Database() TheonlydifferencebetweenthetwofilesisthevalueoftheTESTINGconstant.Other modulesinyourprogramcanthenimportthe__main__moduleandusethevalueof TESTINGtodecidehowtheydefinetheirownattributes. Clickheretoviewcodeimage #db_connection.py import__main__ classTestingDatabase(object): #… classRealDatabase(object): #… if__main__.TESTING: Database=TestingDatabase else: Database=RealDatabase Thekeybehaviortonoticehereisthatcoderunninginmodulescope—notinsideany functionormethod—isjustnormalPythoncode.Youcanuseanifstatementatthe moduleleveltodecidehowthemodulewilldefinenames.Thismakesiteasytotailor modulestoyourvariousdeploymentenvironments.Youcanavoidhavingtoreproduce costlyassumptionslikedatabaseconfigurationswhentheyaren’tneeded.Youcaninject fakeormockimplementationsthateaseinteractivedevelopmentandtesting(seeItem56: “TestEverythingwithunittest”). Note Onceyourdeploymentenvironmentsgetcomplicated,youshouldconsidermoving themoutofPythonconstants(likeTESTING)andintodedicatedconfiguration files.Toolsliketheconfigparserbuilt-inmoduleletyoumaintainproduction configurationsseparatefromcode,adistinctionthat’scrucialforcollaboratingwith anoperationsteam. Thisapproachcanbeusedformorethanworkingaroundexternalassumptions.For example,ifyouknowthatyourprogrammustworkdifferentlybasedonitshostplatform, youcaninspectthesysmodulebeforedefiningtop-levelconstructsinamodule. Clickheretoviewcodeimage #db_connection.py importsys classWin32Database(object): #… classPosixDatabase(object): #… ifsys.platform.startswith(‘win32’): Database=Win32Database else: Database=PosixDatabase Similarly,youcanuseenvironmentvariablesfromos.environtoguideyourmodule definitions. ThingstoRemember Programsoftenneedtoruninmultipledeploymentenvironmentsthateachhave uniqueassumptionsandconfigurations. Youcantailoramodule’scontentstodifferentdeploymentenvironmentsbyusing normalPythonstatementsinmodulescope. Modulecontentscanbetheproductofanyexternalcondition,includinghost introspectionthroughthesysandosmodules. Item55:UsereprStringsforDebuggingOutput WhendebuggingaPythonprogram,theprintfunction(oroutputviathelogging built-inmodule)willgetyousurprisinglyfar.Pythoninternalsareofteneasytoaccessvia plainattributes(seeItem27:“PreferPublicAttributesOverPrivateOnes”).Allyouneed todoisprinthowthestateofyourprogramchangeswhileitrunsandseewhereitgoes wrong. Theprintfunctionoutputsahuman-readablestringversionofwhateveryousupplyit. Forexample,printingabasicstringwillprintthecontentsofthestringwithoutthe surroundingquotecharacters. print(‘foobar’) >>> foobar Thisisequivalenttousingthe'%s'formatstringandthe%operator. print(‘%s’%‘foobar’) >>> foobar Theproblemisthatthehuman-readablestringforavaluedoesn’tmakeitclearwhatthe actualtypeofthevalueis.Forexample,noticehowinthedefaultoutputofprintyou can’tdistinguishbetweenthetypesofthenumber5andthestring'5'. print(5) print(‘5’) >>> 5 5 Ifyou’redebuggingaprogramwithprint,thesetypedifferencesmatter.Whatyou almostalwayswantwhiledebuggingistoseethereprversionofanobject.Therepr built-infunctionreturnstheprintablerepresentationofanobject,whichshouldbeitsmost clearlyunderstandablestringrepresentation.Forbuilt-intypes,thestringreturnedby reprisavalidPythonexpression. a=‘\x07’ print(repr(a)) >>> ‘\x07’ Passingthevaluefromreprtotheevalbuilt-infunctionshouldresultinthesame Pythonobjectyoustartedwith(ofcourse,inpractice,youshouldonlyuseevalwith extremecaution). b=eval(repr(a)) asserta==b Whenyou’redebuggingwithprint,youshouldreprthevaluebeforeprintingto ensurethatanydifferenceintypesisclear. print(repr(5)) print(repr(‘5’)) >>> 5 ‘5’ Thisisequivalenttousingthe'%r'formatstringandthe%operator. print(‘%r’%5) print(‘%r’%‘5’) >>> 5 ‘5’ FordynamicPythonobjects,thedefaulthuman-readablestringvalueisthesameasthe reprvalue.Thismeansthatpassingadynamicobjecttoprintwilldotherightthing, andyoudon’tneedtoexplicitlycallrepronit.Unfortunately,thedefaultvalueofrepr forobjectinstancesisn’tespeciallyhelpful.Forexample,hereIdefineasimpleclass andthenprintitsvalue: Clickheretoviewcodeimage classOpaqueClass(object): def__init__(self,x,y): self.x=x self.y=y obj=OpaqueClass(1,2) print(obj) >>> <__main__.OpaqueClassobjectat0x107880ba8> Thisoutputcan’tbepassedtotheevalfunction,anditsaysnothingabouttheinstance fieldsoftheobject. Therearetwosolutionstothisproblem.Ifyouhavecontroloftheclass,youcandefine yourown__repr__specialmethodthatreturnsastringcontainingthePython expressionthatrecreatestheobject.Here,Idefinethatfunctionfortheclassabove: Clickheretoviewcodeimage classBetterClass(object): def__init__(self,x,y): #… def__repr__(self): return‘BetterClass(%d,%d)’%(self.x,self.y) Now,thereprvalueismuchmoreuseful. obj=BetterClass(1,2) print(obj) >>> BetterClass(1,2) Whenyoudon’thavecontrolovertheclassdefinition,youcanreachintotheobject’s instancedictionary,whichisstoredinthe__dict__attribute.Here,Iprintoutthe contentsofanOpaqueClassinstance: obj=OpaqueClass(4,5) print(obj.__dict__) >>> {‘y’:5,‘x’:4} ThingstoRemember Callingprintonbuilt-inPythontypeswillproducethehuman-readablestring versionofavalue,whichhidestypeinformation. Callingrepronbuilt-inPythontypeswillproducetheprintablestringversionofa value.Thesereprstringscouldbepassedtotheevalbuilt-infunctiontogetback theoriginalvalue. %sinformatstringswillproducehuman-readablestringslikestr.%rwillproduce printablestringslikerepr. Youcandefinethe__repr__methodtocustomizetheprintablerepresentationof aclassandprovidemoredetaileddebugginginformation. Youcanreachintoanyobject’s__dict__attributetoviewitsinternals. Item56:TestEverythingwithunittest Pythondoesn’thavestatictypechecking.There’snothinginthecompilerthatwillensure thatyourprogramwillworkwhenyourunit.WithPythonyoudon’tknowwhetherthe functionsyourprogramcallswillbedefinedatruntime,evenwhentheirexistenceis evidentinthesourcecode.Thisdynamicbehaviorisablessingandacurse. ThelargenumbersofPythonprogrammersouttheresayit’sworthitbecauseofthe productivitygainedfromtheresultingbrevityandsimplicity.Butmostpeoplehaveheard atleastonehorrorstoryaboutPythoninwhichaprogramencounteredaboneheadederror atruntime. OneoftheworstexamplesI’veheardiswhenaSyntaxErrorwasraisedinproduction asasideeffectofadynamicimport(seeItem52:“KnowHowtoBreakCircular Dependencies”).TheprogrammerIknowwhowashitbythissurprisingoccurrencehas sinceruledoutusingPythoneveragain. ButIhavetowonder,whywasn’tthecodetestedbeforetheprogramwasdeployedto production?Typesafetyisn’teverything.Youshouldalwaystestyourcode,regardlessof whatlanguageit’swrittenin.However,I’lladmitthatthebigdifferencebetweenPython andmanyotherlanguagesisthattheonlywaytohaveanyconfidenceinaPython programisbywritingtests.Thereisnoveilofstatictypecheckingtomakeyoufeelsafe. Luckily,thesamedynamicfeaturesthatpreventstatictypecheckinginPythonalsomake itextremelyeasytowritetestsforyourcode.YoucanusePython’sdynamicnatureand easilyoverridablebehaviorstoimplementtestsandensurethatyourprogramsworkas expected. Youshouldthinkoftestsasaninsurancepolicyonyourcode.Goodtestsgiveyou confidencethatyourcodeiscorrect.Ifyourefactororexpandyourcode,testsmakeit easytoidentifyhowbehaviorshavechanged.Itsoundscounter-intuitive,buthavinggood testsactuallymakesiteasiertomodifyPythoncode,notharder. Thesimplestwaytowritetestsistousetheunittestbuilt-inmodule.Forexample,say youhavethefollowingutilityfunctiondefinedinutils.py: Clickheretoviewcodeimage #utils.py defto_str(data): ifisinstance(data,str): returndata elifisinstance(data,bytes): returndata.decode(‘utf-8’) else: raiseTypeError(‘Mustsupplystrorbytes,‘ ‘found:%r’%data) Todefinetests,Icreateasecondfilenamedtest_utils.pyorutils_test.py thatcontainstestsforeachbehaviorIexpect. Clickheretoviewcodeimage #utils_test.py fromunittestimportTestCase,main fromutilsimportto_str classUtilsTestCase(TestCase): deftest_to_str_bytes(self): self.assertEqual(‘hello’,to_str(b’hello’)) deftest_to_str_str(self): self.assertEqual(‘hello’,to_str(‘hello’)) deftest_to_str_bad(self): self.assertRaises(TypeError,to_str,object()) if__name__==‘__main__’: main() TestsareorganizedintoTestCaseclasses.Eachtestisamethodbeginningwiththe wordtest.IfatestmethodrunswithoutraisinganykindofException(including AssertionErrorfromassertstatements),thenthetestisconsideredtohavepassed successfully. TheTestCaseclassprovideshelpermethodsformakingassertionsinyourtests,suchas assertEqualforverifyingequality,assertTrueforverifyingBooleanexpressions, andassertRaisesforverifyingthatexceptionsareraisedwhenappropriate(see help(TestCase)formore).YoucandefineyourownhelpermethodsinTestCase subclassestomakeyourtestsmorereadable;justensurethatyourmethodnamesdon’t beginwiththewordtest. Note Anothercommonpracticewhenwritingtestsistousemockfunctionsandclasses tostuboutcertainbehaviors.Forthispurpose,Python3providesthe unittest.mockbuilt-inmodule,whichisalsoavailableforPython2asanopen sourcepackage. Sometimes,yourTestCaseclassesneedtosetupthetestenvironmentbeforerunning testmethods.Todothis,youcanoverridethesetUpandtearDownmethods.These methodsarecalledbeforeandaftereachtestmethod,respectively,andtheyletyouensure thateachtestrunsinisolation(animportantbestpracticeofpropertesting).Forexample, hereIdefineaTestCasethatcreatesatemporarydirectorybeforeeachtestanddeletes itscontentsaftereachtestfinishes: Clickheretoviewcodeimage classMyTest(TestCase): defsetUp(self): self.test_dir=TemporaryDirectory() deftearDown(self): self.test_dir.cleanup() #Testmethodsfollow #… IusuallydefineoneTestCaseforeachsetofrelatedtests.SometimesIhaveone TestCaseforeachfunctionthathasmanyedgecases.Othertimes,aTestCasespans allfunctionsinasinglemodule.I’llalsocreateoneTestCasefortestingasingleclass andallofitsmethods. Whenprogramsgetcomplicated,you’llwantadditionaltestsforverifyingtheinteractions betweenyourmodules,insteadofonlytestingcodeinisolation.Thisisthedifference betweenunittestsandintegrationtests.InPython,it’simportanttowritebothtypesof testsforexactlythesamereason:Youhavenoguaranteethatyourmoduleswillactually worktogetherunlessyouproveit. Note Dependingonyourproject,itcanalsobeusefultodefinedata-driventestsor organizetestsintodifferentsuitesofrelatedfunctionality.Forthesepurposes,code coveragereports,andotheradvancedusecases,thenose (http://nose.readthedocs.org/)andpytest(http://pytest.org/)opensource packagescanbeespeciallyhelpful. ThingstoRemember TheonlywaytohaveconfidenceinaPythonprogramistowritetests. Theunittestbuilt-inmoduleprovidesmostofthefacilitiesyou’llneedtowrite goodtests. YoucandefinetestsbysubclassingTestCaseanddefiningonemethodper behavioryou’dliketotest.TestmethodsonTestCaseclassesmuststartwiththe wordtest. It’simportanttowritebothunittests(forisolatedfunctionality)andintegrationtests (formodulesthatinteract). Item57:ConsiderInteractiveDebuggingwithpdb Everyoneencountersbugsintheircodewhiledevelopingprograms.Usingtheprint functioncanhelpyoutrackdownthesourceofmanyissues(seeItem55:“Userepr StringsforDebuggingOutput”).Writingtestsforspecificcasesthatcausetroubleis anothergreatwaytoisolateproblems(seeItem56:“TestEverythingwithunittest”). Butthesetoolsaren’tenoughtofindeveryrootcause.Whenyouneedsomethingmore powerful,it’stimetotryPython’sbuilt-ininteractivedebugger.Thedebuggerletsyou inspectprogramstate,printlocalvariables,andstepthroughaPythonprogramone statementatatime. Inmostotherprogramminglanguages,youuseadebuggerbyspecifyingwhatlineofa sourcefileyou’dliketostopon,thenexecutetheprogram.Incontrast,withPythonthe easiestwaytousethedebuggerisbymodifyingyourprogramtodirectlyinitiatethe debuggerjustbeforeyouthinkyou’llhaveanissueworthinvestigating.Thereisno differencebetweenrunningaPythonprogramunderadebuggerandrunningitnormally. Toinitiatethedebugger,allyouhavetodoisimportthepdbbuilt-inmoduleandrunits set_tracefunction.You’lloftenseethisdoneinasinglelinesoprogrammerscan commentitoutwithasingle#character. Clickheretoviewcodeimage defcomplex_func(a,b,c): #… importpdb;pdb.set_trace() Assoonasthisstatementruns,theprogramwillpauseitsexecution.Theterminalthat startedyourprogramwillturnintoaninteractivePythonshell. Clickheretoviewcodeimage ->importpdb;pdb.set_trace() (Pdb) Atthe(Pdb)prompt,youcantypeinthenameoflocalvariablestoseetheirvalues printedout.Youcanseealistofalllocalvariablesbycallingthelocalsbuilt-in function.Youcanimportmodules,inspectglobalstate,constructnewobjects,runthe helpbuilt-infunction,andevenmodifypartsoftheprogram—whateveryouneedtodo toaidinyourdebugging.Inaddition,thedebuggerhasthreecommandsthatmake inspectingtherunningprogrameasier. bt:Printthetracebackofthecurrentexecutioncallstack.Thisletsyoufigureout whereyouareinyourprogramandhowyouarrivedatthepdb.set_trace triggerpoint. up:Moveyourscopeupthefunctioncallstacktothecallerofthecurrentfunction. Thisallowsyoutoinspectthelocalvariablesinhigherlevelsofthecallstack. down:Moveyourscopebackdownthefunctioncallstackonelevel. Onceyou’redoneinspectingthecurrentstate,youcanusedebuggercommandstoresume theprogram’sexecutionunderprecisecontrol. step:Runtheprogramuntilthenextlineofexecutionintheprogram,thenreturn controlbacktothedebugger.Ifthenextlineofexecutionincludescallinga function,thedebuggerwillstopinthefunctionthatwascalled. next:Runtheprogramuntilthenextlineofexecutioninthecurrentfunction,then returncontrolbacktothedebugger.Ifthenextlineofexecutionincludescallinga function,thedebuggerwillnotstopuntilthecalledfunctionhasreturned. return:Runtheprogramuntilthecurrentfunctionreturns,thenreturncontrol backtothedebugger. continue:Continuerunningtheprogramuntilthenextbreakpoint(or set_traceiscalledagain). ThingstoRemember YoucaninitiatethePythoninteractivedebuggeratapointofinterestdirectlyinyour programwiththeimportpdb;pdb.set_trace()statements. ThePythondebuggerpromptisafullPythonshellthatletsyouinspectandmodify thestateofarunningprogram. pdbshellcommandsletyoupreciselycontrolprogramexecution,allowingyouto alternatebetweeninspectingprogramstateandprogressingprogramexecution. Item58:ProfileBeforeOptimizing ThedynamicnatureofPythoncausessurprisingbehaviorsinitsruntimeperformance. Operationsyoumightassumeareslowareactuallyveryfast(stringmanipulation, generators).Languagefeaturesyoumightassumearefastareactuallyveryslow(attribute access,functioncalls).ThetruesourceofslowdownsinaPythonprogramcanbeobscure. Thebestapproachistoignoreyourintuitionanddirectlymeasuretheperformanceofa programbeforeyoutrytooptimizeit.Pythonprovidesabuilt-inprofilerfordetermining whichpartsofaprogramareresponsibleforitsexecutiontime.Thisletsyoufocusyour optimizationeffortsonthebiggestsourcesoftroubleandignorepartsoftheprogramthat don’timpactspeed. Forexample,sayyouwanttodeterminewhyanalgorithminyourprogramisslow.Here, Idefineafunctionthatsortsalistofdatausinganinsertionsort: Clickheretoviewcodeimage definsertion_sort(data): result=[] forvalueindata: insert_value(result,value) returnresult Thecoremechanismoftheinsertionsortisthefunctionthatfindstheinsertionpointfor eachpieceofdata.Here,Idefineanextremelyinefficientversionoftheinsert_value functionthatdoesalinearscanovertheinputarray: Clickheretoviewcodeimage definsert_value(array,value): fori,existinginenumerate(array): ifexisting>value: array.insert(i,value) return array.append(value) Toprofileinsertion_sortandinsert_value,Icreateadatasetofrandom numbersanddefineatestfunctiontopasstotheprofiler. Clickheretoviewcodeimage fromrandomimportrandint max_size=10**4 data=[randint(0,max_size)for_inrange(max_size)] test=lambda:insertion_sort(data) Pythonprovidestwobuilt-inprofilers,onethatispurePython(profile)andanother thatisaC-extensionmodule(cProfile).ThecProfilebuilt-inmoduleisbetter becauseofitsminimalimpactontheperformanceofyourprogramwhileit’sbeing profiled.Thepure-Pythonalternativeimposesahighoverheadthatwillskewtheresults. Note WhenprofilingaPythonprogram,besurethatwhatyou’remeasuringisthecode itselfandnotanyexternalsystems.Bewareoffunctionsthataccessthenetworkor resourcesondisk.Thesemayappeartohavealargeimpactonyourprogram’s executiontimebecauseoftheslownessoftheunderlyingsystems.Ifyourprogram usesacachetomaskthelatencyofslowresourceslikethese,youshouldalso ensurethatit’sproperlywarmedupbeforeyoustartprofiling. Here,IinstantiateaProfileobjectfromthecProfilemoduleandrunthetest functionthroughitusingtheruncallmethod: profiler=Profile() profiler.runcall(test) Oncethetestfunctionhasfinishedrunning,Icanextractstatisticsaboutitsperformance usingthepstatsbuilt-inmoduleanditsStatsclass.VariousmethodsonaStats objectadjusthowtoselectandsorttheprofilinginformationtoshowonlythethingsyou careabout. stats=Stats(profiler) stats.strip_dirs() stats.sort_stats(‘cumulative’) stats.print_stats() Theoutputisatableofinformationorganizedbyfunction.Thedatasampleistakenonly fromthetimetheprofilerwasactive,duringtheruncallmethodabove. Clickheretoviewcodeimage >>> 20003functioncallsin1.812seconds Orderedby:cumulativetime ncallstottimepercallcumtimepercallfilename:lineno(function) 10.0000.0001.8121.812main.py:34(<lambda>) 10.0030.0031.8121.812main.py:10(insertion_sort) 100001.7970.0001.8100.000main.py:20(insert_value) 99920.0130.0000.0130.000{method‘insert’of‘list’ objects} 80.0000.0000.0000.000{method‘append’of‘list’ objects} 10.0000.0000.0000.000{method‘disable’of ‘_lsprof.Profiler’objects} Here’saquickguidetowhattheprofilerstatisticscolumnsmean: ncalls:Thenumberofcallstothefunctionduringtheprofilingperiod. tottime:Thenumberofsecondsspentexecutingthefunction,excludingtime spentexecutingotherfunctionsitcalls. tottimepercall:Theaveragenumberofsecondsspentinthefunctioneach timeitwascalled,excludingtimespentexecutingotherfunctionsitcalls.Thisis tottimedividedbyncalls. cumtime:Thecumulativenumberofsecondsspentexecutingthefunction, includingtimespentinallotherfunctionsitcalls. cumtimepercall:Theaveragenumberofsecondsspentinthefunctioneach timeitwascalled,includingtimespentinallotherfunctionsitcalls.Thisis cumtimedividedbyncalls. Lookingattheprofilerstatisticstableabove,IcanseethatthebiggestuseofCPUinmy testisthecumulativetimespentintheinsert_valuefunction.Here,Iredefinethat functiontousethebisectbuilt-inmodule(seeItem46:“UseBuilt-inAlgorithmsand DataStructures”): Clickheretoviewcodeimage frombisectimportbisect_left definsert_value(array,value): i=bisect_left(array,value) array.insert(i,value) Icanruntheprofileragainandgenerateanewtableofprofilerstatistics.Thenew functionismuchfaster,withacumulativetimespentthatisnearly100×smallerthanthe previousinsert_valuefunction. Clickheretoviewcodeimage >>> 30003functioncallsin0.028seconds Orderedby:cumulativetime ncallstottimepercallcumtimepercallfilename:lineno(function) 10.0000.0000.0280.028main.py:34(<lambda>) 10.0020.0020.0280.028main.py:10(insertion_sort) 100000.0050.0000.0260.000main.py:112(insert_value) 100000.0140.0000.0140.000{method‘insert’of‘list’ objects} 100000.0070.0000.0070.000{built-inmethodbisect_left} 10.0000.0000.0000.000{method‘disable’of ‘_lsprof.Profiler’objects} Sometimes,whenyou’reprofilinganentireprogram,you’llfindthatacommonutility functionisresponsibleforthemajorityofexecutiontime.Thedefaultoutputfromthe profilermakesthissituationdifficulttounderstandbecauseitdoesn’tshowhowtheutility functioniscalledbymanydifferentpartsofyourprogram. Forexample,herethemy_utilityfunctioniscalledrepeatedlybytwodifferent functionsintheprogram: defmy_utility(a,b): #… deffirst_func(): for_inrange(1000): my_utility(4,5) defsecond_func(): for_inrange(10): my_utility(1,3) defmy_program(): for_inrange(20): first_func() second_func() Profilingthiscodeandusingthedefaultprint_statsoutputwillgenerateoutput statisticsthatareconfusing. Clickheretoviewcodeimage >>> 20242functioncallsin0.208seconds Orderedby:cumulativetime ncallstottimepercallcumtimepercallfilename:lineno(function) 10.0000.0000.2080.208main.py:176(my_program) 200.0050.0000.2060.010main.py:168(first_func) 202000.2030.0000.2030.000main.py:161(my_utility) 200.0000.0000.0020.000main.py:172(second_func) 10.0000.0000.0000.000{method‘disable’of ‘_lsprof.Profiler’objects} Themy_utilityfunctionisclearlythesourceofmostexecutiontime,butit’snot immediatelyobviouswhythatfunctioniscalledsomuch.Ifyousearchthroughthe program’scode,you’llfindmultiplecallsitesformy_utilityandstillbeconfused. Todealwiththis,thePythonprofilerprovidesawayofseeingwhichcallerscontributedto theprofilinginformationofeachfunction. stats.print_callers() Thisprofilerstatisticstableshowsfunctionscalledontheleftandwhowasresponsiblefor makingthecallontheright.Here,it’sclearthatmy_utilityismostusedby first_func: Clickheretoviewcodeimage >>> Orderedby:cumulativetime Functionwascalledby… ncallstottimecumtime main.py:176(my_program)<main.py:168(first_func)<-200.0050.206main.py:176(my main.py:161(my_utility)<-200000.2020.202main.py:168(fi 2000.0020.002main.py:172(se main.py:172(second_func)<-200.0000.002main.py:176(my ThingstoRemember It’simportanttoprofilePythonprogramsbeforeoptimizingbecausethesourceof slowdownsisoftenobscure. UsethecProfilemoduleinsteadoftheprofilemodulebecauseitprovides moreaccurateprofilinginformation. TheProfileobject’sruncallmethodprovideseverythingyouneedtoprofilea treeoffunctioncallsinisolation. TheStatsobjectletsyouselectandprintthesubsetofprofilinginformationyou needtoseetounderstandyourprogram’sperformance. Item59:UsetracemalloctoUnderstandMemoryUsage andLeaks MemorymanagementinthedefaultimplementationofPython,CPython,usesreference counting.Thisensuresthatassoonasallreferencestoanobjecthaveexpired,the referencedobjectisalsocleared.CPythonalsohasabuilt-incycledetectortoensurethat self-referencingobjectsareeventuallygarbagecollected. Intheory,thismeansthatmostPythonprogrammersdon’thavetoworryaboutallocating ordeallocatingmemoryintheirprograms.It’stakencareofautomaticallybythelanguage andtheCPythonruntime.However,inpractice,programseventuallydorunoutof memoryduetoheldreferences.FiguringoutwhereyourPythonprogramsareusingor leakingmemoryprovestobeachallenge. Thefirstwaytodebugmemoryusageistoaskthegcbuilt-inmoduletolisteveryobject currentlyknownbythegarbagecollector.Althoughit’squiteablunttool,thisapproach doesletyouquicklygetasenseofwhereyourprogram’smemoryisbeingused. Here,Irunaprogramthatwastesmemorybykeepingreferences.Itprintsouthowmany objectswerecreatedduringexecutionandasmallsampleofallocatedobjects. Clickheretoviewcodeimage #using_gc.py importgc found_objects=gc.get_objects() print(‘%dobjectsbefore’%len(found_objects)) importwaste_memory x=waste_memory.run() found_objects=gc.get_objects() print(‘%dobjectsafter’%len(found_objects)) forobjinfound_objects[:3]: print(repr(obj)[:100]) >>> 4756objectsbefore 14873objectsafter <waste_memory.MyObjectobjectat0x1063f6940> <waste_memory.MyObjectobjectat0x1063f6978> <waste_memory.MyObjectobjectat0x1063f69b0> Theproblemwithgc.get_objectsisthatitdoesn’ttellyouanythingabouthowthe objectswereallocated.Incomplicatedprograms,aspecificclassofobjectcouldbe allocatedmanydifferentways.Theoverallnumberofobjectsisn’tnearlyasimportantas identifyingthecoderesponsibleforallocatingtheobjectsthatareleakingmemory. Python3.4introducesanewtracemallocbuilt-inmoduleforsolvingthisproblem. tracemallocmakesitpossibletoconnectanobjectbacktowhereitwasallocated. Here,Iprintoutthetopthreememoryusageoffendersinaprogramusing tracemalloc: Clickheretoviewcodeimage #top_n.py importtracemalloc tracemalloc.start(10)#Saveupto10stackframes time1=tracemalloc.take_snapshot() importwaste_memory x=waste_memory.run() time2=tracemalloc.take_snapshot() stats=time2.compare_to(time1,‘lineno’) forstatinstats[:3]: print(stat) >>> waste_memory.py:6:size=2235KiB(+2235KiB),count=29981(+29981), average=76B waste_memory.py:7:size=869KiB(+869KiB),count=10000(+10000),average=89 B waste_memory.py:12:size=547KiB(+547KiB),count=10000(+10000),average=56 B It’simmediatelyclearwhichobjectsaredominatingmyprogram’smemoryusageand whereinthesourcecodetheywereallocated. Thetracemallocmodulecanalsoprintoutthefullstacktraceofeachallocation(up tothenumberofframespassedtothestartmethod).Here,Iprintoutthestacktraceof thebiggestsourceofmemoryusageintheprogram: Clickheretoviewcodeimage #with_trace.py #… stats=time2.compare_to(time1,‘traceback’) top=stats[0] print(‘\n’.join(top.traceback.format())) >>> File“waste_memory.py”,line6 self.x=os.urandom(100) File“waste_memory.py”,line12 obj=MyObject() File“waste_memory.py”,line19 deep_values.append(get_data()) File“with_trace.py”,line10 x=waste_memory.run() Astacktracelikethisismostvaluableforfiguringoutwhichparticularusageofa commonfunctionisresponsibleformemoryconsumptioninaprogram. Unfortunately,Python2doesn’tprovidethetracemallocbuilt-inmodule.Thereare opensourcepackagesfortrackingmemoryusageinPython2(suchasheapy),though theydonotfullyreplicatethefunctionalityoftracemalloc. ThingstoRemember ItcanbedifficulttounderstandhowPythonprogramsuseandleakmemory. Thegcmodulecanhelpyouunderstandwhichobjectsexist,butithasno informationabouthowtheywereallocated. Thetracemallocbuilt-inmoduleprovidespowerfultoolsforunderstandingthe sourceofmemoryusage. tracemallocisonlyavailableinPython3.4andabove. Index Symbols %r,forprintablestrings,203 %s,forhuman-readablestrings,202 *operator,liabilityof,44–45 *symbol,forkeyword-onlyarguments,52–53 *args optionalkeywordargumentsand,48 variablepositionalargumentsand,43–45 **kwargs,forkeyword-onlyarguments,53–54 A __all__specialattribute avoiding,183 listingallpublicnames,181–183 ALL_CAPSformat,3 Allocationofmemory,tracemallocmoduleand,214–216 APIs(applicationprogramminginterfaces) future-proofing,186–187 internal,allowingsubclassaccessto,80–82 packagesprovidingstable,181–184 rootexceptionsand,184–186 usingfunctionsfor,61–64 appendmethod,36–37 Arguments defensivelyiteratingover,38–42 keyword,45–48 keyword-only,51–54 optionalpositional,43–45 asclauses,inrenamingmodules,181 astargets,withstatementsand,155–156 assertEqualhelpermethod,verifyingequality,206 assertRaiseshelpermethod,verifyingexceptions,206 assertTruehelpermethod,forBooleanexpressions,206 asynciobuilt-inmodule,vs.blockingI/O,125 AttributeErrorexceptionraising,102–103 Attribute(s).SeealsoPrivateattributes;Publicattributes addingmissingdefaultvalues,159–160 lazilyloading/saving,100–105 metaclassesannotating,112–115 names,conflictsover,81–82 B Binarymode,forreading/writingdata,7 Binarytreeclass,inheritingfromcollections.abc,84–86 bisectmodule,forbinarysearches,169 Blockingoperations,inQueueclass,132–136 Bookkeeping withdictionaries,55–58 helperclassesfor,58–60 btcommand,ofinteractivedebugger,208 Buffersizes,inQueueclass,132–136 Bytecode,interpreterstatefor,122 bytesinstances,forcharactersequences,5–7 C __call__specialmethod,withinstances,63–64 callablebuilt-infunction,63 CapitalizedWordformat,3 Centralprocessingunit.SeeCPU(centralprocessingunit) C-extensionmodules forCPUbottlenecks,145 problemswith,146 chainfunction,ofitertoolsmodule,170 Childclasses,initializingparentclassesfrom,69–73 Childprocesses,subprocessmanaging,118–121 Circulardependencies dynamicimportsresolving,191–192 importreorderingfor,189–190 import/configure/runstepsfor,190–191 inimportingmodules,187–188 refactoringcodefor,189 Clarity,withkeywordarguments,51–54 Classinterfaces @propertymethodimproving,91–94 usepublicattributesfordefining,87–88 classstatements,metaclassesreceiving,106–107 __class__variable registeringclassesand,108–112 superbuilt_infunctionwith,73 Classes.SeealsoMetaclasses;Subclasses annotatingpropertiesof,112–115 forbookkeeping,58–60 docstringsfor,177 initializingparent,69–73 metaclassesregistering,108–112 mix-in,73–78 versioning,160–161 @classmethod inaccessingprivateattributes,78–79 polymorphism,forconstructingobjectsgenerically,67–69 Closures,interactingwithvariablescope,31–36 collectionsmodule defaultdictclassfrom,168 dequeclassfrom,166–167 OrderedDictclassfrom,167–168 collections.abcmodule,customcontainersinheritingfrom,84–86 combinationfunction,ofitertoolsmodule,170 Command-lines correctPythonversion,1,2 startingchildprocesses,119–120 communicatemethod readingchildprocessoutput,118–119 timeoutparameterwith,121 Community-builtmodules,PythonPackageIndexfor,173–174 Complexexpressions,helperfunctionsand,8–10 Concurrency coroutinesand,137–138 defined,117 inpipelines,129–132 Queueclassand,132–136 concurrent.futuresbuilt-inmodule,enablingparallelism,146–148 configparserbuilt-inmodule,forproductionconfiguration,201 Containers inheritingfromcollections.abc,84–86 iterable,41–42 contextlibbuilt-inmodule,enablingwithstatements,154–155 contextmanagerdecorator purposeof,154 astargetsand,155–156 continuecommand,ofinteractivedebugger,209 Conway’sGameofLife,coroutinesand,138–143 CoordinatedUniversalTime(UTC),intimeconversions,162–165 copyregbuilt-inmodule addingmissingattributevalues,159–160 controllingpicklebehavior,158 providingstableimportpaths,161–162 versioningclasseswith,160–161 Coroutines inConway’sGameofLife,138–143 purposeof,137–138 inPython2,143–145 countmethod,forcustomcontainertypes,85–86 cProfilemodule,foraccurateprofiling,210–213 CPU(centralprocessingunit) bottleneckdifficulties,145–146 time,threadswasting,131–132 usage,childprocessesand,118–121 CPythoninterpreter,effectofGILon,122–123 CPythonruntime memorymanagementwith,214 cumtimecolumn,inprofilerstatistics,211 cumtimepercallcolumn,inprofilerstatistics,211 cyclefunction,ofitertoolsmodule,170 D Datamodels,@propertyimproving,91–95 Dataraces,Lockpreventing,126–129 datetimebuilt-inmodule,fortimeconversions,164–166 deactivatecommand,disablingpyvenvtool,195–196 Deadlocks,timeoutparameteravoiding,121 Deallocationofmemory,tracemallocmanaging,214–216 Debuggers,decoratorproblemswith,151,153 Debugging interactive,withpdbmodule,208–209 memoryusage,214–216 printfunctionand,202 reprstringsfor,202–204 rootexceptionsfor,185–186 Decimalclass,fornumericalprecision,171–173 Decorators,functionalityof,151–153 functools,151–153 Defaultarguments approachtoserialization,159–160 namedtupleclassesand,59 usingdynamicvaluesfor,48–51 Defaultvaluehooks,62–64 defaultdictusing,62–64 Defaultvalues copyregbuilt-inmoduleand,159–160 ofkeywordarguments,46–47 defaultdictclass,fordictionaries,168 Dependencies circular,187–192 reproducing,196–197 transitive,192–194 Dependencyinjection,191 Deploymentenvironments,module-scopedcodefor,199–201 dequeclass,asdouble-endedqueue,166–167 Descriptors enablingreusablepropertylogic,90 inmodifyingclassproperties,112–115 forreusable@propertymethods,97–100 Deserializingobjects defaultattributevaluesand,159–160 picklebuilt-inmodulefor,157–158 stableimportpathsand,161–162 Developmentenvironment,uniqueconfigurations/assumptionsfor,199–201 Diamondinheritance,initializingparentclassesand,70–71 __dict__attribute,viewingobjectinternals,204 Dictionaries bookkeepingwith,55–58 comprehensionexpressionsin,16 default,168 ordered,167–168 translatingrelatedobjectsinto,74–75 __doc__specialattribute,retrievingdocstrings,175–176 Docstrings class-level,177 documentingdefaultbehaviorin,48–51 forfunctions,178–179 importance/placementof,175–176 module,176–177 doctestbuilt-inmodule,179 Documentation docstringsfor.SeeDocstrings importanceof,175 Documentation-generationtools,176 Double-endedqueues,dequeclassesas,166–167 __double_leading_underscoreformat,3 downcommand,ofinteractivedebugger,209 dropwhilefunction,ofitertoolsmodule,170 Dynamicimports avoiding,192 resolvingcirculardependencies,191–192 Dynamicstate,defined,55 E elseblocks afterfor/whileloops,23–25 duringexceptionhandling,26–27 endindexes,inslicingsequences,10–13 __enter__method,indefiningnewclasses,154 enumeratebuilt-infunction,preferredfeaturesof,20–21 environdictionary,tailoringmoduleswith,201 evalbuilt-infunction,forre-creatingoriginalvalues,203 Exceptions raising,29–31 root,184–187 try/finallyblocksand,26–28 Executiontime,optimizationof,209–213 __exit__method,indefiningnewclasses,154 Expressions inlistcomprehensions,16–18 PEP8guidancefor,4 F filterbuilt-infunction,listcomprehensionsvs.,15–16 filterfalsefunction,ofitertoolsmodule,170 finallyblocks,duringexceptionhandling,26–27 First-in-first-outqueues,dequeclassfor,166–167 forloops elseblocksafter,23–25 iteratorprotocoland,40–42 Fractionclass,fornumericalprecision,172 Functions closure/variablescopeinteraction,31–36 decorated,151–153 docstringsfor,178–179 exceptionsvs.returnNone,29–31 asfirst-classobjects,32,63–64 generatorvs.returninglists,36–38 iteratingoverarguments,38–42 keywordargumentsfor,45–48 keyword-onlyargumentsfor,51–54 optionalpositionalargumentsfor,43–45 forsimpleinterfaces,61–64 simultaneous,coroutinesfor,137–138 functoolsbuilt-inmodule,fordefiningdecorators,152–153 G GameofLife,coroutinesin,138–143 Garbagecollector,cleanupby,99 gcbuilt-inmodule,debuggingmemoryusage,214–215 Generator(s) coroutineextensionsof,137–138 expressions,forlargecomprehensions,18–20 returninglistsvs.,36–38 Genericclassmethod,forconstructingobjects,67–69 Genericfunctionality,withmix-inclasses,74–78 __get__method,fordescriptorprotocol,97–100 __getattr__specialmethod,tolazilyloadattributes,100–103 __getattribute__method,accessinginstancevariablesin,104–105 __getattribute__method,descriptorprotocoland,98–100 __getattribute__specialmethod,forrepeatedaccess,102–105 __getitem__specialmethod customimplementationof,84–86 inslicingsequences,10 Gettermethods descriptorprotocolfor,98–100 problemswithusing,87–88 providingwith@property,88–89 GIL(globalinterpreterlock) corruptionofdatastructuresand,126–127 defined,122 preventingparallelisminthreads,122–125,145,146–147 Globalscope,33 H hasattrbuilt-infunction,determiningexistenceofproperties,103 hashlibbuilt-inmodule,120 heappopfunction,forpriorityqueues,168–169 heappushfunction,forpriorityqueues,168–169 heapqmodule,forpriorityqueues,168–169 helpfunction decoratorproblemswith,152–153 ininteractivedebugger,208 Helperclasses forbookkeeping,58–60 providingstatefulclosurebehavior,62–63 Helperfunctions,complexexpressionsinto,8–10 Hooks toaccessmissingattributes,100–105 defaultvalue,62–64 functionsactingas,61–62 inmodifyingclassproperties,113 I IEEE754(IEEEStandardforFloating-PointArithmetic),171–172 if/elseexpressions,forsimplification,9–10 import*statements avoiding,183–184 inprovidingstableAPIs,182–183 Importpaths,stable,copyregproviding,161–162 Importreordering,forcirculardependencies,189–190 importstatements asdynamicimports,191–192 withpackages,180–181 Incrementinginplace,publicattributesfor,88 indexmethod,forcustomcontainertypes,85–86 Infiniterecursion,super()functionavoiding,101–105 Inheritance fromcollections.abc,84–86 methodresolutionorder(MRO)and,71 multiple,formix-inutilityclasses,77–78 __init__method assingleconstructorperclass,67,69 initializingparentclass,69–71 __init__.py definingpackages,180 modifying,182 Initializingparentclasses __init__methodfor,69–71 methodresolutionorder(MRO)and,71 superbuilt-infunctionfor,70–73 Integrationtests,207 Interactivedebugging,withpdb,208–209 Intermediaterootexceptions,future-proofingAPIs,186–187 I/O(input/output) betweenchildprocesses,118–121 threadsforblockingI/O,124–125 IOError,exceptblocksand,26–27 IronPythonruntime,1,2 isinstance bytes/str/unicodeand,5–6 withcoroutines,142 dynamictypeinspectionwith,74–75 metaclassesand,114 picklemoduleand,158 testingand,205 islicefunction,ofitertoolsmodule,170 iterbuilt-infunction,41–42 __iter__method asgenerator,41–42 iterablecontainerclass,defined,41–42 Iteratorprotocol,40–42 Iterators asfunctionarguments,39 generatorsreturning,37–38 zipfunctionprocessing,21–23 itertoolsbuilt-inmodule functionsof,169–170 izip_longestfunction,foriteratinginparallel,23 J joinmethod,ofQueueclass,132–136 Jythonruntime,1,2 K Keywordarguments constructingclasseswith,58 dynamicdefaultargumentvalues,48–51 providingoptionalbehavior,45–48 Keyword-onlyarguments forclarity,51–53 inPython2,53–54 L lambdaexpression askeyhook,61 vs.listcomprehensions,15–16 producingiteratorsand,40 inprofiling,210–212 Languagehooks,formissingattributes,100–105 Lazyattributes,__getattr__/__setattr__/__getattribute__for,100–105 _leading_underscoreformat,3 Leakybucketquota,implementing,92–95 lenbuilt-infunction,forcustomsequencetypes,85 __len__specialmethod,forcustomsequencetypes,85 listbuilt-intype,performanceasFIFOqueue,166–167 Listcomprehensions generatorexpressionsfor,18–20 insteadofmap/filter,15–16 numberofexpressionsin,16–18 listtype,subclassing,83–84 Lists,slicing,10–13 localsbuilt-infunction,152,208 localtimefunction,fromtimemodule,163–164 Lockclass preventingdataraces,126–129 inwithstatements,153–154 Logging debugfunctionfor,154–156 severitylevels,154–155 Loops elseblocksafter,23–25 inlistcomprehensions,16–18 range/enumeratefunctions,20–21 lowercase_underscoreformat,3 M mapbuilt-infunction,listcomprehensionsvs.,15–16 Memory coroutineuseof,137 threadsrequiring,136 Memoryleaks bydescriptorclasses,99–100 identifying,214–216 Memorymanagement,withtracemallocmodule,214–216 Meta.__new__method inmetaclasses,107 settingclassattributes,114 __metaclass__attribute,inPython2,106–107 Metaclasses annotatingattributeswith,112–115 forclassregistration,108–112 defined,87,106 validatingsubclasses,105–108 methodresolutionorder(MRO),forsuperclassinitializationorder,70–73 Mix-inclasses composingfromsimplebehaviors,74–75 defined,73–74 pluggablebehaviorsfor,75–76 utility,creatinghierachiesof,77–78 mktime,fortimeconversion,163,165 Mockfunctionsandclasses unittest.mockbuilt-inmodule,206 __module__attribute,106,153 Modules breakingcirculardependenciesin,187–192 community-built,173–174 docstrings,176–177 packagesfororganizing,179–184 providingstableAPIsfrom,181–184 tailoringfordeploymentenvironment,199–201 Module-scopedcode,fordeploymentenvironments,199–201 MRO(methodresolutionorder),forsuperclassinitializationorder,70–73 Multipleconditions,inlistcomprehensions,16–18 Multipleinheritance,formix-inutilityclasses,73–78 Multipleiterators,zipbuilt-infunctionand,21–23 Multipleloops,inlistcomprehensions,16–18 multiprocessingbuilt-inmodule,enablingparallelism,146–148 Mutual-exclusionlocks(mutex) GILas,122 Lockclassas,126–129 inwithstatements,153–154 N __name__attributeindefiningdecorators,151,153 inregisteringclasses,109–110 testingand,206 namedtupletype definingclasses,58 limitationsof,59 NameErrorexception,33 Namespacepackages,withPython3.4,180 Namingconflicts,privateattributestoavoid,81–82 Namingstyles,3–4 ncallscolumninprofilerstatistics,211 __new__method,ofmetaclasses,106–108 nextbuilt-infunction,41–42 nextcommand,ofinteractivedebugger,209 __next__specialmethod,iteratorobjectimplementing,41 Noisereduction,keywordargumentsand,45–48 Nonevalue functionsreturning,29–31 specifyingdynamicdefaultvalues,48–51 nonlocalstatement,inclosuresmodifyingvariables,34–35 nsmallestfunction,forpriorityqueues,168–169 Numericalprecision,withDecimalclass,171–173 O Objects,accessingmissingattributesin,100–105 On-the-flycalculations,using@propertyfor,91–95 Optimization,profilingpriorto,209–213 Optionalarguments keyword,47–48 positional,43–45 OrderedDictclass,fordictionaries,167–168 OverflowErrorexceptions,51 P Packages dividingmodulesintonamespaces,180–181 asmodulescontainingmodules,179–180 providingstableAPIswith,181–184 Parallelism avoidingthreadsfor,122–123 childprocessesand,118–121 concurrent.futuresfortrue,146–148 corruptionofdatastructuresand,126–128 defined,117 needfor,145–146 Parentclasses accessingprivateattributesof,79–81 initializing,70–73 pdbbuilt-inmodule,forinteractivedebugging,208–209 pdb.set_trace()statements,208–209 PEP8(PythonEnhancementProposal#8)styleguide expression/statementrules,4 namingstylesin,3–4,80 overviewof,2–3 whitespacerules,3 permutationsfunction,ofitertoolsmodule,170 picklebuilt-inmodule addingmissingattributevalues,159–160 providingstableimportpathsfor,161–162 serializing/deserializingobjects,157–158 versioningclassesfor,160–161 pipcommand-linetool reproducingenvironments,196–197 transitivedependenciesand,192–193 forutilizingPackageIndex,173 pipfreezecommand,savingpackagedependencies,196 Pipelines concurrencyin,129–131 problemswith,132 Queueclassbuilding,132–136 Polymorphism @classmethodsutilizing,65–69 defined,64 Popenconstructor,startingchildprocesses,118 Positionalarguments constructingclasseswith,58 keywordargumentsand,45–48 reducingvisualnoise,43–45 printfunction,fordebuggingoutput,202–203,208 print_statsoutput,forprofiling,213 Printablerepresentation,reprfunctionfor,202–204 Privateattributes accessing,78–80 allowingsubclassaccessto,81–83 indicatinginternalAPIs,80 ProcessPoolExecutorclass,enablingparallelism,147–148 productfunction,ofitertoolsmodule,170 Productionenvironment,uniqueconfigurationsfor,199–201 profilemodule,liabilitiesof,210 @propertymethod definingspecialbehaviorwith,88–89 descriptorsforreusing,97–100 givingattributesnewfunctionality,91–94 improvingdatamodelswith,95 numericalattributes,intoon-the-flycalculations,91–95 problemswithoverusing,95–96 unexpectedsideeffectsin,90–91 @property.setter,modifyingobjectstatein,91 pstatsbuilt-inmodule,extractingstatistics,211 Publicattributes accessing,78 definingnewclassinterfaceswith,87–88 givingnewfunctionalityto,91–94 preferredfeaturesof,80–82 Pylinttool,forPythonsourcecode,4 PyPI(PythonPackageIndex),forcommunity-builtmodules,173–174 PyPyruntime,1,2 Python2 coroutinesin,143–145 determininguseof,2 keyword-onlyargumentsin,53–54 metaclasssyntaxin,106–107 mutatingclosurevariablesin,35 strandunicodein,5–7 zipbuilt-infunctionin,22 Python3 classdecoratorsin,111 determininguseof,2 closuresandnonlocalstatementsin,34–35 keyword-onlyargumentsin,51–53 metaclasssyntaxin,106 strandbytesin,5–7 PythonEnhancementProposal#8.SeePEP8(PythonEnhancementProposal#8)style guide PythonPackageIndex(PyPI),forcommunity-builtmodules,173–174 Pythonthreads.SeeThreads pytzmodule installing,173 pyvenvtooland,194 fortimeconversions,165–166 pyvenvcommand-linetool purposeof,194 reproducingenvironments,196–197 forvirtualenvironments,194–196 Q quantizemethod,ofDecimalclass,fornumericaldata,172 Queueclass,coordinatingworkbetweenthreads,132–136 R rangebuilt-infunction,inloops,20 ReadtheDocscommunity-fundedsite,176 Refactoringattributes,@propertyinsteadof,91–95 Refactoringcode,forcirculardependencies,189 Registeringclasses,metaclassesfor,108–112 Repetitivecode composingmix-instominimize,74 keywordargumentseliminating,45–48 __repr__specialmethod,customizingclassprintablerepresentation,203–204 reprstrings,fordebuggingoutput,202–204 requirements.txtfile,forinstallingpackages,197 returncommand,ofinteractivedebugger,209 returnstatements ingenerators,140 notallowedinPython2generators,144 Rootexceptions findingbugsincodewith,185–186 future-proofingAPIs,186–187 insulatingcallersfromAPIs,184–185 Ruleofleastsurprise,87,90,91 runcallmethod,forprofiling,211–213 S Scopes,variable,closureinteractionwith,31–36 Scopingbug,inclosures,34 selectbuilt-inmodule,blockingI/O,121,124 Serializing,datastructures,109 Serializingobjects,pickleand defaultargumentapproachto,159–160 defaultattributevaluesand,159–160 picklebuilt-inmodulefor,157–158 stableimportpathsand,161–162 __set__method,fordescriptorprotocol,97–100 set_tracefunction,pdbmodulerunning,208–209 setattrbuilt-infunction annotatingclassattributesand,113 inbadthreadinteractions,127–128 lazyattributesand,101–102,104 __setattr__specialmethod,tolazilysetattributes,103–105 __setitem__specialmethod,inslicingsequences,10 Sets,comprehensionexpressionsin,16 setterattribute,for@propertymethod,88–89 Settermethods descriptorprotocolfor,98–100 liabilityofusing,87–88 providingwith@property,88–89 setuptools,invirtualenvironments,195–197 Singleconstructorperclass,67,69 Single-lineexpressions,difficultieswith,8–10 sixtool,inadoptingPython3,2 Slicingsequences basicfunctionsof,10–13 stridesyntaxin,13–15 Sort,keyargument,closurefunctionsas,31–32 sourcebin/activatecommand,enablingpyvenvtool,195 Speedup,concurrencyvs.parallelismfor,117 Starargs(*args),43 startindexes,inslicingsequences,10–13 Statements,PEP8guidancefor,4 Statictypechecking,lackof,204–205 Statsobject,forprofilinginformation,211–213 stepcommand,ofinteractivedebugger,209 StopIterationexception,39,41 strinstances,forcharactersequences,5–7 stridesyntax,inslicingsequences,13–15 strptimefunctions,conversionto/fromlocaltime,163–164 Subclasses allowingaccesstoprivatefields,81–83 constructing/connectinggenerically,65–69 listtype,83–84 TestCase,206–207 validatingwithmetaclasses,105–108 subprocessbuilt-inmodule,forchildprocesses,118–121 superbuilt-infunction,initializingparentclasses,71–73 supermethod,avoidinginfiniterecursion,101–105 Superclassinitializationorder,MROresolving,70–73 Syntax decorators,151–153 forclosuresmutatingvariables,34–35 forkeyword-onlyarguments,52–53 loopswithelseblocks,23 listcomprehensions,15 metaclasses,106 slicing,10–13 SyntaxErrorexceptions,dynamicimportsand,192 sysmodule,guidingmoduledefinitions,201 Systemcalls,blockingI/Oand,124–125 T takewhilefunction,ofitertoolsmodule,170 task_donecall,methodoftheQueueclass,inbuildingpipelines,134 teefunction,ofitertoolsmodule,170 Testmethods,206–207 TestCaseclasses,subclassing,206–207 threadingbuilt-inmodule,Lockclassin,126–129 ThreadPoolExecutorclass,notenablingparallelism,147–148 Threads blockingI/Oand,124–125 coordinatingworkbetween,132–136 parallelismpreventedby,122–123,145,146–147 preventingdataracesbetween,126–129 problemswith,136 usefulnessofmultiple,124 timebuilt-inmodule,limitationsof,163–164 Timezoneconversionmethods,162–166 timeoutparameter,inchildprocessI/O,121 tottimecolumn,inprofilerstatistics,211 tottimepercallcolumn,inprofilerstatistics,211 tracemallocbuilt-inmodule,formemoryoptimization,214–216 Transitivedependencies,192–194 try/exceptstatements,rootexceptionsand,185 try/except/else/finallyblocks,duringexceptionhandling,27–28 try/finallyblocks duringexceptionhandling,26–27 withstatementsprovidingreusable,154–155 Tuples extending,58 rulesforcomparing,32 asvalues,57 variableargumentsbecoming,44 zipfunctionproducing,21–23 TypeError exceptions,forkeyword-onlyarguments,53–54 rejectingiterators,41–42 tzinfoclass,fortimezoneoperations,164–165 U unicodeinstances,forcharactersequences,5–7 Unittests,207 unittestbuilt-inmodule,forwritingtests,205–207 UNIXtimestamp,intimeconversions,163–165 Unordereddictionaries,167 upcommand,ofinteractivedebugger,209 UTC(CoordinatedUniversalTime),intimeconversions,162–165 Utilityclasses,mix-in,creatinghierarchiesof,77–78 V Validationcode,metaclassesrunning,105–108 ValueErrorexceptions,30–31,184 Values fromiterators,40–42 tuplesas,57 validatingassignmentsto,89 Variablepositionalarguments keywordargumentsand,47–48 reducingvisualnoise,43–45 Variablescopes,closureinteractionwith,31–36 --versionflag,determiningversionofPython,1–2 Virtualenvironments pyvenvtoolcreating,194–196 reproducing,196–197 virtualenvcommand-linetool,194 Visualnoise,positionalargumentsreducing,43–45 W WeakKeyDictionary,purpooseof,99 weakrefmodule,buildingdescriptors,113 whileloops,elseblocksfollowing,23–25 Whitespace,importanceof,3 Wildcardimports,183 withstatements mutual-exclusionlockswith,153–154 forreusabletry/finallyblocks,154–155 astargetvaluesand,155–156 wrapshelperfunction,fromfunctools,fordefiningdecorators,152–153 Y yieldexpression incoroutines,137–138 ingeneratorfunctions,37 useincontextlib,155 yieldfromexpression,unsupportedinPython2,144 Z ZeroDivisionErrorexceptions,30–31,51 zipbuilt-infunction foriteratorsofdifferentlengths,170 processingiteratorsinparallel,21–23 zip_longestfunction,foriteratorsofdifferentlength,22–23,170 CodeSnippets