Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CMSSW Session Bendedikt Hegner, Danilo Piparo 10-9-08 GridKa School of Computing Outline • • • • Framework concepts CmsRun configuration syntax Questions CmsRun example • CmsRun configuration and runtime debugging • Some tools that help you with your daily work • Questions 10-9-08 GridKa School of Computing Basic concepts • One executable: cmsRun • Event Data Model based (EDM) on the Event: – Single entity in memory: the edm::event container – Modular content • Modular architecture – Module: component to be plugged in cmsRun – Unit of clearly defined event-processing functionality 10-9-08 GridKa School of Computing Flow of the data • • • • • Source EDProducer EDFilter EDAnalyzer OutputModule Communication among modules only via the event 10-9-08 GridKa School of Computing Framework modules • EDProducer - add objects into the event • EDAnalyzer - analyse objects from the event • EDFilter - can stop an execution path and put objects into the event • EDLooper - For multi-pass looping over events • OutputModule - Write events to a file. Can use filter decisions 10-9-08 GridKa School of Computing Framework services Examples of Framework Services are - geometry, calibration, MessageLogger • Types are: - ESSource Provides data which have an IOV (Interval of Validity) - ESProducer Creates Products when IOV changes Can be not IOV dependent. E.g. MessageLogger 10-9-08 GridKa School of Computing The data format • • Events physically stored in rootfiles Different file contents types: – – – – – – 10-9-08 FEVT: Full EVnT RECO: RECOnstruction RECOSIM: RECOnstruction + selected simulation information AOD: Analysis Object Data (compact subset of RECO) AODSIM: AOD + generator information …. GridKa School of Computing Files: acces with ROOT Root.exe [] TFile f (“myfile.root”) [] TBrowser b 10-9-08 GridKa School of Computing Files: access with FWLite Automatic library loading [] gSystem->Load(“libFWCoreFWLite”) [] AutoLibraryLoader::enable() [] new TBrowser() 10-9-08 GridKa School of Computing Accessing Event Data Inside the event, data are called “Product” moduleLabel:productInstanceLabel:processName // by module and default product label Handle<TrackVector> trackPtr; iEvent.getByLabel("tracker", trackPtr ); // by module and product label Handle<SimHitVector> simPtr; iEvent.getByLabel("detsim", "pixel" ,simPtr ); // by type vector<Handle<SimHitVector> > allPtr; iEvent.getByType( allPtr ); // by Selector ParameterSelector<int> coneSel("coneSize",5); Handle<JetVector> jetPtr; iEvent.get( coneSel, jetPtr ); 10-9-08 GridKa School of Computing Python Interlude • • • • • • Widely used Interpreted Object oriented (“everything is an object”) Intuitive and nice syntax – Easy to learn Extensive std library – lots of extensions Call by refernce: – Except integers and short strings (immutable types) – Be careful about this • Control flow based on indentation • Lot of flexibility for the user! • Only supported config since CMSSW_2_1_0 Python site: www.python.org Yet another Python Course: www-ekp.physik.uni-karlsruhe.de/~piparo/cgi-bin/python_seminar.py 10-9-08 GridKa School of Computing A general Philosophy • As in C++ / C everithing is a pointer ... • As in Linux everything is a file (cat /proc/meminfo) ... In Python everything is an object 10-9-08 GridKa School of Computing Configuration Files - 1/2 Definition of terms: Python module • A python file that is meant to be included by other files • Placed in Subsystem/Package/python/ or a subdirectory of it • CMS file name convention: – Definition of a single framework object: – A fragment of configuration commands: – A full process definition: _cfi.py _cff.py _cfg.py • To make your module visible to other python modules: – Be sure your SCRAM environment is set up – Go to your package and do scram b or scram b python – Needed only once • Correctness of python config files is checked on a basic level every time scram is used. 10-9-08 GridKa School of Computing Configuration Files - 2/2 Definition of terms: configuration file • • • • Controls the final job to be run Contains a cms.Process object named process Usually placed in a package’s python/ or test/ Can be checked for completeness doing python myExample_cfg.py (Python interpreter) • Can be run using cmsRun cmsRun myExample_cfg.py Process Object: “the protagonist” of the configuration 10-9-08 GridKa School of Computing The process object An usual process definition • Process process = cms.Process(„RECO‟) • An input source process.source = cms.Source(„PoolSource‟, ...) • Some modules process.jets = cms.EDProducer(„JetProducer‟) • Some services process.tracer = cms.Service(„Tracer‟) • Some execution paths process.p1 = cms.Path(process.a * process.b) • Maybe output modules process.out = cms.OutputModule(... • Even full processes can be imported from elsewhere 10-9-08 GridKa School of Computing CMS Python Config Types – 1/2 • Most of the objects you’ll create will be of a CMS-specific type. To make them known to the interpreter you do: import FWCore.ParameterSet.Config as cms • Objects are then created with a syntax like: jets = cms.EDProducer(„JetReco‟, coneSize = cms.double(0.4), debug = cms.untracked.bool(True) ) Warning: The comma between the parameters is very important! Forgetting the comma after line n results in a syntax error reported for line n+1. So not straightforward to track down! 10-9-08 GridKa School of Computing CMS Python Config Types – 2/2 CMS Python Types: 10-9-08 GridKa School of Computing How to import objects – 1/2 • To fetch all modules from some other module into local namespace from Subsystem.Package.Foo_cff import * (looks into Subsystem/Package/python/Foo_cff.py) • To fetch only single objects you can do from Subsystem.Package.Foo_cff import a,b • To use objects without putting them in the local namespace import Subsystem.Package.Foo_cff as foo • With foo.a and foo.b you can now access the objects • Don’t forget that all imports create references, not copies: changing an object at one place changes the object at other places 10-9-08 GridKa School of Computing Cloning • Sometimes you need to add a module which has almost the same parameter as another one • You can copy the module and change the parameters that need to be modified import ElectroWeakAnalysis.ZReco.zToMuMu_cfi as zmumu zToMuMuGolden = zmumu.zToMuMu.clone( massMin = cms.double(40) ) • Changing while cloning should be preferred wrt clone + later replace as it is a much safer practice. 10-9-08 GridKa School of Computing How to import objects – 2/2 • To load everything from a python module into your process object you can say: process.load(„Subsystem.Package.Foo_cff‟) • Technical detail. This is identical to import Subsystem.Package.Foo_cff process.extend(Subsystem.Package.Foo_cff) 10-9-08 GridKa School of Computing Copying all the parameters from a PSET • Give the PSet name directly after module type • Has to happen before the named parameters KtJetParameters = cms.PSet( strategy = cms.string(“Best”) ) ktCaloJets = cms.EDProducer(“KtCaloJetProducer”, KtJetParameters, coneSize = cms.double(0.7) … As parameters get copied a later change of KtJetParameters will not get picked up 10-9-08 GridKa School of Computing Sequences, Paths and Schedules Sequence: • Defines an execution order and acts as building block for more complex configurations and contains modules or other sequences. trDigi = cms.Sequence(siPixelDigis + siStripDigis) Path: • Defines which modules and sequences to run. p1 = cms.Path(pdigi * reconstruction) EndPath: • A list of analyzers or output modules to be run after all paths have been run. outpath = cms.EndPath(myOutput) Schedule: • Defines the execution order of paths. If not given first all paths, then all endpaths. process.schedule = cms.Schedule(process.p1, process.outpath) 10-9-08 GridKa School of Computing Sequence operators “+” as „follows‟: • Use if the input of the previous module/sequence is not required trDigi = cms.Sequence(siPixelDigis + siStripDigis) “*” as „depends on‟: • If module depends on previously created products p1 = cms.Path(pdigi * reconstruction) • Enforced and checked by scheduler Combining: • By using () grouping is possible (ecalRecHits + hcalRecHits) * caloTowers 10-9-08 GridKa School of Computing Filters in paths • When an EDFilter is in a path, returning False will cause the path to terminate • Two operators ~ and - can modify this. 1. ~ means not. The filter will only continue if the filter returns False. 2. - means to ignore the result of the filter and proceed regardless jet500_1000 = cms.Path( ~jet1000filter + jet500filter + jetAnalysis ) 10-9-08 GridKa School of Computing Tracked / Untracked Tracked parameters: • Parameters that change the content of the event • Will be saved in the event data provenance • Cannot be optional – if asked for and not found, exception will be thrown – constructs to circumvent this policy are highly discouraged a = cms.double(3.0) Untracked parameters: • Parameters that don’t affect the results, e.g. debug level • Can have default values – They’re used for optional parameters, but not really considered safe practice. a = cms.untracked.double(3.0) 10-9-08 GridKa School of Computing Modifying parameters • You are free to reach inside any parameter and change it from Subsystem.Package.foo_cff import * foo.threshold = 4.0 • Or via the process object process.foo.threshold = 4.0 • If the right hand side of the expression is a CMS type, a new parameter can be created # this line will fail because g4.SimHits.useMagneticField = # this line will create a new # with wrong name g4.SimHits.useMagneticField = 10-9-08 of typos False parameter cms.bool(False) GridKa School of Computing Support and Documentation • There is a twiki page in the SWGuide: https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideAboutPythonConfigFile • In case of questions and comments send an e-mail to hn-cms-edmFramework 10-9-08 GridKa School of Computing Backup 10-9-08 GridKa School of Computing Yet another helloworld! #! /usr/bin/env python import os # DP import the module for Operating System routines operating_system = os.name print "Hello World! I am running on a %s system!" %operating_system $ ./helloworld.py Hello World! I am running on a posix system! Observation: os.py is physically a file: do not confuse CMSSW and Python modules More on the “import” statement to come! 10-9-08 GridKa School of Computing Command line interpreter - 1/2 • Python has a very powerful command line interpreter • Debugging • Developing Try your snippets in the command line interpreter (incremental developing) $ python Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> operating_system = os.name >>> print "Hello World! I am running on a %s system!" %operating_system Hello World! I am running on a posix system! >>> • Help the user in several ways: the dir, help and type functions - dir: >>> dir (os) ['EX_CANTCREAT', 'EX_CONFIG‘… I CUT A BIT … 'mkdir', 'mkfifo', 'mknod', 'name', 'nice', 'open', 'openpty', 'pardir', 'path', 'pathconf', 'pathconf_names', 'pathsep', 'pipe', 'popen', 'popen2', 'popen3', 'popen4', 'putenv', 'read', 'readlink', … I CUT AGAIN …, 'walk', 'write'] 10-9-08 GridKa School of Computing Command line interpreter - 2/2 - help: >>> help (os) Help on module os: NAME os - OS routines for Mac, NT, or Posix depending on what system we're on. FILE /usr/lib/python2.5/os.py MODULE DOCS http://www.python.org/doc/current/lib/module-os.html DESCRIPTION This exports: - all functions from posix, nt, os2, mac, or ce, e.g. unlink, stat, etc. - os.path is one of the modules posixpath, ntpath, or macpath - os.name is 'posix', 'nt', 'os2', 'mac', 'ce' or 'riscos' - os.curdir is a string representing the current directory ('.' or ':') - os.pardir is a string representing the parent directory ('..' or '::') - os.sep is the (or a most common) pathname separator ('/' or ':' or '\\') - os.extsep is the extension separator ('.' or '/') - os.altsep is the alternate pathname separator (None or '/') - os.pathsep is the component separator used in $PATH etc - os.linesep is the line separator in text files ('\r' or '\n' or '\r\n') Programs that import and use 'os' stand a better chance of being - type: >>> type(1), type ("Hello, I am a string") (<type 'int'>, <type 'str’> 10-9-08 GridKa School of Computing [MORE…] An example of flexibility class human: def __init__(self,name,height): self.name=name self.height=height Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> class human: ... def __init__(self,name,height): ... self.name=name ... self.height=height >>> person = human (“Fabio",180) >>> dir(person) ['__doc__', '__init__', '__module__', 'height', 'name'] >>> person.tail = True >>> dir(person) ['__doc__', '__init__', '__module__', 'height', 'name', 'tail'] Despite the class definition, we can add attributes/members to the single instance 10-9-08 GridKa School of Computing Python Enthusiasm But let‟s forget about Programming Python and see how the configurations look like! 10-9-08 GridKa School of Computing Sample import FWCore.ParameterSet.Config as cms process = cms.Process(“SIM”) # Input source process.source = cms.Source("PoolSource", fileNames = cms.untracked.vstring('file:gen.root') ) # Modules process.load(“Configuration.StandardSequences.VtxSmearedGauss_cff”) process.load(“SimG4Core.Application.g4SimHits_cfi”) process.g4SimHits.UseMagneticField = False # Output process.load(“Configuration.EventContent.EventContent_cff”) process.FEVT = cms.OutputModule("PoolOutputModule", process.FEVTSIMEventContent, fileName = cms.untracked.string('sim.root') ) # Execution paths process.p1 = cms.Path(process.VtxSmeared+process.g4SimHits) 10-9-08 GridKa School of Computing Tools – 1/3 EdmPythonSearch • A “grep”-like syntax to search for identifiers within imported files > edmPythonSearch minPt Reconstruction_cff ... RecoMuon.MuonIdentification.muons_cfi (line: 19) : minPt = cms.double(1.5), ... 10-9-08 GridKa School of Computing Tools – 2/3 EdmPythonTree • Gives an indended dump of which files are included by which files (initial version for the old configs by Karoly Banicz and Sue Anne Koay) > edmPythonTree Reconstruction_cff + Simulation_cff + Configuration.StandardSequences.Digi_cff + SimCalorimetry.Configuration.SimCalorimetry_cff ... 10-9-08 GridKa School of Computing Tools – 2/3 Python • The Python interpreter helps you inspecting your configs > python -i Reconstruction_cff • Ctrl-D shuts it down 10-9-08 GridKa School of Computing