Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CMSSW Session
Bendedikt Hegner, Danilo Piparo
10-9-08
GridKa School of Computing
Outline
•
•
•
•
Framework concepts
CmsRun configuration syntax
Questions
CmsRun example
• CmsRun configuration and runtime debugging
• Some tools that help you with your daily work
• Questions
10-9-08
GridKa School of Computing
Basic concepts
• One executable: cmsRun
• Event Data Model based (EDM) on the Event:
– Single entity in memory: the edm::event container
– Modular content
• Modular architecture
– Module: component to be plugged in cmsRun
– Unit of clearly defined event-processing functionality
10-9-08
GridKa School of Computing
Flow of the data
•
•
•
•
•
Source
EDProducer
EDFilter
EDAnalyzer
OutputModule
Communication among modules only via the event
10-9-08
GridKa School of Computing
Framework modules
• EDProducer
- add objects into the event
• EDAnalyzer
- analyse objects from the event
• EDFilter
- can stop an execution path and put objects into the event
• EDLooper
- For multi-pass looping over events
• OutputModule
- Write events to a file. Can use filter decisions
10-9-08
GridKa School of Computing
Framework services
Examples of Framework Services are
- geometry, calibration, MessageLogger
• Types are:
- ESSource
Provides data which have an IOV (Interval of Validity)
- ESProducer
Creates Products when IOV changes
Can be not IOV dependent. E.g. MessageLogger
10-9-08
GridKa School of Computing
The data format
•
•
Events physically stored in rootfiles
Different file contents types:
–
–
–
–
–
–
10-9-08
FEVT: Full EVnT
RECO: RECOnstruction
RECOSIM: RECOnstruction + selected simulation information
AOD: Analysis Object Data (compact subset of RECO)
AODSIM: AOD + generator information
….
GridKa School of Computing
Files: acces with ROOT
Root.exe
[] TFile f (“myfile.root”)
[] TBrowser b
10-9-08
GridKa School of Computing
Files: access with FWLite
Automatic library loading
[] gSystem->Load(“libFWCoreFWLite”)
[] AutoLibraryLoader::enable()
[] new TBrowser()
10-9-08
GridKa School of Computing
Accessing Event Data
Inside the event, data are called “Product”
moduleLabel:productInstanceLabel:processName
// by module and default product label
Handle<TrackVector> trackPtr;
iEvent.getByLabel("tracker", trackPtr );
// by module and product label
Handle<SimHitVector> simPtr;
iEvent.getByLabel("detsim", "pixel" ,simPtr );
// by type
vector<Handle<SimHitVector> > allPtr;
iEvent.getByType( allPtr );
// by Selector
ParameterSelector<int> coneSel("coneSize",5);
Handle<JetVector> jetPtr;
iEvent.get( coneSel, jetPtr );
10-9-08
GridKa School of Computing
Python Interlude
•
•
•
•
•
•
Widely used
Interpreted
Object oriented (“everything is an object”)
Intuitive and nice syntax – Easy to learn
Extensive std library – lots of extensions
Call by refernce:
– Except integers and short strings (immutable types)
– Be careful about this
• Control flow based on indentation
• Lot of flexibility for the user!
• Only supported config since CMSSW_2_1_0
Python site: www.python.org
Yet another Python Course: www-ekp.physik.uni-karlsruhe.de/~piparo/cgi-bin/python_seminar.py
10-9-08
GridKa School of Computing
A general Philosophy
• As in C++ / C everithing is a pointer ...
• As in Linux everything is a file (cat /proc/meminfo) ...
In Python everything is an object
10-9-08
GridKa School of Computing
Configuration Files - 1/2
Definition of terms: Python module
• A python file that is meant to be included by other files
• Placed in Subsystem/Package/python/ or a subdirectory of it
• CMS file name convention:
– Definition of a single framework object:
– A fragment of configuration commands:
– A full process definition:
_cfi.py
_cff.py
_cfg.py
• To make your module visible to other python modules:
– Be sure your SCRAM environment is set up
– Go to your package and do scram b or scram b python
– Needed only once
• Correctness of python config files is checked on a basic level every
time scram is used.
10-9-08
GridKa School of Computing
Configuration Files - 2/2
Definition of terms: configuration file
•
•
•
•
Controls the final job to be run
Contains a cms.Process object named process
Usually placed in a package’s python/ or test/
Can be checked for completeness doing
python myExample_cfg.py (Python interpreter)
• Can be run using cmsRun
cmsRun myExample_cfg.py
Process Object:
“the protagonist” of the configuration
10-9-08
GridKa School of Computing
The process object
An usual process definition
• Process
process = cms.Process(„RECO‟)
• An input source
process.source = cms.Source(„PoolSource‟, ...)
• Some modules
process.jets = cms.EDProducer(„JetProducer‟)
• Some services
process.tracer = cms.Service(„Tracer‟)
• Some execution paths
process.p1 = cms.Path(process.a * process.b)
• Maybe output modules
process.out = cms.OutputModule(...
• Even full processes can be imported from elsewhere
10-9-08
GridKa School of Computing
CMS Python Config Types – 1/2
• Most of the objects you’ll create will be of a CMS-specific type. To
make them known to the interpreter you do:
import FWCore.ParameterSet.Config as cms
• Objects are then created with a syntax like:
jets = cms.EDProducer(„JetReco‟,
coneSize = cms.double(0.4),
debug = cms.untracked.bool(True)
)
Warning:
The comma between the parameters is very important!
Forgetting the comma after line n results in a syntax error reported for
line n+1. So not straightforward to track down!
10-9-08
GridKa School of Computing
CMS Python Config Types – 2/2
CMS Python Types:
10-9-08
GridKa School of Computing
How to import objects – 1/2
• To fetch all modules from some other module into local namespace
from Subsystem.Package.Foo_cff import *
(looks into Subsystem/Package/python/Foo_cff.py)
• To fetch only single objects you can do
from Subsystem.Package.Foo_cff import a,b
• To use objects without putting them in the local namespace
import Subsystem.Package.Foo_cff as foo
• With foo.a and foo.b you can now access the objects
• Don’t forget that all imports create references, not copies:
changing an object at one place
changes the object at other places
10-9-08
GridKa School of Computing
Cloning
• Sometimes you need to add a module which has almost the same
parameter as another one
• You can copy the module and change the parameters that need to be
modified
import ElectroWeakAnalysis.ZReco.zToMuMu_cfi as zmumu
zToMuMuGolden = zmumu.zToMuMu.clone(
massMin = cms.double(40)
)
• Changing while cloning should be preferred wrt clone + later replace as
it is a much safer practice.
10-9-08
GridKa School of Computing
How to import objects – 2/2
• To load everything from a python module into your process object
you can say:
process.load(„Subsystem.Package.Foo_cff‟)
• Technical detail. This is identical to
import Subsystem.Package.Foo_cff
process.extend(Subsystem.Package.Foo_cff)
10-9-08
GridKa School of Computing
Copying all the parameters from a PSET
• Give the PSet name directly after module type
• Has to happen before the named parameters
KtJetParameters = cms.PSet(
strategy = cms.string(“Best”)
)
ktCaloJets = cms.EDProducer(“KtCaloJetProducer”,
KtJetParameters,
coneSize = cms.double(0.7)
…
As parameters get copied a later change of KtJetParameters will
not get picked up
10-9-08
GridKa School of Computing
Sequences, Paths and Schedules
Sequence:
• Defines an execution order and acts as building block for more complex
configurations and contains modules or other sequences.
trDigi = cms.Sequence(siPixelDigis + siStripDigis)
Path:
• Defines which modules and sequences to run.
p1 = cms.Path(pdigi * reconstruction)
EndPath:
• A list of analyzers or output modules to be run after all paths have been run.
outpath = cms.EndPath(myOutput)
Schedule:
• Defines the execution order of paths. If not given first all paths, then all
endpaths.
process.schedule = cms.Schedule(process.p1, process.outpath)
10-9-08
GridKa School of Computing
Sequence operators
“+” as „follows‟:
• Use if the input of the previous module/sequence is not required
trDigi = cms.Sequence(siPixelDigis + siStripDigis)
“*” as „depends on‟:
• If module depends on previously created products
p1 = cms.Path(pdigi * reconstruction)
• Enforced and checked by scheduler
Combining:
• By using () grouping is possible
(ecalRecHits + hcalRecHits) * caloTowers
10-9-08
GridKa School of Computing
Filters in paths
• When an EDFilter is in a path, returning False will cause the path to
terminate
• Two operators ~ and - can modify this.
1. ~ means not. The filter will only continue if the filter returns False.
2. - means to ignore the result of the filter and proceed regardless
jet500_1000 = cms.Path(
~jet1000filter + jet500filter + jetAnalysis
)
10-9-08
GridKa School of Computing
Tracked / Untracked
Tracked parameters:
• Parameters that change the content of the event
• Will be saved in the event data provenance
• Cannot be optional
– if asked for and not found, exception will be thrown
– constructs to circumvent this policy are highly discouraged
a = cms.double(3.0)
Untracked parameters:
• Parameters that don’t affect the results, e.g. debug level
• Can have default values
– They’re used for optional parameters, but not really considered safe practice.
a = cms.untracked.double(3.0)
10-9-08
GridKa School of Computing
Modifying parameters
• You are free to reach inside any parameter and change it
from Subsystem.Package.foo_cff import *
foo.threshold = 4.0
• Or via the process object
process.foo.threshold = 4.0
• If the right hand side of the expression is a CMS type, a new parameter
can be created
# this line will fail because
g4.SimHits.useMagneticField =
# this line will create a new
# with wrong name
g4.SimHits.useMagneticField =
10-9-08
of typos
False
parameter
cms.bool(False)
GridKa School of Computing
Support and Documentation
• There is a twiki page in the SWGuide:
https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideAboutPythonConfigFile
• In case of questions and comments send an e-mail to
hn-cms-edmFramework
10-9-08
GridKa School of Computing
Backup
10-9-08
GridKa School of Computing
Yet another helloworld!
#! /usr/bin/env python
import os
# DP import the module for Operating System routines
operating_system = os.name
print "Hello World! I am running on a %s system!" %operating_system
$ ./helloworld.py
Hello World! I am running on a posix system!
Observation: os.py is physically a file: do not confuse CMSSW and Python
modules
More on the “import” statement to come!
10-9-08
GridKa School of Computing
Command line interpreter - 1/2
• Python has a very powerful command line interpreter
• Debugging
• Developing
Try your snippets in the command line interpreter
(incremental developing)
$ python
Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> operating_system = os.name
>>> print "Hello World! I am running on a %s system!" %operating_system
Hello World! I am running on a posix system!
>>>
• Help the user in several ways: the dir, help and type functions
- dir:
>>> dir (os)
['EX_CANTCREAT', 'EX_CONFIG‘… I CUT A BIT … 'mkdir', 'mkfifo', 'mknod', 'name',
'nice', 'open', 'openpty', 'pardir', 'path', 'pathconf', 'pathconf_names',
'pathsep', 'pipe', 'popen', 'popen2', 'popen3', 'popen4', 'putenv', 'read',
'readlink', … I CUT AGAIN …, 'walk', 'write']
10-9-08
GridKa School of Computing
Command line interpreter - 2/2
- help:
>>> help (os)
Help on module os:
NAME
os - OS routines for Mac, NT, or Posix depending on what system we're on.
FILE
/usr/lib/python2.5/os.py
MODULE DOCS
http://www.python.org/doc/current/lib/module-os.html
DESCRIPTION
This exports:
- all functions from posix, nt, os2, mac, or ce, e.g. unlink, stat, etc.
- os.path is one of the modules posixpath, ntpath, or macpath
- os.name is 'posix', 'nt', 'os2', 'mac', 'ce' or 'riscos'
- os.curdir is a string representing the current directory ('.' or ':')
- os.pardir is a string representing the parent directory ('..' or '::')
- os.sep is the (or a most common) pathname separator ('/' or ':' or '\\')
- os.extsep is the extension separator ('.' or '/')
- os.altsep is the alternate pathname separator (None or '/')
- os.pathsep is the component separator used in $PATH etc
- os.linesep is the line separator in text files ('\r' or '\n' or '\r\n')
Programs that import and use 'os' stand a better chance of being
- type:
>>> type(1), type ("Hello, I am a string")
(<type 'int'>, <type 'str’>
10-9-08
GridKa School of Computing
[MORE…]
An example of flexibility
class human:
def __init__(self,name,height):
self.name=name
self.height=height
Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class human:
...
def __init__(self,name,height):
...
self.name=name
...
self.height=height
>>> person = human (“Fabio",180)
>>> dir(person)
['__doc__', '__init__', '__module__', 'height', 'name']
>>> person.tail = True
>>> dir(person)
['__doc__', '__init__', '__module__', 'height', 'name', 'tail']
Despite the class definition, we can add attributes/members to the single instance
10-9-08
GridKa School of Computing
Python Enthusiasm
But let‟s forget about Programming Python and see how the configurations look like!
10-9-08
GridKa School of Computing
Sample
import FWCore.ParameterSet.Config as cms
process = cms.Process(“SIM”)
# Input source
process.source = cms.Source("PoolSource",
fileNames = cms.untracked.vstring('file:gen.root')
)
# Modules
process.load(“Configuration.StandardSequences.VtxSmearedGauss_cff”)
process.load(“SimG4Core.Application.g4SimHits_cfi”)
process.g4SimHits.UseMagneticField = False
# Output
process.load(“Configuration.EventContent.EventContent_cff”)
process.FEVT = cms.OutputModule("PoolOutputModule",
process.FEVTSIMEventContent,
fileName = cms.untracked.string('sim.root')
)
# Execution paths
process.p1 = cms.Path(process.VtxSmeared+process.g4SimHits)
10-9-08
GridKa School of Computing
Tools – 1/3
EdmPythonSearch
• A “grep”-like syntax to search for identifiers within imported files
> edmPythonSearch minPt Reconstruction_cff
...
RecoMuon.MuonIdentification.muons_cfi (line: 19) : minPt = cms.double(1.5),
...
10-9-08
GridKa School of Computing
Tools – 2/3
EdmPythonTree
• Gives an indended dump of which files are included by which files
(initial version for the old configs by Karoly Banicz and Sue Anne Koay)
> edmPythonTree Reconstruction_cff
+ Simulation_cff
+ Configuration.StandardSequences.Digi_cff
+ SimCalorimetry.Configuration.SimCalorimetry_cff
...
10-9-08
GridKa School of Computing
Tools – 2/3
Python
• The Python interpreter helps you inspecting your configs
> python -i Reconstruction_cff
• Ctrl-D shuts it down
10-9-08
GridKa School of Computing