Download CMSSW Session

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CMSSW Session
Bendedikt Hegner, Danilo Piparo
10-9-08
GridKa School of Computing
Outline
•
•
•
•
Framework concepts
CmsRun configuration syntax
Questions
CmsRun example
• CmsRun configuration and runtime debugging
• Some tools that help you with your daily work
• Questions
10-9-08
GridKa School of Computing
Basic concepts
• One executable: cmsRun
• Event Data Model based (EDM) on the Event:
– Single entity in memory: the edm::event container
– Modular content
• Modular architecture
– Module: component to be plugged in cmsRun
– Unit of clearly defined event-processing functionality
10-9-08
GridKa School of Computing
Flow of the data
•
•
•
•
•
Source
EDProducer
EDFilter
EDAnalyzer
OutputModule
Communication among modules only via the event
10-9-08
GridKa School of Computing
Framework modules
• EDProducer
- add objects into the event
• EDAnalyzer
- analyse objects from the event
• EDFilter
- can stop an execution path and put objects into the event
• EDLooper
- For multi-pass looping over events
• OutputModule
- Write events to a file. Can use filter decisions
10-9-08
GridKa School of Computing
Framework services
Examples of Framework Services are
- geometry, calibration, MessageLogger
• Types are:
- ESSource
Provides data which have an IOV (Interval of Validity)
- ESProducer
Creates Products when IOV changes
Can be not IOV dependent. E.g. MessageLogger
10-9-08
GridKa School of Computing
The data format
•
•
Events physically stored in rootfiles
Different file contents types:
–
–
–
–
–
–
10-9-08
FEVT: Full EVnT
RECO: RECOnstruction
RECOSIM: RECOnstruction + selected simulation information
AOD: Analysis Object Data (compact subset of RECO)
AODSIM: AOD + generator information
….
GridKa School of Computing
Files: acces with ROOT
Root.exe
[] TFile f (“myfile.root”)
[] TBrowser b
10-9-08
GridKa School of Computing
Files: access with FWLite
Automatic library loading
[] gSystem->Load(“libFWCoreFWLite”)
[] AutoLibraryLoader::enable()
[] new TBrowser()
10-9-08
GridKa School of Computing
Accessing Event Data
Inside the event, data are called “Product”
moduleLabel:productInstanceLabel:processName
// by module and default product label
Handle<TrackVector> trackPtr;
iEvent.getByLabel("tracker", trackPtr );
// by module and product label
Handle<SimHitVector> simPtr;
iEvent.getByLabel("detsim", "pixel" ,simPtr );
// by type
vector<Handle<SimHitVector> > allPtr;
iEvent.getByType( allPtr );
// by Selector
ParameterSelector<int> coneSel("coneSize",5);
Handle<JetVector> jetPtr;
iEvent.get( coneSel, jetPtr );
10-9-08
GridKa School of Computing
Python Interlude
•
•
•
•
•
•
Widely used
Interpreted
Object oriented (“everything is an object”)
Intuitive and nice syntax – Easy to learn
Extensive std library – lots of extensions
Call by refernce:
– Except integers and short strings (immutable types)
– Be careful about this
• Control flow based on indentation
• Lot of flexibility for the user!
• Only supported config since CMSSW_2_1_0
Python site: www.python.org
Yet another Python Course: www-ekp.physik.uni-karlsruhe.de/~piparo/cgi-bin/python_seminar.py
10-9-08
GridKa School of Computing
A general Philosophy
• As in C++ / C everithing is a pointer ...
• As in Linux everything is a file (cat /proc/meminfo) ...
In Python everything is an object
10-9-08
GridKa School of Computing
Configuration Files - 1/2
Definition of terms: Python module
• A python file that is meant to be included by other files
• Placed in Subsystem/Package/python/ or a subdirectory of it
• CMS file name convention:
– Definition of a single framework object:
– A fragment of configuration commands:
– A full process definition:
_cfi.py
_cff.py
_cfg.py
• To make your module visible to other python modules:
– Be sure your SCRAM environment is set up
– Go to your package and do scram b or scram b python
– Needed only once
• Correctness of python config files is checked on a basic level every
time scram is used.
10-9-08
GridKa School of Computing
Configuration Files - 2/2
Definition of terms: configuration file
•
•
•
•
Controls the final job to be run
Contains a cms.Process object named process
Usually placed in a package’s python/ or test/
Can be checked for completeness doing
python myExample_cfg.py (Python interpreter)
• Can be run using cmsRun
cmsRun myExample_cfg.py
Process Object:
“the protagonist” of the configuration
10-9-08
GridKa School of Computing
The process object
An usual process definition
• Process
process = cms.Process(„RECO‟)
• An input source
process.source = cms.Source(„PoolSource‟, ...)
• Some modules
process.jets = cms.EDProducer(„JetProducer‟)
• Some services
process.tracer = cms.Service(„Tracer‟)
• Some execution paths
process.p1 = cms.Path(process.a * process.b)
• Maybe output modules
process.out = cms.OutputModule(...
• Even full processes can be imported from elsewhere
10-9-08
GridKa School of Computing
CMS Python Config Types – 1/2
• Most of the objects you’ll create will be of a CMS-specific type. To
make them known to the interpreter you do:
import FWCore.ParameterSet.Config as cms
• Objects are then created with a syntax like:
jets = cms.EDProducer(„JetReco‟,
coneSize = cms.double(0.4),
debug = cms.untracked.bool(True)
)
Warning:
The comma between the parameters is very important!
Forgetting the comma after line n results in a syntax error reported for
line n+1. So not straightforward to track down!
10-9-08
GridKa School of Computing
CMS Python Config Types – 2/2
CMS Python Types:
10-9-08
GridKa School of Computing
How to import objects – 1/2
• To fetch all modules from some other module into local namespace
from Subsystem.Package.Foo_cff import *
(looks into Subsystem/Package/python/Foo_cff.py)
• To fetch only single objects you can do
from Subsystem.Package.Foo_cff import a,b
• To use objects without putting them in the local namespace
import Subsystem.Package.Foo_cff as foo
• With foo.a and foo.b you can now access the objects
• Don’t forget that all imports create references, not copies:
changing an object at one place
changes the object at other places
10-9-08
GridKa School of Computing
Cloning
• Sometimes you need to add a module which has almost the same
parameter as another one
• You can copy the module and change the parameters that need to be
modified
import ElectroWeakAnalysis.ZReco.zToMuMu_cfi as zmumu
zToMuMuGolden = zmumu.zToMuMu.clone(
massMin = cms.double(40)
)
• Changing while cloning should be preferred wrt clone + later replace as
it is a much safer practice.
10-9-08
GridKa School of Computing
How to import objects – 2/2
• To load everything from a python module into your process object
you can say:
process.load(„Subsystem.Package.Foo_cff‟)
• Technical detail. This is identical to
import Subsystem.Package.Foo_cff
process.extend(Subsystem.Package.Foo_cff)
10-9-08
GridKa School of Computing
Copying all the parameters from a PSET
• Give the PSet name directly after module type
• Has to happen before the named parameters
KtJetParameters = cms.PSet(
strategy = cms.string(“Best”)
)
ktCaloJets = cms.EDProducer(“KtCaloJetProducer”,
KtJetParameters,
coneSize = cms.double(0.7)
…
As parameters get copied a later change of KtJetParameters will
not get picked up
10-9-08
GridKa School of Computing
Sequences, Paths and Schedules
Sequence:
• Defines an execution order and acts as building block for more complex
configurations and contains modules or other sequences.
trDigi = cms.Sequence(siPixelDigis + siStripDigis)
Path:
• Defines which modules and sequences to run.
p1 = cms.Path(pdigi * reconstruction)
EndPath:
• A list of analyzers or output modules to be run after all paths have been run.
outpath = cms.EndPath(myOutput)
Schedule:
• Defines the execution order of paths. If not given first all paths, then all
endpaths.
process.schedule = cms.Schedule(process.p1, process.outpath)
10-9-08
GridKa School of Computing
Sequence operators
“+” as „follows‟:
• Use if the input of the previous module/sequence is not required
trDigi = cms.Sequence(siPixelDigis + siStripDigis)
“*” as „depends on‟:
• If module depends on previously created products
p1 = cms.Path(pdigi * reconstruction)
• Enforced and checked by scheduler
Combining:
• By using () grouping is possible
(ecalRecHits + hcalRecHits) * caloTowers
10-9-08
GridKa School of Computing
Filters in paths
• When an EDFilter is in a path, returning False will cause the path to
terminate
• Two operators ~ and - can modify this.
1. ~ means not. The filter will only continue if the filter returns False.
2. - means to ignore the result of the filter and proceed regardless
jet500_1000 = cms.Path(
~jet1000filter + jet500filter + jetAnalysis
)
10-9-08
GridKa School of Computing
Tracked / Untracked
Tracked parameters:
• Parameters that change the content of the event
• Will be saved in the event data provenance
• Cannot be optional
– if asked for and not found, exception will be thrown
– constructs to circumvent this policy are highly discouraged
a = cms.double(3.0)
Untracked parameters:
• Parameters that don’t affect the results, e.g. debug level
• Can have default values
– They’re used for optional parameters, but not really considered safe practice.
a = cms.untracked.double(3.0)
10-9-08
GridKa School of Computing
Modifying parameters
• You are free to reach inside any parameter and change it
from Subsystem.Package.foo_cff import *
foo.threshold = 4.0
• Or via the process object
process.foo.threshold = 4.0
• If the right hand side of the expression is a CMS type, a new parameter
can be created
# this line will fail because
g4.SimHits.useMagneticField =
# this line will create a new
# with wrong name
g4.SimHits.useMagneticField =
10-9-08
of typos
False
parameter
cms.bool(False)
GridKa School of Computing
Support and Documentation
• There is a twiki page in the SWGuide:
https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideAboutPythonConfigFile
• In case of questions and comments send an e-mail to
hn-cms-edmFramework
10-9-08
GridKa School of Computing
Backup
10-9-08
GridKa School of Computing
Yet another helloworld!
#! /usr/bin/env python
import os
# DP import the module for Operating System routines
operating_system = os.name
print "Hello World! I am running on a %s system!" %operating_system
$ ./helloworld.py
Hello World! I am running on a posix system!
Observation: os.py is physically a file: do not confuse CMSSW and Python
modules
More on the “import” statement to come!
10-9-08
GridKa School of Computing
Command line interpreter - 1/2
• Python has a very powerful command line interpreter
• Debugging
• Developing
Try your snippets in the command line interpreter
(incremental developing)
$ python
Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> operating_system = os.name
>>> print "Hello World! I am running on a %s system!" %operating_system
Hello World! I am running on a posix system!
>>>
• Help the user in several ways: the dir, help and type functions
- dir:
>>> dir (os)
['EX_CANTCREAT', 'EX_CONFIG‘… I CUT A BIT … 'mkdir', 'mkfifo', 'mknod', 'name',
'nice', 'open', 'openpty', 'pardir', 'path', 'pathconf', 'pathconf_names',
'pathsep', 'pipe', 'popen', 'popen2', 'popen3', 'popen4', 'putenv', 'read',
'readlink', … I CUT AGAIN …, 'walk', 'write']
10-9-08
GridKa School of Computing
Command line interpreter - 2/2
- help:
>>> help (os)
Help on module os:
NAME
os - OS routines for Mac, NT, or Posix depending on what system we're on.
FILE
/usr/lib/python2.5/os.py
MODULE DOCS
http://www.python.org/doc/current/lib/module-os.html
DESCRIPTION
This exports:
- all functions from posix, nt, os2, mac, or ce, e.g. unlink, stat, etc.
- os.path is one of the modules posixpath, ntpath, or macpath
- os.name is 'posix', 'nt', 'os2', 'mac', 'ce' or 'riscos'
- os.curdir is a string representing the current directory ('.' or ':')
- os.pardir is a string representing the parent directory ('..' or '::')
- os.sep is the (or a most common) pathname separator ('/' or ':' or '\\')
- os.extsep is the extension separator ('.' or '/')
- os.altsep is the alternate pathname separator (None or '/')
- os.pathsep is the component separator used in $PATH etc
- os.linesep is the line separator in text files ('\r' or '\n' or '\r\n')
Programs that import and use 'os' stand a better chance of being
- type:
>>> type(1), type ("Hello, I am a string")
(<type 'int'>, <type 'str’>
10-9-08
GridKa School of Computing
[MORE…]
An example of flexibility
class human:
def __init__(self,name,height):
self.name=name
self.height=height
Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class human:
...
def __init__(self,name,height):
...
self.name=name
...
self.height=height
>>> person = human (“Fabio",180)
>>> dir(person)
['__doc__', '__init__', '__module__', 'height', 'name']
>>> person.tail = True
>>> dir(person)
['__doc__', '__init__', '__module__', 'height', 'name', 'tail']
Despite the class definition, we can add attributes/members to the single instance
10-9-08
GridKa School of Computing
Python Enthusiasm
But let‟s forget about Programming Python and see how the configurations look like!
10-9-08
GridKa School of Computing
Sample
import FWCore.ParameterSet.Config as cms
process = cms.Process(“SIM”)
# Input source
process.source = cms.Source("PoolSource",
fileNames = cms.untracked.vstring('file:gen.root')
)
# Modules
process.load(“Configuration.StandardSequences.VtxSmearedGauss_cff”)
process.load(“SimG4Core.Application.g4SimHits_cfi”)
process.g4SimHits.UseMagneticField = False
# Output
process.load(“Configuration.EventContent.EventContent_cff”)
process.FEVT = cms.OutputModule("PoolOutputModule",
process.FEVTSIMEventContent,
fileName = cms.untracked.string('sim.root')
)
# Execution paths
process.p1 = cms.Path(process.VtxSmeared+process.g4SimHits)
10-9-08
GridKa School of Computing
Tools – 1/3
EdmPythonSearch
• A “grep”-like syntax to search for identifiers within imported files
> edmPythonSearch minPt Reconstruction_cff
...
RecoMuon.MuonIdentification.muons_cfi (line: 19) : minPt = cms.double(1.5),
...
10-9-08
GridKa School of Computing
Tools – 2/3
EdmPythonTree
• Gives an indended dump of which files are included by which files
(initial version for the old configs by Karoly Banicz and Sue Anne Koay)
> edmPythonTree Reconstruction_cff
+ Simulation_cff
+ Configuration.StandardSequences.Digi_cff
+ SimCalorimetry.Configuration.SimCalorimetry_cff
...
10-9-08
GridKa School of Computing
Tools – 2/3
Python
• The Python interpreter helps you inspecting your configs
> python -i Reconstruction_cff
• Ctrl-D shuts it down
10-9-08
GridKa School of Computing