Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DAQ upgrades at SLHC S. Cittolin CERN/CMS, 22/03/07 DAQ architecture. TDR-2003 DAQ evolution and upgrades DAQ baseline structure Parameters TDR design implementation Collision rate 40 MHz Level-1 Maximum trigger rate 100 kHz Average event size ≈ 1 Mbyte Data production ≈ Tbyte/day No. of In-Out units Readout network bandwidth Event filter computing power No. of PC motherboards 512 ≈ 1 Terabit/s ≈ 106 SI95 ≈ Thousands (SLHC: x 10) S. Cittolin CMS/PH-CMD Evolution of DAQ technologies and structures PS:1970-80: Minicomputers Readout custom design First standard: CAMAC • kByte/s Detector Readout Event building On-line processing Off-line data store LEP:1980-90: Microprocessors HEP standards (Fastbus) Embedded CPU, Industry standards (VME) • MByte/s LHC: 200X: Networks/Grids IT commodities, PC, Clusters Internet, Web, etc. • GByte/s S. Cittolin CMS/PH-CMD DAQ-TDR layout Architecture: distributed DAQ An industry based Readout Network with Tbit/s aggregate bandwidth between the readout and event filter DAQ nodes (FED&FU) is achieved by two stages of switches interconnected by layers of intermediate data concentrators (FRL&RU&BU). Myrinet optical links driven by custom hardware are used as FED Builder and to transport the data to surface (D2S, 200m) Gb Ethernet switches driven by standard PCs is used in the event builder layer (DAQ slices and Terabit/s switches). Filter Farm and Mass storage are scalable and expandable clusters of services Equipment replacements (M+O): FU-PCs every 3 years Mass storage 4 years CMS/SC TriDAS Reviw DAQ-PC AR/CR05 &networks 5 years LHC --> SLHC •Same level 1 trigger rate, 20 MHz crossing, higher occupancy.. •A factor 10 (or more) in readout bandwidth, processing, storage •Factor 10 will come with technology. Architecture has to exploit it •New digitizers needed (and a new detector-DAQ interface) •Need for more flexibility in handling the event data and operating the experiment S. Cittolin CMS/PH-CMD 5 Computing and communication trends 1000Gbit/IN2 1Gbit/IN2 10Mbit/IN2 2020 2015 SLHC S.C. CMS-TriDAS 05-Mar-07 Industry Tbit/s (2004) Text Text http://www.force10networks.com/products/reports.asp •Above is a photograph of the configuration used to test the performance and resiliency metrics of the Force10 TeraScale E-Series. During the tests, the Force10 TeraScale E-Series demonstrated 1 billion packets per second throughput, making it the world's first Terabit switch/router, and ZERO packet loss hitless fail-over at Terabit speeds. The configuration required 96 Gigabit Ethernet and eight 10 Gigabit Ethernet ports from Ixia, in addition to cabling for 672 Gigabit Ethernet and 56 Ten Gigabit Ethernet. S.C. CMS-TriDAS 05-Mar-07 LHC-DAQ: trigger loop & data flow BU DCS EVB EVM FU FRL FB FES FEC FED TPG RTP RU Builder Unit Detector Control System EVent builder Event manager Filter Unit Front-end Readout Link FED Builder Front End System Front End Controller Front End Driver GTP Global Trigger Processor LV1 Regional Trigger Processors Trigger Primitive Generator TTC TIming, Trigger and Control TTS Trigger Throttle System RCMS Run Control and Monitor Regional Trigger Processor Readout Unit S.C. CMS-TriDAS BU DCS EVB EVM FU FRL FB FES FEC FED GTP LV1 TPG TTC TTS RCMS RTP RU Builder Unit Detector Control System from GTP to FED/FES EVent builder distributed by an optical tree (~1000 leaves) - Event LV1 20 manager MHz rep rate - Filter Reset,Unit B cmd 1 Mhz Rep rate - Front-end Controls data 1 Mbit/s Link Readout FED Builder Front End System Front(N End TTS toController 1) Frontfrom End FED Driver sTTS: to GTP collected by a FMM tree (~ 700 roots) Global Trigger Processor aTTS: FromTrigger DAQ nodes to TTS Regional Processors via service network (e.g. Ethernet) (~ 1000s nodes) Trigger Primitive Generator Transition in few BXs TIming, Trigger and Control Trigger Throttle System Run Control and Monitor FRL&DAQ networks (NxN) Regional Trigger Processor Data to Surface (D2S). FRL-FB Readout Unit 1000 Myrinet links and switches up to 2Tb/s TTC (1 to N) Readout Builders (RB). RUBU/FU 3000 ports GbEthernet switches 1Tb/s sustained LHC-DAQ: event description FED-FRL-RU TTC rx Event No. (LOCAL) Orbit/BX (LOCAL) Event No. (LOCAL) Orbit/BX (LOCAL) Event No. (LOCAL) Orbit/BX (LOCAL) FED data fragment FED data fragment FED data fragment Readout Unit from EVM EVB token BU destination Event fragments FED data fragment Filter Unit from GT Event No. (LOCAL) data Time (ABSolute) Run Number Event Type HLT info Full event data Built events Storage Manager Event #(Central) Time (ABSolute) Run Number Event Type HLT info Full Event Data S.C. CMS-TriDAS 9 SLHC: event handling Event #(Central) Time (ABSolute) Run Number Event Type HLT info •TTC: full event info in each event fragment distributed from the GTP-EVM via TTC Full Event Data EVM •FED-FRL interface e.g. PCI •FED-FRL system able to handle high level data Network Distributed clusters and computing services S.C. CMS-TriDAS communication protocols (e.g. TCP/IP) •FRL-DAQ commercial interface e.g. Ethernet (10 GBE or multi-GBE) to be used for data readout and FED configuration, control, local DAQ etc. New FRL design can be based on embedded PC (e.g. ARM like) or advanced FPGA etc. The network interface can be used in place of VME backplane to configure and control the new FED. Current FRL-RU systems can be used (with some limitations) for those detector not upgrading their FED design. 10 SLHC DAQ upgrade (architecture evolution) Another step toward uniform network architecture: •All DAQ sub-systems (FED-FRL included) are interfaced to a single multi-Terabit/s network. The same network technology (e.g. ethernet) is used to read and control all DAQ nodes (FED,FRL, RU, BU, FU, eMS, DQM,...) •In addition to the Level-1 Accept and to the other synchronous commands, the Trigger&Event Manager-TTC transmits to the FEDs the complete event description such as the event number, event type, orbit, BX and the event destination address that is the processing system (CPU, Cluster, TIER..) where the event has to be built and analyzed.... •The event fragment delivery and therefore the event building will be warranted by the network protocols and (commercial) network internal resources (buffers, multi-path, network processors, etc.). FED-FU push/pull protocols can be employed... •Real time buffers of Pbytes temporary storage disks will cover a real-time interval of days, allowing to the event selection tasks a better exploitation of the available distributed processing power S.C. CMS-TriDAS 05-Mar-07 11 SLHC: trigger info distribution Trigger synchronous data (20 MHz) (~ 100 bit/LV1) Clock distribution & Level 1 accept Fast commands (resets, orbit0, re-sync, switch context etc...) Fast parameters (e.g. trigger/readout type, Mask, etc.) Event synchronous data packet (~ 1 Gb/s, few thousands bits/LV1) Event Number Orbit/time and BX No. Run Number Event type Event destination(s) Event configuration 64 bits as before local counters will be used to sychronize and check... 64 bits 64 bits 256 bits 256 bits e.g. list of IP indexes (main EVB + DQMs..) 256 bits e.g. information for FU event builder Event fragment will contain the full event description etc.. Events can be built in Push or Pull mode. E.g. 1. and 2. will allow many TTC operations be done independently per TTC partition (e.g. local re-sych, local reset, partial readout etc..) aSynchronous control information Configuration parameters (delays, registers, contexts, etc.) Maintenance commands, auto-test. Additional features? Readout control signal collector (a la TTS)? sTTS replacement. Collect FED status (at LV1 rate from >1000 sources?) Collect local information (e.g. statistics etc.) on demand? Maintenance commands, auto-test, statistics etc. Configuration parameters (delays, registers, contexts, etc.) S.C. CMS-TriDAS 05-Mar-07 Summary DAQ design Architecture: will be upgraded enhancing scalability and flexibility and exploiting at maximum (by M+O) the commercial network and computing technologies. Software: configuring, controlling and monitoring a large set of heterogeneous and distributed computing systems will continue to be a major issue Hardware developments Increased performances and additional functionality's are required by the event data handling (use standard protocols, perform selective actions etc.) - A new timing, trigger and event info distribution system - A general front-end DAQ interface (for the new detector readout electronics) handling high level network protocols via commercial network cards The best DAQ R&D programme for the next years is the implementation of the 2003 DAQ-TDR Also required a permanent contact between SLHC developers and LHC-DAQ implementors to understand and define the new systems scope, requirements and specifications S.C. CMS-TriDAS 05-Mar-07 13 DAQ data flow and computing model Event rate Level-1 HLT output EVENT BUILDER NETWORK Multi Terabit/s Filter Farms & Analysis S.C. CMS-TriDAS 05-Mar-07 DQM&Storage servers FRL direct network access