* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download DCS-FEE during TPC commissioning
Survey
Document related concepts
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Network tap wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Dynamic Host Configuration Protocol wikipedia , lookup
Nonblocking minimal spanning switch wikipedia , lookup
Transcript
FeeCom software during TPC commissioning (Benchmarks) 22-01-2007 Normal text - click to edit Sebastian Bablok Dag Toppe Larsen Matthias Richter Benjamin Schockert Department of Physics and Technology, University of Bergen, Norway Center for Telecommunication and Technology Transfer, University of Applied Science Worms, Germany TOC TPC commissioning DCS –FEE part Normal text - click to edit Setup overview Observations Conclusion Benchmarks during commissioning results remarks Future plans Front-End-Electronics in DCS Control and monitor channels PVSS II Supervisory Layer (FED - Client) Normal text - click to edit Front-End Device Interface (FED) FED Server Control Layer Config. File InterComLayer Config. DB FEE Client Front-End Electronics Interface (FEE) Field Layer FeeServer FeeServer Load configuration data from file OR database FeeServer Cmd / ACK Channel Service Channel Internal Bus Systems Hardware Device Hardware Device Hardware Device Message Channel Schematically layout for commissioning External network Normal text - click to edit PVSS (incl. FedClient) Switch tpcfee01 (ICL) 100MBit/s 10MBit/s tpcfee02 (Test-FedClient) Internal network 100MBit/s Switch 6 DCS boards (FeeServer incl. TPC CE) DCS network setup Based on standard protcols/tools: DHCP, DNS, NFS DCS boards on private network 10.x.x.x Normal text - click to edit .feenet used as local TLD Board number used for MAC and IP addresses (24 LSB) and hostnamealias (dcs<board#>.feenet) Gateway running ICL provides communication with outside world Hostname in format tpc-fee_x_yy_z.feenet, dcs<board#>.feenet as alias FeeServer name set from hostname FeeServer stored on and run from external NFS share Logs written to NFS share DCS bootup MAC address set to board number DCS board sends MAC address to DHCP server, requesting IP address and hostname Normal text - click to edit DHCP server looks up IP address for MAC address, then queries Domain Name Server for hostname matching IP-address DHCP server returns IP configuration and hostname to DCS board DCS board mounts two NFS shares – one RO and one RW Boot-script run from RO shared directory May start update scripts Starts FeeServer with hostname as FeeServer name and logs outputed to RW share Cables DCS-side: Normal text - click to edit Uses non-standard connector without any locking May easily fall out Connectors are glued together, cable attached to cooling plate using cable ties Switch-side: Standard ethernet connector Connectors not well made/attached, bad contact Had to be re-crimped Are still sensible to twisting when plugged into switch/patch panel Network problems during commissioning Some boards were unreachable via the network: 90% packet drop Switch indicated 100Mb/s – not 10 as expected Normal text - click to edit Most boards affected, but some always, some rarely However: a short power cycle seemed to help? Turned out there was a bug in the kernel driver: autonegitiation not always enabled on boot Ethernet interface switched to 100Mb/s operation The electronics between ethernet chip and cable on DCS board does not support this because of modifications due to the strong magnetic field Only a few packets got through After kernel update, problems gone Temperature measurements • All FECs have temperature sensors Normal text - click to edit – If temperature too high electronics may be damaged – The FeeServer will export temperatures to higher layers – High temperatures will cause electronics to be switched off • During commissioning temperature was written continuously to log files – A temperature cross section for each partition was plotted for every 12th hour – No alarming temperatures were seen Software Mostly OK Normal text - click to edit InterComLayer/FeeServers interplay is working FeeServers sometimes “disappear” from DID, but not from ICL. It seems like they are running, but not in a working state FeeServers sometimes do not publish services – registration timeout FeeServers crashes (and restarts) when FECs are turned on and off via DDL The kernel update took care of most other problems (“impossible” to get all DCS boards running without “dirty tricks”) Commissioning conclusion Normal text - click to edit Network based configuration worked as planed Some initial network problems, OK after kernel update No alarming electronics temperatures seen Some minor FeeServer issues Ethernet cables must be handled with care Benchmarks during TPC commissioning Benchmark done with one patch and a complete slice of the TPC Normal text - click to edit Benchmark test performed on TPC side 0 (a), slice 13 (single cast on patch 0) Setup: 6 FeeServer with TPC ControlEngine (CE) Switch: NETGEAR 7300S Series Layer 3 Managed Switch InterComLayer on P4 (3.4GHz, dual core, 512 MB RAM, SLC 3) FedClient implementation for testing purpose on different machine Setup during commissioning and benchmark tests PVSS (incl. FedClient) Normal text - click to edit Switch 6 DCS boards (FeeServer incl. TPC CE) tpcfee01 (ICL) 100MBit/s tpcfee02 (Test-FedClient) 10MBit/s 100MBit/s Switch Components used during benchmark PVSS II Supervisory Layer (FED - Client) Normal text - click to edit Front-End Device Interface (FED) FED Server Control Layer Config. File Load configuration data from file InterComLayer FEE Client Front-End Electronics Interface (FEE) Field Layer FeeServer / CE FeeServer / CE FeeServer / CE Cmd / ACK Channel Benchmarks layout Issued command: Normal text - click to edit Switching on / off of all Front-End-Cards of the patch command size: 12 Byte (+ 12 Byte of FeePacket header = 24 Byte) CE was emulating the execution of “switch on/off FEC” command Send as: Singlecast and Broadcast for a complete slice from Test-FedClient and from PVSS Benchmark results during TPC commissioning SingleCast ControlFero command: Normal text click to edit time period for [sec] average max min Command in FedServer – ACK in FeeClient 0.358162 1.092122 0.243506 SEND – ACK in FeeClient 0.3574644 1.091613 0.243026 Process time in ICL 0.000698 0.000999 0.00048 0.1118 0.84 0.02 FeeServer computing Annotations: command issued 100 times no lost ACKs Benchmark results during TPC commissioning BroadCast ControlFero command (FedServer – Ack in FeeClient): Normal text click to edit patch0 patch1 patch2 patch3 patch4 [sec] all average 0.404874 0.267716 0.275715 0.303979 0.290279 0.313083 0.32129 max 1.012536 0.619624 0.847929 0.775591 1.011102 0.902006 0.848276 min 0.249206 0.235348 0.032372 0.236584 0.236367 0.064199 0.228168 96 84 92 91 95 92 90 count patch5 Annotations: command issued 96 times, lost ACKs: 21 (for missing already FeeServer no command had been issued) Benchmark results during TPC commissioning FeeServer/CE benchmark (receive command – send ACK): patch0 patch1 patch2 patch3 Normal text - click to editpatch4 patch5 0.028901 0.041023 0.031837 0.042708 0.028316 0.027245 max [sec] 0.22 1.11 0.61 0.62 0.66 0.44 min [sec] 0.02 0.02 0.02 0.02 0.02 0.02 seg faults 3 4 0 0 1 1 duplicated ACKs 6 15 4 8 10 4 counts 91 88 98 96 95 98 average [sec] Annotations: command issued 100 times, duplicated ACKs may indicate temporarily lost links to ICL and/or DIM-DNS Remarks to Benchmark tests ACKs very delayed Normal text - click to edit very few ACK reached at the FeeClient after the ACK of the following Command has already been received take over of ACK not possible in FeeServer and DIM framework most likely package temporarily stuck in switch duplicated ACKs most likely due to lost link to FeeServer, DIM-DNS should not disturb the system, filtered out by InterComLayer Future Tests Extended tests with more slices: 2, 9, 18 (one side), 36 (whole TPC, both sides) Normal text - click to edit preparing a complete set of benchmark test when TPC is available again in May 2007 Test with real commands, real configuration data and real execution in CE Benchmarks of the Service Channels (fast triggered update of temp, etc.) (usage of the CommandCoder during tests) further investigation of delayed ACKs verify that duplicated ACKs will not disturb the system Normal text - click to edit