Download tannenba_advanced_condor

Advanced Condor mechanisms CERN Feb 14 2011 Condor Project Computer Sciences Department University of Wisconsin-Madison a better title… “Condor Potpourri” › Igor feedback “Could be useful to people, but not Monday” › If not of interest, new topic in 1 minute  www.condorproject.org 2 Central Manager Failover › Condor Central Manager has two services › condor_collector Now a list of collectors is supported › condor_negotiator (matchmaker) If fails, election process, another takes over Contributed technology from Technion www.condorproject.org 3 Submit node robustness: Job Progress continues if connection is interrupted › Condor supports reestablishment of the connection between the submitting and executing machines.  If network outage between execute and submit machine  If submit machine restarts › To take advantage of this feature, put the following line into their job’s submit description file: JobLeaseDuration = <N seconds> For example: job_lease_duration = 1200 www.condorproject.org 4 Submit node robustness: Job Progress continues if submit machine fails Automatic Schedd Failover Condor can support a submit machine “hot spare” If your submit machine A is down for longer than N minutes, a second machine B can take over Requires shared filesystem (or just DRBD*?) between machines A and B *Distributed Replicated Block Device – www.drbd.org www.condorproject.org 5 DRBD www.condorproject.org 6 Interactive Debugging › Why is my job still running? Is it stuck accessing a file? Is it in an infinite loop? › condor_ssh_to_job Interactive debugging in UNIX Use ps, top, gdb, strace, lsof, … Forward ports, X, transfer files, etc. www.condorproject.org 7 condor_ssh_to_job Example % condor_q -- Submitter: perdita.cs.wisc.edu : <128.105.165.34:1027> : ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1.0 einstein 4/15 06:52 1+12:10:05 R 0 10.0 cosmos 1 jobs; 0 idle, 1 running, 0 held % condor_ssh_to_job 1.0 Welcome to [email protected]! Your condor job is running with pid(s) 15603. $ gdb –p 15603 … www.condorproject.org 8 How it works › ssh keys created for each invocation › ssh Uses OpenSSH ProxyCommand to use connection created by ssh_to_job › sshd runs as same user id as job receives connection in inetd mode • So nothing new listening on network • Works with CCB and shared_port www.condorproject.org 9 What?? Ssh to my worker nodes?? › Why would any sysadmin › allow this? Because the process tree is managed  Cleanup at end of job  Cleanup at logout › Can be disabled by nonbelievers www.condorproject.org 10 Concurrency Limits › Limit job execution based on admindefined consumable resources E.g. licenses › Can have many different limits › Jobs say what resources they need › Negotiator enforces limits pool-wide www.condorproject.org 11 11 Concurrency Example › Negotiator config file MATLAB_LIMIT = 5 NFS_LIMIT = 20 › Job submit file concurrency_limits = matlab,nfs:3 This requests 1 Matlab token and 3 NFS tokens www.condorproject.org 12 12 Green Computing › The startd has the ability to place a machine into a low power state. (Standby, Hibernate, Soft-Off, etc.)  HIBERNATE, HIBERNATE_CHECK_INTERVAL  If all slots return non-zero, then the machine can powered down via condor_power hook  A final acked classad is sent to the collector that contains wake-up information › Machines ads in “Offline State”  Stored persistently to disk  Ad updated with “demand” information: if this machine was around, would it be matched? www.condorproject.org 13 Now what? www.condorproject.org 14 condor_rooster › Periodically wake up based on ClassAd expression (Rooster_UnHibernate) › Throttling controls › Hook callouts make for interesting possibilities… www.condorproject.org 15 Dynamic Slot Partitioning › Divide slots into chunks sized for matched jobs › Readvertise remaining resources › Partitionable resources are cpus, memory, and disk › See Matt Farrellee’s talk www.condorproject.org 20 20 Dynamic Partitioning Caveats › Cannot preempt original slot or group of sub-slots Potential starvation of jobs with large resource requirements › Partitioning happens once per slot each negotiation cycle Scheduling of large slots may be slow www.condorproject.org 21 21 High Throughput Parallel Computing › Parallel jobs that run on a single machine Today 8-16 cores, tomorrow 32+ cores › Use whatever parallel software you want It ships with the job MPI, OpenMP, your own scripts Optimize for on-board memory access www.condorproject.org 22 Configuring Condor for HTPC › Two strategies: Suspend/drain jobs to open HTPC slots Hold empty cores until HTPC slot is open › We have a recipe for the former on the Condor Wiki http://condor-wiki.cs.wisc.edu › User accounting enabled by Condor’s notion of “Slot Weights” www.condorproject.org 23 CPU Affinity Four core Machine running four jobs w/o affinity core1 core2 core3 core4 j1 j2 j3 j4 j3a j3b j3c j3d www.condorproject.org 24 CPU Affinity to the rescue SLOT1_CPU_AFFINITY = 0 SLOT2_CPU_AFFINITY = 1 SLOT3_CPU_AFFINITY = 2 SLOT4_CPU_AFFINITY = 3 www.condorproject.org 25 Four core Machine running four jobs w/affinity core1 core2 j1 j2 core3 core4 j3 j3a j4 j3b j3c j3d www.condorproject.org 26 Condor + Hadoop FS (HDFS) Condor+HDFS = 2 + 2 = 5 !!! A Synergy exists (next slide) • Hadoop as distributed storage system • Condor as cluster management system Large number of distributed disks in a compute cluster Managing disk as a resource www.condorproject.org 27 condor_hdfs daemon › Main integration point of HDFS within › › › › Condor Configures HDFS cluster based on existing condor_config files Runs under condor_master and can be controlled by existing Condor utilities Publish interesting parameters to Collector e.g IP address, node type, disk activity Currently deployed at UW-Madison www.condorproject.org 28 Condor + HDFS : Next Steps? › Integrate with File Transfer › › › › Mechanism FileNode Failover Management of HDFS What about HDFS in a GlideIn environment?? Online transparent access to HDFS?? www.condorproject.org 29 Remote I/O Socket › Job can request that the condor_starter › process on the execute machine create a Remote I/O Socket Used for online access of file on submit machine – without Standard Universe.  Use in Vanilla, Java, … › Libraries provided for Java and for C, e.g. : › Java: FileInputStream -> ChirpInputStream C : open() -> chirp_open() Or use Parrot! www.condorproject.org 30 shadow starter Secure Remote I/O I/O Server Local System Calls Local I/O (Chirp) I/O Proxy Fork Job Home File System Submission Site I/O Library www.condorproject.org 31 Execution Site www.condorproject.org 32 DMTCP › Written at Northeastern U. and MIT › User-level process checkpoint/restart library › Fewer restrictions than Condor’s Standard Universe Handles threads and multiple processes No re-link of executable › DMTCP and Condor Vanilla Universe integration exists via a job wrapper script www.condorproject.org 33 Questions? Thank You! www.condorproject.org 34

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download tannenba_advanced_condor