Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Advanced Condor mechanisms CERN Feb 14 2011 Condor Project Computer Sciences Department University of Wisconsin-Madison a better title… “Condor Potpourri” › Igor feedback “Could be useful to people, but not Monday” › If not of interest, new topic in 1 minute www.condorproject.org 2 Central Manager Failover › Condor Central Manager has two services › condor_collector Now a list of collectors is supported › condor_negotiator (matchmaker) If fails, election process, another takes over Contributed technology from Technion www.condorproject.org 3 Submit node robustness: Job Progress continues if connection is interrupted › Condor supports reestablishment of the connection between the submitting and executing machines. If network outage between execute and submit machine If submit machine restarts › To take advantage of this feature, put the following line into their job’s submit description file: JobLeaseDuration = <N seconds> For example: job_lease_duration = 1200 www.condorproject.org 4 Submit node robustness: Job Progress continues if submit machine fails Automatic Schedd Failover Condor can support a submit machine “hot spare” If your submit machine A is down for longer than N minutes, a second machine B can take over Requires shared filesystem (or just DRBD*?) between machines A and B *Distributed Replicated Block Device – www.drbd.org www.condorproject.org 5 DRBD www.condorproject.org 6 Interactive Debugging › Why is my job still running? Is it stuck accessing a file? Is it in an infinite loop? › condor_ssh_to_job Interactive debugging in UNIX Use ps, top, gdb, strace, lsof, … Forward ports, X, transfer files, etc. www.condorproject.org 7 condor_ssh_to_job Example % condor_q -- Submitter: perdita.cs.wisc.edu : <128.105.165.34:1027> : ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1.0 einstein 4/15 06:52 1+12:10:05 R 0 10.0 cosmos 1 jobs; 0 idle, 1 running, 0 held % condor_ssh_to_job 1.0 Welcome to [email protected]! Your condor job is running with pid(s) 15603. $ gdb –p 15603 … www.condorproject.org 8 How it works › ssh keys created for each invocation › ssh Uses OpenSSH ProxyCommand to use connection created by ssh_to_job › sshd runs as same user id as job receives connection in inetd mode • So nothing new listening on network • Works with CCB and shared_port www.condorproject.org 9 What?? Ssh to my worker nodes?? › Why would any sysadmin › allow this? Because the process tree is managed Cleanup at end of job Cleanup at logout › Can be disabled by nonbelievers www.condorproject.org 10 Concurrency Limits › Limit job execution based on admindefined consumable resources E.g. licenses › Can have many different limits › Jobs say what resources they need › Negotiator enforces limits pool-wide www.condorproject.org 11 11 Concurrency Example › Negotiator config file MATLAB_LIMIT = 5 NFS_LIMIT = 20 › Job submit file concurrency_limits = matlab,nfs:3 This requests 1 Matlab token and 3 NFS tokens www.condorproject.org 12 12 Green Computing › The startd has the ability to place a machine into a low power state. (Standby, Hibernate, Soft-Off, etc.) HIBERNATE, HIBERNATE_CHECK_INTERVAL If all slots return non-zero, then the machine can powered down via condor_power hook A final acked classad is sent to the collector that contains wake-up information › Machines ads in “Offline State” Stored persistently to disk Ad updated with “demand” information: if this machine was around, would it be matched? www.condorproject.org 13 Now what? www.condorproject.org 14 condor_rooster › Periodically wake up based on ClassAd expression (Rooster_UnHibernate) › Throttling controls › Hook callouts make for interesting possibilities… www.condorproject.org 15 Dynamic Slot Partitioning › Divide slots into chunks sized for matched jobs › Readvertise remaining resources › Partitionable resources are cpus, memory, and disk › See Matt Farrellee’s talk www.condorproject.org 20 20 Dynamic Partitioning Caveats › Cannot preempt original slot or group of sub-slots Potential starvation of jobs with large resource requirements › Partitioning happens once per slot each negotiation cycle Scheduling of large slots may be slow www.condorproject.org 21 21 High Throughput Parallel Computing › Parallel jobs that run on a single machine Today 8-16 cores, tomorrow 32+ cores › Use whatever parallel software you want It ships with the job MPI, OpenMP, your own scripts Optimize for on-board memory access www.condorproject.org 22 Configuring Condor for HTPC › Two strategies: Suspend/drain jobs to open HTPC slots Hold empty cores until HTPC slot is open › We have a recipe for the former on the Condor Wiki http://condor-wiki.cs.wisc.edu › User accounting enabled by Condor’s notion of “Slot Weights” www.condorproject.org 23 CPU Affinity Four core Machine running four jobs w/o affinity core1 core2 core3 core4 j1 j2 j3 j4 j3a j3b j3c j3d www.condorproject.org 24 CPU Affinity to the rescue SLOT1_CPU_AFFINITY = 0 SLOT2_CPU_AFFINITY = 1 SLOT3_CPU_AFFINITY = 2 SLOT4_CPU_AFFINITY = 3 www.condorproject.org 25 Four core Machine running four jobs w/affinity core1 core2 j1 j2 core3 core4 j3 j3a j4 j3b j3c j3d www.condorproject.org 26 Condor + Hadoop FS (HDFS) Condor+HDFS = 2 + 2 = 5 !!! A Synergy exists (next slide) • Hadoop as distributed storage system • Condor as cluster management system Large number of distributed disks in a compute cluster Managing disk as a resource www.condorproject.org 27 condor_hdfs daemon › Main integration point of HDFS within › › › › Condor Configures HDFS cluster based on existing condor_config files Runs under condor_master and can be controlled by existing Condor utilities Publish interesting parameters to Collector e.g IP address, node type, disk activity Currently deployed at UW-Madison www.condorproject.org 28 Condor + HDFS : Next Steps? › Integrate with File Transfer › › › › Mechanism FileNode Failover Management of HDFS What about HDFS in a GlideIn environment?? Online transparent access to HDFS?? www.condorproject.org 29 Remote I/O Socket › Job can request that the condor_starter › process on the execute machine create a Remote I/O Socket Used for online access of file on submit machine – without Standard Universe. Use in Vanilla, Java, … › Libraries provided for Java and for C, e.g. : › Java: FileInputStream -> ChirpInputStream C : open() -> chirp_open() Or use Parrot! www.condorproject.org 30 shadow starter Secure Remote I/O I/O Server Local System Calls Local I/O (Chirp) I/O Proxy Fork Job Home File System Submission Site I/O Library www.condorproject.org 31 Execution Site www.condorproject.org 32 DMTCP › Written at Northeastern U. and MIT › User-level process checkpoint/restart library › Fewer restrictions than Condor’s Standard Universe Handles threads and multiple processes No re-link of executable › DMTCP and Condor Vanilla Universe integration exists via a job wrapper script www.condorproject.org 33 Questions? Thank You! www.condorproject.org 34