Download tannenba_advanced_condor

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Advanced Condor
mechanisms
CERN Feb 14 2011
Condor Project
Computer Sciences Department
University of Wisconsin-Madison
a better title…
“Condor Potpourri”
› Igor feedback
“Could be useful
to people, but not
Monday”
› If not of interest,
new topic in 1
minute 
www.condorproject.org
2
Central Manager Failover
› Condor Central Manager has two
services
› condor_collector
Now a list of collectors is supported
› condor_negotiator (matchmaker)
If fails, election process, another takes
over
Contributed technology from Technion
www.condorproject.org
3
Submit node robustness:
Job Progress continues if
connection is interrupted
› Condor supports reestablishment of the
connection between the submitting and executing
machines.
 If network outage between execute and submit machine
 If submit machine restarts
› To take advantage of this feature, put the
following line into their job’s submit description
file:
JobLeaseDuration = <N seconds>
For example:
job_lease_duration = 1200
www.condorproject.org
4
Submit node robustness:
Job Progress continues if
submit machine fails
Automatic Schedd Failover
Condor can support a submit machine
“hot spare”
If your submit machine A is down for
longer than N minutes, a second machine
B can take over
Requires shared filesystem (or just
DRBD*?) between machines A and B
*Distributed Replicated Block Device – www.drbd.org
www.condorproject.org
5
DRBD
www.condorproject.org
6
Interactive Debugging
› Why is my job still running?
Is it stuck accessing a file?
Is it in an infinite loop?
› condor_ssh_to_job
Interactive debugging in UNIX
Use ps, top, gdb, strace, lsof, …
Forward ports, X, transfer files, etc.
www.condorproject.org
7
condor_ssh_to_job Example
% condor_q
-- Submitter: perdita.cs.wisc.edu : <128.105.165.34:1027> :
ID
OWNER
SUBMITTED
RUN_TIME ST PRI SIZE CMD
1.0
einstein
4/15 06:52 1+12:10:05 R 0
10.0 cosmos
1 jobs; 0 idle, 1 running, 0 held
% condor_ssh_to_job 1.0
Welcome to [email protected]!
Your condor job is running with pid(s) 15603.
$ gdb –p 15603
…
www.condorproject.org
8
How it works
› ssh keys created for each invocation
› ssh
Uses OpenSSH ProxyCommand to use
connection created by ssh_to_job
› sshd
runs as same user id as job
receives connection in inetd mode
• So nothing new listening on network
• Works with CCB and shared_port
www.condorproject.org
9
What?? Ssh to my worker
nodes??
› Why would any sysadmin
›
allow this?
Because the process tree
is managed
 Cleanup at end of job
 Cleanup at logout
› Can be disabled by
nonbelievers
www.condorproject.org
10
Concurrency Limits
› Limit job execution based on admindefined consumable resources
E.g. licenses
› Can have many different limits
› Jobs say what resources they need
› Negotiator enforces limits pool-wide
www.condorproject.org
11
11
Concurrency Example
› Negotiator config file
MATLAB_LIMIT = 5
NFS_LIMIT = 20
› Job submit file
concurrency_limits =
matlab,nfs:3
This requests 1 Matlab token and 3 NFS
tokens
www.condorproject.org
12
12
Green Computing
› The startd has the ability to place a machine
into a low power state. (Standby, Hibernate,
Soft-Off, etc.)
 HIBERNATE, HIBERNATE_CHECK_INTERVAL
 If all slots return non-zero, then the machine
can powered down via condor_power hook
 A final acked classad is sent to the collector
that contains wake-up information
› Machines ads in “Offline State”
 Stored persistently to disk
 Ad updated with “demand” information: if this
machine was around, would it be matched?
www.condorproject.org
13
Now what?
www.condorproject.org
14
condor_rooster
› Periodically wake up based on ClassAd
expression (Rooster_UnHibernate)
› Throttling controls
› Hook callouts make for interesting
possibilities…
www.condorproject.org
15
Dynamic Slot Partitioning
› Divide slots into chunks sized for
matched jobs
› Readvertise remaining resources
› Partitionable resources are cpus,
memory, and disk
› See Matt Farrellee’s talk
www.condorproject.org
20
20
Dynamic Partitioning
Caveats
› Cannot preempt original slot or group
of sub-slots
Potential starvation of jobs with large
resource requirements
› Partitioning happens once per slot
each negotiation cycle
Scheduling of large slots may be slow
www.condorproject.org
21
21
High Throughput Parallel
Computing
› Parallel jobs that run on a single
machine
Today 8-16 cores, tomorrow 32+ cores
› Use whatever parallel software you
want
It ships with the job
MPI, OpenMP, your own scripts
Optimize for on-board memory access
www.condorproject.org
22
Configuring Condor for
HTPC
› Two strategies:
Suspend/drain jobs to open HTPC slots
Hold empty cores until HTPC slot is open
› We have a recipe for the former on
the Condor Wiki
http://condor-wiki.cs.wisc.edu
› User accounting enabled by Condor’s
notion of “Slot Weights”
www.condorproject.org
23
CPU Affinity
Four core Machine
running four jobs w/o affinity
core1
core2
core3
core4
j1
j2
j3
j4
j3a
j3b
j3c
j3d
www.condorproject.org
24
CPU Affinity
to the rescue
SLOT1_CPU_AFFINITY = 0
SLOT2_CPU_AFFINITY = 1
SLOT3_CPU_AFFINITY = 2
SLOT4_CPU_AFFINITY = 3
www.condorproject.org
25
Four core Machine
running four jobs w/affinity
core1
core2
j1
j2
core3
core4
j3
j3a
j4
j3b
j3c
j3d
www.condorproject.org
26
Condor + Hadoop FS (HDFS)
Condor+HDFS = 2 + 2 = 5 !!!
A Synergy exists (next slide)
• Hadoop as distributed storage system
• Condor as cluster management system
Large number of distributed disks in a
compute cluster
Managing disk as a resource
www.condorproject.org
27
condor_hdfs daemon
› Main integration point of HDFS within
›
›
›
›
Condor
Configures HDFS cluster based on existing
condor_config files
Runs under condor_master and can be
controlled by existing Condor utilities
Publish interesting parameters to Collector
e.g IP address, node type, disk activity
Currently deployed at UW-Madison
www.condorproject.org
28
Condor + HDFS :
Next Steps?
› Integrate with File Transfer
›
›
›
›
Mechanism
FileNode Failover
Management of HDFS
What about HDFS in a GlideIn
environment??
Online transparent access to HDFS??
www.condorproject.org
29
Remote I/O Socket
› Job can request that the condor_starter
›
process on the execute machine create a
Remote I/O Socket
Used for online access of file on submit
machine – without Standard Universe.
 Use in Vanilla, Java, …
› Libraries provided for Java and for C, e.g. :
›
Java: FileInputStream -> ChirpInputStream
C : open() -> chirp_open()
Or use Parrot!
www.condorproject.org
30
shadow
starter
Secure Remote I/O
I/O Server
Local System Calls
Local I/O
(Chirp)
I/O Proxy
Fork
Job
Home
File
System
Submission Site
I/O Library
www.condorproject.org
31
Execution Site
www.condorproject.org
32
DMTCP
› Written at Northeastern U. and MIT
› User-level process
checkpoint/restart library
› Fewer restrictions than Condor’s
Standard Universe
Handles threads and multiple processes
No re-link of executable
› DMTCP and Condor Vanilla Universe
integration exists via a job wrapper
script
www.condorproject.org
33
Questions?
Thank You!
www.condorproject.org
34