Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS510 Advanced OS Seminar Class 10 A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy Motivation A concurrent object is a data structure shared by concurrent processes o o o It is blocking if delays in one process can cause delays in others (all processes might fail to complete) Non-blocking means some process will complete in finite number of steps Wait-free means it is free from starvation, ie. All processes will complete in a finite number of steps General methodology for constructing non-blocking and waitfree concurrent objects o o Automatic transformation from sequential implementation Using universal primitives such as LL/SC or CAS • CAS algorithms are more complex and less efficient than LL/SC ones CS533 - Concepts of Operating Systems 2 Overview Methodology o o o Start with sequential objects and operations Apply synchronization and memory management algorithms to transform sequential objects and operations into concurrent objects and operations Simple enough to be applied by a compiler CS533 - Concepts of Operating Systems 3 Small objects Slightly different approach for small and large objects Small objects o o can be copied efficiently object occupies a fixed size contiguous region of memory called a “block” Restrictions o Sequential operations must be “total” • ie. well-defined for all states of the object CS533 - Concepts of Operating Systems 4 Small object transformation The basic idea o o o o Load-linked pointer to current version Copy version to new (local) block of memory Apply the sequential operation to the copy Store-conditional pointer to current version to point to new version • o retry (to load-linked) on failure Reclaim memory of old version CS533 - Concepts of Operating Systems 5 Problem! What if you are slow and another process reclaims and reuses the memory while you are making your copy? o o o o You may generate an inconsistent copy During your modifications you may later dereference a corrupted pointer You may get a divide by zero error … All before you reach the store-conditional that would let you know to fail and retry CS533 - Concepts of Operating Systems 6 Solution Check the consistency of the copy before you use it But how can you know? o Use two checks • • • Updaters increment check 0 before updating and increment check 1 after updating – The two checks have the same value unless an update is in progress Copiers read check 1 before copying and check 0 after copying – If they are the same then the copy is consistent – If not retry the copy This is a classic application for memory barriers CS533 - Concepts of Operating Systems 7 Concurrent priority queue Load linked queue pointer CS533 - Concepts of Operating Systems 8 Concurrent priority queue Read first part of consistency check CS533 - Concepts of Operating Systems 9 Concurrent priority queue Copy object CS533 - Concepts of Operating Systems 10 Concurrent priority queue Read second part of consistency check CS533 - Concepts of Operating Systems 11 Concurrent priority queue If object is consistent CS533 - Concepts of Operating Systems 12 Concurrent priority queue Perform sequential operation on copy CS533 - Concepts of Operating Systems 13 Concurrent priority queue Commit changes by overwriting queue pointer with pointer to new version. Exit loop on success … retry on failure CS533 - Concepts of Operating Systems 14 Concurrent priority queue Take ownership of old queue’s memory to replace the memory given away for new version. CS533 - Concepts of Operating Systems 15 Performance evaluation How expensive is the copying overhead? What kind of contention characteristics does this approach have? CS533 - Concepts of Operating Systems 16 Benchmark: million enqueue/dequeue pairs CS533 - Concepts of Operating Systems 17 Performance results CS533 - Concepts of Operating Systems 18 Contention and fairness CS533 - Concepts of Operating Systems 19 Problem How to reduce contention? Introduce delay via exponential back-off o Like in spin-locks CS533 - Concepts of Operating Systems 20 Concurrent priority queue with backoff Set wait time Busy wait CS533 - Concepts of Operating Systems 21 Performance impact back-off CS533 - Concepts of Operating Systems 22 More problems Performance o Requires backoff to reduce contention Fairness o Non-blocking algorithm is open to starvation • • o Especially of enqueue’s which are slower (longer) and frequently have to retry If a short fast thing runs concurrently with a long slow thing, the short fast thing nearly always wins and the long slow thing nearly always has to retry – It may never complete! How can we make the approach wait-free? CS533 - Concepts of Operating Systems 23 Wait-free algorithm Processes register their attempted invocations in an n-element “announce” array o o o Processes register their results in an n-element “responses” array o o one element per process fields are <operation-name> <arguments> <toggle> toggle field is complemented on each invocation one element per process fields are <result> <value> <toggle> Processes help each other complete invocations by running the “apply” function before performing own operation CS533 - Concepts of Operating Systems 24 Apply function – basic idea Before performing an operation o o look to see if any other processes have operations in progress if so, try to perform those operations for them and place the result somewhere they can pick it up Guarantees that whoever finishes first will succeed in performing every process’ operations o o no processes operations can be starved sounds like lots of wasted work and lots of contention! CS533 - Concepts of Operating Systems 25 Apply function For all processes CS533 - Concepts of Operating Systems 26 Apply function Are any other operations active concurrently? CS533 - Concepts of Operating Systems 27 Apply function If so, complete the operation on behalf of the other process (in case its slower than you and would be forced to retry by the operation you are about to do) CS533 - Concepts of Operating Systems 28 Apply function Mark operation as having completed CS533 - Concepts of Operating Systems 29 Wait-free concurrent pqueue CS533 - Concepts of Operating Systems 30 Wait-free concurrent pqueue Announce your attempted operation CS533 - Concepts of Operating Systems 31 Wait-free concurrent pqueue Distinguish it from the last one, which has a result already registered (… is one bit enough??) CS533 - Concepts of Operating Systems 32 Wait-free concurrent pqueue Check (twice!) that someone else has not completed the operation for you?? (… hmm, why does it help to read this twice?) CS533 - Concepts of Operating Systems 33 Wait-free concurrent pqueue Apply pending operations ?? … is there code missing here??? Should include apply(announce, new_pqueue); CS533 - Concepts of Operating Systems 34 Issues with the wait-free approach Aside from the bugs in this tech report version of the paper …. Avoids starvation by having all processes attempt to complete any concurrent activity of all other processes o o One will succeed and commit all results All others will fail to commit their version, but should notice that their operation was completed by someone else • • How do they pick up the result? How is concurrency managed for the announce and result data structures? CS533 - Concepts of Operating Systems 35 Large objects Copying the entire object is too costly Programmers must construct logical versions o o o Memory is shared among versions New memory is allocated for unique parts of new versions Memory of old versions must be freed CS533 - Concepts of Operating Systems 36 Memory management for large objects Per-process pool of memory o 3 states: committed, allocated and freed Operations: o o o o o set_alloc moves block from committed (freed?) to allocated and returns address set_free moves block to freed set_prepare marks blocks in allocated as consistent set_commit sets committed to union of freed and committed set_abort sets freed and allocated to the empty set CS533 - Concepts of Operating Systems 37 Memory management for large objects May require a global memory pool Problem: o How to prevent this from becoming a synchronization bottleneck? CS533 - Concepts of Operating Systems 38 ACM version – corrected simply algorithm CS533 - Concepts of Operating Systems 39 ACM version – corrected delay algorithm CS533 - Concepts of Operating Systems 40 ACM version – corrected apply function CS533 - Concepts of Operating Systems 41 ACM version – corrected wait-free algorithm CS533 - Concepts of Operating Systems 42