Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
An Adaptive, Region-based Allocator for Java Feng Qian, Laurie Hendren {fqian, hendren}@cs.mcgill.ca Sable Research Group School of Computer Science McGill University Motivation ● ● Reduce GC work by object stackallocation Drawbacks of previous approach for Java – whole-program escape analyses – restrictions on stackable objects: ● trivial finalize method ● limited sizes of arrays ● non-overlapping lifetime in a loop Goals ● Reduce GC work by cheaply reclaiming non-escaping objects, but: – – ● ● should not rely on expensive program analyses overcome restrictions of stack-allocation Preserve full semantics of Java Virtual Machines Explore runtime information of Java programs Road Map ● Motivation & Introduction Region-based Allocator ● Experimental Results ● Conclusion & Future work Proposal ● Use write barriers to dynamically categorize allocation sites as local or non-local ● ● Allocate objects in regions instead of stack frames Adaptively change allocation decisions To Space Heap From Space R2R2 R1 Thread Stack Global Region To Space Heap From Space R2 R1 Thread Stack Global Region Definitions ● Escaping Object: An object escapes its allocation region if and only if it is referenced by an object in another region ● Non-local Allocation Site: An allocation site becomes non-local when an object created by that site escapes Heap Organization ● Heaps are managed as regions consisting of a set of pages: – A Global region contains escaping objects and objects created from non-local sites – A Free list links free pages – Local regions act as extensions of stack frames, allocate spaces for objects created from local sites Allocation Sites and Objects ● ● Each allocation site has one unique index, with one of two states: – local, creates objects in local regions – non-local, creates object in the Global region An object header contains: – the index of its allocation site (sharing space – an escaping bit with thin locks) a = new A(); 1:1:aa==global_new local_new A(); A(); b.f = a a.1 b Region Allocation ● ● Method prologue and epilogue have instructions allocating and releasing regions A region has one of two states: – clean: pages are reclaimed when the host – dirty: pages are appended to the Global stack frame popped region collected by GC Write Barriers ● ● Objects may escape local regions by four types of byte codes: putstatic, putfield, aastore, and areturn Write barriers capturing escaping objects have two purposes: – – safety: marking regions as dirty adaptation: marking allocation sites as nonlocal Put Them Together ● ● ● ● Initially, all allocation sites in a method are in the local state As the program proceeds, some become non-local, and will create future objects in the Global region The local regions of future activations are more likely to be clean Write barriers guarantee the safety Specific Issues for Java ● areturn instruction ● exceptions (and athrow instruction) ● finalize method Road Map ● Motivation & Introduction ● Region-based Allocator Experimental ● Results Conclusion & Future work Prototype Implementation ● ● Jikes RVM: we choose the baseline compiler, and a semi-space copying collector Settings: – – – Fixed page size Did not use large object heap Objects straddling multiple pages Experimental Results ● Behavior study of SPECjvm98 & soot-c: – – Allocation behavior Effect of regions and page sizes on collections and fragmentation – Behavior of write barriers – Effect of adaptation – Impact on thin locks c R1 a b R1 R2 R1 Allocation Distribution firstpage Behavior nextpages Allocation searching 95% 90% 85% javac jess mtrt soot-c 4K 1K 256 4K 1K 256 4K 1K 256 4K 1K 80% 256 Distribution 100% Effect of Regions and Page Sizes Dynamic measurement of: ● number of collections ● froth rate (unused bytes on pages) # collections compress db jack BASE 256 7 7 4 4 9 7 froth rate 1K 7 (13%) 4 ( 0%) 8 (23%) 4K 7 4 9 256 0.03% 0.05% 1.29% 1K 0.11% 0.23% 5.97% 4K 0.47% 1.05% 27.52% javac jess 12 12 12 11 15 ( 9%) 11 ( 7%) 25 11 4.96% 0.13% 29.41% 0.53% 130.42% 2.19% mpeg mtrt soot-c 0 7 15 0 1 13 0 (28%) 1 (81%) 13 (19%) 0 1 15 0.62% 0.03% 1.09% 2.10% 0.09% 4.89% 9.05% 0.38% 23.49% * 50M total heap space with ~25M in each semi-space Behavior of Write Barriers Write barriers for putfield, aastore : escaped 95% sameregion 90% samepage null 85% quick ja ck ja va c m jes pe s ga ud io m trt so ot -c db es s 80% co m pr Distribution 100% Region Allocation at Runtime Allocated on Local Regions 100% soot-c 80% javac jess 60% mtrt 40% 20% 0% 1 21 41 61 81 101 121 141 161 181 201 221 241 261 Bytes Allocated (per 1M) Effect of Adaptation Javac: ratio of clean regions 498K/499K Ratio of Clean Regions 100% 95% 90% 85% 80% 1 51 101 151 201 251 301 351 Released Regions (per 1,000) 401 451 Effect of Adaptation (Cont.) Javac: ratio of clean regions with/without adaptation Ratio of Clean Regions 100% adaptation no adaptation 80% 60% 40% 20% 0% 1 21 41 61 81 101 121 Released Regions (per 10,000) 141 161 181 More on Adaptation •Current scheme predicts future objects will escape after one object from that site escapes •Without adaptation predicts future objects non-escaping with adaptation javac jess #collections 15 11 froth 29% 1% without adaptation #collections 96 2 froth 589% 9% Impact on Thin Locks ● ● ● Share space with thin locks in a twoword object header. Less than 5% of thin locks require one additional check on common path One additional check on uncommon path (see the paper for details) Related Work ● Escape analysis and stack allocation for Java programs – ● Gay et.al. [CC’00], Choi et.al. [OOPSLA’99], Blanchet [OOPSLA’99], Whaley et.al. [OOPSLA’99], … Memory Management with Regions (Scoped memory regions) – Tofte et.al.[IC’97], Gay et.al. [PLDI’98], Deters et.al. [ISMM’02], … Conclusions ● ● ● We have presented the idea of using regions to reduce the work of GC in Java Virtual Machines We have implemented the prototype in a real virtual machine and proposed several techniques to reduce the overhead Our study of allocation behavior validates the idea Future Work ● ● ● ● Relax definition of escaping by using stack discipline and region hierarchy Look for better prediction schemes (calling context) Optimize write barriers with cheap analyses Combine the allocator with other types of GC ?