Reducing Pause Times With Clustered Collection (ISMM 2015 - Research Papers)

Who

Cody Cutler, Robert Morris

Track

ISMM 2015 Research Papers

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sun 14 Jun 2015 16:25 - 16:50 at C123 - Optimizing Garbage Collection Chair(s): Richard Jones

Abstract

Each full garbage collection in a program with millions of objects can pause the program for multiple seconds. Much of this work is typically repeated, as the collector re-traces parts of the object graph that have not changed since the last collection. Clustered Collection reduces full collection pause times by eliminating much of this repeated work.

Clustered Collection identifies clusters: regions of the object graph that are reachable from a single “head” object, so that reachability of the head implies reachability of the whole cluster. As long as it is not written, a cluster need not be re-traced by successive full collections. The main design challenge is coping with program writes to clusters while ensuring safe, complete, and fast collections. In some cases program writes require clusters to be dissolved, but in most cases Clustered Collection can handle writes without having to re-trace the affected cluster. Clustered Collection chooses clusters likely to suffer few writes and to yield high savings from re-trace avoidance.

Clustered Collection is implemented as modifications to the Racket collector. Measurements of the code and data from the Hacker News web site (which suffers from significant garbage collection pauses) and a Twitter-like application show that Clustered Collection decreases full collection pause times by a factor of three and six respectively. This improvement is possible because both applications have gigabytes of live data, modify only a small fraction of it, and usually write in ways that do not result in cluster dissolution. Identifying clusters takes more time than a full collection, but happens much less frequently than full collection.

Link to Publication

http://dl.acm.org/authorize?N98628

Cody Cutler

MIT

Robert Morris

MIT