Garbage collection (by alaric)
The next cleverest technique is called semispace GC. In this technique, the system's memory is divided into two halves.
One big advantage of this system is that it also solves the aforementioned problem of deciding which bits of memory are in use and which aren't, because the system keeps track of which half of the memory it's currently using for new allocations. It just keeps a marker that starts off at the start of the active half; to allocate N bytes, it takes the N bytes starting from the marker, then moves the marker to just after the N bytes. Voila! Nice and fast!
When it starts to get towards the end of the semispace, however, it needs to start the collection cycle. What it does is to start off with the memory regions it inherently knows are currently in use, and to copy them to the unused semispace (using a new advancing marker to allocate space there). It then examines those memory regions for references to other memory regions; it doesn't actually know when the program has finished with a memory block, and instead makes sure it keeps all the blocks that are still referenced and throws away blocks that there are no references to, since they cannot possibly be used in future. Every time it copies a block to the new semispace, it scans it for references to other blocks still in the old semispace, and marks them for copying.
When there are no more marked blocks in the old semispace, it now knows that all the currently usable memory blocks are moved to the new semispace, and that the old semispace is now unused. So it now swaps the two spaces over, with subsequent allocations being handled by advancing the marker in the new semispace; and when the new semispace gets full, it will again run a collection cycle, this time moving live memory regions back to the old semispace.
The disadvantages of this approach are clear - only half your memory is usable at any one time, and there are occasional long delays while every bit of memory in use is copied, which also causes problems for your cache.
However, the advantages are profound - memory allocation is very fast, since the collection cycle conveniently makes all of free memory into a single area, from which chunks of any desired size can be broken off at will like sections of a bar of chocolate. Also, the programmer has nearly no need to worry about memory management; they can just ask for memory when they need it, and when their program has stopped referring to the memory it's invisibly swept back into the free area for later reuse. Most programs don't tend to keep references to memory after it's needed, in practice, with no extra effort from the programmer; here and there, however, the programmer can sometimes help the garbage collector along by making a little extra effort to make sure they don't keep references to memory blocks for longer than needed.
So semispace GC offered a refreshing glimpse of what's possible, but still had some showstopping issues. The search continued...
4 Comments
Other Links to this Post
-
Snell-Pym » Garbage collection — Tue 31st Jul 2007 @ 5:56 pm
RSS feed for comments on this post. TrackBack URI
By Faré, Fri 18th Nov 2005 @ 3:21 am
So as to be able to move your object to the top of the list, yet preserve the invariant that no object points to a newer object, you need to also move up all the objects that point to your born-again object, and so on recursively. In practice, this is only affordable if your object has no one pointing to it except the current context -- in which case what you have is a linearity constraint on objects.
By Alaric Snell-Pym, Fri 18th Nov 2005 @ 10:31 am
Ah, but I do have a linearity constraint on objects 😉 That's why I was amused by the similarity to "copy on write" handling of tree structures on disk, which basically works the same way (to safely update a tree structure like a B-Tree, make copies of the changed leaf nodes into empty space, then work out what intermediate nodes would need changing to reflect the new locations of the leaf nodes and do the same with them, then continue bubbling the change up the tree until you have a new root pointer, etc).
I'm sure there's some Fundamental Truth in the fact that the same underlying technique turns out to be useful both on disk and in memory, yet in very different contexts (ACID properties on disk, fast garbage collection in RAM).
By Alaric Snell-Pym, Sun 20th Nov 2005 @ 2:33 pm
Oooh, while sawing up logs (a great time for thinking about abstract stuff) I was struck by a flaw...
When a memory block is modified and gets brought up to the head of the chain, IF the collector has not yet reached the block in this pass, then the blocks referenced by this block will not get marked since the collector will then not visit the block in question until the next pass. If nothing else refers to the same blocks, they'll be freed. Oops!
So we need to make sure that a block moved to the top of the chain still gets seen. My first thought was that the application could just quickly scan the block and mark all the referenced blocks, but that's wrong - it's the collector's job.
So my next thought was to have a (either shared and lock-free, or per-processor, to stop it from becoming a point of contention in SMP systems) stack of 'touched' objects; when altering an object, the application would merely need to push a reference onto this stack (it wouldn't even need to do the move to the head of the chain). Now, the collector, whenever it's about to examine the next object in the chain, would first look on the stack(s) and go through any memory blocks on them, marking all the referenced blocks. That way, it will never be considering a block for freeing unless it has already 'scanned' all modified objects, so there's no chance of it mistakenly freeing something. Whenever the collector has scanned a block from the stack, it can then also do the chore of moving that block to the head of the chain, moving the task from the application code.
However, there is a problem - an application that just sits there modifying the same large array of pointers over and over again would keep the collector forever rescanning that large array; never getting any real collection done. What we need is to only stack memory blocks for scanning if they've not yet been scanned anyway. This is easily resolved; have a 'scanned' flag in each block, that the collector sets whenever it scans a block, be it due to the block being on the stack or by traversing the chain. Newly allocated objects also have the 'scanned' flag set, since all the objectss they refer to must have been reachable anyway, and thus will be marked - they don't need rescanning until the next pass. When the collector finishes scanning the chain and is about to start again, it has to clear all the 'scanned' flags; but rather than walking the chain doing this, it's easier to just reverse the interpretation of the 'scanned' flag. Then newly created blocks will need to be marked as 'unscanned' for the next scan.
There's a potential race condition in that if the collector changes the global variable that says what newly created blocks should be marked as between the application reading the current setting and the application putting the newly created block at the head of the chain, it could end up with the wrong setting. Therefore, before doing the swap, the collector should take a copy of the pointer to the current head of the chain; then when it starts its walk of the chain, it should force the correct value into all the blocks it examines until it hits the point in the chain it marked, this way ensuring nothing gets missed.