The Garbage Man’s Blues
Arcueid 0.1.0 is now almost complete. The only stuff missing now before I can get it to run news.arc might be networking code, but that shouldn’t be difficult to do. I had, however, gotten into some trouble with garbage collection, which is a painful business. I’m starting off with a basic mark and sweep garbage collector, and now I sort of regret having removed the old write barrier code that used to be part of my original code, as it could have been used not just to support VCGC that we tried to use in the 0.0.1 series. It could have been used to do a lot of interesting things.
First of all, a write barrier would have enabled us to defer actually allocating an object until its reference was actually put somewhere, like in an environment, in a register, as part of a data structure, or on the stack, permitting us to implement essentially what amounts to a 1-bit reference count. Of course, that also means that the free list for BiBOP objects has to be implemented as a queue, or something of that nature. If a value is allocated from within a foreign function and is never stored in a variable, then the allocation technically uses no memory. Since we use __arc_mkaff everywhere to call foreign functions from C code, that produces a small allocation of something that almost always immediately turns into garbage. Many other similar scenarios arise in the interpreter for values that are essentially ephemeral in the same way, so many small objects may never need to be truly “allocated” at all, as they become garbage almost instantly.
The garbage collector also seems to be slowing the current code down significantly the way we use it now. Essentially, we perform a collection every time a cycle of all other threads has finished executing, with a default quantum of 1024 instructions, and as such it slows down the code noticeably. I’ve been playing around with the Ackermann function as a benchmark, and for up to (ack 3 4) it seems to be comparable in speed to reference Arc on Racket. However, for (ack 3 5) it already pauses noticeably on my machine, which I think may be due to the garbage collector, given that increasing the quantum to 65,536 produces a marked improvement, as GC is performed less often with such a high quantum. Profiler proof is something I’m going to try to get as well.
If I’m going to stick with the mark and sweep gc rather than going back to VCGC, I wonder under what situations it is best for the collector to run. As I said we do it now after a thread’s time slice completes, but this is not the only way it can be done. Ruby seems to invoke its GC every time the free lists become empty, and perhaps we can do something like that too: run the GC before creating a new BiBOP page or something like that. We could measure how much memory usage we have and set thresholds, and perform collections when such thresholds are reached.
I’ve tried to design the system so as to make garbage collection as abstracted as possible, so that we can swap out GC algorithms as easily as possible. The only thing missing for now is a write barrier mechanism, and that needs to be used whenever value writes are performed to anything that could potentially be reachable from the root set, viz.:
- Writes to registers that can contain Arcueid values, e.g. to the accumulator
- Writes to variables in an environment.
- scar or scdr operations
- Writes to vector indices (including writes to hash tables)
This should be simple enough to do without making too many mistakes, by changing the macros that have hitherto been used to do these things to stop producing C lvalues, so that we get an error on compilation like ‘lvalue required as left operand of assignment’. This tells us where to change them so we can put in a function or macro that incorporates the write barrier. This also suffices to provide hooks where we can put in a read barrier if a GC algorithm needs it.
As it is the compiler has at least 74 places where this needs to be updated… Now that I’ve gotten into progress making these changes it’s gradually becoming obvious that the code should not be written in C. While I’ve done many clever things with the C preprocessor to support all of this business I think I’ve just about reached the limits of what’s possible. The use of the AV(…) macro to access variables in the function’s local environment is already error-prone as it is, and now I have a WV(…) macro to write to variables like this. The internal use of an unsigned char type for these variables has gone some way to making this less error-prone (as the compiler may or may not complain about use of such a type in contexts where a value is required), but I really do think that some other language halfway between C and Arc is needed. Perhaps a PreArc analogous to Richard A. Kelsey’s PreScheme? Something to consider.
Maybe I’ll call it Shiki?