Rethinking the implementation of Continuations
Well, life has a way of catching up with you when you least expect it, and so I have once again left the Arcueid code base fallow for well over six months. Now, I hope, I’ll have a bit more time to work on it, and perhaps I’ll be able to do something more with it in this stretch, provided that the work I do which pays the bills doesn’t get in the way too much.
Running Conan Dalton’s testsuite on Arcueid has uncovered what appear to be some bugs in the continuation handling behaviour of the virtual machine core, especially that of ccc (call/cc). Someone rooting around in the random writings I have in the Arcueid wiki on Github might have noticed me mentioning something about a calling convention 4, and the variant continuations it generates. In retrospect, I now think that this idea was ill-conceived, relying as it does on Simon Tatham’s coroutine technique which is rather unorthodox to say the least.
To permit C extensions (and naturally functions within the Arcueid runtime itself of course) to call Arc code, we could of course simply create another virtual machine engine context and use that to execute the code, and when it terminates, the return value is returned to the calling C code. This is what Ruby and apparently Python do. However, we want to have full pre-emptive multitasking in the virtual machine core, and doing this precludes that unless Arcueid gets native threads, and that’s a whole other can of worms which I’m not yet ready to open at this time.
A variation of the above technique could also be done, using an Appel Trampoline the way Chicken Scheme does it, but then that would mean that I’d have to rip the guts out of the garbage collector and make fundamental changes in Arcueid’s memory allocation, and that is not something I think I can do just now.
The approach I’m using today makes use of Simon Tatham’s coroutines to generate what amount to continuations capable of restoring to arbitrary locations in C code. This imposes a number of limitations on C functions that want to be able to call Arcueid bytecode, the most notable being that all variables must be Arcueid value
s, and these C functions cannot in turn call other C functions of this type either. It seems that interactions between these types of continuations and the normal type of continuations are the source of the bugs we have, and they seem to be fundamental.
The fourth technique, and perhaps the one I am about to code, involves copying the stack. This is actually something that we already do: it is currently done to allow I/O calls that would otherwise block the thread scheduler completely to return to the thread scheduler and also be restored when I/O will no longer block. Apparently, the technique is used by Guile and a few other Lisp and Scheme implementations to provide continuations and call/cc, so we aren’t exactly treading in strange ground here.
Other techniques for implementing reified continuations seem to require that interfacing with C code become more complicated, and well, one of my design goals for Arcueid is to produce an Arc implementation that can interface with C easily, so those techniques, while they may make for continuations and call/cc that may be much more efficient than stack copying or these other techniques I mention, run counter to my design goals.
That said, I have to wonder just what kind of a performance hit we would be looking at if we had only one continuation type that consisted of a copy of the current stack. For pure Arc code running only on the virtual machine the C stack of the virtual machine would not be getting any deeper, as these saved stacks would include only the local variables of the virtual machine main loop (arc_vmengine
in vmengine.c
). It’s only when we have deeply nested C calls that the stack copies will ever get much deeper than this. Perhaps multitasking pre-emption could also be done in this way…?
Well, let me code this and see how it goes.