Tail Recursive Stack Disciplines, redux
After taking once again a nearly year-long hiatus from Arcueid development I find that I’ve been thrust back into it thanks to Alexander Artemenko pointing out a fairly serious bug in it. The nature of the bug is fairly interesting, and is an artefact of my attempts last year to get Arcueid to do tail recursion with a stack.
The sample code that triggers the bug Mr. Artemenko discovered is fairly simple:
(let i 0 (while t (prn "Still going: " i) (++ i)))
This triggers a stack overflow exception because of the way the Arcueid interpreter handles moving environments from the stack to the heap. Expanding the macros and simplifying this we get the following:
((fn (i) (((fn (self) (assign self (fn (x) ((fn () (prn "Still going: " i) (assign i (+ i 1)))) (self t)))) nil) t)) 0)))
The Arc compiler compiles the innermost function
(fn (x) ((fn () (prn "Still going: " i) (assign i (+ i 1)))) (self t))
into the following virtual machine instruction sequence:
0000 env 01,00,00 ; environment has one element (fn (x) ...) 0001 cont 07,43 ; prepare for call to anonymous function (go to 0005) 0002 ldl 00 ; load anonymous function code (fn () (prn "Still going: " i) (assign i (+ i 1))) 0003 cls ; turn into closure 0004 apply 00 ; apply (fn () (prn "Still going: " i) (assign i (+ i 1))) with no args 0005 true ; parameter for call to self (destination of continuation created at 0001) 0006 push 0007 lde 01,00 ; self (created by assignments above) 0008 menv 01 ; tail recursion prepare 0009 apply 01 ; apply self with t
The whole thing falls apart because the
cls instruction on line 0003, which creates a closure out of a code object and the current environment, has to then move the current environment from the stack to the heap. This is required because the closure could potentially be returned or assigned somewhere beyond its original scope, and the values in that environment might be needed by a subsequent invocation of the function (ah, the joys of static binding). But then what happens to the copy of the environment that’s still on the stack? It’s still there, but has become garbage. And so, as lines 0005 through 0009 are executed, and the application returns to line 0000 by the
apply instruction on line 0009, the garbage environments accumulate, and eventually the stack fills up. So much for our tail recursion!
There are several possible solutions that could solve the problem. One would be to make the
cls instruction destroy the environment on the stack that after transferring it to the heap. That involves moving everything on the stack below it back upwards. The only problem with this is that continuations on the stack are referenced as absolute offsets into the stack, so moving a continuation like that is problematic. Kartik Agaram also discovered this when he tried to hack Arcueid to make its stack grow dynamically.
Rereading Richard Kelsey’s paper on tail-recursive stack disciplines for an interpreter yields the following possibly simplest solution:
Another possibility is to wait until the stack is full and then reclaim any unused space. The stack can be treated as a constrained heap, containing only environments and continuations… The third possibility, copying the live data into a heap, reduces the possibility of thrashing… This is certainly the most attractive possibility for languages that already require a heap.
So what we can do is wait until all stack space is exhausted, and then move all environments and continuations that are still on the stack into the heap, and then drop everything on the stack except any scratch values that the program might have been computing that are still there. Let’s see how this all works.