Tail recursion

The current code in the Git repository today is actually capable of running srv.arc properly, and can now run the Arc challenge code more or less stably. There are a couple of serious problems though: 1. it’s dog slow, and 2. it uses too much memory. Using ab to make 100 accesses at a concurrency level of 10 produces the following dismal results:

$ ab -n 100 -c 10 http://localhost:8080/said
This is ApacheBench, Version 2.3 
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient).....done

Server Software:        
Server Hostname:        localhost
Server Port:            8080

Document Path:          /said
Document Length:        166 bytes

Concurrency Level:      10
Time taken for tests:   10.263 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      24000 bytes
HTML transferred:       16600 bytes
Requests per second:    9.74 [#/sec] (mean)
Time per request:       1026.313 [ms] (mean)
Time per request:       102.631 [ms] (mean, across all concurrent requests)
Transfer rate:          2.28 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       0
Processing:   943 1021  43.3   1015    1170
Waiting:      838  933  47.3    927    1102
Total:        943 1021  43.2   1015    1170

Percentage of the requests served within a certain time (ms)
  50%   1015
  66%   1033
  75%   1042
  80%   1048
  90%   1064
  95%   1125
  98%   1166
  99%   1170
 100%   1170 (longest request)

Reference Arc is at least an order of magnitude faster. And meanwhile the Arcueid process has balooned to 1.5 gigabytes RSS. I think I know part of the reason why this is happening: Arcueid does not yet do proper tail recursion. There are large numbers of loops inside of srv.arc that make use of tail recursion, and their repeated runs make useless copies of their environments accumulate over time.

I just ran across a paper by Richard A. Kelsey (the author of Scheme48 among other things), that addresses precisely the implementation issue here. There are three methods described in Kelsey’s paper to address this: 1. Place environments in the heap instead of a stack and trust the real garbage collector to take care of business, 2. Overwrite environments on tail recursion, and 3. stack garbage collection.

The first method described in Kelsey’s paper, based on an idea by Andrew Appel, is also the simplest to  implement. It is also the slowest, based on Kelsey’s benchmarks. The virtual machine need only stop making new continuations for every tail call in order to do this. There are some complications involved here though, because error handling depends on the existence of a current continuation (and a tail call may not have one), but that should not be hard to address.

The second method does not seem that hard to implement either. What has to be done in this case is just not to make new environments on tail recursive calls. The old environment will still be used, and a new one will not be allocated. This can be done by skipping over the env instruction that creates a new environment on a tail recursive call or modifying the env instruction to detect tail recursion.

The third method may be something we can consider implementing later on, but given the results in Kelsey’s paper it does not seem to be worth the additional complication, as it is not that much faster than the second method.

~ by stormwyrm on 2012-02-11.

One Response to “Tail recursion”

  1. […] the virtual size has gone down from over 400 megs to about 170. After using ab to do the same test as below, virtual size is 283 megs, from over 1.6 gigs before. It’s still a lot, but a lot less than […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
%d bloggers like this: