Big bag of pages

Implementing the big bag of pages memory allocator proved to be not all that difficult. Arcueid still uses malloc to allocate objects greater than 512 bytes in size, but smaller objects are allocated using the big bag of pages. A simple free list is maintained for each of the sizes from 1 to 512 bytes, and if the free list for a particular size is found empty, a new chunk of memory (marked immutable) is allocated that should be big enough to hold up to 64 (by default) objects of that size. This chunk is carved out into objects of the requested size and added to the free list.

This approach has shown a large performance increase. Running the ab test as below produces the following results:

$ ab -n 100 -c 10 http://localhost:8080/said
This is ApacheBench, Version 2.3
Copyright 1996 Adam Twiss, Zeus Technology Ltd,
Licensed to The Apache Software Foundation,

Benchmarking localhost (be patient).....done

Server Software:
Server Hostname:        localhost
Server Port:            8080

Document Path:          /said
Document Length:        166 bytes

Concurrency Level:      10
Time taken for tests:   5.282 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      24000 bytes
HTML transferred:       16600 bytes
Requests per second:    18.93 [#/sec] (mean)
Time per request:       528.151 [ms] (mean)
Time per request:       52.815 [ms] (mean, across all concurrent requests)
Transfer rate:          4.44 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:   440  522  99.9    497     857
Waiting:      408  489  95.8    464     808
Total:        440  523 100.0    497     857

Percentage of the requests served within a certain time (ms)
  50%    497
  66%    508
  75%    518
  80%    522
  90%    615
  95%    850
  98%    856
  99%    857
 100%    857 (longest request)

This is almost twice the speed , showing that memory allocations are a large factor dominating performance: obviously the simple free list that is now used for objects in our big bag of pages is much, much faster than malloc. This has also reduced total memory usage somewhat, though not that much, using 37 megs on initial load of arc.arc, 150 megs after loading libs.arc, and 253 megs after the above test, vs. 41, 170, and 283 respectively before implementing BiBOP.

One thing that the current  BiBOP implementation cannot do is release memory back to the OS. Trying to make a shot at doing this will in any case make the free list not such a simple free list any more, and hence slow down memory allocations as well. We’ll probably have to use mmap instead of malloc to get such memory blocks, and the block size should be made larger than it is today, and some algorithms need to be devised for managing the multiple free lists efficiently.

One other thing: we are getting very close to being able to run news.arc! Some bug fixes involving ssyntax (available in git head) now allow news.arc to finally load correctly. However, there still seem to be other bugs though that prevent it from running properly, that’s also being fixed.


~ by stormwyrm on 2012-02-17.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: