Memory fragmentation and malloc
The resident set size of the Arcueid process on my machine, after it loads arc.arc, clocks in at 99 megabytes, and uses 124 megabytes of virtual memory. This is higher than I expected, when attempting to traverse the root set and all allocated memory as kept track of by the garbage collector only amounts to less than 10 megabytes. Since at the moment we use malloc and free to do all our memory allocation and freeing, we are probably getting lots and lots of fragmentation, which became obvious after looking at the memory map of an Arcueid process. Well, now I learn the hard way why no real high-level language runtime does things that way! I used to have some block memory management code before where I would use malloc only to allocate large blocks of memory (memory pools), which I would then free only if they ever became empty, but debugging this was dreary work so I decided to just use malloc/free for the time being. I knew that fragmentation would become a problem but I had not realized just how badly it would affect an application such as this. In hindsight it must be obvious, given that we are allocating thousands of small objects like cons cells just to load and compile arc.arc, some of which become garbage and others persist indefinitely.
I will thus have to write some form of manual memory allocator that carves out large blocks of memory using malloc/mmap/etc., and then parcels portions of this out to programs as they need them, maintaining a free list to manage allocations. One thing I can do is maintain one set of memory pools for all cell structs (which are the fundamental building block for all of Arcueid’s data types), and another pool for everything else, e.g. immutable memory for things like hash tables, irregularly sized cell structs for things like vectors, etc. This is a simplified version of the big bag of pages that recognizes only two basic objects: cells and non-cells.
There seem to be many different memory allocation algorithms out there of varying degrees of complexity, but for the time being let’s just keep things simple, and maintain just a basic free list sorted by ascending addresses in a block. We’ll optimize the cell case by maintaining separate memory pools for regular cells as they are the most common form of memory allocation. Expect to see this in version 0.0.11.