Region Strategies - Cone Reference

Note: Only the rc and so regions are currently supported.

To allocate a new value, we ask a region to get it done:

imm nbrref = +rc-imm 5  // the 'rc' region counts references

There are many kinds of regions, each using a different strategy for allocating memory and then later safely reclaiming that memory after an object has no references that point to it. Variations in their strategies can deliver noticeably different behavior with regard to:

Throughput. Strategies vary greatly in runtime cycles consumed by allocating, freeing, and reference bookkeeping.
Responsiveness. Some strategies offer excellent responsiveness to realtime demands, while others can experience erratic "lag spikes" while bookkeeping activity takes place.
Data flexibility. Some strategies support all kinds of data structures. Others limit the structures they handle well to non-cyclic or hierarchical graphs.
Determinism. Some strategies promptly release unneeded memory and resources. Others may wait a long time between last use and release, causing memory and resources to be tied up longer than desired.
Memory use. Some strategies use a limited amount of memory effectively. Others have higher ratios of memory allocated-to-used or fragment memory more.
Runtime footprint. Some strategies are easily implemented with little code. Others may require the presence of large, complex runtime logic.

Cone offers a wide selection of library-implemented regions, because no one strategy handles all these criteria well. This power of choice ensures each allocated object can select the optimal region that best serves the way the program needs that object to behave.

This page does not list every possible region that can be used by a Cone program, as new ones can be defined within the language itself. Instead, this page summarizes the five popular region strategies used to allocate new objects pointed at by an owning reference. The focus is to describe the mechanism used by each of these region strategies and assess how they compare on the above criteria. In reality, each of these region strategies may be implemented by multiple concrete regions that vary somewhat in how they work (e.g., +rc might only work within a single thread, whereas +arc might allow multiple threads to share references to the same object).

Single Owner

The single-owner region strategy's mechanism, strengths and weaknesses derive from a very simple constraint: there is only ever one owning reference to an object it allocates. That reference may be passed around from function to function or from thread to thread, but it does so using move mechanics to ensure no copy of the reference is ever made. Because there is only one reference to the object, the memory for that value can be automatically finalized and freed when the reference expires.

Here is the simple definition of a single-owner region:

region @move so:      // @move ensures move semantics on references, i.e. no aliasing
    fn alloc(size usize) Option[*u8]:
	    malloc(size)  // May use any general-purpose memory allocator
	fn free(self &uni rc):
	    free()

Single owner is a good choice if you are looking for:

Better throughput garbage-collected or ref-counted region strategies (as there is no runtime bookkeeping cost). Throughput will likely be worse than arenas.
Determinism, as reclamation happens immediately after the last reference expires. Indeed, single owner is the only region strategy that allows the programmer to decide exactly when the object is to be explicitly destroyed and to support exception handling in the event of an unsuccessful destruction.
Efficient use of memory.
Negligible runtime footprint.

The key shortcomings are:

Data structures are limited to hierarchical graphs. It cannot be used to construct cyclic or many directed-acyclic graphs, as these require the use of multiple references to the same node.
Use of these references must comply with the more-constrained move mechanics. Where moves are forbidden, workarounds (such as swaps) must be employed to extract a reference.
Multiple owning references cannot point to the same object. That said, multiple borrowed references may be temporarily obtained to this object. These are considered temporary, because the lifetime of those borrowed references is constrained to the scope they were created in.

Reference Counting

As the name suggests, the rc region keeps a running counter of the number of references to each allocated object. When the object is first allocated, its counter starts at 1. Every time a copy is made of an rc owning reference, the counter is incremented. Whenever an owning reference goes out of scope, the counter is decremented. When the counter reaches 0, the object is reclaimed.

Here is the simple definition of a reference counting region, where aliasing happens automatically:

region rc:
    cnt u32
    fn alloc(size usize) Option[*u8]:
	    malloc(size)
	fn init(self &uni rc):
	    cnt = 1
	fn alias(self &uni rc):
	    ++cnt
	fn dealias(self &uni rc):
	    if (--cnt == 0)
		  free()
	fn free(self &uni rc):
	    free()

Variations on this algorithm may be built to support multi-threaded use (with atomics) and leverage move semantics and explicit aliasing (to improve throughput).

Reference counting is a good choice if you are looking for:

Memory utilization for non-cyclic data structures.
Semi-determinism, as reclamation happens immediately after the last reference expires.
Runtime footprint, as the mechanism is very simple.

The biggest shortcomings are:

Throughput is typically the worst, due to expensive allocation and free algorithms, and frequent cache-damaging reference counts updates.
Memory leaks due to cyclic reference counts that keep unused data islands alive forever.

To prevent cycle-based memory leaks, one can use a variation that supports weak references to make any cycle breakable. A weak reference may be obtained from any owning reference:

childnode.parent = parentnoderef.weak

Tracing Garbage Collection

A tracing GC region captures metadata about every object it allocates, including detailed information about its references to other allocated objects. Periodically, the region's runtime garbage collector will trace out all references reachable from a root set of objects (including execution stacks). Any allocated value that cannot be traced from the root is considered unreachable, and therefore is safe to automatically finalize and free.

Tracing GC is a good choice if you are looking for:

Data structure flexibility, supporting all kinds of data in the same way.
Better throughput than rc (though slower than other regions due to its regular mark-and-sweep bookkeeping costs).
Stable memory use over long runtimes.

The biggest shortcomings are:

Poor responsiveness, due to unpredictable stop-the-world lag spikes.
Poor determinism, as there can be a sizeable lag between last use and free.
Large and complex runtime footprint, resulting from the need for sophisticated and tunable incremental, generational, and multi-threading algorithms for optimizing performance.
Memory greedy, as optimal performance can require 4-6x more memory than the amount used.

Arena

An arena region begins life pre-allocating a large (and growable) slab of memory. This is the arena. Every new allocation takes a bite out of this arena using a fast bump pointer. Allocations are never individually freed. The entire arena is freed as a single event when the program ends.

Arena region is a good choice if you are looking for:

Blazing-fast throughput. Bump pointer allocation is much faster than malloc. Even better, there are no free nor runtime bookkeeping costs.
Excellent realtime responsiveness, as memory management never causes lag spikes.
Data structure flexibility, supporting all kinds of data in the same way.
Minimal runtime footprint.

The biggest shortcomings are:

Memory use grows without bounds, due to individual objects never being freed.
Awful determinism for finalizers, as objects are only finalized when the program ends.

An arena region is a surprisingly good fit for a limited-lifetime program that allocates a lot of small objects but wants optimal performance. This applies well to many command-line tools, including compilers.

Pool

The mechanism for pools derives from this constraint: all objects in a pool have the same type and size. A pool region begins with a pre-allocated (and growable) memory area that is logically partitioned into identically-sized slots. Allocating space for a new object is efficient: it either reuses a slot found on a linked-list of free slots, or else uses a bump pointer to carve out a new slot from the unused area.

A pool region must be created before using it to allocate values:

mut eventPool RcPool<Event>

This example creates a new region called eventPool. All its allocated object have the type Event (which might be an enum type). This pool is an RcPool, which means references to pool-allocated objects are reference-counted, thereby determining when to free the object. An SoPool would use single-owner references, and so on.

Once a pool region is defined, the region may be specified to allocate a new value:

imm newevent = &eventPool mut Event::QuitEvent[]

Pools are a good choice if you are looking for:

Decent throughput. Pool-based allocation and free is faster than any strategy other than arenas.
Memory efficiency. Pools are very good at reusing reclaimed memory.
Responsiveness and determinism varies according to the reference freeing mechanism: SoPool and RcPool will do pretty well.
Minimal runtime footprint.

The biggest shortcoming is:

All values must have the same type.

When a program has a large number of objects of the same type being regularly allocated and freed, as is true in many games, pools can be very useful strategy. This is particularly true when functions process all objects sequentially in a cache-friendly way.