Programming languages offer a way to dynamically allocate new values. And if a language enforces memory safety, as most languages do, it uses an automatic memory management mechanism (such as tracing garbage collection) to ensure that allocated values are automatically freed when no longer needed. Most languages bake in a single memory management strategy, take it or leave it.
In Cone, allocated references handle these memory management responsibilities. They do so in a versatile way: each allocated reference specifies which memory management strategy allocates its value and manages its lifetime. One reference might use reference counting, another reference garbage collection, and a third a Rust-like single-owner model (RAII with escape analysis). Each memory strategy has distinct advantages and disadvantages. Cone's versatility allows a program to select the optimal mix of strategies that best satisfies its requirements.
A reference in Cone is either an allocated reference or a borrowed reference. Using allocated references looks largely like use of borrowed references, but there are differences. An allocated reference owns the reference that a borrowed reference borrows from. Thus, an allocated reference may be coerced to a borrowed reference, but not vice-versa. The handling of allocated references sometimes carries extra runtime costs that borrowed references never do. Allocated references cannot point to substructures the way a borrowed reference can.
This page describes how allocated references are declared, shows how they are created, and explains the memory management mechanisms that allocators use to ensure memory safety.
Allocated reference Declarations
As with borrowed references, the type declaration for an allocated reference begins with &. The reference mechanics follow:
- Lifetime. Allocated references never specify a lifetime, as it is inferred.
- Allocator. This is the memory management strategy used to allocate and manage the lifetime of an allocated value. We will get into the details of this soon. The presence of a specified allocator is how you know it is a allocated reference.
- Permission. The uni permission is the default given when no permission is specified. uni grants a reference the ability to view or change the reference's value. This is the universal donor permission, allowing mutation and inter-thread movement, but restricting aliasing to a single live reference. Extensive information about uni and other permissions is covered on the permissions page.
- Value Type. This specifies the type of the value that the reference points to.
To summarize, allocated reference type declarations always need to specify the allocator and value type. A permission is specified when different from uni. For example:
imm aref1 &own i32 imm aref2 &rc mut i32
Creating an Allocated reference
There is only one way to create an allocated reference: Specify the & operator, followed by the allocator, an optional permission, and then either specify its initial value or use a type initializer:
imm rcref = &own 45 // Allocate a new boxed value, initialized with 45 imm gcref = &gc mut Point3(1f, 2f, 3f) // Allocate a new Point3 value using an initializer
Creating an allocated reference is quite different than creating a borrowed reference. With a borrowed reference, we are simply obtaining the already-known address of some value. With an allocated reference, we are actually allocating a new place in memory with enough space to hold a value of the desired type. That memory space is initialized using the provided value or initializer. The returned allocated reference points to this newly created value.
An initializer is a certain kind of method defined by the value's type. It is typically named init. Its purpose is to fully initialize a value of that type. A type may define multiple initializers.
Initializers are very useful when the use of a literal would be:
- Insufficient, because initialization needs to perform some logic that is more than simply assigning value(s).
- Impossible, because parts of the type are private and can only be accessed by methods of the type.
- Inconvenient, because the literal would be unnecessarily verbose in comparison to using an initializer.
The type's declaration of an initializer looks like any other method and can accept parameters. It typically returns no value. self is always a reference. When used by an allocated reference, the reference passed to self is a newly allocated area of memory that still needs to be initialized.
The initializer for the Point3 example above might look like this, effectively filling in the x, y, and z fields of a Point3 struct:
fn init(self &, xi f32, yi f32, zi f32) x = xi; y = yi; z = zi
We know that an allocator reference is using an initializer when it explicitly names a type or a type's method in what looks like a function call.
// Point3 is a type. Point3 is de-sugared to Point3::init imm gcref = &gc mut Point3(1f, 2f, 3f)
A type may also define a single drop method. This method may never be explicitly called by a program. However, it is implicitly always called before any allocated value is about to be freed.
The purpose of the drop method is to give back any acquired resources or dependencies before the allocated value vanishes. It is useful for closing file or network handles, or de-subscribing to services.
Cone supports several standard allocators, each corresponding to a specific memory management strategy. Allocators vary in how they allocate and free memory, as well as the lifetime algorithm they use to determine when it is safe to automatically free an allocated memory segment, knowing that no more usable references exist that point to it.
There is no one perfect approach to memory management. Each allocator has its specific advantages and disadvantages. The art lies in choosing the right mix of strategies that best meets the program's requirements for performance, memory utilization, and data structure complexity.
own - Single owner (RAII with escape analysis)
Only one allocated reference may point to any own-allocated value. That reference may be passed around from function to function or from thread to thread. The memory for that value is automatically dropped and freed when its single reference goes out of its last scope. 'own' is similar to Rust's Box<T> single-owner mechanism.
The key benefits are performance, deterministic drops, and memory efficiency. The performance benefit comes from avoiding the need for runtime bookkeeping, since the allocated value's lifetime can be determined at compile time. The biggest downside is its data structure inflexibility. It cannot safely handle data structures that require multiple allocated references to the same value, such as all cyclic and many directed acyclic graphs.
Note: An 'own' allocated reference only accepts the 'uni' or 'imm' permissions. It is not possible to alias an own allocated reference. Any attempt to do so simply moves the reference to a new binding.
rc - Reference Counting
When a new rc reference is created, it is initialized with a counter that is set to 1. Every time an alias (copy) is made of an rc allocated reference, the counter is incremented. Whenever a reference goes out of scope, the counter is decremented. When the counter reaches 0, the value is dropped and freed.
The key benefits are memory efficiency, deterministic drops, simple mechanism, and much better data structure flexibility than lex. The biggest downsides are performance (due to cache-unfriendly runtime bookkeeping) and memory leakage when dealing with cyclic data structures. The latter weakness can be ameliorated sometimes through the use of weak references.
gc - Tracing Garbage Collection
A record is kept of every new allocation. Periodically, the garbage collector traces out all references reachable from a required root collection (including the execution stack). Any allocated value that cannot be traced from the root is considered unreachable, and therefore is safe to automatically drop and free.
The primary benefit is its flexibility: it supports any kind of data structure, and usually does so with better performance than rc. The chief drawbacks are its runtime bookkeeping costs (regular tracing and sweep cycles), unpredictable stop-the-world lag spikes, non-deterministic (and possibly delayed) drop and free, and the complexity of its implementation and tuning (particularly for multi-threaded garbage collection).
arena - Regional
This strategy makes efficient use of a pre-allocated (and growable) arena of memory. Every new allocation takes a bite out of this arena using a fast bump pointer. Allocations are never individually freed. The entire slab is freed as a single event when the arena itself goes out of scope.
The primary benefit is speed: allocation and free are much faster than malloc or equivalent, and there are no run-time bookkeeping costs. The chief drawbacks are memory waste (because nothing is freed until the arena goes out of scope) and the inability to support drop finalizers.
pool - Fixed-size Pool
Similar to arena, this strategy makes use of a pre-allocated (and growable) memory area that has been divided up into identically-sized slots. Each new allocation quickly grabs some unused slot. Reference counting is used to determine when an allocation is freed (marking its slot as reusable).
The primary benefit of pool is memory efficiency and a much faster allocation and free of size-limited values than is possible using malloc. The chief drawbacks are the size limits on values and the runtime bookkeeping costs of its underlying lifetime algorithm.