Type erasure
Type erasure is a generic concept in progamming, where different types can be hidden behind a single, generic interface.
You probably already did type erasure in C: passing void*
pointer around and casting them!
This is a simple case of type erasure, which is not type safe. A tagged union also counts, although it only provides erasure for a pre-defined set of types.
std::function
std::function<R(Args...)>
is a type-erased polymorphic function object that can wrap all kinds of callables with the specified argument types Args...
and a return type that is convertible to R
.
It stores the provided callable (with ownership) by (forwarding) copy/move. Dynamic allocation is required in the general case, however implementations may use SBO (Small Buffer optimization).
It’s suitable for use at API boundaries, e.g. when providing a callback, with owning/by-value semantics.
Implementation
The actual type erasure happens inside the constructor. Besides storing the callable type-erased (either by dynamic allocation or SBO), std::function
needs to record information about how to
do operator()
later. Since std::function
is copyable, it also needs to remember how to copy the callable!
There’s even more: If SBO is in effect, to move an std::function
would require invoking the callable’s move constructor. Otherwise we just move the pointer.
All of these has to be done inside the constructor of std::function
since the type information of the callable is only available there.
Here’s a limited implementation I wrote. It has no SBO, and doesn’t support all kinds of callables (e.g. member function pointers):
|
|
It’s a rough demo only, but you get the idea: “remember” how to do things later by storing function pointers.
ptr
is the stored callable.
func_ptrs
is a group of function pointers, for managing stored callable of type T
.
call_ptr
records how to call the type-erased callable.
Another implementation would be an abstract base functor, which defines the virtual operator()
with desired return type and arg types. In fact we’ve just implemented something similar to vtables!
Performance
The performance cost mainly stems from dynamic allocation and function pointer redirections.
Ownership
std::function
has ownership over the contained callable. Since C++26, there is also std::function_ref
which stores non-owning reference to the actual callable, to avoid additional allocation.
std::any
As its name suggests, std::any
is a container, that can wrap an object of any type. The type info is checked on access, so it provides type safety, compared to void*
.
Besides that, std::any
has clear owning semantics, while void*
pointers have no inherent ownership.
Since the size of the object can be arbitary, it generally requires dynamic allocation (so it’s more expensive compared to std::variant
).
The implementation of any_cast
requires no RTTI, since std::any
does not need to know the exact type of the contained object, it only needs to check if the requested type matches the actual type when doing any_cast
. This is typically achieved by storing a pointer to a static member function of a class template that is instantiated by the stored type. Comparing types is thus equivalent to comparing pointers to templated functions.
That being said, std::any::type()
is available for obtaining runtime typeinfo of the contained object (It does require RTTI, by storing type_info). This allows flexible visitation.
std::shared_ptr
What has std::shared_ptr
to do with type-erasure? Well, it stores the deleter and allocator type-erased, inside the shared control block, so it’s only templated by the object type.
In a typical implementation, shared_ptr
holds only two pointers: one to the shared object, one to the “control block” that is also shared by all replicated instances. The control block will hold the pointer to the managed object (or the object itself), the deletor and the allocator (both are type-erased), and the ref-counting numbers.
A generic deleter can be supplied during construction, then stored (type-erased) inside the control block.
std::variant
A type-safe union. An instance of std::variant
at any given time either holds a value of one of its alternative types, or in the case of error - no value.
As with unions, if a variant holds a value of some object type T
, the object representation of T
is directly within the object representation of the variant itself. Variant is not allowed to allocate additional (dynamic) memory.
It’s usually implemented with placement new, on a buffer that is large enough to hold all kinds of elements, and has appropriate alignment. Or a tagged union.
Visitation
std::visit
works on std::variant
by taking a generic callable that can be applied to any type held by the variant object. If the visitor function is not exhaustive, you get a compile time error.
This allows unified, type-safe operations on std::variant
objects.
Implementation
A typical implementation can generate (at compile time) a table of function pointers to the resolved/instantiated function for every value type of the variant. At runtime, the stored type index is used to select the right function.
Again, function pointers!
Other sum types
std::optional
, std::expected
, …
In general, sum types sits between actual type erasure that allows wrapping arbitary types at runtime, and fully static polymorphism.
Type-erased iterator
E.g. boost::any_range
, any_iter
. Such iterators wraps any iterator that satisfy some behaviors, such as being forward iterators. They’re especially useful at API boundaries, to provide support for abitary iterator/ranges, without compile-time monomorphization.