Recursive references
Recursive data structures are data types that refers to themselves. Chances are you’re already familiar with linked lists:
| |
A ListNode owns some data, and contains a (non-owning, nullable) reference to the next node.
Why does this work? Because C/C++ allows to define a reference to an incomplete type. If we leave out the pointer part of next, the compiler should complain about something like this:
error: field ‘next’ has incomplete type ‘ListNode’
note: definition of ‘struct ListNode’ is not complete until the closing brace
…, which is self-explanatory: During the definition of a struct, the struct itself is seen as an incomplete type. An incomplete type has unknown size, so the compiler couldn’t figure out how much storage it’ll use.
However, references/pointers to incomplete types are allowed here, because they have fixed size, in terms of implementations.
Ownership
For simple one-way linked lists, it is possible to replace the raw pointer with std::unique_ptr<ListNode>, so the first node automatically owns the entire list by recursion.
Recursive variant
std::variant is often employed to represent data with runtime-variable type. A recursive variant is a variant type that is able to store itself, which is useful for hierarchical data structures. For example, a property list structure could represent its node using using Node = std::variant<t1, t2, t3, ... , Self>, where Self is some type that contains another (child) Node. An instance of Node is a leaf node if it’s not of type Self.
We cannot simply write using Node = std::variant<t1, ..., Node>, since recursive using is simply not allowed. Some sort of indirection struct is required here. We could forward declare a struct that contains Node, and use it with the variant. What if:
| |
Again, this won’t work either. To determine sizeof(Node) would be an infinite recursion.
Simply wrap the forward declared struct inside a smart pointer:
| |
This way, the parent node has ownership over the child node (if it has one). The additional price we pay is dynamic memory allocation.
Let’s test this with some examples:
| |
This should print:
| |
Inside main(), v1 is constructed, then moved into v2 as a child. v2 is again transferred to v3. Since v1 contains a plain int, the first move is equivalent to a copy. Semantically, v3 owns all the data, and is automatically released when dropped.
The Visitor prints data contained in V, and if it has a child node, visit that child node recursively.
Removing the wrapper type
We had to wrap the Node inside a wrapper type, since forward declaration doesn’t make sense for a using alias. We could get around this via inheritance:
| |
Use with map
V isn’t particularly useful, it only serves as a demonstration. In practice, we’re more likely to place the variant in some container, e.g.:
| |
The intent is clear: Map should be able to store instances of itself as its values.
The above approach requires two dynamic-allocated indirections, one of which (the unique_ptr) is redundant.
We could go the other way around, by declaring the value type first:
| |
Or, to make our intent clearer:
| |
Application: A heterogeneous data structure
We could represent a JSON-like data structure that can hold heterogeneous data recursively, using the tricks demonstrated above.
The data representations:
| |
We could serialize Value into string, recursively:
| |
The visitor is a generic lambda that can be applied to all types represented by Value.
It recurses into Array or Map types.
To put things together:
| |
It prints:
| |
, as expected.