The Surprising Slowness of C++'s std::variant
I work on my own scripting language in my spare time. In this language, scripters do not specify the type of variables, variables change type depending on what scripters put into them.
This is what is commonly called a variant.
C++ comes with its own variant
class, but it is intended for storing completely distinct
types. Given I do not really want to care what type a scripter used, and perform implicit
conversions, I could actually implement this as a class hierarchy, with a common base
class like:
class Value {
public:
~Value() = default;
int64_t GetAsInt64() const = 0;
void SetAsInt64(int64_t) = 0;
string GetAsString() const = 0;
void SetAsString(const string&) = 0;
};
and subclasses are then like
class Int64Value : public Value {
public:
Value(int64_t n = 0) : mValue(n) {}
int64_t GetAsInt64() const { return mValue; }
void SetAsInt64(int64_t n) { mValue = n; }
string GetAsString() const { return to_string(mValue); }
void SetAsString(const string& n) { mValue = atoll(n); }
protected:
int64_t mValue;
};
Now the problem with this is that this provides a common interface for all types that lets
us treat them the same, but we have to know what type a variable will be at compile-time.
We still haven’t managed to make a variable change type. To do that, you’d have to use
operator new
: You would delete
and new
it under a new type to change its type.
But that would be bad, as it would allocate a new block on the heap for each
int
/string
you wish to use. Performance would be ridiculously slow. Ideally, we’d want
these to be held in-line (or on the stack) like any other type.
Usually, you would use a union
for that:
union ValueUnion {
int64_t mInteger;
string mString;
};
But unions don’t like types with a constructor like string
, because it doesn’t keep
track of a union’s current type, so wouldn’t know whether it would have to call
~string()
or not. Theoretically you’d use variant
for that (but I’ll mention below why
this doesn’t work here).
So what do we do? I didn’t know, until one day I remembered that C++ has what is called
placement new
. Placement new
lets you provide the storage, and will then call an
object’s constructor for you. It is intended for classes like vector
, which allocates
one large block for all elements, and then places objects inside the array one after the
other. So I added the following class:
union VariantUnion {
VariantUnion() {}
~VariantUnion() {}
Int64Value mInteger;
StringValue mString;
};
class alignas(union VariantUnion) VariantValue : public Value {
public:
VariantValue(int64_t n = 0) { new (mStorage) Int64Value(n); }
VariantValue(const string& n) { new (mStorage) StringValue(n); }
~VariantValue() { ((Value*)mStorage)->~Value(); }
int64_t GetAsInt64() const { return ((Value*)mStorage)->GetAsInt64(); }
void SetAsInt64(int64_t n) { ((Value*)mStorage)->SetAsInt64(n); }
...
protected:
uint8_t mStorage[sizeof(ValueUnion)];
};
So basically its only job is to forward all calls to the underlying object. It will also
create the underlying object using placement new
(and destruct it again using placement
delete
, which looks like just calling the destructor directly). We have to provide the
storage ourselves (which we do by making sure that mStorage
is large enough to hold
either of our possible types), but that is what we want, because we can just declare our
storage in-line as a fixed-size array of bytes, and forego the extra allocation.
So how do we make it possible to change a type? We need to destroy the current Value
subclass in mStorage and allocate a new one using placement new
. I did that in my
subclasses by implementing the setters for all other types:
class Int64Value : public Value {
public:
...
void SetAsString(const string& n) {
((Value*)this)->~Value();
new (this) StringVariantValue(n);
}
...
};
This way, if a variable is already an int64_t
, it will just change Int64Value::mValue
,
but if it is another type, it will re-allocate the type in place.
I know this code is scary: You need to make sure that ValueUnion
contains all your
supported types, otherwise you might over-run your memory and cause hard-to-find bugs. You
also need to make sure you use the proper alignment, because an array of uint8_t
is
usually aligned on 1-byte boundaries, which is invalid for e.g. an int64_t
on many
platforms. And since we’re overriding the alignment, you can’t have any other member
variables in your VariantValue
without thinking through whether that will mis-align
mStorage
. It also involves Int64Value
destructing itself while its method is running
and constructing a new object in its place. And finally, Int64Value
makes assumptions
about how large the storage its containing class has allocated for it is. That’s not
proper encapsulation.
On the plus side, though, usage of this class is beautifully straightforward. You just
call SetAsString()
or GetAsString()
and it will magically do the right thing, or
throw an exception if it can’t convert.
So What’s Performance Like?
In a quick test, I used an array<int64_t>
as a baseline, running a loop of 1’000’000
iterations with my programming language. That took 10ms.
Using the above approach, which of course replaces direct memory accesses with function
calls and a bit of virtual dispatch overhead for each GetAsInt64()
and SetAsInt64()
call, doubled the runtime to 20 ms on my Mac, and to about 30ms on Windows.
Then I tried using C++’s variant
:
inline string GetAsString() const {
return std::visit([](auto&& arg) {
using T = std::decay_t<decltype(arg)>;
if constexpr (std::is_same_v<T, int64_t>) {
return to_string(arg);
} else if constexpr (std::is_same_v<T, string>) {
return arg;
} else {
static_assert(always_false_v<T>, "non-exhaustive visitor!");
}
}, mValue);
}
inline void SetAsString(const string& n) {
mValue = n;
}
...
variant<int64_t, string> mValue;
...
This took a whopping 100ms. Given that involved a lot of fancy new constructs like lambdas
and generics, I tried going more old-school and finding the type via the
variant::index()
for comparison:
inline string GetString() const {
switch(mValue.index()) {
case 0:
return to_string(get<int64_t>(mValue));
break;
case 1:
return get<string>(mValue);
break;
}
}
This actually got us down to 60ms. But it also lost us all the compile-time type safety
in favor of a runtime exception if someone adds a type in the middle of the
variant<int64_t, string_>
’s list of types, instead of at the end. It also does not
raise any errors if we forget to implement a type like our pure virtual methods do. I also
suspect that this will perform an extra type check on the assignment on every assignment,
whereas our polymorphic approach knows when a type doesn’t change and will just go through
virtual dispatch, which likely is better for caches.
So we have a factor 3 slowdown at worst for the polymorphic approach, and factor 6 to 10
slowdown for C++’s built-in variant
. Plus, given variant
assumes wholly independent
types, the code for accessing a variant using a certain type (and performing appropriate
conversion) is a lot cleaner using polymorphism.
So is C++’s Standard Variant Bad?
I wouldn’t make that blanket statement. I haven’t got much experience with the class yet,
so I might just not be using the right call, and I have a very specific use case that
doesn’t quite match what variant
was made to support. All I can say is that I encourage
you to not just stick variant
into your core interpreter loop without comparing it to
other approaches. :-p