The Surprising Slowness of C++’s std::variant

I work on my own scripting language in my spare time. In this language, scripters do not specify the type of variables, variables change type depending on what scripters put into them.

This is what is commonly called a variant.

C++ comes with its own variant class, but it is intended for storing completely distinct types. Given I do not really want to care what type a scripter used, and perform implicit conversions, I could actually implement this as a class hierarchy, with a common base class like:

class Value {
public:
	~Value() = default;
	
	int64_t GetAsInt64() const = 0;
	void SetAsInt64(int64_t) = 0;
	
	string GetAsString() const = 0;
	void SetAsString(const string&) = 0;
};

and subclasses are then like

class Int64Value : public Value {
public:
	Value(int64_t n = 0) : mValue(n) {}
	
	int64_t GetAsInt64() const { return mValue; }
	void SetAsInt64(int64_t n) { mValue = n; }
	
	string GetAsString() const { return to_string(mValue); }
	void SetAsString(const string& n) { mValue = atoll(n); }
	
protected:
	int64_t mValue;
};

Now the problem with this is that this provides a common interface for all types that lets us treat them the same, but we have to know what type a variable will be at compile-time. We still haven’t managed to make a variable change type. To do that, you’d have to use operator new: You would delete and new it under a new type to change its type. But that would be bad, as it would allocate a new block on the heap for each int/string you wish to use. Performance would be ridiculously slow. Ideally, we’d want these to be held in-line (or on the stack) like any other type.

Usually, you would use a union for that:

union ValueUnion {
	int64_t mInteger;
	string  mString;
};

But unions don’t like types with a constructor like string, because it doesn’t keep track of a union’s current type, so wouldn’t know whether it would have to call ~string() or not. Theoretically you’d use variant for that (but I’ll mention below why this doesn’t work here).

So what do we do? I didn’t know, until one day I remembered that C++ has what is called placement new. Placement new lets you provide the storage, and will then call an object’s constructor for you. It is intended for classes like vector, which allocates one large block for all elements, and then places objects inside the array one after the other. So I added the following class:

union VariantUnion {
	VariantUnion() {}
	~VariantUnion() {}
	Int64Value mInteger;
	StringValue mString;
};

class alignas(union VariantUnion) VariantValue : public Value {
public:
	VariantValue(int64_t n = 0) { new (mStorage) Int64Value(n); }
	VariantValue(const string& n) { new (mStorage) StringValue(n); }
	~VariantValue() { ((Value*)mStorage)->~Value(); }
	
	int64_t GetAsInt64() const { return ((Value*)mStorage)->GetAsInt64(); }
	void SetAsInt64(int64_t n) { ((Value*)mStorage)->SetAsInt64(n); }
	
	...

protected:
	uint8_t mStorage[sizeof(ValueUnion)];	
};

So basically its only job is to forward all calls to the underlying object. It will also create the underlying object using placement new (and destruct it again using placement delete, which looks like just calling the destructor directly). We have to provide the storage ourselves (which we do by making sure that mStorage is large enough to hold either of our possible types), but that is what we want, because we can just declare our storage in-line as a fixed-size array of bytes, and forego the extra allocation.

So how do we make it possible to change a type? We need to destroy the current Value subclass in mStorage and allocate a new one using placement new. I did that in my subclasses by implementing the setters for all other types:

class Int64Value : public Value {
public:
	...
	
	void SetAsString(const string& n) {
		((Value*)this)->~Value();
		new (this) StringVariantValue(n);
	}
	
	...
};

This way, if a variable is already an int64_t, it will just change Int64Value::mValue, but if it is another type, it will re-allocate the type in place.

I know this code is scary: You need to make sure that ValueUnion contains all your supported types, otherwise you might over-run your memory and cause hard-to-find bugs. You also need to make sure you use the proper alignment, because an array of uint8_t is usually aligned on 1-byte boundaries, which is invalid for e.g. an int64_t on many platforms. And since we’re overriding the alignment, you can’t have any other member variables in your VariantValue without thinking through whether that will mis-align mStorage. It also involves Int64Value destructing itself while its method is running and constructing a new object in its place. And finally, Int64Value makes assumptions about how large the storage its containing class has allocated for it is. That’s not proper encapsulation.

On the plus side, though, usage of this class is beautifully straightforward. You just call SetAsString() or GetAsString() and it will magically do the right thing, or throw an exception if it can’t convert.

So What’s Performance Like?

In a quick test, I used an array<int64_t> as a baseline, running a loop of 1’000’000 iterations with my programming language. That took 10ms.

Using the above approach, which of course replaces direct memory accesses with function calls and a bit of virtual dispatch overhead for each GetAsInt64() and SetAsInt64() call, doubled the runtime to 20 ms on my Mac, and to about 30ms on Windows.

Then I tried using C++’s variant:

inline string GetAsString() const {
	return std::visit([](auto&& arg) {
		using T = std::decay_t<decltype(arg)>;
		if constexpr (std::is_same_v<T, int64_t>) {
			return to_string(arg);
		} else if constexpr (std::is_same_v<T, string>) {
			return arg;
		} else {
			static_assert(always_false_v<T>, "non-exhaustive visitor!");
		}
	}, mValue);
}

inline void SetAsString(const string& n) {
	mValue = n;
}

...

variant<int64_t, string> mValue;

...

This took a whopping 100ms. Given that involved a lot of fancy new constructs like lambdas and generics, I tried going more old-school and finding the type via the variant::index() for comparison:

inline string GetString() const {
	switch(mValue.index()) {
		case 0:
			return to_string(get<int64_t>(mValue));
			break;
		case 1:
			return get<string>(mValue);
			break;
	}
}

This actually got us down to 60ms. But it also lost us all the compile-time type safety in favor of a runtime exception if someone adds a type in the middle of the variant<int64_t, string_>’s list of types, instead of at the end. It also does not raise any errors if we forget to implement a type like our pure virtual methods do. I also suspect that this will perform an extra type check on the assignment on every assignment, whereas our polymorphic approach knows when a type doesn’t change and will just go through virtual dispatch, which likely is better for caches.

So we have a factor 3 slowdown at worst for the polymorphic approach, and factor 6 to 10 slowdown for C++’s built-in variant. Plus, given variant assumes wholly independent types, the code for accessing a variant using a certain type (and performing appropriate conversion) is a lot cleaner using polymorphism.

So is C++’s Standard Variant Bad?

I wouldn’t make that blanket statement. I haven’t got much experience with the class yet, so I might just not be using the right call, and I have a very specific use case that doesn’t quite match what variant was made to support. All I can say is that I encourage you to not just stick variant into your core interpreter loop without comparing it to other approaches. :-p