Sunday, April 11, 2021

Return value optimization and assignment

Return value optimization

 'Return Value Optimization' (RVO) is a compiler optimization where it can apply copy elision in case a function returns an object by value. There are two flavors:

  • RVO
  • NRVO

 In the following example a function returns a class X:

class X
{
};

X Get()
{
   return X{};
}

int main()
{
   X x1 = Get();  // 1 constructor call
}

 Without RVO in above case there would be two constructor calls: one for creating the anonymous temporary and one to copy construct that to the call side.

 With RVO the compiler can directly construct the object in the call side location and circumvent the extra copy. In C++ 17 this is only guaranteed when returning anonymous variables.

 The mechanism works as if the function 'Get' has a hidden argument in which a class of type 'X' can be created directly. The MSDN article in the link below does a great job of explaining it.

Named return value optimization

With 'Named Return Value Optimization' (NRVO) the return variable inside the function has a name. Still the compiler is allowed to optimize this one away as well:

X Get()
{
   X x;  // named value

   return x;
}

int main()
{
   X x1 = Get();
}

  Be aware that the compiler is allowed but not required to remove the extra copy. Observations on Visual Studio 2019:

  • in _DEBUG mode two constructor calls: one inside the function and one copy constructor to the final destination.
  • in RELEASE mode only one constructor call. NRVO was applied here.

Assignment 

Things are less ideal when an object is assigned instead of constructed. Consider the following example:

int main()
{
   X x1;         // 1 constructor call
   x1 = Get();   // 1 constructor and one (move) assignment call
}

The code needs two constructor- and one assignment calls: The 'Get' function needs again a type X for its hidden argument and the compiler creates a temporary hidden object to be filled inside the function. This temporary is then assigned to call side variable 'x1'. 

 From a performance perspective this is less ideal to fill an object; especially if default constructors are relative heavy. It can be considered to switch back to the classic way of filling an object through argument. This can be an attractive alternative when assignment is a dominant use case:

void Fill(X* p)
{
}

int main()
{
   X x1;         // 1 constructor call
   Fill(&x1);
}

Links

Careful with std::ranges

<ranges>   C++20 has added the the ranges library. Basically it works on ranges instead of iterators but added some subtle constraint...