C++ programming topics: Multiplication and the optimizer

Sunday, January 31, 2021

Multiplication and the optimizer

Test case

the other day I was profiling the difference between floating point and integer multiplication. To test the multiplication contribution with the least amount of overhead contribution the functions did a number of multiplications.

floating point

The test for floating point case was as follows:


double PrfMultiply(size_t nMax, double d, double dNumber)
{
   for (size_t n = 0; n != nMax; ++n)
   {
      d *= dNumber;
      d *= dNumber;
      d *= dNumber;
      d *= dNumber;

      d += (d < 100 ? 100.0 : 0);
   }

   return d;
}

For x64 the Visual Studio compiler uses SSE instructions for floating point for numbers so it's no surprise then to see four (scalar) multiplication instructions in the generated code:


00007FF7C0D41700  mulsd       xmm7,xmm1  
00007FF7C0D41704  mulsd       xmm7,xmm1  
00007FF7C0D41708  mulsd       xmm7,xmm1  
00007FF7C0D4170C  mulsd       xmm7,xmm1

Note: it's also remarkable that the compiler only uses the scalar SSE instruction and not invokes the packed variant (mulpd); From this table one can also see that (double) floating point multiplication lasts around 5 CPU cycles which makes it a fairly fast instruction

integer

For integer the function looks almost the same:


size_t PrfIntegerMultiply(size_t nMax, size_t n, size_t nNumber)
{
   for (size_t n2 = 0; n2 != nMax; ++n2)
   {
      n *= nNumber;
      n *= nNumber;
      n *= nNumber;
      n *= nNumber;

      n -= (n > 1000 ? 995 : 0);
   }

   return n;
}

It turns out that Visual Studio 2019 (in release mode) already optimized the four multiplication steps and coalesced them in one multiplication instruction (3^4 == 81 == 51h):


00007FF61C131840  imul        rbx,rbx,51h

Conclusion

When profiling it's always advisable to look at the generated assembly code. Often the optimizer is smarter than you think or may even completely removes function invocation. This is especially the case with fixed predefined numbers and (static) functions in translation units.

C++ programming topics

Sunday, January 31, 2021

Multiplication and the optimizer

Test case

floating point

integer

Conclusion

No comments:

Post a Comment

Watch out for an old VC++ runtime

Report Abuse

Labels