Thursday, December 31, 2020

OOP and performance

Encapsulation and information hiding

  Encapsulation and information hiding are major concepts of object oriented programming (OOP). From a maintenance perspective it is beneficial not to be dependent or know about the implementation of an object. Not considering these aspects though can also harm the performance. Example given here may be obvious but is used for illustration purposes.

Example

  Consider the following sample class which uses a std::vector to store its data elements.The sample has the following member functions:

  • Add to add a datum
  • Clear to clear all contained data
  • Sum (or something else) to extract the data

#include <numeric>
#include <vector>

class Sample
{
public:
   void Add(double d)
   {
      m_vecData.push_back(d);
   }

   void Clear()
   {
      m_vecData.clear();
   }

   double Sum() const
   {
      return std::accumulate(m_vecData.cbegin(), m_vecData.cend(), 0.0);
   }

private:
   std::vector<double>  m_vecData;
};

Suppose there is some recorder class which produces sample data. There are a number of ways samples can be retrieved from the recorder:

  1. a function GetSample which returns the sample.
  2. a function FillSample which clear and fills a supplied sample.

class Recorder
{
public:
   Sample GetSample() const
   {
      Sample s;
      s.Add(m_dValue); // in real life the data comes from e.g. a buffer, file or socket

      return s;
   }

   void FillSample(Sample* pSample) const
   {
      pSample->Clear();
      pSample->Add(m_dValue);
   }
   
private:
   double   m_dValue = 1.0;
};

A client use is that many samples are extracted from the recorder in e.g. a loop.


// example 1:
double SampleGet(const Recorder& rRecorder, size_t nMax)
{
   double dTotal = 0.0;

   for (size_t n = 0; n != nMax; ++n)
   {
      const Sample s = rRecorder.GetSample();

      dTotal += s.Sum();
   }

   return dTotal;
}

// example 2:
double SampleGet(const Recorder& rRecorder, size_t nMax)
{
   double dTotal = 0.0;

   Sample s;

   for (size_t n = 0; n != nMax; ++n)
   {
      s = rRecorder.GetSample();

      dTotal += s.Sum();
   }

   return dTotal;
}

// example 3:
double SampleFill(const Recorder& rRecorder, size_t nMax)
{
   double dTotal = 0.0;

   Sample s;

   for (size_t n = 0; n != nMax; ++n)
   {
      rRecorder.FillSample(&s);

      dTotal += s.Sum();
   }

   return dTotal;
}

std::vector only reallocates memory when the capacity is too small. This steers the performance for a large part since memory allocations are performance wise relative heavy:

  1. using GetSample is the cleanest solution from a C++ perspective. Although NRVO may prevent superfluous sample copies, the sample (and thereby the vector) still needs to be created inside the function which may hamper the performance.
  2. using GetSample with the sample outside the loop is not much better.
  3. using FillSample is not the cleanest interface but is favorable from a performance perspective in this case since memory (re)allocation will only occur if the supplied Sample' vector cannot hold enough data elements

Above aspects are an issue due to the implementation details of the 'Sample' class and when performance considerations need to be weighted.

Counter argument

  The classical counter argument from an OO perspective would be that the interface should reflect and prevent bad client use. For this case one could delete the copy- and move constructors and assignment operators. Not sure if that is a good direction. In itself it's not a bad thing that samples get copied. After all they may be treated as value objects Also deleting these functions would prevent them of storing them in the preferred STL container std::vector.

    Conclusion

    As usual in engineering aspects have to be judged and balanced. The OO principles are good guidelines but it's good to know their limitations and break them when necessary.

    'Effective C++' (third edition) mentions a similar case in Item 26 'Postpone variable definitions as long as possible'.

    No comments:

    Post a Comment

    Careful with std::ranges

    <ranges>   C++20 has added the the ranges library. Basically it works on ranges instead of iterators but added some subtle constraint...