Sunday, January 10, 2021

C++ solutions for some C issues

The C language

 The C language was co-developed with UNIX and played and important part in the ICT history. C is a small yet powerful language with little overhead. Unfortunately the use of C let to some issues:

  • dangling pointers
  • memory leaks
  • buffer overruns

 C++

  C++ original goal was to offer and object oriented programming language but still be compatible with C. Later it incorporated generics as well in the form of 'templates'. 

 The class model allows for making resource wrappers for above issues. Dangling pointers can be solved by using smart pointers; leakage cannot occur due to destructors and access to buffers is protected by member functions. The standard library of modern C++ offers some of these solutions out of the box:

  • smart pointers like unique_ptr and shared_ptr own a memory resource. The smart pointer releases the memory when the last reference to the smart pointer is going out of scope. This solves the problems of dangling pointers and memory leaks. Note that shared_ptr's are not completely opaque: they still have some sharp edges as well like the circular reference problem and the inability to used shared_from_this from a constructor
  • std::vector offers a safe way to manage a contiguous buffer. It has iterators for access and it automatically grows when elements are added. The memory is released when going out of scope. Again it offers an alternative for all problems above.
  • std::string is comparable with std::vector but is specialized for strings. C uses character arrays and they are prone for all of the above mentioned problems

Good C++ code can be as fast or even faster than corresponding C code. As usual one has to know the idioms and read the standard C++ books.

Example 

Suppose you have a function which fills a variable length buffer and do some processing on it. It has multiple early out paths.

 C case

In C this could be:

#include <stdlib.h>

bool f()
{
   size_t nLen = GetBufferLength();

   int* p = malloc(nLen * sizeof(int));
   
   if (!GetBuffer(p, nLen))
   {
      free(p);
      return false;
   }

   if (!EncodeBuffer(p, nLen))
   {
      free(p);
      return false;
   }
   
   free(p);
   return true;
}

 C++ case

 In C++ one can use std:vector as variable length buffer:


#include <vector>

bool f()
{
   const size_t nLen = GetBufferLength();

   std::vector<int>	vec(nLen);
   
   if (!GetBuffer(vec.data(), vec.size()))
   {
      return false;
   }

   if (!EncodeBuffer(vec.data(), vec.size()))
   {
      return false;
   }
   
   return true;
}

There is only one small performance drawback in using std::vector: its elements get default or zero initialized which may be an issue in case a huge buffer is allocated.

  Except for small trivial programs C++ is almost always the better choice over C due to these facilities. It makes one wonder why the Linux kernel hasn't switched over. Instead they chose to support Rust which admittedly has better security protection than C++ but at the cost of relearning a new syntax, a new programming paradigm and would probably touch productivity as well. Another example is the official Python interpreter which is written in C. A part of the code deals with reference counting of Python objects. This could be removed by using a C++ smart pointer. Working with COM it was sometimes also a hassle to get the reference count right but with using CComPtr the problem is basically solved.

No comments:

Post a Comment

Watch out for atan change in Visual Studio 2022 17.14.6

atan  Recently we updated Visual Studio 2022 17.14.6 and the regression test reported errors. It turned out that atan implementation was cha...