Thursday, January 1, 2026

Careful with refactoring

Refactoring issue

 This year we applied a small refactoring in a piece of code. The construct was a parent - child relationship with the child hold by unique_ptr. Simplified the code was somewhat as follows:

struct Parent
{
   explicit Parent(bool b)
   : m_b(b)
   {
      m_ptr = std::make_unique<Child>(this);
   }

   std::unique_ptr<Child> m_ptr;
   bool                   m_b; 
};

struct Child
{
   explicit Child(Parent* pParent)
   : m_bShow(pParent->m_b)
   {
   }

   bool  m_bShow; 
};

 Code above has a bidirectional dependency between parent and child. It will compile if one splits out in header and source files for parent and child. 

 The heap use of child seemed redundant so it was refactored to be hold by value:

struct Parent
{
   explicit Parent(bool b)
   : m_child(this)
   , m_b(b)
   {
   }

   Child   m_child;
   bool    m_b; 
};

 This refactoring lead though to a bug where sometimes things were shown and sometimes not. The bug only appeared in release mode under certain conditions. It turned out that the value of 'm_bShow' was using uninitialized memory. We accidentally created the first memory safety issue in years! 

 The problem is that the child is created before the parent is fully created and thereby using uninitialized memory. The solution is easy by reversing creation order:

struct Parent
{
   explicit Parent(bool b)
   : m_b(b)
   , m_child(this)
   {
   }

   bool    m_b; 
   Child   m_child;
};

 Using the 'this' pointer in constructor is a code alarm but not a code smell. As a general rule here declare first all members of built in types before declaring members with classes. If that is not viable one can defer creation by using std::optional instead of std::unique_ptr. std::optional is often more optimal from a performance perspective.

 Not sure how Rust would have prevented this; probably by disallowing the construct in the first place. That would be pity since circumventing the extra heap allocation and pointer access is certainly worthwhile the final solution where children are hold by value.

Thursday, December 25, 2025

Watch out for std::vector::at()

 Aspects of operator[] vs at() 

 In order to bump the default memory safety of C++ the committee has decided to harden the STL with adding bounds checking to operator[]. This is redundant since bounds checking is already present through function 'at()'. Instead a safety profile could promote 'at()' and issue a warning for the use of operator[].

 This decision has also consequences for performance. If we compare current operator[] which has no bounds checking with 'at()' with bounds checking it is 5 times slower. Consider the following two functions:

int g_iTemp = 0;

void PrfStlVectorIteratorIndex(const std::vector<int>& rv)
{
   int nTemp = 0;
   
   const size_t nLoop = rv.size();
   	
   for (size_t n = 0; n != nLoop; ++n)
   {
      nTemp += rv[n];
   }
   
   g_iTemp = nTemp;
}
   
void PrfStlVectorIteratorIndex(const std::vector<int>& rv)
{
   int nTemp = 0;
    
   const size_t nLoop = rv.size();
    
   for (size_t n = 0; n != nLoop; ++n1)
   {
      nTemp += rv.at(n);
   }
    
   g_iTemp = nTemp;
}

The results for a certain test with VS2022 17.14.23 with /O2: 

Function                  #            Total(s) 
PrfStlVectorIteratorIndex 1 0.149972
PrfStlVectorIteratorIndexAt 1 0.727781

The function using 'at()' is 5 times slower. Spying the assembly it seems that MSVC uses SIMD instructions in case of operator[] but it cannot use them with 'at()'.

Conclusion 

This is a significant difference. It makes one wonder why the C++ committee took the decision so lightly to tax every invocation of operator[]. Especially since a major use case for operator[] is to use it in a loop as above where there is no danger of going out of bounds. Their argument is that it costed only 0.3% extra performance which clearly contradicts above numbers. Also they stated that on certain code bases it revealed thousand extra bugs. Not sure what that code base is. For decades we use Visual Studio with Microsoft's STL which has the extra checking turned on in debug mode and it never fires these asserts when testing debug builds (which is what programmers do all the time). If it would fire you found a bug and repair it. Let users who value safety over performance use the 'at()' variants but leave the operator[] alone.

 

Tuesday, December 23, 2025

Thoughts on C++ 26

Sutter's video

 The other day I watched Sutter's YouTube video about 3 cool things in C++ 26:

  1. Make C++ safer by replacing undefined behavior (UB) with erroneous behavior (EB)
  2. Reflection
  3. Yet another syntax for async

Safe C++ 

Sutter mentions two aspects:

  • uninitialized local variables will be data mangled. The compiler may inject code to check if uninitialized variables are accessed.
  • hardening of STL; most notably operator[] 
According to studies the overhead is minimal (0.3%). This number is debatable: they can never know what applications are out there. In the past we had bad experience with VS 2008 who turned on safe iterators in release builds. They killed all compiler optimizations right away when used.

I question also the first bullet: why not make it simpler and state that every variable will be default or zero initialized. There is no EB or UB necessary; or no hidden code injected by the compiler.

Some of the hardened STL functions are unnecessary. There are already 'at()' functions which bounds check. A safety profile could warn for use of operator[].

Reflection

Nice that reflection is added but I wonder if the C++ committee has the right priorities. The standard library even lacks a standard JSON or XML library which would be an ideal candidate for automatic serialization through reflection.

Async

They added a new superfluous new syntax. So much for consistency.

 Conclusion

Memory safety is an issue but I believe more in safety profiles than changing the language. Even so I would go then for zero initialization instead of checks with hidden costs. Reflection is nice but what C++ lacks most is standard libraries; not major language changes.

Saturday, November 8, 2025

Issues with Linux port

 

Linux port

 The company who employs me has decided to port parts of our application to Linux. At a first shot we will use WSL and Visual Studio but issues are not over:

  • CMake is the lingua franca of generating cross platform build scripts. CMake is a beast of itself however and not sure why this got so popular.
  • WSL keeps sometimes its old configuration and source files. It seems that when the source files are read only they are read only on the target WSL system as well. If you edit a file afterwards the update of this file will fail so one won't see their changes. Either make the source files not read only beforehand or clean all files and directories on the WSL host and then start fresh again. Alternatively make the source files writable on Linux through chmod. 
  • MSVC uses __declspec(dllexport) to export functions from DLL's; GCC doesn't have that.
  • MSVC is pushing security enhanced versions of the CRT through its code analyzer. According to cppreference these functions are standardized albeit as extension (i.e. Annex K of C11). Unfortunately glibc has not implemented them; partially because some dubious reasoning. The API isn't perfect and there are pre-exisiting bounds checking crt but still it's standardized and some people (we) use them. So one ends up writing Windows and Linux specific code even in a layer which supposed to be platform independent.
  • Many of the MSVC C API is Windows specific (e.g. _splitpath; _makepath; _tchdir). Using C alone on Windows platform may therefore still not platform independent.
  • Warning suppression's in pre-compiled header in GCC are ignored in code. This is quite unhandy; especially since some suppression's one want to apply globally to all sources and are therefore are primary candidate to put in pre-compiled header. I have filed a bug 123287 report and it's stated that this has been solved for GCC 15.x.
  • On Windows wchar_t are 2 bytes and represent UCS-2 or UTF-16. On Linux wchar_t it is 4 bytes and probably represents UTF-32. To be compatible with existing persistence storage we had to use char16_t on certain places in the code. Unfortunately it seems that some character code conversion facilities are deprecated so this solution will not hold out for long.
  • std::basic_ifstream and std::basic_ofstream don't accept std::wstring as function name on Linux. This seems to be a MSVC extension so change code to use std::filesystem::path which is a conformance improvement.
  • __FUNCTION__ is an extension which both MSVC and GCC understand. On GCC it is not a macro so prepending it with 'L' to get the wide character variant does not work. It also only gives the function name without class in case member function which makes in unattractive. So specific MSVC and GCC code is needed. There is a standard: __func__. However again it only gives the function name. source_location is another alternative but this gives too much information for the function name. 

GCC might still contain some basic bugs. The warning about #pragma once in main file (when building a pre-compiled header) is only solved in version 14.

Saturday, October 18, 2025

Using clang-cl in Visual Studio

clang-cl

 clang-cl is the command line tool in Visual Studio capable of invoking the clang compiler with the arguments of msvc. In Visual Studio projects one can just flip the toolset and the clang compiler will be chosen. clang has the following positive aspects:

  • better C++ conformance. Examples are that msvc is leniant towards missing 'typename' for dependent types and 'template' for nesting templates; clang picks them up. There are other issues.
  • offers some code improvements like correct member order in constructors and virtuals which override base class
  • detects some performance improvements like advising to use shared_ptr by reference in loops
  • more precise compilation warnings and errors

It has also some drawbacks: 

  • some noisy warnings 
  • does not understand all msvc code. For example the msvc's #import extension is not understood

 Despite using msvc's code analysis the clang compiler was still able to pick up other issues. Some clang warnings are far fetched and one can choose to disable them. This is especially needed for external libraries which one cannot easily patch. To disable warnings one can use the following:

#ifdef __clang__
#pragma clang diagnostic ignored "-Wimplicit-exception-spec-mismatch"
#pragma clang diagnostic ignored "-Wmissing-field-initializers"
#pragma clang diagnostic ignored "-W#pragma-messages"
#pragma clang diagnostic ignored "-Wunused-but-set-variable"
#pragma clang diagnostic ignored "-Wunused-local-typedef"
#endif

 The first one for example is necessary to suppress warnings in MFC. 'delete' should be specified with 'noexcept' but the MFC delete lacks this.


Monday, August 25, 2025

Watch out for hypes in ICT

Hypes

 ICT has a rich history of hypes where people thought that this would be a panacea for all problems. These hypes lasted for some time like paradigms in Thomas Kuhn's theory about evolution of science. From the top of my head we had the following hypes in the past:

  • relational / SQL databases (70's)
  • structural design
  • object oriented design (80's)
  • component based development (90's)
  • design patterns (1995)
  • scrum / agile (2001)
  • AI (2022)

 Many of these hypes were initially promising but not to the extend of solving all problems. They are now part of the current solution domain. We also know now that there are still problems to tackle. 

 Let's see what AI will bring us in the future. For now it's on the level of coding assist but not on the level of designing whole systems. In that part it still cannot replace programmers. There are already studies mitigating the effect of using AI.  It also still makes mistakes. From personal experience it can introduce errors in a code base if you let it run without crosschecking.

 Scrum has brought nothing to ICT except misery. The company I work for took a major loss after embracing it. 

Saturday, August 16, 2025

Careful with AI tooling

AI tooling

 Since some period I started working with AI tooling. Mostly I use Gemini and Copilot inside Visual Studio. The experience is a bit of mixed feelings about this. Gemini had some good suggestions but failed also many times. Copilot has good code completion suggestions but misses the mark also. Copilot's function name suggestion are very welcome.

 On the other hand AI tooling is still full of mistakes. Some examples:

  • I asked Gemini for camera sharpness algorithm. It came up with a good algorithm but the actual OpenCV function calls and parameters were incorrect.
  • I asked Gemini to get the real sample time from an 'IMediaSample'. It suggest to use the non existing 'GetSampleTime'. There is b.t.w. a 'GetMediaTime' function but this returns the stream time; i.e. the time since the graph was running and not the time from the start of the video. 
  • I asked Gemini lately of conversion from UCS-2 to UTF-16 and it wrongly suggested to use wstring_convert. However wstring_covert is hardbound to std::string as byte_string

 Even worse that sometimes AI tooling can suggest plain bugs. I was implementing a swap of width and height and Copilot's code complete came up with the following code snippet:

// NOTE: incorrect 
Size sz = ...;
if (sz.GetWidth() < sz.GetHeight())
{
   sz.SetWidth(sz.GetHeight());
   sz.SetHeight(sz.GetWidth());
}

This doesn't swap but sets the width and height on the old height value.

 AI tooling can be helpful but are still not on the level to be trusted blindly. They also now help with limited scope; e.g. code blocks; algorithms and functions. I am not aware if they can help in refactoring and extending architecture spanning solutions.

 

Careful with refactoring

Refactoring issue  This year we applied a small refactoring in a piece of code. The construct was a parent - child relationship with the chi...