Compiling boost on QNX: a tale of why modules are needed in C++

Recently I had to compile Boost Program Options 1.65.1 for QNX 6.6 (QNX is a UNIX-like real time operating system). This should have worked:

b2 toolset=qcc target-os=qnx –with-program_options –link=static

but I got this error message:

In file included from ./boost/bind/bind.hpp:29:0,
from ./boost/bind.hpp:22,
from libs\program_options\src\parsers.cpp:19:
./boost/bind/arg.hpp:37:15: error: expected '>' before '(' token
./boost/bind/arg.hpp:43:83: error: '_FCbuild' cannot be used as a function

If we look at arg.hpp:37 we see this template definition:

template < int I >
struct arg

    template< class T > BOOST_CONSTEXPR arg( T const & /* t */, typename _arg_eq< I == is_placeholder<T>::value >::type * = 0 )

looking at the compiler output: what is _FCbuild? It must come from somewhere… a macro perhaps? Adding the option cxxflags=”-P” to bjam to produce the preprocessed output sheds some light:

template< int _FCbuild(0.0F, 1.0F) > struct arg
    constexpr arg()

    template< class T > constexpr arg( T const & , typename _arg_eq< _FCbuild(0.0F, 1.0F) == is_placeholder<T>::value >::type * = 0 )

You can see that the template argument I got converted into “_FCbuild(0.0F, 1.0F)”. So is there a macro with the name I somewhere??

Indeed. QNX defines in complex.h:

 #define imaginary	_Imaginary
 #define _Imaginary_I	((float _Imaginary)_Complex_I)
 #define I	_Imaginary_I

 #else /* _HAS_C9X_IMAGINARY_TYPE */
 #define I	_Complex_I
 #endif /* _HAS_C9X_IMAGINARY_TYPE */

and either the first #if or the second ends up defining _FCbuild

So simply adding
#undef I
after all the #includes in arg.hpp solves the problem.

What we see here is pollution of interfaces, the innocent code in arg.hpp gets polluted by QNX complex.h. Causing I to be replaced and producing code that makes no sense to the compiler.

This is something that Bjarne Stroustrup has been pointing out as a benefit of modules. Code in arg.hpp would have been coded like this modules:

import config; // #define I came from config.hpp
template < int I >
struct arg

    template< class T > BOOST_CONSTEXPR arg( T const & /* t */, typename _arg_eq< I == is_placeholder<T>::value >::type * = 0 )

import config; does not “leak” macros (or in general pre-processing state); modules allow to compile interface declarations in isolation. With modules a developer can use the template argument I without fear that his code would be broken by a macro defined by an OS standard header.


On the memory layout of objects and zero-cost abstractions of C++

When I was writing a financial application that stored millions of vectors some years ago I was intrigued by the overhead of the Visual Studio 2008 implementation of std::vector. Recently I discovered an (undocumented) compiler switch that gave me the answer to why that happened: /d1reportSingleClassLayoutXXX where XXX is a class name. If we compile this file main.cpp:

#include <vector>

std::vector<int> v;

like this:

cl /c /EHsc /nologo /W4 /MT main.cpp /d1reportSingleClassLayout?$vector@HV?$allocator@H@std@@
class ?$vector@HV?$allocator@H@std@@    size(48):
        | +--- (base class ?$_Vector_val@HV?$allocator@H@std@@)
        | | +--- (base class ?$_Container_base_aux_alloc_real@V?$allocator@H@std@@)
        | | | +--- (base class _Container_base_aux)
 0      | | | | _Myownedaux
        | | | +---
 8      | | | ?$allocator@V_Aux_cont@std@@ _Alaux
        | | | <alignment member> (size=7)
        | | +---
16      | | ?$allocator@H _Alval
        | | <alignment member> (size=7)
        | +---
24      | _Myfirst
32      | _Mylast
40      | _Myend

we could see std::vector would have at least 48 bytes of overhead over the raw data (in this case I used the mangled name of std::vector when invoking cl)

Fortunately, this changed with time and in VS 2017 you get this output:

class std::vector<int,class std::allocator >       size(24):
 0      | +--- (base class std::_Vector_alloc<struct std::_Vec_base_types<int,class std::allocator > >)
 0      | | ?$_Compressed_pair@V?$allocator@H@std@@V?$_Vector_val@U?$_Simple_types@H@std@@@2@$00 _Mypair
        | +---

So it reduced in half: you just need a pointer to the allocation (_Myfirst), a pointer to end of the utilized section of the allocation (_Mylast) and a pointer to the end of the allocation (_Myend). The VS implementation is now truly following the zero-cost overhead principle, it only stores those 3 pointers = 24 bytes.

Looking at the memory layout in VS2008 we could see that the compiler was storing a pointer to the allocator, which is unnecessary and it was removed in later versions. Similarly std::string reduced its overhead from 48 bytes in Vs 2008 to 32 bytes in VS 2017.

The compiler switch has a sister, /d1reportAllClassLayout, which would output the memory layout of all the classes that would be part of the .obj. With this kind of options it is easy to see why it is usually suggested to declare data members by decreasing alignment. E.g.

struct MyClass {
  char a;
  int* b;
  char c;


class MyClass   size(24):
 0      | a
        | <alignment member> (size=7)
 8      | b
16      | c
        | <alignment member> (size=7)

but the chars are 1-byte alignments and the pointers are 8-byte aligned, so ordering them by decreasing aligment as the rule says:

struct MyClass {
  int* b;
  char a;  
  char c;  
class MyClass   size(16):
 0      | a
 1      | c
        | <alignment member> (size=6)
 8      | b

Saves 8 bytes. It is easy to see why the compiler needs padding. If there was no padding in the last example we would have:

|acbbbbbb|bb      *

Accessing b would require fetching two words instead of one which would not be efficient. Padding is inserted so the memory access is aligned:

|ac      |bbbbbbbb*

The compiler would not perform any reordering of the data members by itself as the C standard says that data members shall have incresing memory addresses. C was designed for direct memory access and that rule would allow programmers to predict memory layouts and store blocks of data read from devices directly.

But reducing memory consumption may not be the main issue at stake:
– On a 64-bit x86, a cache line is 64 bytes beginning on a self-aligned address, so you may want to store the data members that are frequently accessed together on those lines
– You may also want to prevent false sharing separating concurrently accessed data members so threads running on multiple processors do not invalidate their cache copies constantly
– In embedded systems, the offsets encoded inside instructions to point to addresses could very small, e.g. in a 16-bit ARM Thumb, you would have 5 bits (for offsets from 0 to 127) in some instructions, so you may want to keep frequently used data members at the beginning of a structure’s layout

So with this post I would like people to see of the importance of measuring and not assuming zero-cost abstractions in C++. This compiler option, /d1reportSingleClassLayout, is mentioned in a MSDN blog post on how to debug ODR violations and in one of the great Stephan T. Lavavej’s videos about the STL

Migrating existing C++ code to use modules

Microsoft released experimental support for modules a while ago. But it was not until the recent launch of Visual Studio 2017 that it provided us the modules that implement the standard library which allows us to try out modules much easier. They would only be installed if the component “Standard Library Modules” is explicitly checked during installation.

There are several posts about modules in the Visual C++ blog, but all of them are more of a hello world than anything else. In this blog post I would like to provide a bigger example, closer to the real world. We would migrate code implemented without modules to C++ with modules. The code simply parses this configuration file:

  <server address="" port="8080"/>

that would allow to configure the connection details of a TCP client. It will use tinyxml2 to perform the XML parsing.

This was the design of the library (without modules):


I posted all the code in github. The legacy code (without modules) will not be discussed. This will be design of the library using modules:


Our goal is to create a static library, clientconfig.lib that will be consumed by its users using clientconfig.ifc (a module interface file, following the Internal Program Representation binary format). Note that for users to consume the library, several variants of the .ifc file may be created, i.e. depending on the configuration (Debug or Release) or the platform (Win32 or x64) we could have for instance 4 variants. In this case the user of the library will be a simple program, client.exe, that will only read the configuration.

clientconfig.lib will be implemented using a private module, tinyxml2. clientconfig.cpp will consume that module through the tinyxml2.ifc file and it will export the declarations of the whole library in clientconfig.ifc. Because exports are not transitive (unless explicitly requested with an “export import” statement), tinyxml2 symbols will not be seen outside the clientconfig module.

Before we start delving into the code, some important advice (that applies as of VS 2017 update 2). There are some issues if modules are used from the IDE, so do not use the IDE at all when testing modules. Also use Release mode or you would not be able to link executables. All the commands that will be used below are supposed to be executed in a VS2017 native x64 console. The github repo contains nmake makefiles with the commands as a reference.

The first step was to port tinyxml2 to use modules. tinyxml2 5.0.1, which is the version I started from, can be cloned from here and I checked out the tag 5.0.1. The things I changed:

  • Starting from tinyxml2.h, renaming it to tinyxml2_desc.ixx and specifying the symbols to export; we will export the whole tinyxml2 namespace:

module tinyxml2;

export namespace tinyxml2

// ... declarations


  • Replacing #includes of C++ headers for import. E.g. cctype, cstdlib, cstring and were simply replaced by import std.core. Note that some includes, like stdio.h stay, this is because even if they are in std.core,
    they do not define some macros needed to be able to use the C Standard Library (e.g. SEEK_SET, INT_MAX…)
  • Getting rid of macros. Modules do not export macros. The TIXMLASSERT macro defined originally in tinyxml2.h is not exported by its counterpart tinyxml2.ifc so I replaced all of these macro invocations in tinyxml2.cpp with if ( !((void)0,(x))) { __debugbreak(); }. tinyxml2_desc.ixx does not need an include guard either.
  • Removing the omision of values for arguments with a default value. EDIT: I considered this a Visual Studio bug. But the according to the C++ modules TS issues list, this seems the way functions with default arguments are exported.
  • Making protected and private sections public, because otherwise trying to access them causes an undefined symbol error. EDIT: I considered this a bug at the beginning. But after watching C++ Modules: The State of The Union it seems intentional.
  • Some changes unrelated with modules: adding #define _CRT_NO_VA_START_VALIDATION to tinyxml2.cpp and also:

#define TIXML_VSNPRINTF _vsnprintf
#define TIXML_SNPRINTF _snprintf

as 2017 comes with an implementation for vsnprintf and it is no longer necessary to define our own.

Please follow the makefile to see how to compile the library with the compiler switches needed to support modules.
It is worth noting after generating an .ifc file, for instance with the command:

cl /c /EHsc /experimental:module /MD /std:c++latest tinyxml2_desc.ixx

Not only an .ifc object is created but also and .obj file containing symbol definitions needed to create the library.

The main interface description file of the clientconfig library is as simple as:

module clientconfig;
import std.core;

export void readConfigFile(std::string& address, std::string& port);

And the code to implement it:

import clientconfig;
import tinyxml2;
import std.core;
using namespace std;
void readConfigFile(string& address, string& port)

  tinyxml2::XMLDocument doc; XMLElement* configuration;
  XMLElement* node;

  configuration = doc.FirstChildElement("configuration");
  node = configuration->FirstChildElement("server"); address = node->Attribute("address", 0);
  port = node->Attribute("port", 0);

Error checking code was ommited as is not very relevant to the discussion. Also, this code is subject to change. The gcc implementation of modules suggests that the technical specification may be changed, so a file could be marked as the implementation translation unit of a module programmatically, but for now this seems to work on Visual C++.

The main.cpp will consume the clientconfig module:

import std.core;
import clientconfig;

using namespace std;

int main()

  std::string address, port;
  readConfigFile(address, port);

  cout << "Address: " << address << "\n"; cout << "Port: " << port;

  return 0;

This completes our program. I will be excited to see how the Microsoft implementation of modules scales when I apply it to large projects, as the build throughput should improve dramatically.

pct 1.0 now supports Visual Studio project files and is multithreaded

The initial version of pct (0.1.0) still required a lot of configuration to auto-generate precompiled headers for large codebases and it was not as fast as some users would have liked (#4) . Version 1.0 addressed both shortcomings.

pct can now parse Visual Studio project files to extract information like the path of the include directories to search for headers or the macro values to use; so the new options –sln or –vcxproj save you from specifying a long command line . Users that use qmake or cmake just need to generate Visual Studio project files from their build files to able to use the tool in the same way (See Cross-platform development with C++).

Performance greatly improved after, by processing every .vcxproj in parallel with std::async(). It is not uncommon to have big codebases in C++, so this change was useful and easy to implement.

I recently realized that environment variables referenced in the Visual Studio project files were not being expanded. Qmake expanded them for me before generating the Visual Studio project files, so I did not realize this until now; but it was  fixed on 8d43ad1. I also added an option –excluderegexp which is useful for qmake users, because it could be used to ignore moc_* files (which do not need to be parsed by pct because all their headers are referenced already in the original moc’ed header). E.g. –excluderegexp “moc_.*”.



Introducing pct, a tool to help reducing C/C++ compilation times

Until C++ modules become widely available (Microsoft released experimental support for the import statement last December: ), we still need to resort to precompiled headers as one way to reduce compilation times on Windows.

I have released today the first version of a tool that allows to auto-generate precompiled headers (usually named stdafx.h on Windows). Auto-generate stdafx.h is not as simple as it may seem. One may think that he could just make a grep in search of standard headers and then include all those lines in the stdafx.h of the project; but that does not take into account that some of those lines may be disabled depending on the macro values for instance. The tool uses the Boost wave preprocessor to preprocess the source code of a project and generates a header to be precompiled, including all the standard or third-party library headers referenced in the code.

Using the tool I have been to reduce compilation times on one of my Visual Studio projects by a factor of six. The source code and the binaries are at:

Cross-platform development with C++

I have been developing cross-platforms C++ applications for some years now. In this post I would like to share my experience on several aspects of C++ cross-platform development, mainly focusing on Windows and Linux and using Qt.

Build tools

Keeping your build project files (makefiles on unix, Visual Studio project files) in sync manually is a hassle and error prone task. To help with that I have tried several alternatives: cmake, qmake and bjam. I found that the easiest to use by far is qmake, part of the Qt toolkit.


qmake is my first choice when it comes to cross-platform development. It is not as powerful as cmake but its language is much simpler and concise. Imagine we would like to compile the helloworld program of my previous post. This qmake file will suffice as build script:


The template app tells qmake we would like to build an application. By default, in Windows this means to creating an .exe file that runs on a console.

If we would like to use an IDE to build the project, we can just open the .pro file with QtCreator, or we are using Visual Studio we can actually generate the solution files from the command line.

Because Visual Studio has the concept of “solution”, a root file that points to different projects, we need to create an equivalent to the .sln file, in this case, a .pro file that point to the other .pro files. In our case, I created a file:

TEMPLATE = subdirs

helloworld.subdir = helloworld = helloworld
SUBDIRS += helloworld

which looks for a .pro file called helloworld in the helloworld folder. Then I was able to generate the VS project files with:

%QTDIR%\bin\qmake -r -spec win32-msvc2013 -tp vc Q

The spec specifies the pair platform/compiler version to use. In my case because the spec was win32-msvc2013 and I was using a x64 native tools command prompt this generated 64-bit Visual Studio 2013 solution, that could be opened with the file helloworld_global.sln.

Many of the features of qmake are very straightforward, including adding precompiled headers portably.

I do not recommend using the Visual Studio Qt wizards to manage the VS project files. I feel much more practical to use the .pro file as a single point of truth and then use the command line to generate the VS project files.


cmake is probably the most popular and powerful toolkit for building cross-platform C++ code.

Its syntax is a bit cumbersome (for instance because it does not have compound statements, so sometimes you have to explicitly mark which statement is being ended, i.e. if you have nested ifs you may need to specify which if you are actually closing of all the ifs that were opened).

cmake has a GUI which is very handy for inspecting and change the value of the cmake variables in a CMake build directory.


bjam is a nifty tool. It is a good idea to use it if you are generating Boost Python, as it will take care that all the compiler options that are needed to interface C++ with Python are actually used.  Unfortunately, it is not well-documented (when I used it, I had to look in the boost code which options where available).

Processes and threads

C++11 introduced added language support to multithreading to C++. If you are using a pre-C++11 compiler and still want portable threading you can use boost (which primitives are very similar to C++11) or Qt.

Unfortunately there is still not a way in the C++ language to create processes, as process handling is very different between systems. Nevertheless Qt includes QProcess which allows you to manage processes in a portable way.

Portable GUIs

I developed a GUI for C++ using GTK around 10 years ago. I used glade to generate the code, and I really liked it at the time. The callback code needed the use of a lot of macros used emulate a type hierarchy (GTK is C based), but I thought it was mainly clean and neat anyway.

Unfortunately, as of 2016, it seems that GTK does not seem to be so well supported outside of Linux, specially if compared with Qt. Many projects that used GTK have moved to Qt.

Qt is C++ based and has many features that are interesting for GUI development. For instance, it has internalization support; its widgets take into account that some languages are written right to left, some have several accents, some have ligatures… It also allows you create a XML repository of translations that you can query at runtime depending on the locales. E.g. this element in qt_es.ts (included in Qt itself):

&lt;location line=&quot;+1&quot;/&gt;

Will instruct Qt to translate “File” to “Fichero”, when the Spanish/Spain locale is being used. E.g. the code:

QMenu *menu = new QMenu(menuBar);


will create a menu with the name “Fichero”. The macro tr() should be used for any string literal that could be displayed to the user, so the literal will be translated before displaying it.

Qt has also Android, iphone and Windows Phone support. For mobile development, Qt provides QML, a mark-up language which provides a good way to specify small user interfaces using a declarative language.

Compiler differences

Compilers implemented the C++11 standard at different speeds, and for some time this made the newest features in the standard not very portable. Fortunately, as of 2016, all the major compilers support C++11 almost in full.

There are still differences, though, the Visual Studio preprocessor is not standard compliant and fails when passing variadic data as a whole directly into a non-variadic macro.

Nevertheless Microsoft has promised to integrate the clang front-end in Windows as well, so in the future developing cross-platform C++ applications using Visual Studio should be completely hassle-free.

Subversion and end-of-line

Subversion is still probably the most popular centralized version control system. One problem it has when used in cross-platform development is that, by default, it does not pay attention to end of line markers. This means that, for example, if we commit a file on Linux and then check it out on a Windows machine, the editor may display the whole file in a single line, as it will not have carriage returns which the Windows programs expect.

The best solution to the problem is to use eol-style=native, that can be specified in subversion´s config file:

enable-auto-props = yes

*.cpp = svn:eol-style=native

This instructs svn to check out *.cpp files differently according to the platform. LF on UNIX and CRLF on Windows, the native end of line character on each one. We would need to do the same for every file extension that could cause problems.



Experimenting with modules in C++

One of the most awaited features of C++ are modules. One of the problems that modules addresses is that exactly the same headers files are compiled again and again during C++ developments (and some of them like <iostream> , could be extremely big). Caching those compilations will speed up compilation times dramatically.

Modules, in the LLVM implementation, are a generalization of precompiled headers. Precompiled headers are implemented by all the major C++ compiler vendors and they are simply a way of storing the compiler state to disk, so if a .cpp file has an associated precompiled header, the compiler can use this stored state to “fast forward” its state to the one it had after compiling the precompiled header, and skip the compilation of some headers. Because they are just storing the whole compiler state, only one precompiled header can be used per .cpp file.

Although precompiled headers could reduce compilation times dramatically, especially on Windows (where I experienced one order of magnitude improvement on one of my projects when I enabled them), they usually required a lot of manual work and they increase the chances of name collisions as they are usually shared by many files.

The latest versions of clang featured modules. Modules are a much better solution to reduce build times. They store the resulting Abstract Syntax Trees after the parsing of a header, that can be added incrementally to the current compiler state, instead of the whole state. Thusly, multiple modules can be loaded when compiling a translation unit, and a module’s header will be parsed once per language configuration rather than every time a dependant translation unit is recompiled.

Even if we do not use modules in our code, a module-enabled compiler like clang will still be faster than a traditional compiler, because it will still make use of one module, the std, skipping the textual inclusion of standard headers and using their binary representation instead.

You will need the LLVM suite, the clang compiler, the libc++ standard library and the libcxxabi to be able to experiment with modules. Because I wanted to experiment with the latest version, I checked out the trunk of all those projects, but the binary bundles of LLVM 3.7 have modules as well, so you may use them instead. I compiled the whole thing out of the source tree with:

cmake -DLLVM_PATH=../llvm -DLIBCXX_CXX_ABI=libcxxabi -DLIBCXX_CXX_ABI_INCLUDE_PATHS=../llvm/projects/libcxxabi/include -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm/projects/libcxx


(note: for some reason a parallel build with make -j of llvm crash gcc on my machine so executed my build serially, but that is probably something wrong my version of gcc, 4.8.4)

Then I tried this simple example, which just makes use of the std module:

#include <iostream>
using namespace std;

int main()
cout &lt;&lt; &quot;Hello, world!!!&quot;;
return 0;
gerardo@GOLIATH:~/llvm_tests$ time clang++ -fmodules -fcxx-modules -fmodules-cache-path=./cache_path -stdlib=libc++ -lc++abi -L/home/gerardo/llvm.libcxxabi.release/lib helloworld.cpp

real    0m2.213s
user    0m2.148s
sys    0m0.064s

This is the first time I created the cache, which entails compiling the whole standard library and thus the long compilation times. If we go to ./cache_path:

total 4
drwxrwx--- 2 gerardo gerardo 4096 Sep 13 16:37 D9DWX79PA5LV
-rw-rw-r-- 1 gerardo gerardo    0 Sep 13 16:37 modules.timestamp&amp;amp;lt;/code&amp;amp;gt;

total 10144
-rw-rw-r-- 1 gerardo gerardo   225476 Sep 13 16:37 modules.idx
-rw-rw-r-- 1 gerardo gerardo 10154840 Sep 13 16:37 std-2WRNL6O46F2FW.pcm

we can see that a module std was compiled in the file std-2WRNL6O46F2FW.pcm.

(Note: on my machine I got this error:

/usr/local/bin/../include/c++/v1/future:515:23: error: a non-type template parameter cannot have type 'std::__1::future_errc'

when clang was generating the std module. I guess this is just a bug in the trunk that will be fixed soon. I disabled the module future at /usr/local/include/c++/v1/module.modulemap and the generation of module std worked)

Subsequent compilations are much faster, thanks to the module cache:

gerardo@GOLIATH:~/llvm_tests$ time clang++ -fmodules -fcxx-modules -fmodules-cache-path=./cache_path -stdlib=libc++ -lc++abi -L/home/gerardo/llvm.libcxxabi.release/lib helloworld.cpp&amp;amp;lt;/code&amp;amp;gt;

real    0m0.093s
user    0m0.080s
sys    0m0.014s

If we want to create our own modules, we have to use the module map language. For example, this module.modulemap file creates a module that exposes the C++ header “myheader.h”

module test {
requires cplusplus
header &quot;myheader.h&quot;;