devOps

Reduce C++ Build Times (Part 2) with the Pimpl Idiom

What is Pimpl Idiom?

Pimpl means Pointer to the IMPlementation. The Pimpl idiom technique is also referred to as an opaque pointer, handle classes, compiler firewall idiom, d-pointer, or cheshire cat. This idiom is useful because it can minimize coupling, and separates the interface from the implementation. It is a way to hide the implementation details of an interface from the clients. It is also important for providing binary code compatibility with different version of a shared library. The Pimpl idiom simplifies the interface that is created since the details can be hidden in another file.

What are the benefits of Pimpl?

Generally, whenever a header file changes, any file that includes that file will need to be recompiled. This is true even if those changes only apply to private members of the class that, by design, the users of the class cannot access. This is because of the C++ build model and because C++ assumes that callers know two main things about a class (and private members).

  1. Size and Layout: The code that is calling the class must be told the size and layout of the class (including private data members). This constraint of seeing the implementation means the callers and callees are more tightly coupled, but is very important to the C++ object model because having direct access to object by default helps C++ achieve heavily-optimized efficiency.
  2. Functions: The code that is calling the class must be able to resolve calls to member functions of the class. This includes private functions that are generally inaccessible and overload non-private functions. If a private function is a better match, the code will fail to compile.

With the Pimpl idiom, you remove the compilation dependencies on internal (private) class implementations. The big advantage is that it breaks compile-time dependencies. This means the system builds faster because Pimpl can eliminate extra includes. Also, it localizes the build impact of code changes because the implementation (parts in the Pimpl) can be changed without recompiling the client code. 1

Example: How to implement Pimpl Idiom

In this section, I am going to use a simple example of a Cow class to show how you can update your code to include the Pimpl idiom. Here is a simple Cow class that has private data members:

#include <iostream>

class Cow
{
public:
	Cow();
	~Cow();

	Cow(Cow&&);
	Cow& operator=(Cow&&);
private:
	std::string name;
	std::string color;
	double weight;
};

To implement the Pimpl idiom, I will:

  1. Put all private members into a struct or class in the header file
  2. In the class definition, declare a (smart) pointer to the class (struct) as the only private member variable
#include <memory>

class Cow
{
public:
	Cow();
	~Cow();

	Cow(Cow&&);
	Cow& operator=(Cow&&);
private:
	class cowIMPL;                                     
	std::unique_ptr<cowIMPL> pimpl; 
};

In the source (.cpp) file:

  1. Put the class definition in the .cpp file
  2. The constructors for the class need to create the class
  3. The destructor of the class is defaulted so that the destructor can see the complete definition of cowIMP
  4. The assignment and CopyConstructor need to copy the struct appropriately or else be defaulted (in this case defaulted)
#include "cow2.h"
#include <iostream>

class Cow::cowIMPL                
{
public:
	void do_setup()
	{
		name = "Betsy";
		color = "White";
		weight = 275;
	}
private:
	std::string name;
	std::string color;
	double weight;
};

Cow::Cow() : pimpl{ std::make_unique<cowIMPL>() } 
{
	pimpl->do_setup();
}

Cow::~Cow() = default;

Cow::Cow(Cow&&) = default;
Cow& Cow::operator=(Cow&&) = default;

Summary:

The Pimpl idiom is a great way to minimize coupling and break compile-time dependencies, which leads to faster build times. If you are looking for other ways to reduce compile times, read our blog post on header dependencies. If you are looking to reduce dependencies in general or would like to visualize your source code architecture, check out Lattix Architect.

1. Herb Sutter GotW#100

Reduce C++ Build Times by Reducing Header Dependencies

Slow build times are a common problem in C++. The build speed is based on language complexity and code organization. While you may not be able to change C++ language complexity, you can improve code organization. As Herb Sutter said, “Managing dependencies well is an essential part of writing solid code.”

The more modular and less interdependent (complex) your code is in general, the less often you will have to recompile everything. This will reduce the amount of work the compiler has to do on any individual block at the same time, because the compiler has less that it needs to keep track of in memory.

Today we will talk about header dependencies and their effect on build times.

Direct includes not needed

One of the main problems affecting the speed of C++ build times is the unnecessary inclusion of header files. Header inclusion should be done only when needed. For example, if you are using only classes X and Y, then you only need to include x.h and y.h. Unfortunately, many programmers habitually include many more headers than necessary, like <iostream> or windows.h. This can seriously degrade build time.

The Chromium Projects C++ Dos and Don’ts recommends not including unneeded headers. They mention that, after refactoring a file, there may often be symbols that are no longer used in that header, meaning you can remove that header. With that in mind, when you are refactoring it is a good idea to track redundant includes either manually or using an external tool like Lattix Architect.

Indirectly included files

Another way to reduce header dependencies, and therefore build times, in C++ is to avoid including headers inside other header files. In C++, you get the declaration of a function by including its header file, which can be put in either a .cpp file or a header file. When you include a header in another header file, you may be slowing down compilation time because you may be including other files unnecessarily.

The solution is a forward declaration. A forward declaration of a function or class simply introduces a name. According to Wikipedia: A forward declaration is a declaration of an identifier for which the programmer has not yet given a complete definition. This can be used in situations where you need to know that the name of a class is a type, but not necessarily the structure. In C++, classes can be forward-declared if you only need to use the pointer-to-that class type or reference, since all pointers and references are the same size and can have the same operations performed on them.

This is useful inside a class definition if a class contains a member that is a pointer (or reference) to another class. If, on the other hand, you need to create an object in the header file you can’t use forward declaration because a forward declaration does not tell you how big it is or anything about member functions or constructors/destructors. Forward declarations significantly reduce build time by avoiding unnecessary coupling. Forward declarations reduce build times in two ways:

  1. By reducing the amount of files that the compiler has to open and process
  2. By saving on unnecessary recompilation. If you include the header, you will be forced to recompile the code even if the change is unrelated.

It turns out in my last refactoring blog tip, I had an indirectly included file (header file included in another header file:

Lattix Architect find issues

As you can see I included shareprice.h in stock30.h:

Lattix Architect dependency usage

The next section will show you how I fixed this issue.

How to fix indirectly included files

Here is my original stock30.h file:

#include <string>
#include "shareprice.h"

class Stock
{
private:
    std::string company;
    int shares;
    SharePrice share_val;
public:
    Stock();                  // default constructor
    Stock(const std::string & co, long n = 0; double pr = 0.0);
    void buy(long num, double price);
    void sell(long num, double price);
    void update(double price);
};

In stock30.h, I included “shareprice.h”. One way to fix this issue is by making the SharePrice class a forward declaration. I do this by:

  1. Removing #include “shareprice.h”
  2. Adding class SharePrice
  3. Changing SharePrice share_val to a pointer to SharePrice

You can see the updated code below:

#include <string>

class SharePrice;          // Forward declaration

class Stock
{
private:
    std::string company;
    int shares;
    SharePrice* share_val;       // changed to a pointer to SharePrice
public:
    Stock();                   // default constructor
    Stock(const std::string & co, long n = 0, double pr = 0.0);
    void buy(long num, double price);
    void sell(long num, double price);
    void update(double price);
};

You will also need to update the stock30.cpp file to reflect the change in the variable share_val, but I will leave that as an exercise for the reader.

Summary

Build times are a constant problem for larger C++ programs. But by thinking carefully about a C++ project’s design (especially for large projects, consisting of multiple modules), you can modify it so the compiler can produce output efficiently. This can be done manually or with an architectural analysis tool like Lattix Architect.

A new way to think about software design

This year’s Saturn Conference at San Diego reflected an evolving landscape as macro trends such as cloud based architectures, Internet of Things (IoT), and devOps in an Agile world, continue to reshape the industry. How do we think about design and architecture in this changing landscape?

Professor Daniel Jackson of MIT, in a keynote at the Saturn Conference, gave us a fresh look on how to think about design. The idea is simple and elegant and one wonders why it took so long for somebody to come up with it. Simply put, Professor Jackson describes an application as a collection of coherent concepts that fulfill the purposes of the application. The beauty of this formulation is that it eliminates the clutter of implementation artifacts.


When we describe the design of a program in UML, we struggle to create structural and behavioral diagrams that accurately reflect program implementation. Sadly (and, perhaps, mercifully) we rarely succeed in this endeavor and even if we did, those diagrams would likely be just as hard to understand as the code (think of creating interaction diagrams to represent various method call chains). And if our implementation language happens to be a non-object oriented language then we are plain out of luck. On the other hand, this new kind of thinking has the potential to transcend implementation language and, perhaps, even technology. It also has ramifications on the architect vs developer debates that rage in the world of software engineering today.

Conceptual Design vs Representational Design: Reducing the clutter

Professor Jackson provided several examples of applications and the concepts they embody. For instance, an email application embodies concepts such as Email Address, Message and Folder while a word processor embodies concepts such as Paragraph, Format and Style. A considerable part of the presentation delved into the details that illustrated the sophistication that underlies these concepts and the confusion that befalls when these concepts are poorly defined.

So, how do we select concepts? Professor Jackson defines purposes that a concept fulfills. In a clean design, he said, a concept fulfills a single purpose. This has ramifications that I have yet to fully get my head around. It reminds me of the Single Responsibility Principle which is also a difficult concept to understand. In any case, I suspect that defining a coherent set of concepts is difficult and takes repeated iterations of implementations to get it right. In fact, the user of that software is likely to be a critical part of the process as concepts are pruned, split up or even eliminated to make them coherent and understandable.

And, how do we implement concepts? Does a concept map to a single class or multiple classes if implemented in an object oriented language? I will eagerly wait to see further work on this approach.

Go look up the slides of this thought provoking presentation here: Rethinking Software Design.