April 2017

Reduce C++ Build Times (Part 2) with the Pimpl Idiom

What is Pimpl Idiom?

Pimpl means Pointer to the IMPlementation. The Pimpl idiom technique is also referred to as an opaque pointer, handle classes, compiler firewall idiom, d-pointer, or cheshire cat. This idiom is useful because it can minimize coupling, and separates the interface from the implementation. It is a way to hide the implementation details of an interface from the clients. It is also important for providing binary code compatibility with different version of a shared library. The Pimpl idiom simplifies the interface that is created since the details can be hidden in another file.

What are the benefits of Pimpl?

Generally, whenever a header file changes, any file that includes that file will need to be recompiled. This is true even if those changes only apply to private members of the class that, by design, the users of the class cannot access. This is because of the C++ build model and because C++ assumes that callers know two main things about a class (and private members).

  1. Size and Layout: The code that is calling the class must be told the size and layout of the class (including private data members). This constraint of seeing the implementation means the callers and callees are more tightly coupled, but is very important to the C++ object model because having direct access to object by default helps C++ achieve heavily-optimized efficiency.
  2. Functions: The code that is calling the class must be able to resolve calls to member functions of the class. This includes private functions that are generally inaccessible and overload non-private functions. If a private function is a better match, the code will fail to compile.

With the Pimpl idiom, you remove the compilation dependencies on internal (private) class implementations. The big advantage is that it breaks compile-time dependencies. This means the system builds faster because Pimpl can eliminate extra includes. Also, it localizes the build impact of code changes because the implementation (parts in the Pimpl) can be changed without recompiling the client code. 1

Example: How to implement Pimpl Idiom

In this section, I am going to use a simple example of a Cow class to show how you can update your code to include the Pimpl idiom. Here is a simple Cow class that has private data members:

#include <iostream>

class Cow
{
public:
	Cow();
	~Cow();

	Cow(Cow&&);
	Cow& operator=(Cow&&);
private:
	std::string name;
	std::string color;
	double weight;
};

To implement the Pimpl idiom, I will:

  1. Put all private members into a struct or class in the header file
  2. In the class definition, declare a (smart) pointer to the class (struct) as the only private member variable
#include <memory>

class Cow
{
public:
	Cow();
	~Cow();

	Cow(Cow&&);
	Cow& operator=(Cow&&);
private:
	class cowIMPL;                                     
	std::unique_ptr<cowIMPL> pimpl; 
};

In the source (.cpp) file:

  1. Put the class definition in the .cpp file
  2. The constructors for the class need to create the class
  3. The destructor of the class is defaulted so that the destructor can see the complete definition of cowIMP
  4. The assignment and CopyConstructor need to copy the struct appropriately or else be defaulted (in this case defaulted)
#include "cow2.h"
#include <iostream>

class Cow::cowIMPL                
{
public:
	void do_setup()
	{
		name = "Betsy";
		color = "White";
		weight = 275;
	}
private:
	std::string name;
	std::string color;
	double weight;
};

Cow::Cow() : pimpl{ std::make_unique<cowIMPL>() } 
{
	pimpl->do_setup();
}

Cow::~Cow() = default;

Cow::Cow(Cow&&) = default;
Cow& Cow::operator=(Cow&&) = default;

Summary:

The Pimpl idiom is a great way to minimize coupling and break compile-time dependencies, which leads to faster build times. If you are looking for other ways to reduce compile times, read our blog post on header dependencies. If you are looking to reduce dependencies in general or would like to visualize your source code architecture, check out Lattix Architect.

1. Herb Sutter GotW#100

What is Architectural Refactoring?

Most people are familiar with the term refactoring, but refactoring is not constrained to just code. It is applicable to software architecture as well. Architectural refactoring is improving the design of an existing software application. Architectural refactoring changes the structure but not the functionality.

The problem is that, over time, software architecture erodes as it evolves (see our blog post on architectural erosion). The original design may be lost or was never intended to address new requirements. Architectural refactoring, like redesigning a city, needs to be done in pieces and be part of a controlled, iterative development process. This refactoring deals with resolving architectural smells like breaking dependency cycles, splitting subsystems, and enforcing strict layering.

The goal is to improve a set of quality attributes such as performance, scalability, extensibility, testability, and robustness. As it is more involved than code refactoring, architectural refactoring is more than just a technical task - it’s about justifying your design, creating a case, and presenting your ideas in a well thought-out plan to your business.

Architectural Refactoring vs Code Refactoring

Architectural refactoring typically involves code refactoring as well, which leads to confusion between the two. As Martin Fowler says, code refactoring restructures code to make it more maintainable without changing its observable behavior. For code refactoring, you focus on software entities like packages, classes and methods.

Code refactoring is a bottoms-up activity that preserves structure. Architectural refactoring is a top-down activity that improves structure. Architectural refactoring pertains to components, connectors, subsystems, and interfaces. Architectural refactoring is a deliberate process to remove architectural smells while not changing the scope or functionality of the code. When you refactor the architecture, you can revisit architectural decisions, now with more domain knowledge, and seek better alternatives.

How to Perform Architectural Refactoring

“Architecture is the stuff that’s hard to change” - Martin Fowler

Architectural refactoring should be part of an iterative development process. The goal is for incremental improvements in each development cycle or sprint. Here are steps on how to perform this refactoring:

  1. Understand and visualize the current architecture. A Dependency Structure Matrix (DSM) is a great way to visualize the current architecture (watch our How to Read a Dependency Structure Matrix video). This quickly gives you an understanding of the components in your software system and how they are connected to each other. With this knowledge, you might find that your systems or components are too tightly coupled or too dependent on each other.


  2. Architectural Refactoring

  3. Determine what architectural changes are needed. For example, if you have a monolithic application, maybe splitting it into multiple units or services is the answer (microservices).
  4. Develop a high-level refactoring plan. Create a worklist of items that need to be fixed. If you are using a tool like Lattix Architect, you can set up rules to make sure future builds do not break your intended architecture (watch our How to Specify Architectural Rules video).
  5. Monitor the architecture as you refactor. Architectural refactoring should be done incrementally. The architecture needs to be monitored to make sure it is trending towards a clean state and new architectural violations are not being introduced.

Summary

Architectural refactoring is more than a technical task. It is about creating a proper case and selling your ideas. Always remember to identify the business value associated with the refactoring effort to justify the cost. This can be increased performance, reduced downtime, faster delivery and time to market, and improved quality. Finally, software architecture should be continuously monitored because as the software evolves the architecture decays. The longer the architecture is left unchecked the more expensive the architectural refactoring. A solution like Lattix Architect, which is designed for architectural refactoring, can simplify the process.

Reduce C++ Build Times by Reducing Header Dependencies

Slow build times are a common problem in C++. The build speed is based on language complexity and code organization. While you may not be able to change C++ language complexity, you can improve code organization. As Herb Sutter said, “Managing dependencies well is an essential part of writing solid code.”

The more modular and less interdependent (complex) your code is in general, the less often you will have to recompile everything. This will reduce the amount of work the compiler has to do on any individual block at the same time, because the compiler has less that it needs to keep track of in memory.

Today we will talk about header dependencies and their effect on build times.

Direct includes not needed

One of the main problems affecting the speed of C++ build times is the unnecessary inclusion of header files. Header inclusion should be done only when needed. For example, if you are using only classes X and Y, then you only need to include x.h and y.h. Unfortunately, many programmers habitually include many more headers than necessary, like <iostream> or windows.h. This can seriously degrade build time.

The Chromium Projects C++ Dos and Don’ts recommends not including unneeded headers. They mention that, after refactoring a file, there may often be symbols that are no longer used in that header, meaning you can remove that header. With that in mind, when you are refactoring it is a good idea to track redundant includes either manually or using an external tool like Lattix Architect.

Indirectly included files

Another way to reduce header dependencies, and therefore build times, in C++ is to avoid including headers inside other header files. In C++, you get the declaration of a function by including its header file, which can be put in either a .cpp file or a header file. When you include a header in another header file, you may be slowing down compilation time because you may be including other files unnecessarily.

The solution is a forward declaration. A forward declaration of a function or class simply introduces a name. According to Wikipedia: A forward declaration is a declaration of an identifier for which the programmer has not yet given a complete definition. This can be used in situations where you need to know that the name of a class is a type, but not necessarily the structure. In C++, classes can be forward-declared if you only need to use the pointer-to-that class type or reference, since all pointers and references are the same size and can have the same operations performed on them.

This is useful inside a class definition if a class contains a member that is a pointer (or reference) to another class. If, on the other hand, you need to create an object in the header file you can’t use forward declaration because a forward declaration does not tell you how big it is or anything about member functions or constructors/destructors. Forward declarations significantly reduce build time by avoiding unnecessary coupling. Forward declarations reduce build times in two ways:

  1. By reducing the amount of files that the compiler has to open and process
  2. By saving on unnecessary recompilation. If you include the header, you will be forced to recompile the code even if the change is unrelated.

It turns out in my last refactoring blog tip, I had an indirectly included file (header file included in another header file:

Lattix Architect find issues

As you can see I included shareprice.h in stock30.h:

Lattix Architect dependency usage

The next section will show you how I fixed this issue.

How to fix indirectly included files

Here is my original stock30.h file:

#include <string>
#include "shareprice.h"

class Stock
{
private:
    std::string company;
    int shares;
    SharePrice share_val;
public:
    Stock();                  // default constructor
    Stock(const std::string & co, long n = 0; double pr = 0.0);
    void buy(long num, double price);
    void sell(long num, double price);
    void update(double price);
};

In stock30.h, I included “shareprice.h”. One way to fix this issue is by making the SharePrice class a forward declaration. I do this by:

  1. Removing #include “shareprice.h”
  2. Adding class SharePrice
  3. Changing SharePrice share_val to a pointer to SharePrice

You can see the updated code below:

#include <string>

class SharePrice;          // Forward declaration

class Stock
{
private:
    std::string company;
    int shares;
    SharePrice* share_val;       // changed to a pointer to SharePrice
public:
    Stock();                   // default constructor
    Stock(const std::string & co, long n = 0, double pr = 0.0);
    void buy(long num, double price);
    void sell(long num, double price);
    void update(double price);
};

You will also need to update the stock30.cpp file to reflect the change in the variable share_val, but I will leave that as an exercise for the reader.

Summary

Build times are a constant problem for larger C++ programs. But by thinking carefully about a C++ project’s design (especially for large projects, consisting of multiple modules), you can modify it so the compiler can produce output efficiently. This can be done manually or with an architectural analysis tool like Lattix Architect.