Failing Fast: Test Impact Analysis and Software Architecture

Wolfgang Platz, Founder and Chief Strategy Officer of Tricentis, recently wrote an article on test impact analysis and how it benefits end-to-end testing. This article builds upon those themes and applies them to unit testing and software architecture.

Failing fast is a philosophy that values extensive testing and incremental software development. Agile is all about failing fast. Continuous integration testing helps to fail fast by allowing you to find issues as soon as they are introduced, which is the best time to fix them. This leads to more robust software as Jim Shore/Martin Fowler said in Fail Fast:

“...’failing immediately and visibly’ sounds like it would make your software more fragile, but it actually makes it more robust. Bugs are easier to find and fix, so fewer go into production.”

The longer it takes for bugs to appear, the harder and more costly it is to fix them. Bugs add expense and risk to projects; failing fast decreases debugging expense and improves quality.

The combination of digital transformation, DevOps, and agile software testing is changing developers’ expectations. They want immediate feedback on their code, but large code bases (especially legacy code bases) tend to have large unit test suites that can take hours (or days) to execute. This can lead to developers skipping unit testing or scaling it back to save time. A slow CI pipeline slows the productivity of the entire team. One solution is to figure out which modules can be built and tested in parallel or distributed. A second solution is to figure out which tests need to be run and in what order. This is typically referred to as test impact analysis.

What is test impact analysis?

Test impact analysis is a change-based testing practice that rapidly finds issues in new or modified code. To speed up testing and “fail fast,” we suggest two things:

  1. Map all unit tests to the code and only run tests that are mapped to code that has been added or modified. This way you are not running tests that are never going to fail.
  2. Prioritize the unit tests based on the likelihood of failing the test suite. Now you can fail fast and expose bugs as soon as possible.

One way to prioritize is based on the “importance” of a file or module. By importance, we mean how much impact that file will have on the rest of the software if it is changed. For this we use the System Stability metric. System stability measures how sensitive the system is to change. When a change is made to the software, system stability will tell you how much of the rest of the software will be affected. In software with a layered architecture, the lowest layers tend to have the lowest system stability because any change to them affects the layers above. Therefore, the lower layers need much more testing.

Lower stability means the software is harder to maintain because every change affects a greater portion of the system and, therefore, requires more testing and validation. This is why less stable software tends to break easily (fragile) when even small changes are made. High stability means there is less change impact, so changes are localized. We consider this robust software. Also, because there are fewer unexpected consequences when making a change, developers have an easier time understanding the software.

Test impact analysis with system stability allows you to optimize your testing and fail fast. This means the feedback loop is tighter so builds that are set to fail upon the first reported test case failure are reached as soon as possible. Working builds are delivered faster. By reducing the amount of time running the unit test suite, developers are more likely to test. This reduces the number of defects reported later in the process or in the field, which means less wasted time on bug fixes, etc.

How to implement test impact analysis?

Test impact analysis can be run with any type of tests such as unit tests and integration tests. With this process, a subset of test cases is selected and executed in a particular order for each test run. Here are some steps to implement it:

  1. Use a static dependency analysis solution like Lattix Architect to identify which code has been added or modified since the last test run.
  2. Use Automated Unit Testing Framework like Cantata to correlate unit tests with new or modified code in the build.
  3. Unit tests that are not affected by new or modified code are eliminated from the test run.
  4. The remaining tests are sorted according to their System Stability so that the most “important” tests (i.e. the most likely to fail) are executed first.


Test impact analysis rapidly exposes defects in new and modified source code. When you add in prioritization using system stability, you are supercharging the process and making that feedback loop even tighter.

If you are interested in using test impact analysis in your testing practice, Lattix Architect can help by giving you the change sets (new and modified code) for each build, system stability (by module or file) and other impact analysis of your source code. Please contact us with any questions or to request an evaluation.

Measure your Software Architectural Health

How do you measure the “architectural health” of a software project?

Since every software project is different, it is hard to come up with a single number that represents the architectural health of an entire project. Lattix Architect, therefore, provides a variety of architectural metrics. These metrics were chosen based on academic research on system architecture as well as the practical experience of the Lattix development team. We will look at one of these architecture metrics today: system stability.

Architectural metrics are different from traditional code metrics (i.e. bugs per line of code, cyclomatic complexity, etc.) that focus on understanding the quality and complexity of sections of code. Architectural metrics focus on the big picture (the system as a whole) and examine its organizational structure.

What is system stability?

System stability measures how sensitive the system is to change. When a change is made to the software, system stability will tell you how much of the rest of the software will be affected. In software with a layered architecture, the lowest layers tend to have the lowest system stability because any change to them affects the layers above. Therefore, the lower layers need much more testing.

Lower stability means the software is harder to maintain because every change affects a greater amount of the system and therefore requires more testing and validation. This is why less stable software tends to break easily (fragile) when even small changes are made. High stability means there is less change impact, so changes are localized. We consider this robust software. Also, because there are fewer unexpected consequences when making a change, developers have an easier time understanding the software.

How is it calculated?

System stability is measured by analyzing the impact of change in a software system for every element of the system. The overall stability number is the average stability number from all of the elements. The dependency information for every element is examined. Then the number of elements that are potentially affected (dependencies) when an element is changed is calculated. This is done through to transitive closure.

If the system stability is 70%, this means that 30% of the elements on average are affected when any element is changed and 70% are unaffected. Stability is computed as a percentage of the size of the system, so it doesn’t necessarily change simply because the software project gets larger or smaller.

How can you use it? And why?

The architecture of a software system has a profound impact on its stability. Therefore, by monitoring system stability you can measure the quality of your architecture. System stability can also be used to focus your testing efforts. As an example, if we create a set of applications on a common framework (see picture below) and then change the framework, we will affect all of the applications.

Software Architecture

If we change just an application then the impact will be much lower. Therefore, it is essential to understand not just the stability of the entire system, but understand where the stability is coming from.

System Stability Breakdown

In the picture above, the total stability of Project is about 62%, but the Apps layer has a stability of 73%, while Frameworks and Util have lower stability. This means the Frameworks and Util layers have more dependencies on other components, which means that changes to these components have a much larger impact and a need for focused testing (i.e. changes to these layers are riskier).

There is also value in tracking the stability of the software over time.

System Stability Trend

We have talked in previous blog posts about architecture erosion and how that leads to software that is harder to maintain. The system stability metric gives you measurable and actionable insight into this phenomenon. If the stability goes down over time as in the picture above, there needs to be a corresponding increase in testing and verification.


System stability tends to decrease over time if not monitored. Every change to the software can erode the architecture and therefore the stability. This means the software becomes harder to maintain over time, resulting in longer testing cycles and reduced developer productivity.

Monitoring stability can help you maintain a clean and modular architecture. Lattix products makes architecture management part of your continuous integration and makes architecture management easy and actionable in your development lifecycle.

The Smell of Rotting Software

Jack Reeve introduced the concept that source code is the design and programming is about designing software.1 As software grows, the design, or architecture, tends to grow large and complex. This is because software architecture is constantly evolving, making software maintenance difficult and error-prone. In this article, we will talk about symptoms of bad architecture and how to fix them.

Poor Software Architecture

According to Robert Martin2, there are seven symptoms of poor architecture.

  1. Rigidity: this means the system is hard to change. Every change forces other changes to be made. The more modules that must be changed, the more rigid the architecture. This slows down development as changes take longer than expected because the impact of a change can not be forecast (impact analysis can help). System stability and average impact are good architecture metrics to monitor for rigidity. System stability measures the percentage of elements (on the average) that would not be affected by a change to an element. Average impact for an element is calculated as the total number of elements that could be affected if a change is made to this element (or the transitive closure of all elements that could be affected).
  2. Fragility: when a change is made to the system, bugs appear in places that have no relationship to the part that was changed. This leads to modules that get worse the more you try to fix them. In this case, these modules need to be redesigned or refactored. Cyclicality metrics can help find fragile modules. Cyclicality is useful in determining how many elements of a system are in cycles. See our blog post “Cyclicality and Bugs” for more information.
  3. Immobility: this is when a component cannot be easily extracted from a system, making it unable to be reused in other systems. If a module is found that would be useful in other systems, it cannot be used because the effort and risk are too great. This is becoming a significant problem as companies move to microservices and cloud-ready applications. A metric that is useful in this case is called coupling. Coupling is the degree of interdependence between software modules; a measure of how closely connected two routines or modules are and the strength of the relationship between modules.
  4. Viscosity: this is when the architecture of the software is hard to preserve. Doing the right thing is harder than doing the wrong thing (breaking the architecture). The software architecture should be created so it is easy to preserve the design.
  5. Needless complexity: the architecture contains infrastructure that adds no direct benefit. It is tempting to try to prepare for any contingency, but preparing for too many contingencies makes the software more complex and harder to understand. Architectures shouldn’t contain elements that aren’t currently useful. Cyclomatic complexity metrics can help diagnose this problem.
  6. Needless repetition: this is when an architecture contains code structures that are repeated, usually by cut and paste, that instead should be unified under a single abstraction. When there is redundant code in software, the job of changing the software becomes complex. If a defect is found in code that has been repeated, the fix has to be implemented in every repetition. However, each repetition might be slightly different.
  7. Opacity: this is when the source code is hard to read and understand. If source code is the design, this is source code that does not express its intent very well. In this case, a concerted effort to refactor code must be made so that future readers can understand it. Code reviews can help in this situation.


While source code may be the design, trying to figure out the architecture from the source code can be a daunting experience. Using architectural analysis tools like Lattix Architect can help by visualizing the dependencies. This allows you to refactor the architecture, prevent future architectural erosion, and provide metrics like system stability, average impact, cyclicality, coupling, and cyclomatic complexity.

1. C++ Journal, “What is Software Design?”
2. Agile Software Development, Principles, Patterns, and Practices, Robert Martin

Software Metrics: Trends Trump Goals

“I don’t set trends. I just find out what they are and exploit them.” – Dick Clark, New Year’s Rockin’ Eve software metric guru.

Management loves software metrics. They love to set goals and then measure how their employees are doing against those goals (system stability needs to be 95%, for example). Software metrics don’t have to be a bad thing, but unfortunately, they are often used inappropriately. A single software metric is a snapshot and without context means nothing. While we can all agree that a codebase with a system stability of 5% is significantly worse than a codebase with a system stability of 95%, what about a codebase with 60% system stability versus a codebase with 70% system stability? It is easy to compare one number to another, but it is harder to see if that number is relevant in context of the larger software system.

In terms of “good” and “bad” codebases, a single metric is also not very helpful. You need a combination of software metrics. You could look at System Cyclicality, Intercomponent Cyclicality, System Stability and Coupling, for example, to get a better understanding of your codebase.

software metric: system stability trend

Then you have the question about what is the difference between a codebase with 94% system stability and one with 95%. If it requires a large amount of work to go from 94% to 95% just to get to that goal of 95% system stability, is that final 1% really worth it? This is where trends come in. Trends are the true added value to software metrics. How do you prioritize what should be fixed in your code? You can look at the trend or evolution of that software metric over time.

The “magic insight” only comes from looking at a number of relevant software metrics and the trends of those software metrics over time. This is why trends are more important than the actual goal. The trend will show if a team is moving in the right direction and the rate of that change. And trends create actionable insight for an organization. These insights, based on real data, into current performance trigger thinking about deeper underlying forces at work in the development of the software. Anticipating and responding to trends means thinking through all the scenarios that a trend could bring about, and how you need to respond to that trend. The goal should be to ensure trends accelerate, decelerate or reverse based on the project.

Trends also encourage experimentation. What happens if we implement pair-programming or if we switch to GitHub? Over a long project, trends can be motivating. You focus on moving in the right direction instead of the large gap between now and the end of the project. This is why trends are more important than the actual goal.

Learn how Lattix is Tracking Stability of a Software System or check out all of our metrics in our Lattix web demo.

Cyclicality and Bugs

Metrics have an obvious charm. If we could measure the quality of a system then we could track it and act as soon as it starts to degrade. But can we even hope to come up with a metric that works across something as complex as the software that runs the Mars Rover or something as simple as the software that plays the game of Tic-Tac-Toe?

Remember that the metric(s) we seek is not likely to be an individual metric such as file size, or the number of paths within a method, or even the count of bugs filed against each component. Useful as these individual metrics may be, what we want is something that is a predictor of the overall system quality. A metric that, if monitored, will help us manage the overall quality as the system evolves.

Indeed there are a number of system metrics to consider. They are graph theoretic. They come under names such as: Cohesion, Coupling, Cyclicality, Normalized Cumulative Dependency, Propagation Cost, Stability etc. But, how do we know if they are a good predictor of system quality? Research is beginning to catch up. New research shows a clear correlation between the cyclicality of your code and how buggy it is.

Cyclicality Matters

Interestingly, the evidence comes not from some dyed-in-the-wool computer science guru but from astute observers whose work is rooted in business and management. It’s a trio of business school professors collaborating across the Altantic Ocean: Manual Sosa, Tyson Browning and Mihm Jurgen. They are skilled at statistics and experts at teasing apart and verifying correlations.

And their conclusion is: Cyclicality of your code has a bearing on how buggy it is.

This conclusion may not be a surprise to many software engineers; and yet, it is a big deal because now we have large scale empirical evidence that demonstrates it. The researchers examined more than a hundred releases for various open source projects. They conclude that cyclicality of code and the presence of bugs in it are correlated. Their research goes deeper into the nature of cyclicality as well. The size of the cycle, the centrality of the component in the cycle, and the lack of encapsulation of the cycle all have an impact on the quality. They also present interesting results about “hubs” which are generally good until there are “overdone”.

You can peruse a highly readable article that summarizes the results of the research. You can also delve deeper into articles [7] and [9] at this link for the original work.

And then there is the question of “why”. Why do the bugs in code increase if cyclicality increases? The answer is not, nor is it likely to be, a mathematical theorem. Instead, the answer lies in how our brains function and how we think. I believe that cycles, particularly large cycles, make it harder for us to think about abstractions in a coherent way. This is also why architecture is so valuable. The systems we design and maintain are less prone to errors when we can think about them in ways that makes them understandable and maintainable.

Postscript: There is additional research that has arrived at largely the same conclusion. It’s from MIT in a doctoral thesis by Dan Sturtevant. Dan is a seasoned software engineer with a PhD in Systems Engineering. His pioneering work examined cyclicality using techniques that go well beyond traditional static code analysis. Dan’s work suggests that not just the bugginess of code but even employee turnover may have something to do with large scale cyclicality! Companies struggling with woes related to their software systems might consider giving him a call (Dan is at Harvard these days).