The Problem with Legacy Software and Refactoring
For an architect who is new to a legacy software project, it is often very hard to understand the existing architecture, determine the extent of architectural decay, and identify architectural smells and metric violations. It’s almost impossible to perform refactoring without breaking the working code. Legacy applications are often critical to the business and have been in use for many years, sometimes decades. Since the business is always changing, there is constant pressure to support additional requirements and fix existing bugs. However, changing these applications is difficult and you end up spending an increasing amount of resources maintaining the software.
There are many reasons why maintaining legacy software is such a difficult problem. Often most, if not all, of the original developers are gone, and no one understands how the application is implemented. The technologies used in the application are no longer current, having been replaced by newer and more exciting technologies. Also, software complexity increases as it evolves over time as you add new requirements.
The key to managing the lifecycle of the software is to understand how the application is implemented and how it is used, whether you are looking to replace a legacy application or gradually refactor it to support new requirements. Renovation of a legacy systems involves considerable re-architecting.
Understand the Legacy Software
You need to understand how an application is used by various stakeholders in order to understand it. These different perspectives are ultimately reflected in the design of the software. This is the same process we use to understand the complexity of any system.
Understand how it is used
Understanding how an application is used is critical to understanding the design. As an application evolves, special use cases are often added that are unique to the business and appear in the way the application was designed and implemented. Also, how it is used influences the performance requirements. As an example, a word processor has very different performance requirements than a high-frequency trading platform.
Understand how it is deployed
This is often one of the most neglected aspects of architectural analysis. One reason is that, historically, many applications were monoliths and there was not much to understand about how it was deployed. With the rise of microservices, an application can be distributed in multiple containers or services which makes understanding how it is deployed more important than ever.
Understand how it is built
It is necessary to understand how each component is built. This is especially true for languages like C/C++ where there are a variety of compile time options when generating object files. These options are used for generating different variants (typically for different hardware platforms) from the same source code. Without understanding these options, it wouldn’t be possible to fully analyze the code.
Understand how it is structured
This is an area that developers typically care about a lot and where a large part of the complexity resides. The code could be organized into thousands of interdependent files. A key goal of architectural analysis is to organize these files and elements into modular groups. Architecture discovery is necessary.
Using a dependency structure matrix representation (DSM) and analysis techniques is a great methodology for understanding and analyzing a legacy system.
This methodology can be applied to reduce complexity and make the software transparent. There are a number of techniques you can use to analyze legacy systems. Boeing uses a DSM approach for their knowledge-based engineering systems and Erik Philippis, founder and CEO of ImprovemenT BV, uses the horseshoe model.
Here is another technique you can use:
- Examining the existing artifacts is a great starting point. For example, the file/directory structure or the package/namespace structure is already a guide to how the developers organized these code elements.
- Apply partitioning and clustering algorithms to discover the layers and independent components. Even if the architecture has significantly eroded, identifying the minimal set of dependencies that cause cycles will often lead to the discovery of the intended layers. Lattix’s DSM approach is very helpful with this.
- Experiment with what-if architectures. Create different logical modules and examine the dependencies associated with those modules. If you are looking to create a microservice or componentize, create logical components or services using the current set of files/classes/elements. If there are no dependencies between these components and they are supposed to be independent of each other, then you know these can be independent services. On the other hand, if there are dependencies between these components, you know what dependencies to eliminate.
- Ask the developers and architects who have been supporting the application. They will already have some understanding of the architecture. They will also have the knowledge to assist in experimenting with what-if architectures. Experimenting with what-if architecture is a good exercise to sharpen your understanding of the system.
The goal of architectural discovery is to understand the organization of the application. It is one of the most effective ways to start a refactoring process to make the code more understandable and maintainable. Also, a clear understanding of the architecture will also prevent further architectural erosion.
Architecture is very important when dealing with legacy applications. It contains the knowledge of how to handle the software. Even if you decide to end-of-life a legacy application, the architectural knowledge left over from the project will be vital for the application that will replace it. If you are interested in seeing how Lattix can help with your legacy applications, see our whitepaper “Managing the Evolution of Legacy Applications” or sign up for a free trial.