legacy code

Refactoring Legacy Software

The Problem with Legacy Software and Refactoring

For an architect who is new to a legacy software project, it is often very hard to understand the existing architecture, determine the extent of architectural decay, and identify architectural smells and metric violations. It’s almost impossible to perform refactoring without breaking the working code. Legacy applications are often critical to the business and have been in use for many years, sometimes decades. Since the business is always changing, there is constant pressure to support additional requirements and fix existing bugs. However, changing these applications is difficult and you end up spending an increasing amount of resources maintaining the software.

There are many reasons why maintaining legacy software is such a difficult problem. Often most, if not all, of the original developers are gone, and no one understands how the application is implemented. The technologies used in the application are no longer current, having been replaced by newer and more exciting technologies. Also, software complexity increases as it evolves over time as you add new requirements.

The key to managing the lifecycle of the software is to understand how the application is implemented and how it is used, whether you are looking to replace a legacy application or gradually refactor it to support new requirements. Renovation of a legacy systems involves considerable re-architecting.

Understand the Legacy Software

You need to understand how an application is used by various stakeholders in order to understand it. These different perspectives are ultimately reflected in the design of the software. This is the same process we use to understand the complexity of any system.

Understand how it is used
Understanding how an application is used is critical to understanding the design. As an application evolves, special use cases are often added that are unique to the business and appear in the way the application was designed and implemented. Also, how it is used influences the performance requirements. As an example, a word processor has very different performance requirements than a high-frequency trading platform.

Understand how it is deployed
This is often one of the most neglected aspects of architectural analysis. One reason is that, historically, many applications were monoliths and there was not much to understand about how it was deployed. With the rise of microservices, an application can be distributed in multiple containers or services which makes understanding how it is deployed more important than ever.

Understand how it is built
It is necessary to understand how each component is built. This is especially true for languages like C/C++ where there are a variety of compile time options when generating object files. These options are used for generating different variants (typically for different hardware platforms) from the same source code. Without understanding these options, it wouldn’t be possible to fully analyze the code.

Understand how it is structured
This is an area that developers typically care about a lot and where a large part of the complexity resides. The code could be organized into thousands of interdependent files. A key goal of architectural analysis is to organize these files and elements into modular groups. Architecture discovery is necessary.

Using a dependency structure matrix representation (DSM) and analysis techniques is a great methodology for understanding and analyzing a legacy system.

Refactoring Legacy Software

This methodology can be applied to reduce complexity and make the software transparent. There are a number of techniques you can use to analyze legacy systems. Boeing uses a DSM approach for their knowledge-based engineering systems and Erik Philippis, founder and CEO of ImprovemenT BV, uses the horseshoe model.

Here is another technique you can use:

  1. Examining the existing artifacts is a great starting point. For example, the file/directory structure or the package/namespace structure is already a guide to how the developers organized these code elements.
  2. Apply partitioning and clustering algorithms to discover the layers and independent components. Even if the architecture has significantly eroded, identifying the minimal set of dependencies that cause cycles will often lead to the discovery of the intended layers. Lattix’s DSM approach is very helpful with this.
  3. Experiment with what-if architectures. Create different logical modules and examine the dependencies associated with those modules. If you are looking to create a microservice or componentize, create logical components or services using the current set of files/classes/elements. If there are no dependencies between these components and they are supposed to be independent of each other, then you know these can be independent services. On the other hand, if there are dependencies between these components, you know what dependencies to eliminate.
  4. Ask the developers and architects who have been supporting the application. They will already have some understanding of the architecture. They will also have the knowledge to assist in experimenting with what-if architectures. Experimenting with what-if architecture is a good exercise to sharpen your understanding of the system.

The goal of architectural discovery is to understand the organization of the application. It is one of the most effective ways to start a refactoring process to make the code more understandable and maintainable. Also, a clear understanding of the architecture will also prevent further architectural erosion.

Summary

Architecture is very important when dealing with legacy applications. It contains the knowledge of how to handle the software. Even if you decide to end-of-life a legacy application, the architectural knowledge left over from the project will be vital for the application that will replace it. If you are interested in seeing how Lattix can help with your legacy applications, see our whitepaper “Managing the Evolution of Legacy Applications” or sign up for a free trial.

Reasons NOT to Refactor your code

 

Last week I wrote about the reasons to refactor code. Let us now look at some reasons why you shouldn’t refactor code. When dealing with legacy code there will always be a temptation to refactor the code to improve its understand-ability or performance. However, here are some reasons why it might be better to hold off:

1. You do not have the proper tests in place

Do not waste time refactoring your code when you do not have the proper tests in place
to make sure the code you are refactoring is still working correctly. A refactoring exercise pre-supposes a good engineering environment. And testing is one of the key components of that environment. If you don’t have a good way to test what you changed, it is better to hold off making that change until you can fully test it. Our developers tell us it is impossible to write good code without thorough testing. I believe them.

2. Allure of technology

Don’t make a refactoring change because a new exciting technology gets released. Given the fast pace of change there will always be something new and exciting. Today’s new and exciting technology will be legacy tomorrow. Instead, seek to understand the value of the new technology. If a Java backend is working fine, don’t jump to node.js unless you know that event handling is necessary for your application. Too many legacy applications are hard to maintain because they have a mish-mash of languages, frameworks, and technologies.

To learn more watch our webinar on reengineering legacy code.

3. The application doesn’t need to change

The primary purpose for changing an application is to satisfy new user requirements or usage conditions. So as long as the user of the application is content with the operation of the application there is less of a need to refactor the code. If there is no reason to change the application there is no reason to refactor it. Even if your company is swimming in money and you don’t have anything else to do, don’t do it.

Four Reasons to Refactor your Code

1. Maintenance is easier


Legacy code architecture erodes over time and becomes difficult to maintain. Legacy code bugs are harder to find and fix. Testing any changes in legacy code takes longer. Even small changes can inadvertently break the application because over time the design has been extended to accommodate new features and the code has become increasingly coupled. Refactoring code allows you to improve the architecture, reduce the coupling, and help the development team understand the intended design of the system. A clean architecture makes the design understandable and easier to manage and change.

Read our other blog on Reasons NOT to Refactor.

2. Make the Design Modular

Split up large applications into components. For instance, monolithic applications can be split up into microservices. In embedded systems, interfaces are created to allow drivers to be written to support a variety of hardware devices. These drivers serve to encapsulate the logic for interacting with different hardware devices. Also, most large applications can often be layered into separate layers such as the business logic and the user interface, which can itself be split up into various pages, forms, dialogs and other components. Modularity simplifies the design and is probably the most effective way to increase team productivity.

Check out our blog on a New Way to Think About Software Design.

3. Refactoring is often the cheaper option

When faced with new requirements that appear not to fit into the current design, it is often tempting to propose a complete rewrite. However, a rewrite can be expensive and highly risky. When a rewrite of a project fails it leaves in its wake a dispirited organization with no product to take to market. Before starting a rewrite, do a what-if exercise on the current application to see what would need to change to support the new requirements. Often large parts of an application can be salvaged while other parts are refactored, thereby reducing risk and saving considerable time and effort.

4. Your development team is happier

A design that is easy to understand reduces stress on the team. A modular design allows different team members to improve different components of the project at the same time without breaking each other’s code.