Steps to Follow when Reengineering Code

Developers know that a software system will become more complex and more highly coupled over time as additional changes are made. Often this calls for refactoring or reengineering the code to make it more maintainable. Reengineering will allow you to incorporate what has been learned about how the code should have been designed. This is the kind of learning that was the original basis for the term “technical debt.”

So how should we go about reengineering code that remains vital and useful? In real life we keep applying metaphorical Band-Aids as we make changes and incorporate new technologies. This leads to design erosion. Many progressive managers now understand the debilitating nature of this erosion and how it affects quality and productivity.

code refactoring

Even if we agree that reengineering is called for, how can we plan for it? Here are four key steps to take if you have decided to reengineer your software.

1. Understand the current structure of the code. Always resist the temptation to reengineer without a clear understanding of what you have. Understand and identify the critical components and what their dependencies are. For example, if you are Java or a .NET programmer, understand the various JAR files or assemblies and how they are related to each other. For C/C++, understand the executables and libraries, as well as the code structure used to generate them. Now ask yourself: Are these the right components in my desired architecture? Sometimes you have only one component. If that is the case, ask yourself if you need to split up the component into smaller components.

Read our other blog on Reasons NOT to Refactor

2. Examine the internals of the components, particularly the larger ones and the more important ones. Look at the dependencies of the classes or files that constitute the component. Is there excessive coupling? Does this coupling make the code harder to maintain? As a team, decide what your desired architecture is. Consult senior developers. Ask the team members with different areas of expertise to validate your ideas. The testing team can be particularly helpful. A good architecture will make a huge difference in how easy and effective it is to test. You should be able to take the existing classes or files and build new components. Try various what-if architectures to arrive at the desired architecture for your existing code.

3. With the desired architecture in hand, you should now know what changes are needed and what the unwanted dependencies are. Prioritize the dependencies to fix based on your requirements. If you have areas of code that change frequently, you should think about componentizing them. Always take into account your current requirements. While reengineering has its own benefits, it is unlikely that you will stop making other improvements during this time. Any reengineering effort is likely to be in conjunction with other improvements. A good reengineering tool will allow you to perform reengineering work in conjunction with making continued enhancements to the product. Another benefit of this approach is that it will build management support for the reengineering effort.

To learn more watch our Webinar on Reengineering Legacy Code.

4. The last step is to make sure you communicate the reengineering plan to the entire team. With a prioritized scheme, make reengineering a part of continuous integration. You can create rules that prevent things from getting worse by continuously examining the changes against the desired architecture. Reengineering stories should be part of agile planning just like any other stories. Not only can you do reengineering, you can make it part of your normal development. The best reengineering exercises often minimize disruption and still allow you to migrate to a new architecture.

A Software Architect's Perspective on APIs and Copyrights


The topsy-turvy legal struggle between Oracle and Google over Java got me interested in the issues that have produced such divergent judgements and such passionate responses. Initially, it seemed that it was about patents, then it turned into a battle over copyrights, and then narrowed to fair use. And this battle isn’t over yet. As a programmer and an architect, I first tried to understand how the legal system thinks about copyright. So, I started thinking about books.

A book is protected by copyright. A typical book consists of chapters that consist of paragraphs that consist of sentences, and eventually we reach words. The same book may also contain illustrations, a table of contents, and a cover. A book may have a particular type of binding and the pages may have a certain thickness and feel. So, when a part of a work is used by another author, how does the legal system decide whether it is an infringement? Simply using the same words in different sentences isn’t enough; after all, we all use the same language. Simply repeating the sentence doesn’t do it either. You are allowed to quote parts of another book in your own work. This is called fair use. Are the titles protected?

Are the events described in a fictional work protected? There is no one answer. Like all things in law, there is a large case history about what is an infringement and what isn’t. The underlying conclusion I came to is that the legal 2 system treats the work as a whole, and then sees how much or what parts of the whole constitute sufficient grounds for infringement. Of course this approach does not provide a clear answer because it is impossible to enumerate all the possible ways that could constitute infringement.

And, yet, I like this approach. This is systems thinking. As an architect, I am used to thinking of systems as a whole. When we think of systems, we think of the parts of the system and how they interrelate with each other, within different perspectives. The perspective is important because we may think of the same system in completely different ways depending on who we are. For instance, as the driver of a car I have different concerns from the passenger in my car; while the designer of the car thinks of it in an entirely different way.

So, let’s examine Java. This particular dispute is from the perspective of the developer. Java software consists of many parts. There is a Java language specification, there are many libraries, there are programs to compile the language, there are programs to package the compiled output, there are fonts, there is a virtual machine to run the compiled output, and there is a fair amount of documentation. There is even the steaming coffee cup logo to go with it. These are just a few components that come to mind.

To complete the notion of a system, we must also look at the interrelationships. The documentation of the libraries obviously depends on the API because that’s a key part of what the documentation is about. Of course, the API depends on the implementation. You cannot provide services that you don’t implement, much like a car with an automatic transmission doesn’t have a lever to shift gears. At an even more subtle level, the API for one part may refer to the API for another part, and even the implementation of one part may use the API or implementation for another part. So, these parts are all intertwined. As an architect, I want to tease them apart in a way that makes things understandable and manageable. Nevertheless, take it from me, not everything can be or should be separated. Indeed, once you accept that one part of a system is a creative expression subject to copyright, then it is not hard to accept that other parts also share that creative expression since they are all interrelated.

So this is what the legal system contends with, when asked to adjudicate about copyright. What is the entire work? How much was copied and does that constitute infringement? This is a daunting task with considerable subjectivity and, perhaps, that is why it gets handed over to a jury to decide on a case by case basis. The higher echelons of our legal system have not been amenable to absolute tests. For instance, you cannot look at the percentage of text that you may have copied to say whether that’s an infringement. And, in this case the higher courts rejected the notion that copying an API is unconditionally permissible.

So how far does this analogy of examining software as a creative work like a book go? Java libraries are organized into packages. Can we think of the Java packages as chapters of a book or the API as the title of the chapters? Here the analogy appears to be fairly weak. The chapters are normally sequenced while there is no particular ordering constraint for Java packages. The title is merely informative or suggestive of what is to follow while the API describes in precise detail exactly how to make use of the underlying implementation.

The analogy between books and software begins to fray to much greater extent when we consider how we interact with them. Typically, we read books. All other uses are marginal – including burning them to express disapproval or holding them up to hide our face. Software is different. We interact with software; we do things with software. When you stream a movie from Netflix, a complex series of interactions ensue. Netflix downloads software into your display device (e.g. browser on your PC); that software then uses APIs to communicate with other software on Netflix servers to show you the list of movies that you might be interested in. Once you make your selection, the software then streams the movie into your device, which also contains software to display it on your monitor or TV. Indeed, the TV may contain software as well. So, software interacting with humans and software interacting with other software is pervasive.

So, here we come to the other aspect of software. Software is used for “doing things” and APIs are central to doing things with it. Indeed, the developer, much like the designer of a car, is intimately concerned about the how the software will be used. The API design reflects it. So, when the district judge ruled that APIs couldn’t be copyrighted, he viewed them as essential for “doing things” and when the Federal Court overruled him, they viewed APIs as an essential part of a creative work.

For me, as a software architect, all of these aspects are parts of the same system. The legal system tries to tease the system apart and apply different parts of the law to different parts of the system. It is a hard task. Some might consider it nearly impossible. And, yet, billions of dollars are at stake and the decision will have an impact on the entire industry. One way for us to think about it is to go back to the underlying constitutional rationale: Copyrights exist for the public good. Let that be the cornerstone for this decision.

Reasons NOT to Refactor your code


Last week I wrote about the reasons to refactor code. Let us now look at some reasons why you shouldn’t refactor code. When dealing with legacy code there will always be a temptation to refactor the code to improve its understand-ability or performance. However, here are some reasons why it might be better to hold off:

1. You do not have the proper tests in place

Do not waste time refactoring your code when you do not have the proper tests in place
to make sure the code you are refactoring is still working correctly. A refactoring exercise pre-supposes a good engineering environment. And testing is one of the key components of that environment. If you don’t have a good way to test what you changed, it is better to hold off making that change until you can fully test it. Our developers tell us it is impossible to write good code without thorough testing. I believe them.

2. Allure of technology

Don’t make a refactoring change because a new exciting technology gets released. Given the fast pace of change there will always be something new and exciting. Today’s new and exciting technology will be legacy tomorrow. Instead, seek to understand the value of the new technology. If a Java backend is working fine, don’t jump to node.js unless you know that event handling is necessary for your application. Too many legacy applications are hard to maintain because they have a mish-mash of languages, frameworks, and technologies.

To learn more watch our webinar on reengineering legacy code.

3. The application doesn’t need to change

The primary purpose for changing an application is to satisfy new user requirements or usage conditions. So as long as the user of the application is content with the operation of the application there is less of a need to refactor the code. If there is no reason to change the application there is no reason to refactor it. Even if your company is swimming in money and you don’t have anything else to do, don’t do it.