architecture

The Importance of Managing Complexity

Introduction

We write software to make our lives easier; to reduce the complexity and chaos. The need to reduce complexity is at the very center of software development. Unfortunately, we tend to mirror the complexity of the real world in our software and this leads to many problems. As Bruce Schneier in his essay A Plea for Simplicity: You can’t secure what you don’t understand said, “the worst enemy of software security is complexity.” In this blog post, we will talk about the importance of managing complexity and how to reduce complexity in software.

The Importance of Managing Complexity

Fred Brooks, in his paper “No Silver Bullet: Essence of Accidents of Software Engineering”, explains there are two types of issues that introduce complexity, the essential and the accidental properties. Essential properties are the properties that a thing must have to be that thing. Accidental properties are the properties a thing happens to have, but they are not really necessary to make the thing what it is. Accidental properties could also be called incidental, discretionary, or optional properties. Brooks argues that difficulties from accidental properties have been addressed with higher-level languages, the introduction of IDEs, hardware enhancements, etc.

This leaves the problem of essential difficulties and essential difficulties are harder to solve. You are now interfacing with the complex, chaotic real world where you must identify all the dependencies and edge cases for a problem. As software addresses larger and larger problems, interactions between entities become more complex and this increases the difficulty of creating new software solutions. The root of all essential difficulties is complexity.

Conceptual errors (specifications, design, and testing) are much more damaging than syntax errors. C.A.R Hoare, the inventor of the quicksort algorithm, stated in the talk The Emperor’s Old Clothes, “there are two ways of constructing a software design: one way is to make it so simple that there are obviously no deficiencies, and the other is to make it so complicated that there are not obvious deficiencies.” Steve McConnell in Code Complete states that software’s primary technical imperative is to manage complexity. When software projects fail for technical reasons, one reason is often uncontrolled complexity (i.e. things are so complex that no one really knows what is going on). Edsger Dijkstra pointed out that no one’s skull can contain a whole program at once; we should try to organize our programs in such a way that we can safely focus on one part at a time.

Brooks paints a pessimistic picture of the future stating there is no “silver bullet” in either technology or management technique that promises even an order of magnitude improvement in productivity, reliability, or simplicity. While there is no silver bullet, there are things that can be done to reduce complexity.

How to Reduce Complexity

When designs are inefficient and complicated they can be broken down into three categories:

  1. A complex solution to a simple problem
  2. A simple, incorrect solution to a complex problem
  3. An inappropriate, complex solution to a complex problem

As stated above, the most important technical goal is managing complexity. In terms of software architecture, complexity can be reduced by dividing a system into a series of subsystems. The goal is to break a complicated problem into simpler, easier to manage pieces. The less subsystems are dependent on each other the safer it is to focus on one area of complexity at a time. McConnell gives 14 recommendations on how to conquer complexity:

  1. Dividing a system into subsystems at the architecture level so your brain can focus on a smaller section of the system at one time
  2. Carefully defining class interfaces so you can ignore the internal workings of the class
  3. Preserving the abstraction represented by the class interface so your brain doesn’t have to remember arbitrary details
  4. Avoiding global data, because global data vastly increases the percentage of the code you need to juggle in your brain at any one time
  5. Avoiding deep inheritance hierarchies because they are intellectually demanding
  6. Avoiding deep nesting loops and conditionals because they can be replaced by simpler control structures.
  7. Avoiding gotos because they introduce nonlinearity that has been found to be difficult for most people to follow
  8. Carefully defining your approach to error handling rather than using an arbitrary number of different error-handling techniques
  9. Being systematic about the use of built-in exception mechanisms, which can become nonlinear control structures that are about as hard to understand as gotos if not used with discipline
  10. Not allowing classes to grow into monster classes that amount to whole programs in themselves (Lattix Architect has a number of metrics that can help with this)
  11. Keeping routines short
  12. Using clear, self-explanatory variable names so your brain doesn’t have to waste cycles remembering details like “i stands for the account index, and j stands for the customer index, or was it the other way around?”
  13. Minimizing the number of parameters passed to a routine, or, more importantly, passing only the parameters needed to preserve the routine interface’s abstractions
  14. Using conventions to spare your brain the challenge of remembering arbitrary accidental differences between different sections of code

Conclusion

The importance of managing complexity can be seen when we start looking at other people’s source code. Depending on the complexity of the software, this can be a daunting task. As stated in Code for the Maintainer, “always code and comment in such a way that if someone a few notches junior picks up the code, they will take pleasure in reading and learning from it.” Lattix Architect has helped hundreds of companies reduce the complexity of their source code by visualizing, optimizing, and controlling their software architecture.

Refactoring Legacy Software

The Problem with Legacy Software and Refactoring

For an architect who is new to a legacy software project, it is often very hard to understand the existing architecture, determine the extent of architectural decay, and identify architectural smells and metric violations. It’s almost impossible to perform refactoring without breaking the working code. Legacy applications are often critical to the business and have been in use for many years, sometimes decades. Since the business is always changing, there is constant pressure to support additional requirements and fix existing bugs. However, changing these applications is difficult and you end up spending an increasing amount of resources maintaining the software.

There are many reasons why maintaining legacy software is such a difficult problem. Often most, if not all, of the original developers are gone, and no one understands how the application is implemented. The technologies used in the application are no longer current, having been replaced by newer and more exciting technologies. Also, software complexity increases as it evolves over time as you add new requirements.

The key to managing the lifecycle of the software is to understand how the application is implemented and how it is used, whether you are looking to replace a legacy application or gradually refactor it to support new requirements. Renovation of a legacy systems involves considerable re-architecting.

Understand the Legacy Software

You need to understand how an application is used by various stakeholders in order to understand it. These different perspectives are ultimately reflected in the design of the software. This is the same process we use to understand the complexity of any system.

Understand how it is used
Understanding how an application is used is critical to understanding the design. As an application evolves, special use cases are often added that are unique to the business and appear in the way the application was designed and implemented. Also, how it is used influences the performance requirements. As an example, a word processor has very different performance requirements than a high-frequency trading platform.

Understand how it is deployed
This is often one of the most neglected aspects of architectural analysis. One reason is that, historically, many applications were monoliths and there was not much to understand about how it was deployed. With the rise of microservices, an application can be distributed in multiple containers or services which makes understanding how it is deployed more important than ever.

Understand how it is built
It is necessary to understand how each component is built. This is especially true for languages like C/C++ where there are a variety of compile time options when generating object files. These options are used for generating different variants (typically for different hardware platforms) from the same source code. Without understanding these options, it wouldn’t be possible to fully analyze the code.

Understand how it is structured
This is an area that developers typically care about a lot and where a large part of the complexity resides. The code could be organized into thousands of interdependent files. A key goal of architectural analysis is to organize these files and elements into modular groups. Architecture discovery is necessary.

Using a dependency structure matrix representation (DSM) and analysis techniques is a great methodology for understanding and analyzing a legacy system.

Refactoring Legacy Software

This methodology can be applied to reduce complexity and make the software transparent. There are a number of techniques you can use to analyze legacy systems. Boeing uses a DSM approach for their knowledge-based engineering systems and Erik Philippis, founder and CEO of ImprovemenT BV, uses the horseshoe model.

Here is another technique you can use:

  1. Examining the existing artifacts is a great starting point. For example, the file/directory structure or the package/namespace structure is already a guide to how the developers organized these code elements.
  2. Apply partitioning and clustering algorithms to discover the layers and independent components. Even if the architecture has significantly eroded, identifying the minimal set of dependencies that cause cycles will often lead to the discovery of the intended layers. Lattix’s DSM approach is very helpful with this.
  3. Experiment with what-if architectures. Create different logical modules and examine the dependencies associated with those modules. If you are looking to create a microservice or componentize, create logical components or services using the current set of files/classes/elements. If there are no dependencies between these components and they are supposed to be independent of each other, then you know these can be independent services. On the other hand, if there are dependencies between these components, you know what dependencies to eliminate.
  4. Ask the developers and architects who have been supporting the application. They will already have some understanding of the architecture. They will also have the knowledge to assist in experimenting with what-if architectures. Experimenting with what-if architecture is a good exercise to sharpen your understanding of the system.

The goal of architectural discovery is to understand the organization of the application. It is one of the most effective ways to start a refactoring process to make the code more understandable and maintainable. Also, a clear understanding of the architecture will also prevent further architectural erosion.

Summary

Architecture is very important when dealing with legacy applications. It contains the knowledge of how to handle the software. Even if you decide to end-of-life a legacy application, the architectural knowledge left over from the project will be vital for the application that will replace it. If you are interested in seeing how Lattix can help with your legacy applications, see our whitepaper “Managing the Evolution of Legacy Applications” or sign up for a free trial.

What is a Microservices Architecture?

“In short, the microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API. These services are built around business capabilities and independently deployable by fully automated deployment machinery. There is a base minimum of centralized management of these services, which may be written in different programming languages and use different data storage technologies.”
- James Lewis and Martin Fowler

From this definition, we understand that microservices are small, independently deployable services that work together. These services are focused on doing one thing well (Single Responsibility Principle). In this style, you break down your larger system into multiple microservices that will interact with each other to accomplish the larger goal. There is no standard model for a microservices architecture but most share some notable properties.

microservices architecture

First, they are autonomous. Microservices are typically created by componentizing the software, a component being a unit of software that is independently replaceable and upgradeable. This is especially important as more applications are being deployed to the cloud where load demands can increase dramatically. All communication happens via lightweight networking calls (APIs). The aim is to be as decoupled and as cohesive as possible.

Microservices do not need a standard technology stack. While traditionally large software applications standardize on a single technology stack, when you split your software into independent services you can choose your technology stack for each service. For example, you can use C/C++ for real-time services, Java for the GUI, and Node.js for reporting. You should remember that there will be overhead for having different technology stacks for each service.

You need to design your services to handle the failure of other services; they need to be resilient. This is a side-effect of making the software individual components. You need to consider how a failure of a single service will affect the overall user experience. As a consequence of this, each service should fail as quickly as possible and restore itself automatically, if possible.

Scaling is one of the big advantages of microservices and one of the reasons it is so popular. Since each service (feature) does not depend on the other services, they can be deployed separately. You can now distribute the services across servers and replicate them as load demands increase. Compare this to traditional development that must scale the entire application as demands increase.

Different microservices can be owned by different teams. Teams are cross-functional, which means that they contain the full range of skills for the development of the service. The development team builds the software and owns the product for its lifetime. An example of this is Amazon’s philosophy "You build it, you run it." The advantage is that the development team now has special insight into how users are using their service and can tailor future development to their needs. Teams can also be organized around business capabilities but you will need to watch out for Conway’s Law where the design tends to mimic the organizational structure (see our blog post Overcoming Conway’s Law: Protecting Design from Organization).

Finally, because of the componentization of the software there is a lot of opportunity for reuse of functionality. In the microservices architecture, as you are making the software individual components you should also be designing them so that many different programs can reuse the functionality. This does take significant effort and awareness to be done correctly but can also lead to increases in quality and productivity (for more information see “Effects of Reuse on Quality, Productivity and Economics" by Wayne C. Lim).

To summarize, microservices: are autonomous, using services to componentize the software; use decentralized governance; are designed to handle service interruptions; are easily scalable; focus on products not projects; are typically organized around business capabilities; and promote software reuse. To learn more see our whitepaper "Developing a Microservices Architecture."

Steps to Follow when Reengineering Code

Developers know that a software system will become more complex and more highly coupled over time as additional changes are made. Often this calls for refactoring or reengineering the code to make it more maintainable. Reengineering will allow you to incorporate what has been learned about how the code should have been designed. This is the kind of learning that was the original basis for the term “technical debt.”

So how should we go about reengineering code that remains vital and useful? In real life we keep applying metaphorical Band-Aids as we make changes and incorporate new technologies. This leads to design erosion. Many progressive managers now understand the debilitating nature of this erosion and how it affects quality and productivity.

code refactoring

Even if we agree that reengineering is called for, how can we plan for it? Here are four key steps to take if you have decided to reengineer your software.

1. Understand the current structure of the code. Always resist the temptation to reengineer without a clear understanding of what you have. Understand and identify the critical components and what their dependencies are. For example, if you are Java or a .NET programmer, understand the various JAR files or assemblies and how they are related to each other. For C/C++, understand the executables and libraries, as well as the code structure used to generate them. Now ask yourself: Are these the right components in my desired architecture? Sometimes you have only one component. If that is the case, ask yourself if you need to split up the component into smaller components.

Read our other blog on Reasons NOT to Refactor

2. Examine the internals of the components, particularly the larger ones and the more important ones. Look at the dependencies of the classes or files that constitute the component. Is there excessive coupling? Does this coupling make the code harder to maintain? As a team, decide what your desired architecture is. Consult senior developers. Ask the team members with different areas of expertise to validate your ideas. The testing team can be particularly helpful. A good architecture will make a huge difference in how easy and effective it is to test. You should be able to take the existing classes or files and build new components. Try various what-if architectures to arrive at the desired architecture for your existing code.

3. With the desired architecture in hand, you should now know what changes are needed and what the unwanted dependencies are. Prioritize the dependencies to fix based on your requirements. If you have areas of code that change frequently, you should think about componentizing them. Always take into account your current requirements. While reengineering has its own benefits, it is unlikely that you will stop making other improvements during this time. Any reengineering effort is likely to be in conjunction with other improvements. A good reengineering tool will allow you to perform reengineering work in conjunction with making continued enhancements to the product. Another benefit of this approach is that it will build management support for the reengineering effort.

To learn more watch our Webinar on Reengineering Legacy Code.

4. The last step is to make sure you communicate the reengineering plan to the entire team. With a prioritized scheme, make reengineering a part of continuous integration. You can create rules that prevent things from getting worse by continuously examining the changes against the desired architecture. Reengineering stories should be part of agile planning just like any other stories. Not only can you do reengineering, you can make it part of your normal development. The best reengineering exercises often minimize disruption and still allow you to migrate to a new architecture.

Reasons NOT to Refactor your code

 

Last week I wrote about the reasons to refactor code. Let us now look at some reasons why you shouldn’t refactor code. When dealing with legacy code there will always be a temptation to refactor the code to improve its understand-ability or performance. However, here are some reasons why it might be better to hold off:

1. You do not have the proper tests in place

Do not waste time refactoring your code when you do not have the proper tests in place
to make sure the code you are refactoring is still working correctly. A refactoring exercise pre-supposes a good engineering environment. And testing is one of the key components of that environment. If you don’t have a good way to test what you changed, it is better to hold off making that change until you can fully test it. Our developers tell us it is impossible to write good code without thorough testing. I believe them.

2. Allure of technology

Don’t make a refactoring change because a new exciting technology gets released. Given the fast pace of change there will always be something new and exciting. Today’s new and exciting technology will be legacy tomorrow. Instead, seek to understand the value of the new technology. If a Java backend is working fine, don’t jump to node.js unless you know that event handling is necessary for your application. Too many legacy applications are hard to maintain because they have a mish-mash of languages, frameworks, and technologies.

To learn more watch our webinar on reengineering legacy code.

3. The application doesn’t need to change

The primary purpose for changing an application is to satisfy new user requirements or usage conditions. So as long as the user of the application is content with the operation of the application there is less of a need to refactor the code. If there is no reason to change the application there is no reason to refactor it. Even if your company is swimming in money and you don’t have anything else to do, don’t do it.

Four Reasons to Refactor your Code

1. Maintenance is easier


Legacy code architecture erodes over time and becomes difficult to maintain. Legacy code bugs are harder to find and fix. Testing any changes in legacy code takes longer. Even small changes can inadvertently break the application because over time the design has been extended to accommodate new features and the code has become increasingly coupled. Refactoring code allows you to improve the architecture, reduce the coupling, and help the development team understand the intended design of the system. A clean architecture makes the design understandable and easier to manage and change.

Read our other blog on Reasons NOT to Refactor.

2. Make the Design Modular

Split up large applications into components. For instance, monolithic applications can be split up into microservices. In embedded systems, interfaces are created to allow drivers to be written to support a variety of hardware devices. These drivers serve to encapsulate the logic for interacting with different hardware devices. Also, most large applications can often be layered into separate layers such as the business logic and the user interface, which can itself be split up into various pages, forms, dialogs and other components. Modularity simplifies the design and is probably the most effective way to increase team productivity.

Check out our blog on a New Way to Think About Software Design.

3. Refactoring is often the cheaper option

When faced with new requirements that appear not to fit into the current design, it is often tempting to propose a complete rewrite. However, a rewrite can be expensive and highly risky. When a rewrite of a project fails it leaves in its wake a dispirited organization with no product to take to market. Before starting a rewrite, do a what-if exercise on the current application to see what would need to change to support the new requirements. Often large parts of an application can be salvaged while other parts are refactored, thereby reducing risk and saving considerable time and effort.

4. Your development team is happier

A design that is easy to understand reduces stress on the team. A modular design allows different team members to improve different components of the project at the same time without breaking each other’s code.

A new way to think about software design

This year’s Saturn Conference at San Diego reflected an evolving landscape as macro trends such as cloud based architectures, Internet of Things (IoT), and devOps in an Agile world, continue to reshape the industry. How do we think about design and architecture in this changing landscape?

Professor Daniel Jackson of MIT, in a keynote at the Saturn Conference, gave us a fresh look on how to think about design. The idea is simple and elegant and one wonders why it took so long for somebody to come up with it. Simply put, Professor Jackson describes an application as a collection of coherent concepts that fulfill the purposes of the application. The beauty of this formulation is that it eliminates the clutter of implementation artifacts.


When we describe the design of a program in UML, we struggle to create structural and behavioral diagrams that accurately reflect program implementation. Sadly (and, perhaps, mercifully) we rarely succeed in this endeavor and even if we did, those diagrams would likely be just as hard to understand as the code (think of creating interaction diagrams to represent various method call chains). And if our implementation language happens to be a non-object oriented language then we are plain out of luck. On the other hand, this new kind of thinking has the potential to transcend implementation language and, perhaps, even technology. It also has ramifications on the architect vs developer debates that rage in the world of software engineering today.

Conceptual Design vs Representational Design: Reducing the clutter

Professor Jackson provided several examples of applications and the concepts they embody. For instance, an email application embodies concepts such as Email Address, Message and Folder while a word processor embodies concepts such as Paragraph, Format and Style. A considerable part of the presentation delved into the details that illustrated the sophistication that underlies these concepts and the confusion that befalls when these concepts are poorly defined.

So, how do we select concepts? Professor Jackson defines purposes that a concept fulfills. In a clean design, he said, a concept fulfills a single purpose. This has ramifications that I have yet to fully get my head around. It reminds me of the Single Responsibility Principle which is also a difficult concept to understand. In any case, I suspect that defining a coherent set of concepts is difficult and takes repeated iterations of implementations to get it right. In fact, the user of that software is likely to be a critical part of the process as concepts are pruned, split up or even eliminated to make them coherent and understandable.

And, how do we implement concepts? Does a concept map to a single class or multiple classes if implemented in an object oriented language? I will eagerly wait to see further work on this approach.

Go look up the slides of this thought provoking presentation here: Rethinking Software Design.

Android Kernel: Lacking in Modularity

android-panda-dsm

We decided to take a look at the architecture of the Android Kernel. We selected the panda configuration for no particular reason - any other configuration would have worked just as well. The kernel code is written in C and it is derived from the Linux kernel. So, our approach will work on any configuration of the generic Linux kernel, as well.

Now we all know that C/C++ is a complex language and so we expect the analysis to be hard. But that difficulty just refers to the parser. Armed with the Clang parser we felt confident and were pleased that we didn't run into any issues. Our goal was to examine all the C/C++ source files that go in the panda configuration and to understand their inter-relationships. To do this, it was necessary to figure out what files are included or excluded from the panda build. And then there were issues dealing with how all the files were compiled, included and linked. That all took effort. The resulting picture showed how coupled the Linux kernel is.

First, let's acknowledge that the Linux kernel is well-written. What goes into it is tightly controlled. Given its importance in the IT infrastructure of the world, that is just what one would hope. Let us also remember that many of the modularity mechanisms in use today were invented in Unix. The notion of device drivers that plug into an Operating System was popularized by Unix and is commonplace today. Application pipes were pioneered by Unix. And yet, the Linux kernel itself has poor modularity.

Part of the problem is that that when Unix/Linux kernels were developed programming language support for modularity was poor. For instance, C does not have the notion of an interface and so dependency inversion is not naturally supported (it is possible, however). And, Linux has no real modularity mechanisms to verify or enforce modularity

A few things become apparent after a partitioning algorithm is applied. This partitioning algorithm reorders the subsystems based on dependencies, revealing what is "lower" and what is "higher." In an ideal implementation, the developers of the higher layer need only understand the API of the lower layers, while the developers of the lower layers need to worry about the higher layers only when an interface is affected. In a coupled system developers need to understand both layers making the job of understanding the impact of change considerably harder. Indeed, in the Android kernel where nearly all the layers are coupled, developers may sometimes have to understand thousands of files to feel confident about their changes.

This also means is that the intent behind the decomposition has been lost. For instance, 'arch.arm' is so strongly coupled with 'kernel' that it is hard for developers to understand one without understanding the other. Notice how even the 'drivers' are coupled to rest of the system. I experimented by creating a separate layer for the base layer of the drivers and I even moved some of the basic drivers such as 'char' and 'tty' and yet the coupling remained. Sadly, even some of the newer drivers are also coupled to the kernel.

All this goes to show that unless there is a focus on architecture definition and validation, even the best managed software systems will experience architectural erosion over time.

If you would like to discuss the methodology of this study or if you would like to replicate the results on your own, please contact me (neeraj dot sangal at lattix dot com). You can peruse a Lattix white paper on the Android kernel for some more details.

Analyzing ArgoUML

Johan van den Muijsenberg just published (in Dutch) his analysis of ArgoUML in a magazine published by the Java User Group in Netherlands. The brilliance of Johan's analysis is in how logically straightforward it is and how that analysis yields clearly identifiable architectural issues and fixes. It is yet another example of how easily architecture erodes from its intended design. If more teams were to focus on fixing "bugs" in architecture, they would reap rich dividends in improved quality and productivity.

My main complaint is why Dutch readers should be the only ones to benefit from this interesting and useful article. Here is a Google translation into English.

What Star Trek Can Teach Us About Software Architecture

What Star Trek Can Teach Us About Software Architecture

The writers of Star Trek Voyager envisioned a game that was worthy of challenging the superior mind and intellect of a Vulcan. They called it Kal-toh. To the human eye, Kal-toh looks to be a high-tech fusion of Jenga and chess. Lieutenant Tuvok of the starship Voyager would be quick to tell you "Kal-toh is to chess as chess is to tic-tac-toe.”

In Kal-toh, the players are presented with rod like game pieces that are in total chaos. This could be compared to legacy code with no discernable architecture or documentation – spaghetti code or a big ball of mud. The object of Kal-toh is to move the pieces (systems and subsystems) until a perfect structure or architecture is formed. The challenge in Kal-toh, as in Software Architecture Design, is that if you move the wrong piece, the entire structure can collapse.

Kal-toh and Software Architecture are based on very similar principles. An experienced Kal-toh player has the ability to visualize the complexity in the Kal-toh puzzle. Without fully understanding how the Kal-toh pieces interact with each other there is no road map to create stability in the structure.

Visualizing software systems is hard because there are a very large number of elements with dependencies on each other. Therefore, the ability to scale is critical. Furthermore, the purpose of visualization is to reveal the architecture. This means that it is important to not just draw the dependencies but to show where dependencies are missing and to highlight the problematic dependencies.

Lattix has pioneered an award winning approach for large scale visualization based on a Dependency Structure Matrix (DSM). Lattix incorporates hierarchy in the display, while showing the architectural decomposition and the dependencies at the same time. Built-in algorithms help discover architecture. What-if capabilities allow users to modify the current structure to reflect the desired architecture. Users get a big picture view while still identifying problematic dependencies. Lattix also provides an intuitive Conceptual Architecture diagram that takes some of the lessons of a DSM and makes them accessible to a wider audience.

Let me leave you with a Star Trek quote. Lieutenant Tuvok said “Kal-toh is not about striving for balance. It is about finding the seeds of order, even in the midst of profound chaos." The same can be said about Lattix.