Discovering Software Rot – hello2morrow – Empowering Software Craftsmanship

Since I have seen some announcements about talks targeting the German Corona Warn App, I was interested to see, what kind of information I could extract with Sonargraph.

The project is still pretty young, but maybe, there is already some structural erosion happening? (Spoiler: There is!)

This blog post is about the setup of the analysis-pipeline consisting of Sonargraph-Build and Sonargraph-Enterprise that allowed me to analyse 24 versions (1.4.0 to 2.9.0) of the CWA-Server project and investigate metric trends.

The execution is implemented in Java and published as a new repo “Sonargraph-Build Batch“. If anybody is interested in analyzing past versions of a project with Sonargraph, “Sonargraph-Build Batch” can serve as a starting point.

Motivation

Software experts like Edward Yourdon (Computer Hall of Fame), Robert C. Martin (a.k.a. “Uncle Bob”), and a lot of others have been saying it for a long time that software erosion (a.k.a. “software rot”) is a severe problem in our industry.

It happens, when code is written without enforcing a proper structure. These kind of projects end up in a “big ball of mud“, where bug fixes and other functional changes get more and more costly to implement, because it is impossible to foresee the impact of the changes to the rest of the system. If you picture the system as Gulliver on Lilliput Island, each unnecessarily added dependency is another rope tying it down and making it more difficult to move.

One early indicator of a project going into this direction is the negative trend in coupling, i.e. dependencies between individual elements (e.g. packages, classes) grow rapidly. A chaotic dependency structure with a lot of circular dependencies is shown in the following screenshot generated with Sonargraph’s Exploration view:

**Visualization of Dependency Structure (“Big Ball of Mud”)**

Kent Beck gave a brilliant talk “The Beauty of Maintenance” with a strong focus on “coupling”, where he explains its impact on maintenance. He points out that coupling and cohesion and the importance of an organized dependency structure has already been highlighted in the “green book” (Structured Design) written by Edward Yourdon and Larry Constantine in 1979 (!!). The book is worth a read especially if you still think that efforts spent in software design is a waste of time and it should rather “emerge” by itself (an approach which I haven’t ever experienced to go well).

If you agree with these experts then you will be interested in monitoring coupling and detect the creation of unwanted dependencies (things that Sonargraph was built for).

I wanted to see, if I can identify some popular projects where structural erosion is a problem. I picked the German cwa-server project and hibernate-core.

Analysis

With Sonargraph-Build and the integrations to SonarQube and Jenkins, it has been possible for some years now to track metric trends. But that requires that you have used Sonargraph during development, i.e. during the time when the releases were built.

I now want to go back in time and run Sonargraph for a series of versions for the cwa-server project. So, I needed to implement the following:

Retrieve the available releases
For each release:
1. Checkout the code for the release from GitHub.
2. Build the project or at least compile the sources.
3. Execute Sonargraph-Build and upload the resulting XML report to Sonargraph-Enterprise.

For the cwa-server project, it was ok to implemented this as a batch script. The project is pretty new and I did not notice a major change in the build itself. If you look at the results, you can see some signs for a structural erosion, like growing package cycles, and an increasing amount of code involved in cycles. Generated and test code are excluded from the analysis.

**Code Involved in Cycles of CWA-Server**

**Number of Packages Involved in Cycles** **in CWA Server**

I have to admit, these numbers are pretty abstract. The biggest package cycle group of 10 packages looks like this, if you visualize it in Sonargraph, and hopefully you agree that this does not look good.

**Cycle Group of 10 Packages in CWA Server**

For one reason, circular dependencies make it more difficult to understand the code base. On a technical side, they make it impossible to split the code later into different modules. At least I don’t know any module and build system (something like OSGi, Maven, etc.) that allows circular dependencies between modules.

This kind of batch analysis stops working as soon as the build itself is changed, for example if Maven is replaced by Gradle, different targets need to be called, and maybe other Java versions are required for the build. So this kind of analysis is not suitable for longer timespans.

Since my experience with batch scripts is limited and I desparately missed debugging functionality, I resorted back to Java.

For the analysis of hibernate-core, I implemented a different approach:

Retrieve the available releases from Maven.
For each version
1. Download the classes.jar and sources.jar from Maven central and use them in a Sonargraph system as class and source root directories.
2. Run Sonargraph-Build, generate XML and HTML reports and upload the results to Sonargraph-Enterprise.

After repeating this for a couple of other Maven modules, I generalized and simplified the code. The analysis now can be started simply by specifying a couple of arguments for the file “AnalyzeMavenArtifact.java” (for more details, check the code at GitHub).

AnalyzeMavenArtifact <groupId> <artifactid> ./src/main/resources/maven-central.properties writeVersionsFile

This obviously speeds up the analysis dramatically, since nothing needs to be built locally and only the Sonargraph analysis is executed for each version.

There exists very high coupling in hiberante-core, with very large cycle groups and an ACD (Average Component Dependency) of over 2000… The screenshot above about the “Big Ball of Mud” dependency structure shows only a part of hibernate-core and it is not possible to extract a higher-level abstraction from the dependencies in the code.
The next screenshot shows some data about coupling metrics of the hibernate-core project.

**Metric Values and Trends of Hibernate-Core**

I repeated the analysis for other Maven artifacts with similar looking results.

My Attempt of an Explanation

I see some early warnings for software rot in the CWA-Server project. For me, this comes as no surprise, since the monitoring of dependencies needs special tooling. They use SonarQube to monitor the quality, but there is no support to monitor the architecture or detect unwanted dependency. Thus, the project is blind with respect to the structural erosion.

For all other projects I have analyzed (e.g. hibernate-core, activemq-all, junit, spotbugs), similar problems are visible with a varying degree of severity.

I conclude that there is still a lot of room for improvement in the software industry. Despite the fact that the problems of bad structure have been known for a long time, only very few projects take the efforts to fight them. Since tools like Sonargraph exist for years, we cannot use the excuse “it’s impossible to know if an added dependency causes an architecture violation”. Thus, what explanations exist for this situation?

Open Source versus Commercial Software

You might argue that the projects I have analyzed are open source and it looks different for commercial software. We have seen and analyzed a lot of commercially developed software over the years and I can assure you that there is no difference. Commercial software suffers from software rot equally badly.

Missing Incentives

A lot of software is implemented by contractors or consultants. If the software is so complex that only they can maintain it, well, then their job is pretty safe isn’t it?

Even if they are professionals and try their best, structural erosion will happen, if no proper tooling is used.

I know only very few people, who have worked for the same project over years. If the project lead or the developers do not benefit from the good quality and are gone by the end of the year, what’s their motivation to put in extra work for a clean structure?

Priorities

Fixing a bug or implementing a new feature that a customer has been waiting for is always more urgent than cleaning up a messy structure. Software is nothing mechanical like a watch or a motorcycle where customers pay attention to implementation details. You cannot open a lid and see the internals. As longs as it works and does not offer too many bad surprises, software customers are happy. But, if urgent stuff constantly beats important refactorings, software erosion cannot be stopped.

Hard to Sell

Software itself is not tangible and metrics about cohesion and coupling are pretty abstract. How do you convince your project lead that breaking up a cyclic structure is a brilliant idea? When cycles are small, the problem seems to be inflated, but once they are huge, it’s no longer easy to fix and you don’t want to touch it with a 5ft pole.

Plus, popular quality tools like SonarQube don’t offer dependency checks, so people might think “If they don’t offer it, it cannot be that important”. Developers focus on file-local issues and test coverage (which are important, too) but the overall structure is completely ignored. Some might split their code-base into hundreds of modules, in order to limit the extend of coupling and prevent system-wide cycles. But we have seen projects, where files or folders are linked from different modules to circumvent the restrictions of the module/project system.

Others chop their code-base into dozens of microservices, only to discover that it is now even harder to track dependencies and that they ended up in a distributed big ball of mud. As Simon Brown says: “If you can’t build a modular monolith, what makes you think microservices is the answer?”

Boiling Frog Syndrome

Code and dependencies between classes grow incrementally. Usually, you don’t see where and how code gets added and you don’t know from the outside without detailed inspection (or proper tooling) if it’s good or bad code. What looks “okay” today might derail slowly into an unintended structure over the cause of weeks and months. Plus, if it is your own code, it might not look like complex or bad code at all. You get used to the heat and don’t notice that you are in boiling water…

Last but not least, we get used to the way we work. If you have always worked in a project with a bad internal structure, you don’t expect it to be different in the next project.

Final Remarks

This article started of with the analysis of a series of versions to check for signs of software rot. For this type of analysis, the new project “Sonargraph-Build-Batch” can be used as a starting point.

I have shown that there are warning signs about software erosion in the rather young project CWA-Server. And that I see severe structural problems in hibernate-core.

I tried to explain the reasons why the topic is unpopular in our industry. If I could not convince you about the importance to keep the overall structure of your software in good shape, that’s fine. There are a lot of resources available for free on the web or contained in books, where successful and well-known experts explain the topic much better. Here is a list of my personal favorites:

‘Structured Design’ by Edward Yourdon and Larry L. Constantine, Prentice-Hall, 1979
‘Is High Quality Worth the Cost’ by Martin Fowler, https://martinfowler.com/articles/is-quality-worth-cost.html, 2019
‘Sustainable Software Architecture: Analyze and Reduce Technical Debt’ by Dr Carola Lilienthal, dpunkt.verlag, 2020
‘Is Design Dead’ by Martin Fowler, https://martinfowler.com/articles/designDead.html, 2004
‘Simple made easy’, Rich Hickey: http://www.infoq.com/presentations/Simple-Made-Easy, 2011
‘The value of values’, Rich Hickey: http://www.infoq.com/presentations/Value-Values, 2012
‘Domain Driven Design’ by Eric Evans, Addison-Wesley, 2004
‘Large-Scale C++ Software Design’ by John Lakos, Addison-Wesley, 1996
‘The Pragmatic Programmer: From Journeyman to Master’ by Andrew Hunt and David Thomas, Addison-Wesley, 1999
‘Applying UML And Patterns’ by Craig Larman, Prentice Hall, 2000
‘Agile Software Development’ by Robert C. Martin, Prentice Hall, 2003

Rest assured that software rot can be prevented! The “Environment Agency Austria” as one of our long-time customers has successfully implemented a development process that reduces maintenance costs by 50%, as documented in a case study available on our website.