You probably never heard about “Code Cancer”, but this term would be an adequate description for some key issues most non-trivial software systems are suffering from. Chances are that your own organization is affected by it right now. In this article I will describe what I mean by “Code Cancer” and offer ideas how to mitigate this problem.
The screenshot above shows a dependency diagram from the well-known open-source project “Gradle”, which is written in Java. Gradle is an advanced tool for building software systems. What you see a cyclic dependency between 69 different Java files, i.e. by following the links you can reach each of the 69 files from any other and come back a different way. We call this a “cycle group” of 69 elements. The different colors mark the different parent packages for the files. This form of coupling creates some real issues:
- It becomes impossible to re-use any of the 69 classes in the cycle group separately from the rest.
- You can test none of those 69 classes in isolation, which makes testing a lot harder.
- Code comprehension becomes much more difficult, because it also becomes difficult to understand any single class without the other 68 ones. This is especially bad, since developers already spend most of their time with reading code.
- Security vulnerabilities are harder to detect in this code jungle.
By doing a lot of research and assessments of complex systems I can confirm that once those cycle groups reach a certain size, they will only get worse over time, hence the use of the term “code cancer”. The cycle groups can be seen as tumors, that will grow over time. For example, in Apache Cassandra version 1.0 there was a cycle group with 296 elements. In version 4.1 this tumor has grown to almost 1,600 elements. This is what I would call late-stage code-cancer. To decouple a cycle group as large as this you probably need more time than trying to rewrite the software from scratch.
CISQ came out with a report that estimated the cost of poor software quality for 2022 in the U.S. alone to be 2.41 trillion USD, a whopping 10% of GDP. I suspect that code cancer is a major contributor to this figure.
What we observe here is the structural erosion of the code base, which also could be described as deteriorating architectural cohesion. The reasons for that are plentiful. Most importantly most organizations do not work with enforceable architectural models. If there is an architecture, it is communicated either verbally or over some outdated documents. A mechanism to verify that the code is reflecting the architecture is usually missing. That means developers are for the most part unaware of issues caused by undesirable dependencies and only feel the pain once the code tumors have reached critical size. By that time, it is already too late to fix the problem in a cost-effective way.
The best way to address this problem is the use of tools that can detect those cycle groups and allow the definition of enforceable architectural boundaries. Two simple rules that can be enforced automatically will guarantee that your systems never suffer from a severe case of code-cancer:
- Never allow cycle groups with more than 5 elements. This rule applies to source files, but also for larger elements like packages or namespaces. When looking at dependencies between packages / namespaces it is best to totally avoid cyclic dependencies. On the source file level certain design patterns are prone to cycles, but as long as the cycles stay small this is not a big issue.
- If you want to walk the extra mile towards excellence, you also need to define an enforceable (with tool-support) architectural model.
Both of those rules can easily be enforced in the CI build using our Sonargraph tool family. For the architectural model we designed a domain specific language which could be described as UML component diagrams in text form. Sonargraph is used by 100’s of medium to large sized businesses mainly in Europe, but also in the U.S. and Asia. Many of our customers have been using it for more than 10 years and achieved significant improvements in developer productivity and overall code quality. We created a YouTube video that explains the philosophy behind the tool.
If you are interested in having your own software checked for code-cancer, we offer that as a free service. Please contact us at info at hello2morrow dot com or book a virtual intro meeting.