2025-12-11

Weekly Rant: Architectural Erosion

This post is part of the “Weekly Rant” series

This week’s rant is slightly less ranty than expected, for which I sincerely apologize in advance. Also I’m publishing on a Thursday to please a specifically impatient reader. Subsequent rants will likely still be on Friday.

I want to share a pattern I’ve noticed over the years. I’d join a team, they’d already have some “legacy” system (some things are instantly legacy) that wasn’t “modularized” correctly. The boundaries around whatever the unit of re-usability is (modules, packages, whatever), would not be correct (anymore?). This is a form of software rot. I’ve seen people use different names for different minor variations of the phenomenon: architectur(e|al) (drift|erosion). I prefer the later of those options.

After encountering this, I would subsequently recommend to “collapse everything” and re-factor back into “modules”. Intuitively, that’s what I’ve felt that needed to be done, and indeed it significantly improved the system each time it was done. All the behavior is there, I realized, it’s just split in a way that makes things that should be easy to change hard to change, and vice-versa. It was aiming to fix decoupling and cohesion by breaking down all walls, remodelling, and putting them back up. Or in more precise terms, I proposed large scale restructuring (macro-refactoring) of the system’s modular structure, iteratively coalescing (merging or consolidating modules) and re-modularizing (splitting or extracting new modules), with the aim to repair or improve the orthogonal properties of the system’s modularization without changing its external behavior.

It’s worth stressing that I’m using the term “module” in the most generic sense possible, as “some unit of software at a medium high level of granularity: smaller than a subsystem, generally larger than a file containing multiple data types, or a namespace”. Sometimes it’s called a “module”, often a “package”. I think “module” makes much more sense but many language creators historically have not agreed, judging by the state of things.

Anyway, in formalizing my thoughts on the process, I (much later) discovered it to be known in academic circlejerks – euh circles – as architectural remodularization. Actual research has been done in this area, a double digit number of papers published on the topic. Most of it seems to revolve around establishing mechanisms for methodological or even automated approaches to the process (recent example), using code metrics, heuristics and analysis to imply specific changes to the modularization. While theoretically interesting, practical applicability is currently low. I’m proposing nothing quite as fanciful.

If your starting point is well-structured enough (i.e., the boundaries are incorrect but the code somewhat consistently adheres to a well-defined architecture), you can perhaps do some simple scripting that automates the process for at least this specific starting architecture. Whether that is worth it over just doing the work manually, considering it’s a one-off (the same script will not work next time) is a different equation. This only works if the source and target architectures are somewhat similar of course. Or rather the more you deviate from “just shuffling things around”, the more difficult it will become. If this is what you need long term, I suggest you go live in between some of the iterations.

Primarily, this involves discipline, persistence and a good deal of time across many iterations. The mental equivalent of “elbow grease”.

Practical Real World Strategy

This involves breaking everything (from a functional perspective) and fixing it again. In that sense it can be considered a high risk operation. Having a decent number of functional and unit tests are a must have before even considering this. Even then, it’s risky and definitely time consuming. Not something for a Friday afternoon. Merging master quickly becomes a conflict nightmare, which is a good argument for scripting the process.

But, if I ignore the overtime I had to put in, in some of these cases, this has been a very successful strategy:

Consolidate/coalesce

Move code into the same “module”. Whatever you need to do to completely break module level isolation so all code can talk.

Refactor

Restructure the code, fixing (and/or writing) integration tests as needed. The goal is remodel the dependencies in such a way it will be easy to split into modules. Resist micro refactorings. Really, do not do it. If the change you just made is not immediately required for the higher level restructuring, revert it. This is harder than it sounds.

Extract/split

Do the actual split and fix all the tests. It’s going to require a good deal of integration tests. The functional tests obviously should not change unless you find a (verifiable) error in them.

Iterate

Depending on the size of the system you may want to zoom in and out on specific parts of the system and apply the same pattern at different levels of granularity. Eg, hypothetically I might zoom into the billing-service specifically, remodularize it, zoom out and remodularize the billing-service and invoice-service in the finance subsystem, resulting in an accounting- and payments service because that makes more sense from a domain perspective.

It’s tempting to take it small bits at the time. Generally that’s the advice when doing any type of iterative approach. In this case you want to use as large bits as you can get away with. It’s hard to qualify what that exactly means. You want to be able to shuffle things around. But it shouldn’t take too long before tests pass again. If you feel that happens, just bail, take a step back. Of course, make sure you can always hit that proverbial undo button without having to start from scratch.

Happy remodelling.