Decoupling a legacy code base is hard.
Decoupling legacy code bases is not only hard, but often we don’t even have a clear idea of the current couplings that exist in our system. Without a clear overview of the current state we can’t make sound decisions on what we should do to improve. This fact leads to refactorings that are subjective, these refactorings might not yield value except for some subjective measure of “less ugly”.
I will deliberately leave out a generic discussion on different types of coupling and cohesion in code, but I will restrict myself to large assemblies with low cohesion that typically come from splitting up monolithic applications such as ours. They tend to have large dependencies, many of which you do not need in your application. Each dependency with its set of transitive dependencies. As legacy code is already hard enough to reason about, the last thing we want to carry around is more code than we need. Also, large shared assemblies with low cohesion cause small changes to require unproportionally large parts of the system to be rebuilt, redeployed and re-tested.
By using static analysis we can make a map of the current couplings in our system, make a plan for refactorings based on that map, and verify that we have obtained the goals we set out to accomplish by re-running the analysis. This can make refactoring easier, cheaper and yield code-bases that can be proven to be more modular.
Refactoring strategies and maps
There are many strategies that one can use to obtain a more maintainable code-base, for example utilize a domain-driven approach, reduce unwanted coupling, refactor to meet some code-metric or refactor according to SOLID principles, all described in the literature.
I am a big fan of the strategies that lend themselves to drawing maps because they put me in the role of general. Maps visualize the current situation and they provide an opportunity to make high-level design decisions and prioritize where to focus efforts.
The life of a general is a nice break from the daily life in the trenches. In the trenches we battle legacy code line-by-line, class by class, in hand-to-hand combat, armed with only a keyboard and our cunning friend ReSharper, always there to suggest that any problem can be solved with hitting ALT+ENTER add reference. In the trenches it’s developer against code in a messy, brutal fight that leaves both sides bleeding with infected wounds in a muddy field. A general, on the other hand, can enjoy a hot beverage, miles away from any messy action on the ground.
“Static program analysis is the analysis of computer software that is performed without actually executing programs”.
We use NDepend to do static analysis. It plugs easily into Visual Studio and TeamCity and provides many types of analysis with interesting code metrics. In this post I’ll stick to the analysis of dependencies. I’ll also restrict it to the dependencies between assemblies, and leave modules as namespaces out of it for now.
NDepend will automatically provide an analysis and make diagrams of your code based on simply scanning all assemblies in your build folder. The only thing I’ve done to make these diagrams is filtering the assemblies with “Nrk” and selecting “Include application assemblies only”, leaving frameworks- and external libraries out of the diagrams.
Splitting frontend and backend
We have a solution where the former monolithic tv.nrk.no has been split into a front-end (tv.nrk.no) and a back-end (psapi.nrk.no). The front-end does not access databases directly anymore, but accesses all data through psapi.nrk.no over HTTP, just as if it was a smart-TV or mobile client. This is good!
In the team we talk about the front-end and the backend as separated, but when looking at the dependency graph we could see that five assemblies in the front-end project contained backend in the name, or were referenced transitively by backend assemblies. We wanted the backend to be separated, but in fact it was an uncompleted siamese twins operation. Couplings were still in place.
Contrasted with our view of the system driven by illusions and feelings, static analysis doesn’t care too much about your hopes and plans for the refactoring, but mercilessly maps out all dependencies between assemblies, and internal dependencies between namespaces in your assemblies.
The strategy is pretty clear from the map – the red cross (added by the Strategic Command) marks the dependency we want to get rid of. Getting rid of Backend.WebAPI.Models will also get rid of a four transitive dependencies.
This map serves as a basis for implementing our strategy of reducing coupling. This is your high level plan for our first refactoring campaign. NDepend will help us further by listing the exact couplings that must be removed to remove the Reference.
For a more concrete view, we can list the connections directly. This is well suited for components that aren’t too tightly coupled. This is also the point at which you realize being a general is easy, but being in the trenches and cutting through enemy lines is painful and will leave you scarred.
Still, this todo-list of things to remove is much more comfortable to work with than being all alone in enemy territory with no plan and no map. We don’t have to manually read and analyze the entire code-base, we can forget the bigger picture and simply focus on each objective one by one, knowing that when we have completed our list we have made a consistent set of refactorings.
This is the time for team-discussions before we start on how to best remove the coupling. It could be moving code between assemblies, duplicating code, not using helper methods from other assemblies, deprecating functionality, splitting assemblies etc, but it requires careful analysis to figure out what to do on a case-by-case basis. Principles such as SOLID could help facilitate these tactical design discussions and help to see the design challenges from multiple perspectives.
After the refactoring the size of the deployment package was reduced from 288 to 113 MB. The backend references are gone. Our work still isn’t over though, the thick arrow to the Common module means that there is still a lot of shared code that should be looked at. However, having cut through the enemy lines, annihilated their supply lines to the backend and isolated them completely at the front, we should relax and celebrate a little before we push forward and refactor the remains.
The next step from here is to add rules that warns if these dependencies come back. NDepend makes it fairly straightforward to make rules such as a rule that breaks the build if a dependency is made on an assembly containing the name “Backend”. I like to think of these rules as tripwires, efficiently stopping unwelcome intruders from attempts at recovering the occupied Territory.
Our experience has been that this strategy works quite well, we have also applied it to other parts of the system. When we split functionality into smaller services, we make sure the entire monolith is not pulled in by accidental coupling.