Saturday, March 26, 2011

On "Working Effectively with Legacy Code"

In the last few weeks I have spent most of my time working on a legacy system which is going to be partly replaced by a new one which greatly broadens its scope. But what exactly is legacy code? Michael Feathers says that
To me, legacy code is simply code without tests.

It is as simple as that:
Code without tests is bad code. (...) With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don't know if our code is getting better or worse.

Alas, too often the answer is not the one we'd like to hear.

We started out so well

Almost every system, if you have at least decent programmers working on it, starts out as a wonderful green field: defined architecture, code conventions, solid principles, patterns and practices.

After a while the gardeners get busy with other gardens, and a couple of weeds start to poke out. Then some more, then more. And more. Meanwhile seasons pass and rain falls, so our green field gets muddier and muddier. Actually it looks more like it has rained cats and dogs, day after day after day. Mud becomes quicksand, and nobody dares to even get close to it.

The (in)famous spaghetti incident

Too soon the project reaches the point in which whenever you pull a string you have an undesired and often undetected change in a completely different part of the system. When I want to sound important I refer to this as the butterfly effect, a metaphore used in chaos theory. Yes, I know that your projects are different and you have never seen it happen, so try to use your imagination and stay with me a little more, will you?

That was exactly where I was, all tangled up in obscure dependencies: the perfect spaghetti code.


As I tweeted, I love spaghetti when they are in my dish, not in my code, so I had to do something (even because deadlines are always waiting in ambush). To sharpen my tools I retrieved my copy of "Working Effectively with Legacy Code". The book covers many aspects of what we developers have to face daily: instead of the "write once, run anywhere" mantra, we have to deal with "write once, read it at least ten times", but too often our code is obscure. Yes, our code. And we don't have tests, or at least we don't have enough. Do you have a complete coverage? If you do, you have all my respect, otherwise welcome in the family.

How do we get out of this?

First of all, we have to get our code into a test harness, but it is not as easy as it sounds. The book contains a series of 24 different dependency breaking techniques, each of which presented in several almost-real situations, to minimize the impacts of changes and ensure that we are not breaking anything. It sounds like the classic catch-22 thing:

When we change code, we should have tests in place. To put tests in place, we often have to change code.


This is where the techniques come in handy. I was pleasantly surprised when I discovered that, after some years of TDD, almost every problem and solution described in the book sounded familiar, so I skimmed it more than actually reading it. I still remember when I first read it: man it was hard, not the writer's fault, but because of the reader.
One of the most important things is that you have to preserve the behavior of your system when you refactor it (and preserve the rest of the system when you introduce new features or change existing ones), so you should at least know what it does. What is better than some characterisation tests?

Cleaning it up

I love NetBeans. I always have, since it was Forte4Java. Now I love it even more because it has a heap of automated refactorings that make "Refactoring" needless (ehm... that's not true, NetBeans deals with the mechanics, but you should know what you're doing, so stop reading and go buy your copy if you still don't have it. Done? OK, let's get on).

Do you need to create a new test class? nothing simpler, just a simple SHIFT + CTRL + U and a template with all the skeleton methods is ready for you. Just punch in the starting conditions and the expected results and you're done. Feathers describes all refactorings steps by steps, but a few keystrokes are all NetBeans needs to extract superclasses, methods, interfaces, pull up or push down members and methods, delegate, and so on and so forth. Nevertheless, you should know what you're doing.

While working your code could get uglier in some places. It could be temporary, or temporary in the Italian way (which means definitive). Even in the latter case, at least you would have tests in place so that you know you're not breaking anything.

In several places Feathers suggests to reduce incapsulation to put code into a test harness. Somebody might find it strange, extremists might find it insane. Yet...

Encapsulation isn't an end in itself: it is a tool for understanding.


Remember when I wrote that you write code once and read the very same code over and over? What 's the point of an encapsulated design if I spend hours for the most trivial task? Mnd you, I'm not suggesting every field must be public and collections should be directly exposed, but nobody is going die if a private method becomes protected for the sake of testing.

Where have my spaghetti gone?

After a while, one after the other, your pull spaghetti out of your dish, slowly replacing them with neat and tidy sushi.

Before you realize it, you will look at your codebase with different eyes, and chances are that you will enjoy dealing with legacy code. Don't enjoy too much though, even if you have a lot of technical debt to pay: you don't have to pay it for the sake of doing it, but because it allows you to move faster when you need it (and you know you will need it).

Let me stress this again: if I didn't have some hundreds of tests already in place, these weeks would have been months. Baby steps, nip and tuck, red and green bar. But I only introduced tests and refactored legacy code when I needed it, otherwise I would spend too much time on activities that were not a priority. See something rotten? pen it down, fix what you're doing and if you still have time you can deal with it. Look for decisions that could change, but change your code just in time (the YAGNI rule rules).

1 comment:

Wojciech Roszkowski said...

The transparency of the code is very important if we write an application for many users. That why I am very happy to use the solutions provided from https://grapeup.com/services because I like to use applications that are stable in the cloud. After all, it is a very convenient solution, especially for business.