How I Learned to Write Legacy Code

The following is the text of a talk I gave at Imperial College in February. The images are those used in the talk, and are used here without explicit permission.

Hello. As Antonio says I’m Duncan McGregor. I graduated with a Physics degree from Imperial in 1989, when Windows looked like this.

Windows 2.1

From Imperial I joined the Ministry of Defence Management Training Fast Stream, where I worked in various mainly R&D roles. I left the MoD in 1994, when Windows looked like this.

Windows 3.11

and had a few permanent jobs in software companies before taking the plunge and becoming a contractor in 2000, when Windows looked like this.

Windows 2000.

These days (when I don’t even know what Windows looks like any more) I work mainly in London, largely on the JVM, and in various different application areas, most recently in academic publishing.

When Robert asked me to talk about legacy code and participate in your discussion, I thought I’d dredge up a few war stories and see if we, or at least I, can learn something from my career.

My first real exposure to commercial code was at MTA, a small consultancy I joined after I was ejected from the MoD. The consultancy had just been commissioned to extend a Lisp expert system that planned Royal Marines amphibious landings – essentially which troops and equipment should leave which ships on which boats and helicopters at what time to hit the beach at H-hour, with the most scary marines at the front.

Royal Marines Landing Craft

I bring this project up because (with a bit of a bash) it fits the original definition of legacy – essentially “something of value left by old people to young people.” This wasn’t an albatross - it was an asset. Just because I hadn’t written it didn’t mean that it was bad - it did the job, was capable of being extended, and made the company more money.

This contrasts with my next job, at ESL, a company that wrote X.400 email systems. No one remembers X.400 any more – it was email designed by international standards bodies rather than Unix hackers, and so died the death that most over-engineered solutions do when disintermediated by free software. The codebase, and the people who knew how to tend it, were the company’s only asset – and when I joined they had just been bought out by an American firm who had won a contract with the US Department of Defence and needed code to fulfil it.

Unfortunately this codebase was much more like our usual software definition of legacy – old and hard to work with. It was, I think, originally quite well engineered – there were sensible conventions, and reasonable encapsulation for a C codebase. But this was lost under the accretion of years of new features and misunderstandings. I remember that, years after leaving the job, I finally realised that the memory management scheme – where heaps were allocated by upstream code to allow downstream functions to return pointers – was a clever way of taking advantage of the pipeline nature of message decoding to make up for lack of garbage collection. But if anyone at the company understood that by the time I arrived, they certainly didn’t explain it. Instead we had heuristics for when heaps had to be allocated and destroyed that seemed arbitrary because the original model had been forgotten.

The result of this and other code issues was painfully slow progress and eventual bankruptcy, so I took a job at Roke Manor Research, then a part of Siemens. Work there was on nice new R&D projects with rubbish logos,


so I learned little about legacy code - until I discovered Extreme Programming.

This was 1999, and it’s easy to forget just how revolutionary Ward Cunningham and Kent Beck‘s ideas were. Up to this point we considered that finding all the requirements for a software system before starting work was a worthy goal. That way the system could be designed to meet those requirements – and the team could implement the design with least waste.

At least, that’s true for the first version, but after that, you have the system as it is and have to work out how to hack Version 2’s features into the existing system. I can’t find the quote, but Kent said something like – “I read that 80% of all effort on software is maintenance – and wondered how we could get that up to 95%.”

I won’t claim that I really understood this at the time, but it is crucial. Your first release is not only building the software – it’s building the team that builds the software. If when it delivers Version 1, that team has only learnt how to implement features given a clean slate, things are going to go very wrong when faced with a list of features and an existing codebase.

This was a lesson that was evidently not learned by Sony - my first client when I started contracting. It gave the prototype Set Top Box application my team had developed to another team to ‘productionise.’ But it was a validation of XP when it comes to code quality. The team took the code and, having studied it, came back and asked, “How did you guys develop this – because we want to be doing that!” That’s an existence proof for a way of writing legacy code that people want to work in – Test Driven Development and You Ain’t Gonna Need It.

Around this time I took a commission to write an interface between the Tradacoms EDI system and a soft-furnishing manufacturer’s accounting package. I didn’t know it at the time, but over 10 years this system would process hundreds of millions of pounds worth of orders and invoices. (If I had have known, I would have charged considerably more.) I wrote the software in Python, because I wanted to learn the language. This is not the best reason for choosing a technology for a system that you’re going to end up supporting for over a decade.

The software was upgraded perhaps 8 times over its lifetime. Every time I returned to the code it was after enough of a break to have to work out how it functioned just from the code and documentation – it was legacy code every time. Python isn’t ideal for this – the dynamic typing makes reasoning with the code harder than in a statically typed language. Nevertheless, in the dozen years that it was running data was never data lost because of a coding error, in fact I don’t think I ever had a bug report that wasn’t actually a feature request. How? Well I knew that this was critical code and wrote tests accordingly, both at the unit and acceptance level, and more regression tests to make sure that upgraded code gave the same output as its predecessor on old messages.

Returning to a piece of code after a break is also an ideal time to add the comments and tests that would have helped you understand the code quicker this time had they been there. I really do think that this codebase got better with age – it came to express the problem domain so that new features were often just a matter of configuration or wiring up existing code in a new way. This is of course legacy code nirvana – and I’ve not been lucky enough to work on another system that aged so well.

That may be because it’s so much easier to work on your own code – even when the old you wasn’t half as good as the current you is. In late 2002 I started another XP project in Sony – a little video editing product that a pair of us released in 9 months and went on to extend. I then left the project – at that point the code wasn’t perfect, but it was really fit for purpose, with a good architecture and test coverage.

I would return to work on the same codebase, now being extended by a team of 6 plus QA, a little over 2 years later. Actually it was in remarkably good shape – with much of the original architecture intact, proving that an evolutionary design can support change. Crucial was that my original partner on the project was still in charge, so that our discovery of solutions and ways of thinking about the domain had not been lost. He also kept the culture of tests, if not always TDD, so that new code was forced by this constraint into being at least more than usually habitable.

This theme of tests driving code quality resurfaced in 2012 when I went to work on a major satellite broadcaster’s set top box software. When I arrived this had recently had its first release, albeit 4 years late. Some of that time had been spent retrofitting tests to the codebase.

Now I suppose if you agree with Michael Feathers - that legacy code is code without tests - it should logically follow that you can turn your decaying stately home into a plush modern mansion just by adding the tests that weren’t there.

It won’t surprise you to know that this is not true. Test Driven Development works because it makes writing software with sensible seams, logical interfaces and defined dependencies the path of least resistance. But if you’re tasked with adding tests to existing code - it’s probably easier not to refactor - just throw mocks at the problem, and if they aren’t powerful enough, then surely powermocks will be. The resulting tests will then prevent you from changing the behaviour of your code accidentally, or, unfortunately, deliberately.

Moving onto another contract at a major fashion retailer reinforced the lesson that tests done badly can prevent any change. Here things were better factored but hobbled by a test suite that tied the tests to the Spring dependency injection framework. The result was individual tests that could take seconds to start, tests that required chasing through the filesystem and class hierarchy to find the data that they relied on, and tests that depended on other tests having run first. And don’t get me started on the tests that started failing on 1 Jan 2015 – my second week, because that was as far into the future as far as anyone could imagine this codebase lasting.

What went wrong here? I suppose that in the end there were just too many people working on the code for too long, in a culture of ‘good enough.’ But the truth is that most developers are not good enough to prevent poorly architected systems from eventually collapsing under their own weight, most architects are not good enough to design systems that can cope with average developers, and most clients won’t pay for the overhead in any case.

But I don’t want to end on a negative note! I’ve just come from a meeting planning work on a multi-man-decade academic peer-review system for a client that does understand the importance of these things (and will be hiring in the summer!) In the end, what we do is engineering – it requires knowledge, experience, creativity and taste – and these are the very things that can keep our legacy code a habitable place to work.

[ If you liked this, you could share it on Twitter. ]