Why Red-to-Green projects just add more Red

A Red-to-Green project is one that aims to take an old, shoddy system (“red”) and, bit by bit, turn it “green” – in practice, by replacing each old component in turn by a newer piece of code.

The logic behind this is that the system stays operational whilst developers add newer modules and services, with appropriate shims and interfaces so that the remaining legacy code thinks it’s still talking to its old buddies.

An example: you have an ancient web application which consists of a SQL database, a set of COM+ objects in VB6, and a Classic ASP front-end. Using RTG methodology, you could write a new version of each COM+ object using, say, C#, which exposed the same COM-visible methods as the old object. You could install the new object in COM+, and the website should, in theory, continue to work as normal – at the same time, you could code newer parts of the website in ASP.NET MVC and use the DLL directly (or maybe within a service). Alternatively, you could go ahead and rewrite some particularly grotty ASP pages in MVC, and create an interop library which talks to the COM+ objects but exposes a more pleasant API for ASP.NET to use.

Neat. What’s the problem?

It’s too good to be true.

Red-to-Green, like many bad ideas, sounds fantastic to management:

What’s that Jenkins? You say you can rebuild our software whilst still keeping the application running and delivering new features? You say that this new ‘green’ stuff will make development faster? By Jove, when can you start?!

This is bullshit. This is what actually happens:

  • Developers begin coding new modules in <modern language of choice>. Usually, the choice of modules will revolve around business needs – i.e. Customer X wants feature Y, so can we use this Red-to-Green stuff to deliver Y faster?
  • An amount of time is taken up with the basic plumbing – creating base classes, defining an interop layer (.NET to COM+, Ruby to PHP, views over an old SQL schema etc.) – but things will really start to move once this is done!
  • The developers start implementing feature Y. Instantly, their new architecture comes into conflict with the old – no-one told them (or remembered) that this batch VBA script needs to run in the background to copy files, or that the format of filenames is actually really important for this module
  • The new, green components now consist mostly of code which replicates, or compensates for, the eccentricities of the legacy code base. If not, the new code is now wholly reliant on calling methods in the old codebase, and is now tightly coupled with the legacy code.
  • Feature Y is delivered – probably late, but that’s OK, because the RTG pattern is established now, right?
  • Feature Z, which involves modifying a small part of the legacy system, needs to get information from Feature Y. The developers write an interface so the legacy codebase can talk to the new one. The green components are now coupled to the red, and vice versa.
  • Management begins to lose interest – there are paying customers out there, and Red-to-Green is taking too long and causing too much confusion. Who cares that the email formatting engine is a tangled mess of VB6? It works, so leave it alone! We’ll allocate some more time in the new year.

At this point, the green components are now just a different flavour of legacy code: they go to join the swamp of rotting code, more meatballs in the Great Spaghetti Monster.

When to Inject a Dependency

Let’s take a simple scenario: say that you’re developing a holiday request system (because the previous MS Access-cum-Excel-cum-Word jalopy that the boss wrote ten years ago has finally corrupted beyond the point of no return). You’re given the basic parameters, and then, if you’re like me, you start pseudo-coding:

class HolidayManager
{
  void RequestHoliday(Employee emp, DateTime start, DateTime end)
  {
    // TODO: sensible error messages and basic validation stuff
    if (start < DateTime.Now) throw new ArgumentOutOfRangeException();
    if (end < start) throw new ArgumentOutOfRangeException();

    // does the user have any holiday available?
    if (database.GetHolidaysRemaining(emp) == 0)
      throw new OutOfHolidayException("TODO: should we use a return code?");

    // assume that this call returns us a unique ID for this holiday
    var holidayId = database.AddHoliday(emp, start, end, HolidayStatus.Requested);

    // tell the user
    emailer.SendConfirmation(emp, start, end);

    // tell their line manager
    var manager = database.GetLineManager(emp);
    if (manager != null) // they might be the Big Boss
      emailer.SendRequestForApprovalEmail(emp, start, finish, holidayId);
    else
      this.ApproveHoliday(holidayId, HolidayStatus.Approved);
  }

  void ApproveHoliday(int id, HolidayStatus status)
  {
    // assume this struct has info about the holiday
    var holiday = database.GetHoliday(id);
    if (holiday.Status == HolidayStatus.Requested)
    {
        database.SetApproval(id, status);
        int daysLeft = database.CalculateRemainingDays(holiday.Employee);

        // send email to the requester (assume that holiday includes an email address)
        emailer.SendApprovalEmail(holiday, status, daysLeft);
    }
    else
      throw new InvalidOperationException("already approved");
  }
}

What we have here is a basic sketch of our business logic – we don’t care about how the database works, and we don’t care how emails go out, because that’s not important to us at this stage.

If, whilst pseudo-coding, you find yourself writing GetStuffFromDatabase or somesortofqueue.Push(msg), then congratulations – you’ve identified your dependencies!

Whenever you find yourself writing a quick note to cover up what you know will be a complicated process (retrieving data from a store, passing messages to a queue, interacting with a third party service, etc.), then this is an excellent candidate for a dependency.

Looking at our example, we’ve got two dependencies already:

  • database
  • emailer

Now, my point here is not to design the best holiday application imaginable – the code above is just a sketch. My point is that we’ve now identified some application boundaries, namely the boundary between business logic and services.

Let’s take this a little further: in our emailer implementation, we might have further dependencies. Say that our emailer implementation needs to format an email (given data and a template) and then send it. We might switch the formatter at some point in the future (using RazorEngine, NVelocity or just string.Format) and we might change the delivery mechanism (because the business now has a central email queue, and we need to submit messages to that instead of using SMTP directly). In this case, the business logic of the email engine is:

class HolidayEmailer {
  IEmailFormatter formatter;
  IEmailSender sender;
  IEmailLogger logger;

  void SendEmail(someObject email)
  {
    var message = new System.Net.MailMessage();
    message.To.Add(email.To);
    message.Subject = email.Subject;
    message.Body = formatter.Format(email);

    sender.Send(message);
    logger.Log(message);
  }
}

What this means is that we have many levels of DI – from the point of view of our holiday app, it has an abstraction that it can call to send emails, and we can unit test our object against a mock. We canĀ also test our emailer implementation, because we’ve abstracted the formatting, sending and logging, which means that we can test what happens if any of these fail, or produce weird results.