Thinking about microservices? Start with your database

Hang on, the database? Ahem, I think you’ll find that service contexts should be vertical, not horizontal, that is, based on function, not technology. So, nyah-nyah.

Let me explain.

This post isn’t for you if you happen to be Netflix. Nor is it for you if you’re writing a greenfield application with microservices in mind. No, this post is for the team maintaining a chunk of EnterpriseWare, something relatively dull yet useful and (presumably) profitable. This team has been toying with the idea of microservices, and may well be drawing up elaborate Visio diagrams to define the new architecture.

So, what does this have to do with databases?

A database is a service. In Windows land, it literally is a Windows service (mssql.exe, in point of fact), but in more general terms, it provides a service for managing an application’s data.

A database with a relatively simple schema, where the application layer controls most of the SQL code and most operations are simple CRUD, is more like a RESTful API. A database with a more complicated schema (parent-subcategories, multiple tables involved in atomic transactions) that provides complex stored procedures for the application(s) to use is more like a traditional RPC service. Either way, these things are services with a well-known API that the application uses.

Behold, the monolith!

Your basic monlithic codebase will hopefully include both the source code for the database and the application.

If every release requires your DBA to manually run ad-hoc SQL scripts, then you have far bigger problems to address than microservices. Version your migrations and set up continuous deployment before you start looking to change architecture!

Typically, a new version of the product will involve both database changes, and application changes, for example: adding a new field to a business entity, which may touch the table schema, stored procedure(s), the application tier and the UI layer. Without the schema changes, the app will not function. Without the application changes, the schema changes are, at best, useless, and at worst, broken.

Therefore, deployments involve taking the application down, migrating the database (usually after taking a backup), and then deploying the application code.

&nonbreakingchanges;

Microservices imply an application that composes multiple, independent services into a cohesive whole, with the emphasis being on “independent”. For a team with no real experience of this, a useful place to start is the database. Not only will it teach you valuable lessons about managing multiple parts of your application separately, but, even if you don’t decide to go down the microservice rabbit-hole, you will still have gained value from the exercise.

So, what exactly are we talking about? In practical terms:

  • Create a new repository for your database source code.
  • Create a separate continuous integration/deployment pipeline for your database.
  • Deploy your database on its own schedule, independent of the application.
  • Ensure that your database is always backwards-compatible with all live versions of your application.

Now, this last part is the hardest to do, and there’s no silver bullet or magic technique that will do it for you, but it is crucial that you get it right.

Whenever you make a schema change, whether that be a change to a table or stored procedure or whatever, then that change must not break any existing code. This may mean:

  • Using default values for new columns
  • Using triggers or stored procedures to fake legacy columns that are still in use
  • Maintaining shadow, legacy copies of data in extreme circumstances
  • Creating _V2, _V3 etc. stored procedures where the parameters or behaviour changes

The exact techniques used will depend on the change, and must be considered each time the schema changes. After a while, this becomes a habit, and ideally more and more business logic moves into the application layer (whilst storage and consistency logic remains the purview of the database).

Let’s take the example of adding a new column. In this new world, we simply add a database migration to add the new column, and either ensure that it’s nullable, or that it has a sensible default value. We then deploy this change. The application can then take advantage of the column.

Let’s take stored procedures. If we’re adding new optional parameters, then we can just change the procedure, since this is a safe change. If, however, we are adding new required parameters, or removing existing ones, we would create MyProcedure_V2, and deploy it.

Let’s say we want to remove a column – how do we do this? The simple answer is that we don’t, until we’re sure that no code is relying on it. We instead mark it as obsolete wherever we can, and gradually trim off all references until we can safely delete it.

And this benefits us… how?

The biggest benefit to this approach, besides training for microservices, is that you should now be able to run multiple versions of your application concurrently, which in turn means you can start deploying more often to preview customers and get live feedback rather than waiting for a big-bang, make-or-break production deployment.

It also means that you’re doing less during your deployments, which in turn makes them safer. In any case, application-tier deployments tend to be easier, and are definitely easier to roll back.

By deploying database changes separately, the riskier side of change management is at least isolated, and, if all goes well, invisible to end users. By their very nature, safe, non-breaking changes are also usually faster to run, and may even require no downtime.

Apart from all that, though, your team are now performing the sort of change management that will be required for microservices, without breaking up the monolith, and without a huge amount of up-front investment. You’re learning how to manage versions, how to mark APIs as obsolete, how to keep services running to maintain uptime, and how to run a system that isn’t just a single “thing”.

Conclusion

If your team has up until now been writing and maintaining a monolith, you aren’t going to get to serverless/microservice/containerised nirvana overnight. It may not even be worth it.

Rather than investing in a large-scale new architecture, consider testing the water first. If your team can manage to split off your database layer from your main application, and manage its lifecycle according to a separate deployment strategy, handling all versioning and schema issues as they arise, then you may be in a good position to do that for more of your components.

On the other hand, maybe this proves too much of a challenge. Maybe you don’t have enough people to make it work, or your requirements and schema change too often (very, very common for internal line-of-business applications). There’s nothing wrong with that, and it doesn’t imply any sort of failure. It does mean that you will probably struggle to realise any benefit from services, micro or otherwise, and my advice would be to keep refactoring the monolith to be the best it can be.