Why I Hate Microservices Part 1: The Russian Dolls Problem 🪆🪆🪆

People always want the newest things. The more trendy something is, the more it appeals to most people. Same applies to technology. Adding the word microservices to the CV can help it shine a little bit. These are some of the thoughts I was having while I was 2 hours deep into debugging an issue in a project that involved .Net, Kafka, SignalR, React js, Elasticsearch, ECS, MediatR and more.

Beware of Microservices ⚠️

What I'm about to share is a cautionary tale for people who are on the verge of shifting their solution into a microservices architecture.

Before you start implementing such an architecture, please read carefully the decisions that others have made along with their consequences so that you can see for yourself the outcome of each design decision instead of falling for the same mistakes.

Before anything: The case study that I’m about to discuss might have had its problems. However, the people who made the design decisions are very good software engineers, and I have learned a lot from them over the years. Some of these problems occurred due to outside conditions and client restrictions/preferences. Just because I’m pointing out the problems we faced in this project does not mean that I wouldn’t have made similar mistakes. On the contrary, facing these problems made me a better software engineer.

I will never claim that I’m better than those who worked on this project or took part in designing its architecture, and I have great respect for all the engineers I have worked alongside.

And always remember this golden rule:

Never start with microservices, always start modular monolith then shift into microservices (if necessary)

My stance and opinions are based on 3 things:

Personal experience and real life consequences.
The opinions and point of views of my seasoned colleagues and coworkers.
Major design and architecture books such as Building Microservices, Microservices Design Patterns in .Net & Embracing Microservices Design.

That being said let's talk about this project by dissecting its core problems that made this system unstable, unpredictable and straight up exhausting.

Let's Decouple By Coupling 🔗

One of the main problems of this project was that the microservices were heavily dependent on each other. Which right away should be your first sign that this should be monolith not microservices.

The dependency between domains was so tight that sometimes one domain might require a synchronous communication with another domain. Or that a domain needs a certain model from another domain. Which of course made the domains coupled. And this is a topic I have covered in a previous article.

To fix this coupling, a decision was made to create a package of each domain, publish these packages on Nexus Package Repository after each successful build (as a ci/cd pipeline step) and finally referencing the needed package in the domain that needs the models or the services.

Great, right? No. If you look closely, you basically added a domain inside another domain which means you're going back to monolith. Not only it contradicts the whole microservices idea, it creates a very painful deployment cycle.

It's important to note that this technique is actually very useful in other cases like sharing some utility code among multiple services. The technique itself isn't the issue, rather the case it was chosen for.

Now let's get to the horrors of deployment when you have domains that are this reliant on each other.

Deployment PTSD 😰

I don't know if you're aware of the Russian dolls. It's those dolls that when you open one you find another one inside of it. Well I'm calling this ordeal The Russian Dolls Problem and I will tell you why.

Having domain 1 reference domain 2 based on a package you published to a repository means that each time you update the code base in domain 1 you need to update the package version in domain 2 (manually) so that it gets the latest updates.

And if domain 2 is already used in domain 3 for example, then you have to update domain 3 as well and if - god forbid - domain 3 is used in domain 4, then you probably get why I call it the Russian Dolls Problem.

This is actually how the project handled inter-service communication and this layout was so exhausting and time consuming that best-case-scenario a full deployment process might take from 1 to 2 hours. Imagine if you discover that you made a mistake in domain 1, you need to do a full deployment again because now the whole other 3 domains are impacted.

This panic-attack inducing deployment cycle can and will cause nightmares. Specially if you add Kafka and Confluent Schema Registry in the mix.

Because the slightest mismatch in schemas (due to an outdated domain package) will prevent Schema Registry from deserializing the Avro serialized message and will cause schema conflict and ruin your application flow which by the way did not stop nor revert the flow.

Which brings us to the next problem that I'm going to discuss in the next article: Data inconsistency.

Search This Blog

HARDCODE