Spring Boot 2: Best Practices for Reactive Applications

TLDR;

In this article we will highlight some of the best practices when writing a Reactive Application, and more specifically a Spring Webflux application in the context of Spring Boot 2. Each best practice suggested will have a rationale accompanying it that shows when it’s appropriate to use and when it’s not, describing the most common use cases for it. We will close with a look at what are the most common concepts to keep in mind when writing a Reactive Application.

 

Preliminary Considerations

Let’s start by saying that no two applications are equal, and thus there is no one set of practices that will work 100% of the time for every possible application.

What we can do is provide a reasonable list of guidelines and practices derived from the collective development experience of the industry and invite the reader to apply his best judgement when evaluating each one of them, suggesting the question “Does it make sense to incorporate one or more of this into the current application workflow? Is the codebase disrupted too much?”.

Sometimes the answer to those question make it very difficult to adopt any best practice, and for a variety of different reasons, it’s not possible to change the way the codebase is structured and working at the present moment.

Sometimes it’s possible to work on a greenfield project and thus there’s more freedom when it comes to incorporating the suggestions mentioned below, which is the ideal scenario, because there is no pre-existing “scaffolding” that will render this adaptation difficult.

The most important thing to always keep in mind is the consistency of the codebase and the fact that any change comes with a set of trade-offs that need to be thoroughly understood before implementing them.

While it’s true that some of the best practices will be very simple (bordering on triviality), some of them will bring a more profound impact to the application, both from an architectural point of view and also from a debuggability point of view, which makes them the most “prolific” but also the most impactful.

A codebase that is full of best practices but is inconsistent in style, architecture and code/module organization is not going to be pleasant to work with, and while it might be performant enough to scale, it is surely not the best way to achieve that result.

Consistency here in this context means simply that there are some agreed-upon rules that the developers will follow when writing the code, so it is always organized the same way and also, when there are exceptions, those need to be clearly documented and justified.

As an example, it can be established that every reactive component is written as an event emitter and/or an event subscriber, so the entire architecture of the application will rely on event broadcasting and subscription.

This means that there will be a need for some sort of event persistence, and thus maybe an external system will be involved (such as Kafka, RabbitMQ, ZeroMQ etc.) that brings additional concerns and constraints into the design of the system.

With this rule that’s been decided and agreed-upon, developers can now start creating modules and components following this model, which means that they will “shape” the codebase in a certain way according to this event-driven direction decided in the design phase.
Introducing some best practices that will disrupt this architecture is not a wise choice, even if those best practices might be needed or well-suited for the problem at hand.

Creating inconsistencies at such deep level like the architecture of an application is bound to disrupt the entire codebase, and potentially expose code paths that were not meant to be exposed, along with some bugs that might have not been manifested before if not for this change that’s just been introduced.

What this implies is that it’s necessary to understand when it’s worth to introduce some best practices that are quite impactful on the codebase and when it’s better to leave things as they already are, even if they are not the best but still good enough.

This judgement can only come from experience, from the intimate knowledge of the application and from a substantial amount of time spent evaluating what it means changing the code to introduce a potentially disruptive change.

Studying and understanding what we are working on is always time well spent, and it can only lead to a more complete development experience and, more importantly, to a better capability of future-proofing our code against changes that might come when requirements will inevitably change.

Some situations, by their very nature, are not a good environment for introducing new changes like adopting some best practices or rewriting some components to fit a better architectural model.

These situations might be an incoming deadline, a project that’s being built as a prototype just to understand the problem domain and possibly the refactoring of a legacy system, which are very delicate and are not the right place and time to introduce new code that we might not be familiar with.

In summary, try to understand each point discussed below, asking yourself if your codebase is ready to change in the way that’s suggested in that best practice, and if yes, what might be the impact on the overall architecture of the application.

 

Best Practices for a Reactive Application

With the introduction on the topic out of the way, we can now proceed to discuss the best practices that might be useful when writing a Reactive Application, with a focus on Spring Boot 2 based codebases.

We will divide these best practices into different categories based on what kind of “layer” they have the most impact on, such as architecture, module organization, code components.

 

Architecture

  • When working with Reactive Streams, try to push concerns about subscriptions down the stack because it saves you from cluttering your code with non-business logic, making it more readable and easier to test and maintain. Source

What this means is that we should try to avoid subscribing directly to a Reactive Stream in our code, because this operation is a low-level one that’s exposed for the developer’s convenience where it’s not possible to do otherwise, while we should instead focus our reactive code on the high-level concerns and shouldn’t care about these kind of details.

When we use the raw “subscribe()” method or any of its overloaded variants, we are concerning ourselves with handling data buffering, backpressure and outbound flow control, which is something we should not care about in the vast majority of cases.

If possible, and with Spring Webflux it always is, try to use a library that handles the subscription process for you, freeing yourself from the need of writing boilerplate code as much as possible

  • Try from the start to design components which are pure, free from side effects. This is especially true for the business logic components, and instead push these side effects to specialized classes that can be “abstracted” away behind an interface.

This is a very impactful change, and thus might not be applicable everywhere, but the gist of it is to abstract away (behind well-defined interfaces) the “ugliness” of the external world, and write code that is free from such side effects, that will always produce the same result given the same combination of input and external state.

What this means is that our business logic core classes should not directly interact with the external world, but instead relay on specialized classes that feed data into these components and will eventually handle the writeback/sending/output of the processed data that’s been obtained.

While this might sound trivial, it’s not an easy feat and requires a lot of discipline when designing the system, because the modules need to be strictly separated via interfaces and the implementation details need to be abstracted away, so it’s always possible to swap one implementation for another and nothing will change, which is very useful for testing purposes too.

Modelling and concentrating all the state from the outside world into a number of POJOs will allow us to write business logic that takes in this external state objects and will work on that, always producing the same result if given the same input and the same external state objects, which is a great property to have in a Reactive System due to the fact that immutable and pure systems naturally compose very well with the functional and reactive approach that we want to adopt.

Another added benefit is that we will simplify testing a lot because we can craft a particular external state “view of the world” as objects and pass those to our business components, checking if they behave like we expect them to or if there are edge cases we might have not considered.

How to practically do this is quite complex and requires first-hand knowledge of the system, but the general advice is to separate the types of data that’s needed to perform a given business function and model each of those types into POJOs/interfaces according to their nature, if they are a required input for the system of if they represent external state.

As an example, we might have a business need to calculate the tax due on an item and write that back into a database, that we solved by creating an interface and implementing its method “void calculateTax(Long itemId, BigDecimal vatRate)”.

How we might have solved this is by getting the item from the database, calculate the tax rate based on the input VAT and write it back to the Database, all of it in the same method.

There is nothing wrong with that per-se, it will work 99% of the time there’s not an issue with the DB and it will suffice for most applications, even if it might be a bit too much for a single method.

In the Reactive Model, we must first not perform any blocking calls, which already rules out the possibility of including the DB query/methods into this method that we will use in a Reactive Pipeline, and furthermore, this is not pure, because it has side effects and relies on implicit state represented by the Database (even if we pass the same vatRate, we might end up with different results because the product at that ID changed, for example, and this is not a pure function).

If we want to adopt the best practice suggested above, we should decompose our method into a set of three methods, chained in a Reactive Pipeline and make sure we “wrap” our DB calls into the appropriate “publishOn” calls to make them non-blocking (or even better start using a reactive JDBC driver where applicable).

The methods might look like this:

“Mono<Item> fetchItem(Long itemId)“
“Mono<BigDecimal> calculateTax(Mono<Item> item, BigDecimal vatRate)”
“Mono<Void> writeTaxToDB(Mono<BigDecimal> taxAmount)”

The Reactive Pipeline might look something like (link):

 

 

We have separated the actual logic of calculating the tax amount for an item, which is a pure operation (we will always get the same value out if we use the same pair item/vatRate) from the side effect ones, like reading from the Database and writing back into it.

This allows us to have a functionally pure business logic component/class/method that receives inputs (the vatRate) and external state (the item) and will always interact with the external world through well-defined interfaces, in this case the method signature.

We are confining the act of accessing the real world and thus interacting with possible side effects to well defined interfaces, the two methods to retrieve an item and to write a tax amount back to the database.

This requires knowing very well what are the implicit assumptions that our application relies on, because we don’t want to disrupt or break anything when we introduce this into an existing codebase.

 

Module Organization

  • Try to organize all code related to a given feature into the same package.

This is general advice that is as valid for Reactive as it is for “classic” java, because it leads to high module cohesion and low coupling between unrelated packages/feature/units.

The reason it’s presented as a best practice for Reactive as well is because in this programming model, more than others, it’s important not to introduce logic dependencies between unrelated features/modules/classes as this would break the promise of “purity” when writing code.

Writing pure code is very important in the Reactive World, not only for performance, but also for debugging purposes, observability of the overall system state and reproducibility of edge cases, and if we organize our code in a way that makes it easy to write pure “business logic” code, we are setting ourselves up for success.

What this means in practice is that we should keep into the same package the code related to “item processing”, in another package we should place the “login feature”, in yet another one we should place the “schedule a batch job” functionality and so on.

This is something that, if the codebase and the classes are written following the SOLID principle, is possible to implement for an already existing application without causing too much pain.

  • Organize your module for debugging since the early stages of the application, and if you’re writing microservices, make use of trace ids/span ids/correlation ids.

What this means is that you should plan and organize your code in a way that at any given moment you are logging the necessary information for your developers/SREs to understand what is going on.

Make good use of appropriate logging levels, try not to be too verbose unless strictly necessary to prevent “logging fatigue” and be mindful when logging at the “error” level (or equivalent) to prevent the scenario where people ignore error logs just because there are too many of them.

If you’re writing microservices, make sure you can trace a request while it transits through different applications, so you can always know the correspondence between microservice->log line->request affected.

This is a quite important point and does not require a lot of changes in your codebase, at least not as much as the previous suggestions, so try and implement it if you can.

There are a number of frameworks for Java that can do a lot for your application in this regard, consider checking them: Spring Cloud Sleuth, Zipkin.

Being able to check the history of a request and examine all the logs from all the microservices that touched upon on that request is an invaluable aid when debugging errors or weird/unexpected behaviors.

 

Code Components

  • Never block on a Reactive thread. Never.

More than a best practice, this is the golden rule, the mother of all the other rules. Any best practice would be useless if a developer did not respect this rule.

Blocking on a reactive thread means breaking the reactive promise (link) which in turn means nullifying any gain from going reactive, and possibly deteriorating our performance because of the much lower number of threads we have in the Reactive Model by default.

So how do we avoid blocking on a reactive thread?

It’s easy, we block on a non-reactive thread, and ask the Project Reactor framework (integrated into Spring Boot 2) to perform our blocking call on another thread, and then notify us of the result in the form of Mono or Flux, like a normal reactive operation, using the “publishOn” operator.

Like this, we will not block the main reactive thread and at the same time we will not disrupt the “reactive workflow” by introducing special syntax or conventions for the blocking operations, we will simply apply an operator and use the result as a normal reactive collection, Mono or Flux.

  • Compose operations on the reactive streams with the higher order functions (operators) provided by the Mono/Flux Reactive Collections.

This is a change that’s almost transparent and can be safely applied to already existing codebase, because it’s non-destructive as it’s just moving out code/function calls from nested scope into a chain of operators.

By doing so, we reduce the number of scopes where a variable is visible, we decrease the cognitive complexity of a given block of code (which naturally leads to fewer bugs due to having simpler code) and we also make it more maintainable which is always a very interesting object to achieve.

  • Don’t ever use ThreadLocals in a Reactive Application.

This is very simple and straightforward: requests accepted on one thread may be returned to the client from another thread, and thus your ThreadLocal variable will not “belong” to the right thread, with resulting data synchronization issues.

  • Try to avoid as much as possible mutable state in your business logic components.

This is somewhat related to the suggestion of writing pure functions, but it’s a bit more subtle in the sense that not all immutable data belongs to pure functions, but all pure functions use immutable data.

Immutability is a great property to have in a component because it simplifies reasoning about the actual state of the component, if there’s any.

The best scenario is to have stateless components with no state at all inside, but this is not always possible, so the next best thing is designing for immutable state.

When we create an immutable instance, it cannot be changed anymore, so we can verify some of the properties of a system that uses immutable data/state.

In practice, this means that we don’t have to worry that one value might be unknowingly modified by some other module/class/function, which in turn shapes our search for issues into the only place where an immutable is changing value, its creation.

If we obtain an incorrect value out of an immutable piece of data, we only have to check what’s receiving when it’s created, there’s no need to worry about external calls to its setters, because either there will be no setters to call or calling them will throw an exception.

It goes without saying that any help during the search for a bug is much appreciated, and being able to reason about how values (don’t) change is invaluable, because we are adding constraints to a system that’s misbehaving, an operation that reduces its state space and allows us to focus only on certain parts of the system and exclude certain kind of errors right from the start.

 

Conclusions

The best practices we outlined earlier are only a small subset of the all suggestions one could apply to a reactive codebase, and it’s not possible to list them all, not only because of the size of the resulting list, but also because some of them would be too specific and some would be too generic to be useful for a particular application.

Always remember to reason and evaluate if the proposed solution fits your use case and if it can be smoothly added to your code base or you might need to bolt it on and force some disrupting change with it.

In general, trying to add constraints to a system, limiting its state space and making sure data sync issues are avoided at design time is the best way possible to design a robust and resilient reactive system, that will have a high degree of debuggability and maintainability, which will make it easier for us to adapt it when requirements will inevitably change.

Check back next week for the next instalment of this Tech Journey Series at Itembase, where we will dive a bit deeper into the topic of debugging Reactive Applications.

 

<< Spring Boot 2: A Look at Spring Webflux