We knew that our application would break if the database was down. More precisely, we knew that our service end-point would time out when we were logging to the database some not-essential-but-useful information for each item in the request’s large batch.

We knew this would happen because it did when Auto Trader ran a Disaster Recovery test in our test environment and we needed a story to fix it soon.

As the data being logged was not essential, we needed to implement a circuit breaker to surround the database call and allow us to continue with essential work even when the circuit tripped on the first logging failure.

Martin Fowler has a post that explains circuit breakers nicely.

A foray into Google turned up a few possibilities, the most interesting being

  • Netflix’s Hystrix is a library intended to help control the interactions between distributed services by adding latency tolerance and fault tolerance logic. It is a big library that pulls in quite a lot of dependencies and does much more than we needed. Interestingly, it is integrated with Spring Boot using the Spring Cloud Netflix package, although that wasn’t a consideration for our Dropwizard application.
  • Resilience4j is a lightweight fault tolerant library inspired by  Hystrix but designed for Java 8 and functional programming. At first glance, Resilience4j looked new but it is actually a new name for the more mature Javaslang-Circuitbreaker. It is built on top of Vavr (formally Javaslang), a functional language extension to Java 8.

Enough looking. Time was short. Resilience4j it was.

The Circuit Breaker

The Resilience4j circuit breaker works in a beautifully simple and flexible way by decorating a Function, Supplier, Consumer, Runnable with a CircuitBreaker. You can then go on to decorate that with a whole load of other things, which I will expand on below.

We used a custom circuit breaker because we wanted it to trip straight away:

CircuitBreakerConfig circuitBreakerConfig = CircuitBreakerConfig.custom()
    .failureRateThreshold(1)
    .waitDurationInOpenState(Duration.ofMillis(120000))
    .ringBufferSizeInHalfOpenState(1)
    .ringBufferSizeInClosedState(1)
    .build();
CircuitBreaker circuitBreaker = CircuitBreaker.of("a-custom-circuit-breaker", circuitBreakerConfig);

but the default circuit breaker provided is even simpler to create:

CircuitBreaker circuitBreaker = CircuitBreaker.ofDefaults("a-default-circuit-breaker");

Then to use it, we just needed to decorate our DAO’s create() call with the circuit breaker. The original call

dataLoggingDao.create(dataToLog);

turned into

CheckedConsumer<AlgorithmResultSummary> recordCreator = dataLoggingDao::create;
CircuitBreaker.decorateCheckedConsumer(circuitBreaker, recordCreator)
              .accept(dataToLog);

… and that was it. A simple, clean circuit breaker around our problem piece of code.

Looking more deeply into the circuit breaker

The default circuit breaker is equivalent to this custom circuit breaker:

CircuitBreakerConfig circuitBreakerConfig = CircuitBreakerConfig.custom()
    .failureRateThreshold(50)
    .waitDurationInOpenState(Duration.ofSeconds(60))
    .ringBufferSizeInHalfOpenState(10)
    .ringBufferSizeInClosedState(100)
    .build();

The ring buffers store the success / failure status of the most recent calls. There are two ring buffers. The closed-state ring buffer stores the status of calls made while the circuit is closed, i.e. the circuit breaker is not tripped. The half-open-state ring buffer stores the status of calls made while the circuit is in the half open state, i.e. after the circuit breaker has been tripped. The half-open-state ring buffer must be filled before the circuit can be considered to be closed again.

failureRateThreshold is the percentage of failures in the closed-state ring buffer above which the circuit breaker should trip open and start short-circuiting calls. The circuit breaker will not trip before the closed-state ring buffer has been filled.

waitDurationInOpenState is the amount of time the circuit breaker should stay open and short-circuit calls after it has been tripped. After this time, the circuit breaker moves to the half open state.

ringBufferSizeInHalfOpenState is the size of the half-open-state ring buffer.

ringBufferSizeInClosedState is the size of the closed-state ring buffer.

The Circuit Breaker Registry

There is a registry supplied for you to manage your circuit breaker instances if you so wish. You create a CircuitBreakerRegistry with the circuit breaker configuration you wish to use as a default:

CircuitBreakerRegistry circuitBreakerRegistry = CircuitBreakerRegistry.of(circuitBreakerConfig);

then request circuit breakers from the registry as required. If you wish to have a circuit breaker with a different configuration you can specify one, otherwise you will get a circuit breaker that has the configuration that you used when creating the registry:

// Get a CircuitBreaker from the CircuitBreakerRegistry with configuration that you when creating the registry
CircuitBreaker customCircuitBreaker1 = circuitBreakerRegistry.circuitBreaker("custom-circuit-breaker-1");

// Get another CircuitBreaker from the CircuitBreakerRegistry with configuration that you when creating the registry
CircuitBreaker customCircuitBreaker2 = circuitBreakerRegistry.circuitBreaker("custom-circuit-breaker-2");

// Get a CircuitBreaker from the CircuitBreakerRegistry using a different custom configuration
CircuitBreaker customCircuitBreaker3 = circuitBreakerRegistry.circuitBreaker("custom-circuit-breaker-3", otherCircuitBreakerConfig);

More Than a Circuit Breaker

Because Resilience4j works by applying decorators to your consumers, functions, runnables and suppliers, you can combine the decorators in a very powerful way.

The core modules give you a circuit breaker, a rate limiter, a bulkhead for limiting the amount of parallel executions, an automatic retry (sync and async), response caching and timeout handling. There are add-on modules to give you metrics and more.

The resilience4j GitHub page at https://github.com/resilience4j/resilience4j gives some good examples of how the modules can be used:

  • A circuit breaker with retry that handles any exceptions, and you can also configure a custom back-off algorithm
  • A rate limiter
  • A bulkhead
  • A cache
  • Adding metrics
  • Consuming CircuitBreaker, RateLimiter, Cache and Retry events

If You’re Interested

Find out more at:

GitHub page: https://github.com/resilience4j/resilience4j

Very good documentation page: http://resilience4j.github.io/resilience4j