// microservices

Microservices Pattern Catalog

A working catalog of the resilience, consistency, and operational patterns that make microservices survive real production traffic. Each pattern includes when to use it, the trade-offs, and an implementation pointer.

Quick Reference

  • Circuit Breaker — fail fast when a downstream is unhealthy
  • Retry — recover from transient failures (with backoff + jitter)
  • Saga — multi-service workflow with compensations
  • Outbox — atomic 'DB write + event publish'
  • Bulkhead — isolate thread pools / connection pools
  • API Gateway — single entrypoint for cross-cutting concerns
  • Service Discovery — dynamic instance lookup
  • Distributed Tracing — request flow across services

Learning Path

Recommended order

  1. 1.Beginner
  2. 2.Intermediate
  3. 3.Advanced

Prerequisites

  • At least one microservice deployed
  • Understanding of HTTP timeouts and retries

Skills you will learn

  • Designing resilient calls between services
  • Implementing distributed workflows safely
  • Diagnosing distributed failures

Estimated time

1–2 weeks to internalize.

Architecture Overview

Architecture

Microservices Architecture

CLIENTAPI GATEWAYSERVICESDATAEXTERNALRESTpublishsubscribeWeb AppMobile AppAPI GatewayRouting · AuthUsers ServiceOrders ServiceBilling ServiceUsers DBPostgreSQLOrders DBPostgreSQLEvent BusKafkaStripePaymentsEmail APISES / SendGrid
An API gateway routes traffic to independent services. Each service owns its data and communicates via REST or async events.

Circuit Breaker

Stop calling a sick downstream.

Recommended

Tracks failure rate; opens after a threshold; rejects calls during open state; tries half-open after a cool-down. Resilience4j is the Spring Boot standard.

Pros

  • +Prevents cascading failures
  • +Recovers automatically

Cons

  • Tuning thresholds is empirical

Best for: Any HTTP/gRPC call to a downstream service.

Retry

Try again — with backoff and a budget.

Exponential backoff + jitter, capped attempts, idempotency required on the callee. Combine with circuit breaker to avoid retry storms.

Pros

  • +Handles transient failures

Cons

  • Amplifies load during outages without a budget

Best for: Idempotent calls (GET, PUT, DELETE).

Saga

Multi-service workflow with compensations.

Orchestrated (a central coordinator) or choreographed (events). Each step has a compensating action to roll back on failure.

Pros

  • +Distributed consistency without 2PC
  • +Auditable workflow

Cons

  • Compensations are business logic, not magic
  • Hard to debug

Best for: Cross-service business transactions (checkout, onboarding).

Outbox

Atomically write to DB and publish an event.

Write the event to an outbox table in the same DB transaction; a relay polls the outbox and publishes to Kafka/RabbitMQ. Solves dual-write reliably.

Pros

  • +No lost events
  • +No 2PC needed

Cons

  • Adds a relay component
  • Slight publish latency

Best for: Any service emitting domain events.

Bulkhead

Isolate failures by resource pool.

Separate thread pools / connection pools per downstream so one slow dependency can't drain your whole worker pool.

Pros

  • +Prevents resource exhaustion across calls

Cons

  • More pools to size and monitor

Best for: Services calling multiple external dependencies.

API Gateway

Single front door for clients.

Centralizes auth, routing, rate limiting, request shaping, and observability. Spring Cloud Gateway, Kong, AWS API Gateway.

Pros

  • +Centralized cross-cutting concerns

Cons

  • Can become a bottleneck

Best for: Public-facing microservice systems.

Service Discovery

How services find each other.

Eureka, Consul, Kubernetes Services. Required once instances scale dynamically.

Pros

  • +Dynamic scaling
  • +Health-aware routing

Cons

  • Control-plane dependency

Best for: Any non-trivial microservice deployment.

Distributed Tracing

See a request flow across services.

OpenTelemetry + Tempo/Jaeger/Zipkin. Correlate logs and metrics by trace id; find p99 latency contributors.

Pros

  • +End-to-end visibility
  • +Latency attribution

Cons

  • Sampling trade-offs
  • Storage costs at scale

Best for: Microservices with 3+ hops per request.

Common Mistakes

  • !Retrying non-idempotent POSTs — double charges, double orders.
  • !Implementing a saga without compensations, calling it 'eventually consistent'.
  • !Single shared thread pool — one slow downstream stalls everything.
  • !Dual-writing to DB + Kafka without an outbox.

Production Tips

  • Default every HTTP client to: connect 2s, read 5s, total budget 10s.
  • Every retry must have a budget AND jitter; every retry caller must be idempotent.
  • Emit one trace per request, propagate `traceparent` headers everywhere.
  • Wrap every external call in a circuit breaker with bulkhead.

Further Reading

Frequently Asked Questions

Do I need all these patterns in every microservice?

No. Start with timeouts + retries + circuit breaker. Add saga/outbox when you have multi-service workflows.

Outbox vs Change Data Capture (CDC)?

Both solve dual-write. CDC (Debezium) reads the WAL and is non-invasive; outbox is explicit in your domain. Pick CDC for legacy DBs, outbox for greenfield.