FastAPI + Redis Caching — Make Your API Faster
Add Redis caching to FastAPI the right way — async Redis client, decorator-based caching, TTL strategies and cache invalidation patterns.
Introduction
Most slow FastAPI endpoints aren't slow because of Python — they're slow because they hit the database for data that changes once an hour. Redis is the answer. This tutorial wires up async Redis caching, a decorator pattern that's easy to read, and a sane invalidation strategy.
The Java equivalent: Spring Boot + Redis Caching.
Key takeaways
- Use
redis.asyncio(a.k.a.redis-pyasync client) —aioredisis deprecated and merged in. - Cache reads, not writes. Invalidate on write.
- Always set a TTL — never trust an eviction policy alone.
- Use a versioned key prefix so a deploy can invalidate everything at once.
- Serialize with orjson — 5-10× faster than
json.
Setup
pip install "redis[hiredis]" orjson
```python
# app/core/cache.py
import orjson, redis.asyncio as redis
from functools import wrapsPREFIX = "v1" r = redis.from_url("redis://redis:6379", decode_responses=False)
def cached(ttl: int = 60, key: str | None = None): def deco(fn): @wraps(fn) async def wrapper(*args, **kwargs): k = f"{PREFIX}:{key or fn.__name__}:{orjson.dumps([args, kwargs]).decode()}" hit = await r.get(k) if hit is not None: return orjson.loads(hit) value = await fn(*args, **kwargs) await r.set(k, orjson.dumps(value), ex=ttl) return value return wrapper return deco ```
Using it in an endpoint
```python
from app.core.cache import cached, r, PREFIX@cached(ttl=300, key="product") async def get_product(product_id: int) -> dict: row = await db.fetch_one("SELECT * FROM products WHERE id=$1", product_id) return dict(row)
async def update_product(product_id: int, payload: dict): await db.execute("UPDATE products SET ... WHERE id=$1", product_id) await r.delete(f"{PREFIX}:product:[[{product_id}], {{}}]") ```
TTL strategy
| Data | TTL | Why | |------|-----|-----| | User profile | 5 min | Rarely changes | | Product catalog | 1 hour | Bulk updates infrequent | | Pricing | 30 sec | Must reflect changes fast | | Session | sliding | Refresh on every read |
Cache stampede protection
When a hot key expires, 1,000 requests can hit the DB at once. Use single-flight:
async def get_with_lock(key: str, ttl: int, loader):
val = await r.get(key)
if val: return orjson.loads(val)
lock = await r.set(f"{key}:lock", "1", nx=True, ex=10)
if lock:
try:
v = await loader()
await r.set(key, orjson.dumps(v), ex=ttl)
return v
finally:
await r.delete(f"{key}:lock")
# someone else is computing — short wait + retry
await asyncio.sleep(0.05)
return orjson.loads(await r.get(key) or b"null")
Production best practices
- Always cap memory with
maxmemory+allkeys-lru. - Don't cache user-specific data under a shared key — include
user_id. - Monitor hit ratio. Below ~80% the cache may be hurting you.
- Avoid
KEYS *in production — useSCAN.
Common mistakes
- Caching writes — silently serves stale data.
- Forgetting to invalidate on every write path. Centralize writes in a service layer.
- Using the same Redis instance for cache and queue without separating DBs.
Tools used in this tutorial
The setup below uses the following tools. Versions matter less than the role each one plays, so swap freely as long as you keep the responsibilities the same.
- Python 3.12+
- FastAPI
- Uvicorn / Gunicorn
- Pydantic v2
- SQLAlchemy 2.x
- Docker
- Kubernetes
Real-world production context
In production, a Python backend like this rarely runs as a single uvicorn process on a laptop. Teams build the app into a container image, push it to a registry, and run it behind a managed load balancer with autoscaling, health checks and centralized logging. Async FastAPI services typically use multiple Uvicorn workers managed by Gunicorn, a managed Postgres for state, Redis for cache and rate-limit counters, and an object store for uploads. The reason cloud hosting becomes a hard requirement is operational: zero-downtime rollouts, log aggregation, metrics, and TLS at the edge are not optional once real users depend on the API.
Recommended tools & deployment options
Once the tutorial works on your machine, the next question is *where do I run this for real?* These are the platforms most Python backend teams reach for:
- DigitalOcean — the simplest path from a working FastAPI container to a public URL. App Platform deploys directly from a Dockerfile, managed Postgres and Redis are one click away, and pricing is predictable. A common way to deploy the setup in this tutorial is using a cloud provider like DigitalOcean when you want to ship quickly without learning a full cloud SDK.
- AWS — the default for enterprise workloads. ECS Fargate or EKS run containers without you managing servers, RDS handles Postgres, and CloudWatch covers logs and metrics.
- Docker — the packaging format every modern deploy target understands. Build once, run the same image locally, in CI and in production.
- Kubernetes (managed: EKS, DOKS, GKE) — the right choice once you have more than a handful of services, need rolling updates, autoscaling and policy-driven networking.
A VPS or managed cloud service is required to run this architecture end-to-end — uvicorn --reload is for development, not for serving traffic.
FAQ
fastapi-cache vs custom? fastapi-cache2 is fine for simple cases. Roll your own when you need stampede protection or non-trivial keys.
Next steps & related tutorials
Keep the momentum going with the next tutorial in this learning path:
Architecture
Read-Through Cache with Redis
TL;DR
Key takeaways
- Understand the core concepts behind FastAPI + Redis Caching — Make Your API Faster in a production context.
- Apply the patterns to real Python & FastAPI systems, not just toy examples.
- Recognize the trade-offs, failure modes, and operational concerns before adopting them.
- Get a clear path to the next step — related tutorials, tools, and reference architectures.
Avoid these
Common mistakes
1. Copy-pasting code without understanding the trade-offs
It's tempting to ship a snippet from a blog post into production, but Python & FastAPI patterns only work when the failure modes are understood. Always reason about timeouts, retries, and consistency.
2. Skipping observability from day one
Structured logs, metrics, and traces are not optional. Wire them in before you ship — debugging Python & FastAPI systems without them is painful and expensive.
3. Optimizing too early
Premature caching, sharding, or microservice extraction adds operational cost. Validate the bottleneck with real measurements first.
4. Ignoring security defaults
Secrets in env files, open management ports, missing RBAC — these are the most common production incidents. Treat security as part of the definition of done.
Ship it safely
Production best practices
Apply these before promoting FastAPI + Redis Caching — Make Your API Faster to a real production environment.
Scalability
Design Python & FastAPI services to scale horizontally. Keep request handlers stateless, push session and cache state to external stores (Redis, the database), and benchmark p95/p99 latency under realistic load before tuning.
Monitoring & Observability
Emit metrics (RED/USE), structured JSON logs, and distributed traces from day one. Wire dashboards and alerts to SLOs you actually care about — error rate, latency, saturation — not vanity metrics.
Logging
Log with correlation IDs, never log secrets or PII, and centralize logs (ELK, Loki, CloudWatch). Use levels deliberately: INFO for state changes, WARN for recoverable issues, ERROR for incidents.
Security
Apply least-privilege IAM, rotate secrets through a vault, validate every input, and patch dependencies on a schedule. For HTTP services, enable TLS everywhere and set sensible security headers.
Testing
Layer unit, integration, and contract tests. Run them in CI on every PR, and add smoke tests post-deploy. For Python & FastAPI systems, also run chaos and load tests before a major release.
Reliability & Rollouts
Ship with health checks, readiness probes, graceful shutdown, and a rollback strategy. Prefer canary or blue/green deploys over big-bang releases.
Questions
Frequently asked questions
Is this tutorial up to date?
Yes. This tutorial was last reviewed and updated on May 26, 2026. We revisit popular Python & FastAPI tutorials regularly to keep them aligned with current best practices.
What level is this tutorial aimed at?
It is written for working developers with some backend experience. Beginners can still follow along, and senior engineers will find production-grade patterns and trade-off discussions.
Do I need to follow every step in order?
The walkthrough is sequential because each step depends on the previous one. If you only need a specific concept, the table of contents at the top of the article lets you jump straight to that section.
Where can I find the source code?
Code samples are inlined in the tutorial. When a companion repository is published it will be linked at the top of this page.
Go deeper
Further reading
More From the Channel
Follow the full tutorial series on YouTube
The MasterLabSystems channel publishes in-depth, project-based tutorials on Java, Spring Boot, microservices, Docker, Kubernetes, AWS and DevOps — the same topics covered on this site, with full code walkthroughs.
Stay in the Loop
Get the next tutorial in your inbox
next tutorial →
JWT Authentication in FastAPI — Secure APIs Properly
Related tutorials
Building REST APIs with FastAPI — A Complete Guide
A complete, production-focused walkthrough of building REST APIs with FastAPI — Pydantic models, dependency injection, async endpoints, SQLAlchemy and Docker.
FastAPI Microservices Architecture Explained Step by Step
How to design and build a Python microservices architecture with FastAPI — services, API gateway, async messaging, Redis, Postgres and Docker Compose.
Dockerizing a FastAPI Application the Right Way
Build small, fast, secure Docker images for FastAPI — multi-stage builds, Gunicorn + Uvicorn workers, non-root users, and production-ready Dockerfiles.
