02. API Gateway

API Gateway & Rate Limiter LLD

A gateway-first backend design: request identity, JWT/API key validation, Redis token buckets, atomic quota checks, service routing, 429 rejection, and abuse visibility.

API Gateway/Rate limiting and routing

Gateway Entry

Every client request lands at the gateway before it reaches product services, so normalization, tracing, and policy checks start at one boundary.

Identity Guard

JWTs, API keys, tenant ids, and client IPs are validated and normalized before the limiter decides which quota applies.

Rate Limit State

Redis stores token-bucket counters and TTLs per user, API key, tenant, route, or IP depending on the request policy.

Routing Decision

Allowed requests are routed to backend services; exhausted buckets receive deterministic 429 responses with retry metadata.

Flow Canvas

Execution map

Rate limiting and routing

LLD

API Gateway Execution Map

Normalize identity, validate auth, check Redis token bucket, route allowed traffic, reject exhausted requests, and emit signals

entry

Receive request

Request enters gateway.

Connected

service

Normalize client identity

Build principal context.

Connected

decision

Validate JWT or API key

Reject bad credentials.

Connected

service

Build rate limit key

Pick quota dimension.

Connected

state

Load token bucket

Read limiter state.

Connected

service

Refill tokens

Restore allowed quota.

Connected

lock

Consume token atomically

Atomic quota decrement.

Connected

decision

Bucket empty?

Decide allow or 429.

Connected

recovery

Reject with 429

Stop before backend.

Connected

service

Route allowed request

Proxy to upstream.

Connected

external

Backend service response

Upstream returns result.

Connected

state

Emit logs and metrics

Log every outcome.

Connected

recovery

Update abuse signals

Feed protection rules.

Connected

13 Nodes

12 Connections

Static map based on gateway, limiter, routing, and observability modules

LLD Note

Gateway Responsibilities

The API gateway is the shared request boundary, so it should own cross-cutting work that every backend service would otherwise duplicate: authentication checks, request ids, route matching, rate policy selection, timeout rules, and response normalization.

The gateway should not own product business logic. Its job is to decide whether a request is trusted, within quota, and routable, then forward a clean request context to the selected service.

Normalize request identity before any quota key is built.
Reject malformed or unauthorized traffic before backend resources are touched.
Keep route policy declarative so service teams can change limits without changing gateway code.

LLD Note

Token Bucket Rate Limiting

A token bucket allows controlled bursts while preserving a long-term refill rate. Each request consumes one or more tokens; elapsed time refills tokens up to a configured capacity.

This is a better fit for user-facing APIs than a fixed window counter because it avoids hard reset spikes at window boundaries and gives clients clearer retry behavior.

Use route-specific capacity and refill rate for expensive endpoints.
Return Retry-After and remaining quota headers when a request is blocked.
Apply stricter fallback limits for anonymous or untrusted identities.

LLD Note

Redis Atomicity and TTL

The load, refill, and consume steps must be atomic. A Redis Lua script or equivalent atomic command sequence prevents concurrent requests from reading the same token count and overspending quota.

Limiter keys should carry TTL so Redis automatically releases cold identities. The TTL should be long enough to preserve quota history across a normal policy window without keeping inactive keys forever.

Store token count and last refill timestamp together.
Expire idle buckets automatically to control Redis memory growth.
Use explicit fail-open or fail-closed policy per route when Redis is unavailable.

LLD Note

Routing, Observability, and Abuse Handling

Allowed requests are routed only after identity and quota checks succeed. The gateway attaches trusted identity headers and enforces timeout limits so backend services receive a predictable contract.

Every decision should emit structured logs and metrics. The same signals that power dashboards also feed abuse rules for repeated 401, 403, 429, timeout, and high-cardinality source patterns.

Emit one trace id across gateway and backend service boundaries.
Track allowed, blocked, auth-failed, timeout, and upstream-error outcomes separately.
Use abuse signals to block keys or reduce quotas without redeploying product services.

Source References

Gateway entrypoint

src/infrastructure/http/gateway/api-gateway.ts

Request normalization, correlation id creation, policy lookup, and response shaping.

Auth middleware

src/infrastructure/http/middlewares/auth.middleware.ts

JWT/API key validation, scopes, tenant context, and trusted principal extraction.

Rate limiter service

src/infrastructure/rate-limit/redis-token-bucket.ts

Redis token bucket refill, atomic consume script, TTL handling, and Retry-After calculation.

Route registry

src/infrastructure/http/gateway/route-registry.ts

Route matching, upstream target selection, route-specific timeout, and quota policy metadata.

Observability pipeline

src/infrastructure/observability/request-telemetry.ts

Structured logs, traces, metrics, limiter decisions, and upstream health signals.

Abuse policy

src/domain/security/services/AbusePolicyService.ts

Repeated auth failures, quota exhaustion, IP patterns, and dynamic block or throttle decisions.

Design Rule

The gateway should reject untrusted or over-quota traffic before any product service is called, while emitting enough telemetry to explain every allow, block, timeout, and abuse decision.

Failure Modes

Edge cases handled

Redis unavailable

Trigger

The gateway cannot read or update the token bucket state.

System response

Public and expensive routes fail closed with 503 or 429, while trusted internal routes can fail open with strict logging based on policy.

Burst traffic from one IP

Trigger

Many anonymous requests arrive from the same source before identity is established.

System response

The gateway uses an IP-based pre-auth bucket and returns 429 before auth or backend services are overloaded.

Valid user exceeds quota

Trigger

The principal is authenticated but the selected user/API-key bucket has no tokens left.

System response

The request stops at the gateway with 429, Retry-After, remaining quota headers, and no upstream call.

Invalid or missing token

Trigger

JWT signature, expiry, issuer, audience, API key, or scope validation fails.

System response

The gateway returns 401 or 403 and does not create route-level quota state for an untrusted principal.

Backend service timeout

Trigger

The request passed quota and routing but the upstream service exceeds its deadline.

System response

The gateway returns a timeout response, emits upstream health metrics, and does not refund the consumed token unless the route explicitly allows it.

Gateway partial outage

Trigger

One gateway instance loses Redis/auth connectivity or reports elevated local errors.

System response

Load balancer health checks drain the bad instance while other gateways continue serving with shared Redis quota state.

Request State Machine

Request, limiter, and routing states

Request State	Identity	Limiter	Routing	Gateway Note
Incoming	Unknown	Not checked	Not routed	The gateway has accepted bytes but has not yet trusted the caller or selected a quota policy.
Authenticated	Trusted principal	Key selected	Not routed	JWT or API key is valid and the request has a stable user, tenant, API key, IP, or route limiter key.
RateLimited	Trusted or anonymous	Bucket empty	Rejected	No backend request is made; the client receives 429 and retry metadata.
Routed	Trusted principal	Token consumed	Upstream selected	Gateway attaches identity context and proxies the request to the matched backend service.
Rejected	Invalid or blocked	Not consumed	Not routed	401, 403, or policy block stops the request before limiter or backend resources are spent.
TimedOut	Trusted principal	Token consumed	Upstream timeout	Gateway records upstream latency and returns timeout response while preserving the request trace.
Completed	Trusted principal	Token consumed	Response returned	Gateway emits final status, latency, route, quota decision, and upstream health metrics.