Flows

02. API Gateway

API Gateway & Rate Limiter LLD

A gateway-first backend design: request identity, JWT/API key validation, Redis token buckets, atomic quota checks, service routing, 429 rejection, and abuse visibility.

API Gateway/Rate limiting and routing

Gateway Entry

Every client request lands at the gateway before it reaches product services, so normalization, tracing, and policy checks start at one boundary.

Identity Guard

JWTs, API keys, tenant ids, and client IPs are validated and normalized before the limiter decides which quota applies.

Rate Limit State

Redis stores token-bucket counters and TTLs per user, API key, tenant, route, or IP depending on the request policy.

Routing Decision

Allowed requests are routed to backend services; exhausted buckets receive deterministic 429 responses with retry metadata.

Flow Canvas

Execution map

Rate limiting and routing
LLD

API Gateway Execution Map

Normalize identity, validate auth, check Redis token bucket, route allowed traffic, reject exhausted requests, and emit signals

entry

Receive request

Request enters gateway.

Connected
service

Normalize client identity

Build principal context.

Connected
decision

Validate JWT or API key

Reject bad credentials.

Connected
service

Build rate limit key

Pick quota dimension.

Connected
state

Load token bucket

Read limiter state.

Connected
service

Refill tokens

Restore allowed quota.

Connected
lock

Consume token atomically

Atomic quota decrement.

Connected
decision

Bucket empty?

Decide allow or 429.

Connected
recovery

Reject with 429

Stop before backend.

Connected
service

Route allowed request

Proxy to upstream.

Connected
external

Backend service response

Upstream returns result.

Connected
state

Emit logs and metrics

Log every outcome.

Connected
recovery

Update abuse signals

Feed protection rules.

Connected
13 Nodes
12 Connections

Static map based on gateway, limiter, routing, and observability modules

LLD Note

Gateway Responsibilities

The API gateway is the shared request boundary, so it should own cross-cutting work that every backend service would otherwise duplicate: authentication checks, request ids, route matching, rate policy selection, timeout rules, and response normalization.

The gateway should not own product business logic. Its job is to decide whether a request is trusted, within quota, and routable, then forward a clean request context to the selected service.

  • Normalize request identity before any quota key is built.
  • Reject malformed or unauthorized traffic before backend resources are touched.
  • Keep route policy declarative so service teams can change limits without changing gateway code.

LLD Note

Token Bucket Rate Limiting

A token bucket allows controlled bursts while preserving a long-term refill rate. Each request consumes one or more tokens; elapsed time refills tokens up to a configured capacity.

This is a better fit for user-facing APIs than a fixed window counter because it avoids hard reset spikes at window boundaries and gives clients clearer retry behavior.

  • Use route-specific capacity and refill rate for expensive endpoints.
  • Return Retry-After and remaining quota headers when a request is blocked.
  • Apply stricter fallback limits for anonymous or untrusted identities.

LLD Note

Redis Atomicity and TTL

The load, refill, and consume steps must be atomic. A Redis Lua script or equivalent atomic command sequence prevents concurrent requests from reading the same token count and overspending quota.

Limiter keys should carry TTL so Redis automatically releases cold identities. The TTL should be long enough to preserve quota history across a normal policy window without keeping inactive keys forever.

  • Store token count and last refill timestamp together.
  • Expire idle buckets automatically to control Redis memory growth.
  • Use explicit fail-open or fail-closed policy per route when Redis is unavailable.

LLD Note

Routing, Observability, and Abuse Handling

Allowed requests are routed only after identity and quota checks succeed. The gateway attaches trusted identity headers and enforces timeout limits so backend services receive a predictable contract.

Every decision should emit structured logs and metrics. The same signals that power dashboards also feed abuse rules for repeated 401, 403, 429, timeout, and high-cardinality source patterns.

  • Emit one trace id across gateway and backend service boundaries.
  • Track allowed, blocked, auth-failed, timeout, and upstream-error outcomes separately.
  • Use abuse signals to block keys or reduce quotas without redeploying product services.

Failure Modes

Edge cases handled

Redis unavailable

Trigger

The gateway cannot read or update the token bucket state.

System response

Public and expensive routes fail closed with 503 or 429, while trusted internal routes can fail open with strict logging based on policy.

Burst traffic from one IP

Trigger

Many anonymous requests arrive from the same source before identity is established.

System response

The gateway uses an IP-based pre-auth bucket and returns 429 before auth or backend services are overloaded.

Valid user exceeds quota

Trigger

The principal is authenticated but the selected user/API-key bucket has no tokens left.

System response

The request stops at the gateway with 429, Retry-After, remaining quota headers, and no upstream call.

Invalid or missing token

Trigger

JWT signature, expiry, issuer, audience, API key, or scope validation fails.

System response

The gateway returns 401 or 403 and does not create route-level quota state for an untrusted principal.

Backend service timeout

Trigger

The request passed quota and routing but the upstream service exceeds its deadline.

System response

The gateway returns a timeout response, emits upstream health metrics, and does not refund the consumed token unless the route explicitly allows it.

Gateway partial outage

Trigger

One gateway instance loses Redis/auth connectivity or reports elevated local errors.

System response

Load balancer health checks drain the bad instance while other gateways continue serving with shared Redis quota state.

Request State Machine

Request, limiter, and routing states

Request StateIdentityLimiterRoutingGateway Note
IncomingUnknownNot checkedNot routedThe gateway has accepted bytes but has not yet trusted the caller or selected a quota policy.
AuthenticatedTrusted principalKey selectedNot routedJWT or API key is valid and the request has a stable user, tenant, API key, IP, or route limiter key.
RateLimitedTrusted or anonymousBucket emptyRejectedNo backend request is made; the client receives 429 and retry metadata.
RoutedTrusted principalToken consumedUpstream selectedGateway attaches identity context and proxies the request to the matched backend service.
RejectedInvalid or blockedNot consumedNot routed401, 403, or policy block stops the request before limiter or backend resources are spent.
TimedOutTrusted principalToken consumedUpstream timeoutGateway records upstream latency and returns timeout response while preserving the request trace.
CompletedTrusted principalToken consumedResponse returnedGateway emits final status, latency, route, quota decision, and upstream health metrics.