LLD Note
Gateway Responsibilities
The API gateway is the shared request boundary, so it should own cross-cutting work that every backend service would otherwise duplicate: authentication checks, request ids, route matching, rate policy selection, timeout rules, and response normalization.
The gateway should not own product business logic. Its job is to decide whether a request is trusted, within quota, and routable, then forward a clean request context to the selected service.
- Normalize request identity before any quota key is built.
- Reject malformed or unauthorized traffic before backend resources are touched.
- Keep route policy declarative so service teams can change limits without changing gateway code.
LLD Note
Token Bucket Rate Limiting
A token bucket allows controlled bursts while preserving a long-term refill rate. Each request consumes one or more tokens; elapsed time refills tokens up to a configured capacity.
This is a better fit for user-facing APIs than a fixed window counter because it avoids hard reset spikes at window boundaries and gives clients clearer retry behavior.
- Use route-specific capacity and refill rate for expensive endpoints.
- Return Retry-After and remaining quota headers when a request is blocked.
- Apply stricter fallback limits for anonymous or untrusted identities.
LLD Note
Redis Atomicity and TTL
The load, refill, and consume steps must be atomic. A Redis Lua script or equivalent atomic command sequence prevents concurrent requests from reading the same token count and overspending quota.
Limiter keys should carry TTL so Redis automatically releases cold identities. The TTL should be long enough to preserve quota history across a normal policy window without keeping inactive keys forever.
- Store token count and last refill timestamp together.
- Expire idle buckets automatically to control Redis memory growth.
- Use explicit fail-open or fail-closed policy per route when Redis is unavailable.
LLD Note
Routing, Observability, and Abuse Handling
Allowed requests are routed only after identity and quota checks succeed. The gateway attaches trusted identity headers and enforces timeout limits so backend services receive a predictable contract.
Every decision should emit structured logs and metrics. The same signals that power dashboards also feed abuse rules for repeated 401, 403, 429, timeout, and high-cardinality source patterns.
- Emit one trace id across gateway and backend service boundaries.
- Track allowed, blocked, auth-failed, timeout, and upstream-error outcomes separately.
- Use abuse signals to block keys or reduce quotas without redeploying product services.