Rate Limiting là kỹ thuật kiểm soát tần suất request từ một client/IP/user để bảo vệ hệ thống khỏi abuse, DDoS, và đảm bảo fair usage.
Các thuật toán:
- Token Bucket – bucket chứa tokens, mỗi request tiêu 1 token, tokens được refill theo rate cố định; cho phép burst ngắn.
- Leaky Bucket – requests được xử lý ở rate cố định; smooths out bursts nhưng không cho phép burst.
- Fixed Window Counter – đếm request trong window cố định (mỗi phút); đơn giản nhưng có boundary problem.
- Sliding Window Log – lưu timestamp của mỗi request, chính xác nhất nhưng tốn memory.
- Sliding Window Counter – kết hợp Fixed Window + sliding, cân bằng tốt.
Implementation: Redis với INCR + EXPIRE cho distributed rate limiting; Nginx module; API Gateway built-in (AWS API GW, Kong). Trả về 429 Too Many Requests với Retry-After header khi vượt limit.
Rate Limiting is the technique of controlling the frequency of requests from a client, IP, or user to protect the system from abuse, DDoS attacks, and to ensure fair usage.
- Common algorithms: Token Bucket — a bucket holds tokens; each request consumes one token; tokens are refilled at a fixed rate; allows short bursts.
- Leaky Bucket — requests are processed at a constant rate like water dripping through a hole; smooths out bursts but does not allow them.
- Fixed Window Counter — counts requests within a fixed window (e.g., per minute); simple but has a boundary problem (spikes at the end and start of windows).
- Sliding Window Log — stores the timestamp of each request; most accurate but memory-intensive.
- Sliding Window Counter — combines fixed window and sliding approaches for a good balance.
- Implementation: Redis with INCR + EXPIRE for distributed rate limiting; Nginx modules; built-in API Gateway features (AWS API GW, Kong).
- Decide how to key rate limits: by IP, user_id, or API key; and what action to take when limits are exceeded: return 429 Too Many Requests with a Retry-After header.