Rate Limiter
Rate limiting is a crucial design pattern used in distributed systems to control the rate of incoming requests or traffic to a service. It helps protect services from being overwhelmed, ensures fair resource usage, and maintains system stability.
A rate limiter acts as a gatekeeper that monitors and regulates the number of requests a client can make to an API or service within a specified time window. When the limit is exceeded, subsequent requests are either delayed or rejected until the next time window begins.
Key benefits of rate limiting include:
- Service Protection: Prevents services from being overwhelmed by too many requests
- Resource Management: Ensures fair distribution of system resources among users
- Cost Control: Helps manage infrastructure costs by limiting excessive usage
- Security: Protects against DDoS attacks and brute force attempts
- Quality of Service: Maintains consistent performance for all users
Common rate limiting algorithms include:
- Token Bucket
- Leaky Bucket
- Fixed Window Counter
- Sliding Window Log
- Sliding Window Counter
1. Token Bucket
The Token Bucket algorithm uses the concept of a bucket that is continuously filled with tokens at a fixed rate. Each request consumes one token. If tokens are available, the request is allowed; if not, it’s throttled.
Key characteristics:
- Bucket has a maximum capacity
- Tokens are added at a constant rate
- Allows for bursts of traffic up to bucket capacity
- Simple to implement and memory efficient
Use when:
- You need to allow brief bursts of requests
- You want to enforce a steady average rate with some flexibility
2. Leaky Bucket
The Leaky Bucket algorithm processes requests at a constant rate, like water leaking from a bucket. If the bucket is full, new requests are dropped.
Key characteristics:
- Fixed processing rate
- Queue-based implementation
- Smooths out bursts of traffic
- Memory usage depends on queue size
3. Fixed Window Counter
This algorithm divides time into fixed windows and counts requests in each window. When the counter hits the limit, new requests are rejected until the next window starts.
Key characteristics:
- Simple to understand and implement
- Resets counter at fixed intervals
- Can allow twice the rate limit at window boundaries
- Memory efficient
Although it has the potential risks that the attackers can take advantage of the time in the bulkhead.
For example, if you partition time by 24 hours, attackers can attack at 23:59h - 24h with the double traffic of 2 days.
4. Sliding Window Log
Keeps track of timestamps for each request in a time window that “slides” forward. Removes old timestamps and counts current ones to make decisions.
Key characteristics:
- Very accurate
- No boundary conditions
- Higher memory usage
- More complex implementation
It limits the traffic by the gap between the final log and current timestamp, it is accurate to detect anomolies. But the drawback is we need to store all the logs when querying.
5. Sliding Window Counter
Combines the fixed window counter with a weighted sliding window to smooth out boundary conditions.
Key characteristics:
- More accurate than fixed window
- Less memory intensive than sliding log
- Moderate implementation complexity
- Good balance of accuracy and performance
You can use the last access time and probability to predict the average frequency access of users.
Access per week 1 * (Number of days in week 1 / 7) + Access per week 2 * (Number of days in week 2 / 7)