Rate Limiting is a technique that controls the number of API requests a client can make within a specified time window.
Rate Limiting
Rate Limiting is a technique that controls the number of API requests a client can make within a specified time window.
Why It Matters
Without rate limits, a compromised key can generate thousands of requests per second, running up bills or exhausting quotas. The Verizon 2024 DBIR notes that automated attacks are a top vector, and rate limiting is a primary defense.
How It Works
The server tracks request counts per client (by API key, IP, or token) within sliding or fixed time windows. When a client exceeds the limit, subsequent requests receive a 429 Too Many Requests response with a Retry-After header.
Best Practices
- Set limits appropriate to your service's capacity
- Use sliding window algorithms to prevent burst abuse
- Return clear Retry-After headers
- Set different limits for different API endpoints
Common Mistakes
- Setting limits too high to be effective
- Rate limiting by IP only (misses shared NAT scenarios)
- Not rate limiting internal APIs
How ShieldKey Helps
ShieldKey applies per-token rate limits at the proxy layer. Even if the upstream provider has generous limits, you can restrict individual Shield Tokens to prevent abuse or runaway costs.
Try ShieldKey FreeFAQ
What is a good rate limit for APIs?
It depends on your use case. Start with 60 requests/minute for standard endpoints and adjust based on legitimate usage patterns. Critical endpoints (auth, payments) should have stricter limits.