Rate Limiting
Control request rates per IP, user, or team.
Enable rate limiting
ratelimit: enabled: true default: requests_per_minute: 60 requests_per_hour: 1000 burst: 10 per_ip: true per_user: true
How it works
Rate limiting uses a token bucket algorithm. Each key (IP, user, or team) gets a bucket that refills at the configured rate. The burst setting controls how many requests can fire in quick succession before rate limiting kicks in.
When a request is rate-limited, Stockyard returns HTTP 429 with a Retry-After header telling the client when to retry.
Per-IP limiting
When per_ip: true, each unique IP address gets its own rate limit. This is the simplest mode and works well for public-facing APIs.
Per-user limiting
When per_user: true, rate limits apply per authenticated user (identified by their API key). This requires auth to be enabled. Different users can have different limits set via the team API.
Per-team limiting
Team isolation includes per-team rate limits. Set custom RPM and TPM (tokens per minute) limits per team:
curl -X PUT http://localhost:4200/api/teams/backend \ -d '{"rpm_limit": 100, "tpm_limit": 100000}'
Response headers
Rate-limited responses include these headers:
X-RateLimit-Limit: 60 X-RateLimit-Remaining: 42 X-RateLimit-Reset: 1711900800 Retry-After: 15
Exemptions
Admin requests authenticated with STOCKYARD_ADMIN_KEY are exempt from rate limiting. This ensures management API calls always work even when rate limits are active.