API Reference
Rate Limits
Understanding and managing API rate limits.
Rate Limits
AssistantRouter implements rate limiting to ensure fair usage and protect the API from abuse.
Default Limits
| Metric | Hobby | Pro | Team | Enterprise |
|---|---|---|---|---|
| Requests/minute | 10 | 60 | 300 | Unlimited |
| Tokens/day | 100,000 | 1,000,000 | Unlimited | Unlimited |
| Web search/minute | - | 30 | 100 | Unlimited |
| File search/minute | - | 60 | 200 | Unlimited |
| API keys | 1 | 5 | 20 | Unlimited |
| Assistants | 1 | 10 | Unlimited | Unlimited |
Use GET /v1/limits to retrieve your current workspace limits and usage.
Resource Limits
| Resource | Hobby | Pro | Team | Enterprise |
|---|---|---|---|---|
| Documents per assistant | 5 | 50 | Unlimited | Unlimited |
| Storage | 100 MB | 5 GB | 50 GB | Unlimited |
| Nerfing rules | 2 | 10 | Unlimited | Unlimited |
| Widgets | 1 | 5 | 20 | Unlimited |
Feature Availability
| Feature | Hobby | Pro | Team | Enterprise |
|---|---|---|---|---|
| Web search | - | Yes | Yes | Yes |
| File search (RAG) | - | Yes | Yes | Yes |
| Custom models | - | - | Yes | Yes |
| Custom API keys | - | - | Yes | Yes |
| Remove branding | - | Yes | Yes | Yes |
| Priority support | - | - | Yes | Yes |
| EU data residency | - | - | Yes | Yes |
| SSO | - | - | - | Yes |
Model Access
All tiers have access to all available models. Use GET /v1/models to see the full list.
Rate Limit Headers
Every API response includes headers indicating your current rate limit status:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 55
X-RateLimit-Reset: 1704067260| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window |
X-RateLimit-Remaining | Remaining requests in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Handling Rate Limits
When you exceed a rate limit, the API returns a 429 Too Many Requests error:
{
"error": {
"type": "rate_limit_exceeded",
"message": "Too many requests. Please try again later.",
"retry_after_seconds": 30
}
}Recommended Approach
Implement exponential backoff when you receive a rate limit error:
async function makeRequestWithRetry(fn: () => Promise<any>, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (error.type === 'rate_limit_exceeded' && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}Check Your Limits
Use the /v1/limits endpoint to check your current limits:
curl https://api.assistantrouter.com/v1/limits \
-H "Authorization: Bearer $API_KEY"Response:
{
"data": {
"tier": "pro",
"limits": {
"requests_per_minute": 60,
"tokens_per_day": 1000000,
"assistants": 10,
"api_keys": 5
},
"usage": {
"requests_this_minute": 12,
"tokens_today": 45000,
"assistants": 3,
"api_keys": 2
}
}
}Best Practices
- Monitor headers - Check
X-RateLimit-Remainingbefore making requests - Implement backoff - Use exponential backoff when rate limited
- Batch requests - Combine multiple operations when possible
- Cache responses - Reduce duplicate requests with caching
- Check limits endpoint - Use
/v1/limitsto monitor your usage
Increasing Limits
Need higher limits?
- Pro upgrade: Upgrade to Pro in the dashboard
- Team upgrade: Upgrade to Team for unlimited assistants and higher rate limits
- Enterprise: Contact sales for custom limits and SLAs