Skip to Content
Rate Limiting

Rate Limiting

The Nexus Platform applies rate limiting to protect service availability and ensure fair usage across all consumers. Rate limits are applied per workspace (or per IP address for unauthenticated requests) and are tracked independently for different categories of API endpoints.

Rate Limit Groups

API endpoints are organized into groups, each with its own independent rate limit counter. This means activity on one group does not consume the budget of another. For example, polling for session status will not reduce the number of queries you can send.

GroupEndpointsLimitWindow
AuthenticationPOST /v2/auth/access-tokens, POST /v2/auth/refresh20 requests60 seconds
QueryPOST /v2/queryWorkspace-specific60 seconds
SessionsAll /v2/sessions/* endpoints, GET /v2/sessions/:id/status300 requests60 seconds
FeedbackPOST /v2/query/feedback, POST /v2/analytics/link-clicks500 requests60 seconds
Data IngestionAll /v2/data-ingestion/* endpoints60 requests60 seconds
HealthGET /v2/health120 requests60 seconds

The query endpoint has a workspace-specific rate limit that may differ from the defaults above. Contact your workspace administrator for details on your configured limits.

Endpoints not listed above fall under a general rate limit of 200 requests per 60 seconds.

Response Headers

When rate limiting is active, responses include standard headers to help you track your usage:

HeaderDescription
X-RateLimit-LimitMaximum number of requests allowed in the current window
X-RateLimit-RemainingNumber of requests remaining in the current window
X-RateLimit-ResetUnix timestamp (seconds) when the current window resets
Retry-AfterSeconds until you can retry (only present on 429 responses)

Handling Rate Limits

When you exceed a rate limit, the API responds with HTTP status 429 Too Many Requests:

{
  "statusCode": 429,
  "message": "Rate limit exceeded. Try again in 42 seconds."
}

Use the Retry-After header to wait the appropriate amount of time before retrying:

async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);
    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60', 10);
      console.log(`Rate limited. Retrying in ${retryAfter}s...`);
      await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
      continue;
    }
    return response;
  }
  throw new Error('Max retries exceeded');
}

Best Practices

  • Respect Retry-After — always use the value from the response header rather than a fixed delay
  • Add jitter — when multiple clients retry simultaneously, add a small random delay to avoid thundering herd effects
  • Monitor remaining requests — use the X-RateLimit-Remaining header to proactively slow down before hitting the limit
  • Batch where possible — for data ingestion, use batching to reduce the number of API calls (see Data Ingestion API)
  • Cache responses — avoid redundant API calls by caching results that don’t change frequently (e.g., session status)

Repeatedly sending requests after receiving a 429 response without waiting for the Retry-After period will not resolve faster and may extend the cooldown.