API Reference

360+ REST endpoints. All served on a single port. OpenAPI 3.1 spec included.

Overview

All endpoints are served on the same port as the proxy. Full OpenAPI 3.1 spec available at /api/openapi.json.

Authentication

All API endpoints (except /health) require an Authorization header:

curl http://localhost:4200/api/status \
  -H "Authorization: Bearer sy_admin_your_key"

Admin endpoints require an admin key (sy_admin_ prefix). User endpoints accept user keys (sy_ prefix). The proxy endpoint (/v1/chat/completions) accepts both.

Error Responses

All errors follow a consistent format:

{
  "error": {
    "code": "rate_limited",
    "message": "Rate limit exceeded. Retry after 2s.",
    "status": 429
  }
}

Common error codes:

StatusCodeMeaning
400bad_requestInvalid request body or parameters
401unauthorizedMissing or invalid API key
403forbiddenKey lacks required permissions
404not_foundResource does not exist
429rate_limitedRate limit or cost cap exceeded
500internal_errorServer error (check logs)
502provider_errorUpstream LLM provider returned an error
503circuit_openCircuit breaker tripped for provider

Pagination

List endpoints support limit and offset query parameters:

curl "http://localhost:4200/api/observe/traces?limit=20&offset=40" \
  -H "Authorization: Bearer sy_admin_..."

Responses include total and has_more fields for cursor-based pagination.

OpenAPI Spec

The full OpenAPI 3.1 specification is available at /api/openapi.json. Use it with Swagger UI, Postman, or any OpenAPI-compatible tool:

curl http://localhost:4200/api/openapi.json \
  -H "Authorization: Bearer sy_admin_..." | python3 -m json.tool

System

MethodPathDescription
GET/healthHealth check
GET/api/statusSystem status, uptime, metrics, component health
GET/api/licenseLicense status and usage
GET/api/plansAvailable pricing plans
POST/api/checkoutStart checkout session
GET/api/config/exportExport full config snapshot
POST/api/config/importImport config snapshot
POST/api/config/diffDiff current config vs uploaded snapshot

Chute — Proxy (12 endpoints)

The proxy is the core request pipeline. All /v1/* routes are OpenAI-compatible and pass through the full middleware chain.

MethodPathDescription
POST/v1/chat/completionsProxy chat request (OpenAI-compatible, streaming supported)
POST/v1/completionsProxy legacy completion request
POST/v1/embeddingsProxy embedding request
GET/api/proxy/modulesList all middleware modules with enable/disable status
PUT/api/proxy/modules/{name}Toggle or configure a single module
POST/api/proxy/modules/bulkBulk toggle multiple modules at once
GET/api/proxy/modules/{name}Get single module status and config
GET/api/proxy/providersList configured providers with health status
POST/api/proxy/providers/{name}/checkHealth-check a specific provider
GET/api/proxy/routesList routing rules
GET/api/proxy/chainShow current middleware execution chain
GET/api/proxy/statusProxy subsystem status

Auth (20 endpoints)

User management, API key lifecycle, and BYOK provider key storage. The /me routes use API key auth; /users routes require admin.

MethodPathDescription
POST/api/auth/signupCreate a new user account
GET/api/auth/meGet current user profile (API key auth)
GET/api/auth/me/keysList user's API keys
POST/api/auth/me/keysGenerate a new API key
DELETE/api/auth/me/keys/{keyId}Revoke an API key
POST/api/auth/me/keys/{keyId}/rotateRotate key (new key issued, old expires)
GET/api/auth/me/providersList user's configured LLM providers
PUT/api/auth/me/providers/{provider}Add or update a provider API key
DELETE/api/auth/me/providers/{provider}Remove a provider API key
GET/api/auth/me/usageGet user's usage statistics
GET/api/auth/usersList all users (admin)
POST/api/auth/usersCreate a user (admin)
GET/api/auth/users/{id}Get user by ID (admin)
GET/api/auth/users/{id}/keysList user's keys (admin)
POST/api/auth/users/{id}/keysCreate key for user (admin)
DELETE/api/auth/users/{id}/keys/{keyId}Revoke user key (admin)
POST/api/auth/users/{id}/keys/{keyId}/rotateRotate user key (admin)
GET/api/auth/users/{id}/providersList user providers (admin)
PUT/api/auth/users/{id}/providers/{provider}Set user provider key (admin)
DELETE/api/auth/users/{id}/providers/{provider}Remove user provider key (admin)

Lookout — Observe (17 endpoints)

Traces, cost tracking, alerting, anomaly detection, and live streaming. Engine hooks automatically populate traces for every proxy request.

MethodPathDescription
GET/api/observe/overviewDashboard overview (request counts, latency, costs)
GET/api/observe/tracesList traces with pagination and filtering
POST/api/observe/tracesCreate a trace manually
GET/api/observe/traces/{id}Get trace detail with full request/response
POST/api/replay/{id}Replay a recorded trace through the current chain
GET/api/observe/costsCost breakdown by provider and model
GET/api/observe/costs/dailyDaily cost time series
GET/api/observe/timeseriesRequest volume time series
GET/api/observe/alertsList configured alerts
POST/api/observe/alertsCreate an alert rule
DELETE/api/observe/alerts/{id}Delete an alert
GET/api/observe/alerts/historyAlert firing history
GET/api/observe/anomaliesDetected anomalies
GET/api/observe/safetySafety event log
GET/api/observe/safety/summarySafety stats summary
GET/api/observe/liveLive request stream (SSE)
GET/api/observe/statusLookout subsystem status

Brand — Trust (12 endpoints)

Immutable audit ledger with hash-chain verification, trust policies, evidence collection, and feedback loops for compliance and SOC 2 readiness.

MethodPathDescription
GET/api/trust/ledgerQuery the immutable audit ledger
POST/api/trust/ledgerAppend an audit entry
GET/api/trust/ledger/verifyVerify ledger integrity (hash chain)
GET/api/trust/policiesList trust policies
POST/api/trust/policiesCreate a trust policy
GET/api/trust/evidenceList evidence records
POST/api/trust/evidenceSubmit evidence for audit
GET/api/trust/feedbackList feedback entries
POST/api/trust/feedbackSubmit feedback on a response
GET/api/trust/replaysList replay records
POST/api/trust/replaysCreate a replay record
GET/api/trust/statusBrand subsystem status

Tack Room — Studio (17 endpoints)

Prompt versioning, A/B experiments, benchmark suites, and snapshots. Templates support Mustache variables and version history.

MethodPathDescription
GET/api/studio/templatesList prompt templates
POST/api/studio/templatesCreate a prompt template
GET/api/studio/templates/{slug}Get template detail
GET/api/studio/templates/{slug}/versionsList template versions
POST/api/studio/templates/{slug}/versionsPublish a new template version
GET/api/studio/experimentsList A/B experiments
POST/api/studio/experimentsCreate an experiment
GET/api/studio/experiments/{id}Get experiment detail and results
PUT/api/studio/experiments/{id}Update experiment
POST/api/studio/experiments/runRun experiment (send requests to variants)
GET/api/studio/snapshotsList prompt snapshots
POST/api/studio/snapshotsSave a prompt snapshot
GET/api/studio/benchmarksList LLM benchmarks
POST/api/studio/benchmarksCreate a benchmark suite
POST/api/studio/benchmarks/runRun benchmark suite
POST/api/studio/playgroundTack Room playground (send test request)
GET/api/studio/statusTack Room subsystem status

Forge (18 endpoints)

DAG workflow engine, tool registry, multi-turn sessions, and batch processing. Workflows define steps as nodes with dependencies.

MethodPathDescription
GET/api/forge/workflowsList DAG workflows
POST/api/forge/workflowsCreate a workflow
GET/api/forge/workflows/{slug}Get workflow detail
PUT/api/forge/workflows/{slug}Update a workflow
POST/api/forge/workflows/{slug}/runExecute a workflow run
GET/api/forge/runsList workflow runs
GET/api/forge/runs/{id}Get run detail
DELETE/api/forge/runs/{id}Delete a run
GET/api/forge/runs/{id}/stepsGet step-by-step execution log
GET/api/forge/toolsList registered tools
POST/api/forge/toolsRegister a tool
GET/api/forge/sessionsList chat sessions
POST/api/forge/sessionsCreate a session
GET/api/forge/sessions/{id}/messagesGet session messages
POST/api/forge/sessions/{id}/messagesSend message in session
GET/api/forge/batchList batch jobs
POST/api/forge/batchCreate a batch job
GET/api/forge/statusForge subsystem status

Trading Post — Exchange (13 endpoints)

Config pack marketplace with versioning, environment sync, and one-click install. Packs bundle module configs, provider settings, and routing rules.

MethodPathDescription
GET/api/exchange/packsList available config packs
POST/api/exchange/packsPublish a new pack
GET/api/exchange/packs/{slug}Get pack detail with content
GET/api/exchange/packs/{slug}/previewPreview what install would change
POST/api/exchange/packs/{slug}/installInstall a pack
POST/api/exchange/packs/{slug}/versionsPublish a new pack version
GET/api/exchange/installedList installed packs
DELETE/api/exchange/installed/{id}Uninstall a pack
GET/api/exchange/environmentsList sync environments
POST/api/exchange/environmentsCreate an environment
POST/api/exchange/environments/{name}/syncSync config to an environment
GET/api/exchange/sync-logView sync history
GET/api/exchange/statusTrading Post subsystem status

Configuration (5 endpoints)

Runtime configuration management with export/import and diff support.

MethodPathDescription
GET/api/configGet current runtime configuration
POST/api/configUpdate runtime configuration
GET/api/config/exportExport full config as JSON
POST/api/config/importImport config from JSON
POST/api/config/diffDiff two configurations

Playground (2 endpoints)

MethodPathDescription
POST/api/playground/shareCreate shareable session (30-day TTL)
GET/api/playground/share/{id}Retrieve shared session

Webhooks (6 endpoints)

Push notifications to external URLs on events like cost thresholds, safety blocks, provider health changes, and guardrail violations. Deliveries are logged and retried automatically.

MethodPathDescription
GET/api/webhooksList configured webhooks
POST/api/webhooksCreate a webhook
DELETE/api/webhooks/{id}Delete a webhook
POST/api/webhooks/testSend a test webhook
GET/api/webhooks/{id}/deliveriesRecent delivery log (status, latency, retries)
GET/api/webhooks/{id}/statsDelivery statistics (success rate, avg latency)

Supported event types: webhook.test, alert.fired, cost.threshold, spend.cap_exceeded, provider.health_changed, guardrail.violation, trust.violation, upgrade.drover_cap_reached, upgrade.drover_savings_*, upgrade.feral_bypass_detected, upgrade.pii_detected, upgrade.trace_milestone_*, upgrade.error_rate_high. Use * to receive all events.

Guardrails (8 endpoints)

Runtime-configurable content guardrail rules. Rules are regex-based, scoped per-project, and support block/redact/warn/log actions. Cached in memory with 60-second TTL.

MethodPathDescription
GET/api/guardrailsList all guardrail rules (filter by ?project=)
POST/api/guardrailsCreate a guardrail rule
GET/api/guardrails/{id}Get a single rule
PUT/api/guardrails/{id}Update a rule
DELETE/api/guardrails/{id}Delete a rule
POST/api/guardrails/testDry-run text against rules (no provider call)
GET/api/guardrails/statsAnalytics counters (blocked, flagged, per-rule hits)
GET/api/guardrails/eventsRecent violation events

User Spend (4 endpoints)

Per-user spend tracking and caps. The /api/users/{id}/spend endpoint reads from the fast in-memory counter; history reads from daily rollups in the database.

MethodPathDescription
GET/api/users/{id}/spendCurrent user spend (today, month, tokens)
GET/api/users/{id}/spend/historyDaily spend rollups (?days=30)
GET/api/users/{id}/spend/capGet user spend cap configuration
PUT/api/users/{id}/spend/capSet user daily/monthly spend caps

Lasso — Request Replay (10 endpoints)

Re-run any traced request against a different model. Compare cost, latency, and output. Batch replay against up to 5 models in a single call. Share comparisons as public URLs. Requires Individual tier ($29.99/mo). Learn more →

MethodPathDescription
POST/api/replayReplay a single trace against a target model
POST/api/replay/batchReplay a trace against up to 5 models at once
GET/api/replay/runsList replay history with optional model filter
GET/api/replay/runs/{id}Full replay detail including request body
GET/api/replay/compare/{id}Side-by-side comparison with winner analysis (cost, speed, efficiency)
GET/api/replay/statsAggregate stats, model leaderboard, available providers
POST/api/lasso/shareCreate shareable URL from a replay run ID
POST/api/lasso/share/customCreate shareable URL from raw comparison data
GET/api/lasso/share/{id}Retrieve shared comparison data (JSON)
GET/compare/{id}Public comparison page (HTML, shareable)
# Replay a trace against Claude
curl -X POST /api/replay \
  -H "X-Admin-Key: $KEY" \
  -d '{"trace_id":"tr_5213e8a4...","target_model":"claude-sonnet-4-6"}'

# Batch: same prompt → 3 models
curl -X POST /api/replay/batch \
  -d '{"trace_id":"tr_5213...","target_models":["gpt-5.4","claude-sonnet-4-6","gpt-5.4-mini"]}'

# Compare results
curl /api/replay/compare/rpl_0ae37547...

# Share the comparison — returns stockyard.dev/compare/{id}
curl -X POST /api/lasso/share \
  -d '{"replay_id":"rpl_0ae37547..."}'
# → {"id":"a1b2c3d4e5f6","url":"/compare/a1b2c3d4e5f6","full_url":"https://stockyard.dev/compare/a1b2c3d4e5f6"}
    

Drover — Autopilot Routing (7 endpoints)

Calibrate models on real traffic, set a quality threshold, and Drover routes every request to the cheapest qualifying model. Three modes: cost, speed, balanced. Three trigger keywords: model=autopilot, model=auto, or the reference model name for transparent downgrade. Requires Pro tier ($99.99/mo). Learn more →

MethodPathDescription
GET/api/autopilotCurrent config, state, scores, and active route
PUT/api/autopilotUpdate threshold, candidate models, mode, enable/disable
POST/api/autopilot/calibrateRun calibration — replays traces against all candidates, scores quality
GET/api/autopilot/scoresModel quality leaderboard with cost, latency, success rate
GET/api/autopilot/decisionsRouting decision history with per-request savings
GET/api/autopilot/statsAggregate savings, route distribution, calibration count
GET/api/autopilot/calibration-logPer-trace calibration detail with quality scores
GET/api/autopilot/capFree-tier daily cap: used, remaining, savings, upgrade CTA
GET/api/drover/capAlias — same as /api/autopilot/cap
GET/api/drover/statsAlias — same as /api/autopilot/stats
# Calibrate 3 models against real traffic
curl -X POST /api/autopilot/calibrate -H "X-Admin-Key: $KEY"

# Send a request with autopilot routing
curl /v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"model":"auto","messages":[{"role":"user","content":"hello"}]}'

# Check free-tier cap status and savings
curl /api/drover/cap
# → {"used_today":47,"limit":100,"remaining":53,"savings_today":3.20,"capped":false}

# Check what Drover chose and how much you saved
curl /api/autopilot/decisions?limit=5
    

Feral — Red-Team Testing (1 free + 12 paid endpoints)

Adversarial security testing for your LLM stack. The quickscan is free on all tiers. Full campaign management requires Team.

Quickscan: Free • Full suite:
MethodPathDescription
POST/api/feral/quickscanFree 5-probe security audit — instant report card, no tier gating
POST/api/feral/campaignsCreate a red-team campaign (Team+)
GET/api/feral/campaignsList all campaigns
GET/api/feral/campaigns/{id}Campaign details + results
POST/api/feral/campaigns/{id}/huntRun a multi-generation adversarial hunt
GET/api/feral/attacksList all attacks across campaigns
GET/api/feral/vulnerabilitiesList discovered vulnerabilities
GET/api/feral/statsAggregate attack stats
# Free quickscan — 5 probes, instant report card
curl -X POST /api/feral/quickscan \
  -H "Content-Type: application/json" \
  -d '{"api_key":"$OPENAI_API_KEY"}'
# → {"probes_run":5,"blocked":3,"bypassed":2,"grade":"C","score":"3/5",...}

# Full hunt — 29 probes with evolutionary mutation (Pro tier)
curl -X POST /api/feral/campaigns \
  -d '{"name":"audit-q1","target_url":"http://localhost:4200/v1/chat/completions"}'
curl -X POST /api/feral/campaigns/{id}/hunt \
  -d '{"api_key":"$OPENAI_API_KEY","generations":3}'
    

Optimize — The Full Loop (2 endpoints)

One endpoint that chains the entire Stockyard optimization loop: observe your traces → score model quality → analyze for cost waste, PII, and errors → recommend actions → optionally apply them → explain everything. This is the "make my stack better" button.

Free on all tiers
MethodPathDescription
GET/api/optimizeQuick status: traces available, spend, calibration state
POST/api/optimizeRun the full optimization loop. Returns staged report with actions.
# Check optimization readiness
curl /api/optimize

# Run the full loop (analyze only)
curl -X POST /api/optimize
# → stages: observe → score → analyze → recommend
# → actions: [{enable_drover, priority:1}, {fix_pii, priority:2}, ...]
# → summary: {grade:"B", projected_savings:"$12.40/mo", actions:4}

# Run and auto-apply safe recommendations
curl -X POST /api/optimize -d '{"apply": true}'
# → applied: ["Drover autopilot enabled", "SecretScan module enabled"]
    

System (13 endpoints)

Health checks, license info, spending, upgrade triggers, insights, and cache management.

MethodPathDescription
GET/api/appsList registered apps
GET/api/statusSystem health status
GET/api/healthHealth check (for load balancers)
GET/api/openapi.jsonOpenAPI 3.0 specification
GET/api/licenseCurrent license info
GET/api/plansAvailable pricing plans
GET/api/spendCurrent spending summary
GET/api/spend/historySpending history
GET/api/upgrade-promptsActive upgrade triggers
GET/api/insightsAuto-insights — analyzes traces for cost waste, PII, prompt duplication, errors
GET/api/cache/statsCache hit/miss statistics
DELETE/api/cacheClear the response cache
GET/api/productsProduct catalog

Common Examples

Quick reference for the most frequently used endpoints:

# Send a chat completion through the proxy
curl -X POST http://localhost:4200/v1/chat/completions \
  -H "Authorization: Bearer sy_your_key" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'

# Toggle a module on/off
curl -X PUT http://localhost:4200/api/proxy/modules/cachelayer \
  -H "Authorization: Bearer sy_admin_..." \
  -H "Content-Type: application/json" \
  -d '{"enabled": true}'

# Get cost breakdown for the last 7 days
curl "http://localhost:4200/api/observe/costs?days=7" \
  -H "Authorization: Bearer sy_admin_..."

# Create a trust policy
curl -X POST http://localhost:4200/api/trust/policies \
  -H "Authorization: Bearer sy_admin_..." \
  -H "Content-Type: application/json" \
  -d '{"name":"no-pii","action":"block","rules":[{"field":"response.content","pattern":"\\d{3}-\\d{2}-\\d{4}"}]}'

# Create and run a workflow
curl -X POST http://localhost:4200/api/forge/workflows \
  -H "Authorization: Bearer sy_admin_..." \
  -H "Content-Type: application/json" \
  -d '{"name":"test","steps":[{"id":"s1","type":"llm","model":"gpt-4o","prompt":"Say hi"}]}'

# Export your entire config
curl http://localhost:4200/api/config/export \
  -H "Authorization: Bearer sy_admin_..." -o config.json

For the full OpenAPI specification, visit /api/openapi.json on your running instance.

Every endpoint returns JSON with consistent error formatting. See the error responses section above for the standard error schema.

Rate Limits

API management endpoints are rate-limited to prevent abuse. The proxy endpoint (/v1/chat/completions) uses the rateshield module for configurable per-key rate limiting. Management endpoints allow 120 requests per minute per admin key.

Rate-limited responses include Retry-After and X-RateLimit-Remaining headers.

Explore: Self-hosted proxy · OpenAI-compatible · Model aliasing