Most LLM stacks are four services, three databases, and a container orchestrator. Stockyard is a single binary with embedded SQLite. Deploy your entire AI control layer by copying one file.
Install in 5 minutesIf you want an LLM proxy, observability, guardrails, and cost tracking today, you are looking at assembling multiple services. A proxy layer. A tracing backend. A database for logs. A cache layer. Maybe a separate auth service. Each one needs its own deploy, its own config, its own monitoring, its own failure modes.
That is a lot of moving parts for what is, at its core, a middleware problem: take a request, run it through some controls, forward it to a provider, record what happened.
Stockyard puts all of that in one process. Not as a monorepo of microservices crammed into a Docker Compose file. As a single compiled binary that stores everything in one SQLite file.
The web console, API server, proxy, tracing, audit ledger, caching, guardrails, and every product listed on the platform page are compiled into one executable. Nothing is downloaded at runtime. Nothing phones home.
No docker-compose.yml with six services. No Helm chart. No Terraform module. No environment matrix. One binary, one environment variable for your provider key, and you have a running LLM proxy with tracing, caching, guardrails, and a web console.
If you prefer Docker, that works too:
One container. One volume. Same binary inside.
Everything Stockyard knows — traces, costs, audit logs, module configs, API keys (encrypted), product state — lives in one SQLite file. Backing up is literally copying a file.
No pg_dump. No managed snapshots. No point-in-time recovery configuration. The file is the database. Copy the file, you have a backup. Copy it back, you have a restore.
LLM proxy service (LiteLLM, custom gateway)
Postgres or MySQL for configuration
Redis for caching and rate limiting
Observability backend (Langfuse, Helicone, or self-hosted)
Separate auth layer or API gateway
Container orchestrator to run it all
docker-compose.yml or Helm chart
5-7 services. 3+ databases. Multiple failure modes. One team to keep it running.
One binary: stockyard
One data file: stockyard.db
Caching built in (semantic + exact match)
Tracing, cost tracking, audit ledger built in
Auth, rate limiting, guardrails built in
Runs anywhere a Linux binary runs
curl install.sh | sh && stockyard
1 process. 1 file. The same middleware chain, the same controls, the same data — in ~25MB.
A single binary is not a limitation. It is a deliberate architectural choice. Here is why it holds up:
SQLite handles the load. Stockyard uses WAL mode for concurrent reads and writes. For an LLM proxy, the database is never the bottleneck — the upstream provider API is. A typical LLM request takes 1-30 seconds waiting for the provider to respond. SQLite comfortably handles the read/write profile of that workload on a single node.
One process means one failure mode. It is either running or it is not. There is no partial failure where the proxy is up but the tracing database is down, or the cache is stale because Redis restarted. Everything starts together, fails together, recovers together.
Upgrades are atomic. Download the new binary, stop the old one, start the new one. The SQLite database migrates automatically on startup. There is no migration coordination between services, no version matrix, no "make sure the proxy version matches the tracing backend version."
~200ms proxy overhead. The full middleware chain — rate limiting, cost tracking, prompt logging, failover detection, content filtering — adds roughly 200ms of latency. The rest of the time is spent waiting for the LLM provider to respond, which is typically 1-30 seconds.
Stockyard is designed for solo developers, small teams, and startups running on one to a few nodes. If you need horizontal scaling across dozens of machines, multi-region active-active replication, or sub-millisecond proxy overhead, a single-binary architecture is not the right fit.
For everyone else — which is most teams using LLM APIs — the simplicity of one binary, one file, and one process is a feature, not a constraint. You can always add complexity later. You cannot always take it away.
One binary, one SQLite file, free tier.
Install in 5 minutes