Guide

What is an LLM proxy?

A technical explanation of what LLM proxies do, how they work, and when you need one.

The short version

An LLM proxy is a server that sits between your application and LLM providers like OpenAI, Anthropic, and Google. Your app sends requests to the proxy instead of directly to the provider. The proxy forwards the request, logs it, and returns the response.

This indirection layer lets you add routing, caching, cost tracking, rate limiting, safety filters, and observability without changing your application code.

How it works

Your app already calls an API endpoint to reach the LLM provider, typically /v1/chat/completions for OpenAI-compatible APIs. An LLM proxy implements the same API. You change one URL in your app config, and all requests now flow through the proxy.

The proxy can then do any combination of: route to different providers based on model name, cache identical requests, enforce spend limits, redact PII, log every request for debugging, and fail over to backup providers when one goes down.

Because the proxy speaks the same API as the provider, your application code does not need to know the proxy exists. Any SDK that works with OpenAI works with the proxy.

When you need one

You probably need an LLM proxy if any of these are true: you use more than one LLM provider, you need to track costs per request, you want to cache responses to reduce latency and spend, you need audit logs for compliance, or you want to add safety guardrails without modifying application code.

You probably do not need one if you are calling a single provider with a few hundred requests per day and have no compliance requirements.

Self-hosted vs managed

Some LLM proxies run as managed SaaS (Portkey, Helicone). Others are self-hosted (Stockyard, LiteLLM). The tradeoff is operational overhead vs data control. With a managed proxy, your prompts and completions flow through a third party. With a self-hosted proxy, everything stays on your infrastructure.

Stockyard is a self-hosted LLM proxy that ships as a single binary with embedded SQLite. No external database, no Docker, no SaaS dependency. Install it in under 60 seconds.

LLM proxy vs LLM gateway

These terms are often used interchangeably. Some products call themselves gateways to emphasize API management features (authentication, rate limiting, request transformation). Others call themselves proxies to emphasize transparent request forwarding. For a detailed breakdown, see our LLM gateway vs proxy comparison.

Try Stockyard. One binary, 16 providers, under 60 seconds.

Get Started

Compare: vs LiteLLM · vs Kong · vs AWS Bedrock · Best self-hosted proxy

Explore: OpenAI proxy · Anthropic proxy · Best self-hosted proxy · Install guide
Stockyard also makes 150 focused self-hosted tools — browse the catalog or get everything for $29/mo.