Add cost tracking, caching, failover, and 76 middleware modules to your Groq requests. One URL change, no SDK swap.
Groq runs open-source models on custom LPU hardware with extremely low latency. Proxying through Stockyard adds cost tracking (Groq is cheap but not free), response caching (save even more), and failover to other providers when Groq hits rate limits.
Groq is already OpenAI-compatible, so the translation overhead is minimal. Stockyard adds the operational layer that Groq does not provide: per-request logging, audit trails, and middleware modules.
# Install Stockyard curl -fsSL stockyard.dev/install.sh | sh # Set your Groq API key export GROQ_API_KEY=your-key-here # Start the proxy stockyard # Provider: groq (from GROQ_API_KEY) # Proxy listening on :4200 # Send a request through the proxy curl http://localhost:4200/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"llama-3.3-70b-versatile","messages":[{"role":"user","content":"hello"}]}'
Groq has aggressive rate limits on free tiers. Stockyard's rate limiting module can smooth out request bursts and the cache reduces redundant API calls.
Groq's LPU hardware delivers sub-second responses, but free-tier rate limits can throttle you at 30 requests per minute. Stockyard helps in two ways:
Identical prompts return cached responses instantly. For iterative development, this can cut your effective request count by 50-80%.
When Groq returns a 429 rate limit error, Stockyard automatically retries on your fallback provider (OpenAI, DeepSeek, etc.) so your app never sees the error.
Route Groq through Stockyard in under 60 seconds.
Install GuideAll 16 providers · Proxy-only mode · What is an LLM proxy? · vs LiteLLM · vs Helicone