Developer Documentation

One endpoint.
Every LLM provider.

Integrate Llummo with a single POST request. No SDKs to wrap, no libraries to install — just point your existing LLM calls at our proxy endpoint and get real-time cost tracking for every provider you use.

POSThttps://llummo.com/api/proxy
< 5ms overheadKeys never exposedStreaming supportedLive cost trackingAES-256-GCM encryption
terminal
live

Endpoint

https://llummo.com/api/proxy

Method

POST

Auth header

Authorization: Bearer <proxy-key>

Streaming

Set payload.stream: true

Providers

openai · anthropic · cohere · mistral

Cost tags

_meta.routeName + functionName

Get started

Up and running in 4 steps

From zero to tracked API calls in under 5 minutes.

01

Add your provider API keys

In the dashboard, go to API Keys and add keys for the providers you use. These are stored encrypted — you will only see the last 4 characters after saving.

Dashboard → API Keys → Add Provider Key
02

Generate a proxy key

Create a named proxy key for each app or service. This is the bearer token your code will use — your real provider keys stay on the server and are never shared.

Dashboard → API Keys → Proxy Keys → New Key
03

Send requests to the proxy

Replace direct provider calls with a single POST to the Llummo endpoint. Pass your proxy key as the Bearer token. All requests are forwarded transparently.

POST https://llummo.com/api/proxy
Authorization: Bearer YOUR_PROXY_KEY
04

Track costs in real-time

Every request is logged with token counts, cost, model, and your _meta labels. Open the dashboard to see live spend charts, per-model breakdowns, and anomaly alerts.

Dashboard → Usage → Daily cost & breakdowns

Provider integrations

Every major LLM. One endpoint.

Switch providers without touching your observability setup. Pass the provider name in the request body — everything else stays the same.

Provider
Streaming
Usage tracking
Tool calls
Docs
OpenAIOpenAI
AnthropicAnthropic
MistralMistral
CohereCohere
OpenAIForwarded to OpenAI — your real API key is never exposed to your app
1const res = await fetch("https://llummo.com/api/proxy", {
2 method: "POST",
3 headers: {
4 "Authorization": "Bearer YOUR_PROXY_KEY",
5 "Content-Type": "application/json",
6 },
7 body: JSON.stringify({
8 provider: "openai",
9 payload: {
10 model: "gpt-4o",
11 messages: [{ role: "user", content: "Summarize this article." }],
12 },
13 _meta: {
14 routeName: "/api/summarize",
15 functionName: "generateSummary",
16 },
17 }),
18});
19 
20const data = await res.json();
21console.log(data.choices[0].message.content);

Framework integration

Vercel AI SDK

The passthrough proxy is fully compatible with @ai-sdk/* providers. Set the baseURL to your Llummo proxy URL — every generateText and streamText call is tracked automatically.

Zero logic changes

Keep using generateText, streamText, and streamObject as-is.

Full cost visibility

Every token, model, and call is logged in your dashboard in real time.

Keys stay server-side

Your real provider keys are stored encrypted and never sent to your app.

OpenAI@ai-sdk/openai·npm install @ai-sdk/openai ai
1import { createOpenAI } from '@ai-sdk/openai';
2 
3// Configure once — reuse throughout your app
4const openai = createOpenAI({
5 baseURL: 'https://llummo.com/api/proxy/openai/v1',
6 apiKey: process.env.LLUMMO_PROXY_KEY,
7});
8 
9export { openai };

Google auth note

@ai-sdk/google sends its key in x-goog-api-key, not Authorization: Bearer. Pass your proxy key via the headers option and set apiKey to any non-empty string — it gets stripped server-side and your real Google key is injected automatically. OpenAI, Anthropic (authToken), and Mistral all use Bearer natively.

Cost attribution

Tag requests with _meta

Add optional labels to any request so you can break down costs by route and function in the dashboard. The _meta field is stripped before forwarding — providers never see it.

Your request

{
  "provider": "openai",
  "payload": { ... },
  "_meta": {
    "routeName": "/api/summarize",
    "functionName": "generateSummary"
  }
}
_meta is stripped

What the provider receives

{
  "model": "gpt-4o",
  "messages": [...],
  // _meta removed — clean native request
}

Both fields are optional. When present they appear on every usage log entry, letting you filter spend by endpoint or function in the dashboard.

routeNamestring?

The API route that triggered this call, e.g. /api/summarize

functionNamestring?

The function within that route, e.g. generateSummary

Dashboard breakdown

Once tagged, you can filter the Usage page by route name or function name to see exactly which part of your app is driving costs — down to the cent.

Authentication

Proxy keys keep your secrets safe

You authenticate with a proxy key — a short-lived bearer token tied to your account. Your real provider API keys are never sent to your app or exposed in client code.

How to use a proxy key

http
1POST https://llummo.com/api/proxy
2Authorization: Bearer YOUR_PROXY_KEY
3Content-Type: application/json

One-time reveal

Your proxy key is shown in full only once, at creation. Copy it immediately and store it in your environment variables. If you lose it, rotate it from the dashboard — the old key is invalidated instantly.

One key per integration

Create a separate proxy key for each app, service, or environment. Usage from each key is tracked independently — useful for isolating cost per product or team.

Rotate without downtime

Generate a new key at any time. The old key is revoked immediately. Update your environment variable and redeploy — no other changes needed.

Keys never reach your app

Your real provider API keys live only on the server side. Client apps and backend services only ever hold a proxy key — rotating it has zero impact on the provider account.

Rate limits

Know your limits, handle them gracefully

Llummo enforces per-account rate limits to ensure fair usage. Limits vary by plan — always check the response headers and handle 429s in your code.

PlanReq / minReq / day
Starter605,000
Growth30050,000
Scale1,000Unlimited

429 Too Many Requests

When rate limited you will receive a 429 with a Retry-After header in seconds. Back off and retry after that window.

Handle rate limit errors in your code with a simple retry loop:

Retry on 429
typescript
1async function callProxy(body: object, retries = 3) {
2 const res = await fetch("https://llummo.com/api/proxy", {
3 method: "POST",
4 headers: {
5 "Authorization": `Bearer ${process.env.PROXY_KEY}`,
6 "Content-Type": "application/json",
7 },
8 body: JSON.stringify(body),
9 });
10
11 if (res.status === 429 && retries > 0) {
12 const wait = Number(res.headers.get("Retry-After") ?? 1);
13 await new Promise((r) => setTimeout(r, wait * 1000));
14 return callProxy(body, retries - 1);
15 }
16
17 return res.json();
18}

Security & encryption

Your keys are safe with us

Provider API keys are treated as secrets from the moment they are saved. Here is exactly what we do — and do not do — with them.

AES-256-GCM at rest

Every provider API key is encrypted with AES-256-GCM before being written to the database. The encryption key is never stored alongside the data.

Never exposed in the UI

Once saved, you only ever see the last 4 characters of a key in the dashboard. The full value is never returned by any API endpoint.

Proxy keys are hashed

Proxy keys are stored as SHA-256 hashes. Even if the database were compromised, raw proxy tokens cannot be recovered from it.

Keys never leave the server

Provider API keys are decrypted only at the moment a request is forwarded — in memory, never logged. Your app code and client never see them.

TLS in transit

All traffic between your app and the Llummo proxy, and between the proxy and providers, is encrypted with TLS 1.2+.

Rotate at any time

Both provider keys and proxy keys can be rotated instantly from the dashboard. Old credentials are invalidated immediately on rotation.

API Reference

Request & response shape

POSThttps://llummo.com/api/proxyRequires Authorization: Bearer <proxy-key>
FieldTypeDescription
providerstringTarget provider: openai · anthropic · mistral · cohere
payloadobjectThe native request body — sent verbatim to the provider API
_meta.routeNamestring?Label for the calling API route, e.g. /api/summarize
_meta.functionNamestring?Label for the calling function, e.g. generateSummary

Response

The unmodified provider response, returned with the original HTTP status code.

choices / content / messageNative provider response body — passed through exactly as received
usageToken counts from the provider, logged for your dashboard
(errors)Provider error shapes forwarded with the original status code

Error responses

401Missing or invalid proxy key
{"error":"Unauthorized."}
400Missing required fields
{"error":"provider and payload are required."}
429Too many requests — check Retry-After header
{"error":"Rate limit exceeded."}
403Feature requires a higher plan
{"error":"...","upgrade":true}
404Provider key not saved in dashboard yet
{"error":"No openai API key found."}
502Upstream provider returned an error
{"error":"Provider error."}

CLI

Manage everything from your terminal

The llummo CLI lets you configure projects, manage proxy keys, and inspect usage without opening a browser.

npmInstallation

terminal
bash
1# Run once with npx (no global install needed)
2npx llummo <command>
3
4# Or install globally
5npm install -g llummo

Published on npm as llummo

1. Authenticate

Generate a CLI token in Dashboard → Settings → CLI Tokens, then run:

terminal
bash
1llummo login
2# Paste your token when prompted

2. Init a project

Detects installed AI SDKs, lets you pick or create a proxy key, writes .env.local, and prints the exact code change needed.

terminal
bash
1cd my-app
2llummo init
3
4# Detected SDK: openai
5# ? Which proxy key should this project use? › my-app-key
6# ✓ Wrote to .env.local
7#
8# Update your AI client:
9# const client = new OpenAI({
10# apiKey: process.env.LLUMMO_PROXY_KEY,
11# baseURL: process.env.LLUMMO_PROXY_URL,
12# });

3. Native passthrough — zero code changes

For terminal tools (Claude Code, Aider, Cursor, Cline, Continue.dev) or any SDK that reads provider env vars, Llummo exposes a native passthrough route. Point the tool's base URL at Llummo and use your proxy key as the API key — no wrapper format needed. Usage is still logged and spend limits still apply.

.env or shell
bash
1# OpenAI-compatible tools (GPT, Mistral, Cohere …)
2OPENAI_API_KEY=pk_your_proxy_key
3OPENAI_BASE_URL=https://your-app.com/api/proxy/openai
4
5# Anthropic / Claude Code
6ANTHROPIC_API_KEY=pk_your_proxy_key
7ANTHROPIC_BASE_URL=https://your-app.com/api/proxy/anthropic
8
9# Mistral
10MISTRAL_API_KEY=pk_your_proxy_key
11MISTRAL_BASE_URL=https://your-app.com/api/proxy/mistral
12
13# Google Gemini
14GOOGLE_API_KEY=pk_your_proxy_key
15GOOGLE_GENERATIVE_AI_BASE_URL=https://your-app.com/api/proxy/google

llummo init can write these for you automatically — it detects installed SDKs and terminal tools and offers to set up passthrough env vars in the same step.

Claude Code example

Export these in your shell (or add to ~/.zshrc) and every claude session is automatically tracked through Llummo.

~/.zshrc
bash
1export ANTHROPIC_API_KEY=pk_your_proxy_key
2export ANTHROPIC_BASE_URL=https://your-app.com/api/proxy/anthropic

Command reference

llummo login

Authenticate with a personal access token. Saves the token to ~/.config/llummo/config.json.

llummo whoami

Print the currently authenticated account email and plan.

llummo init

Configure the proxy in the current project. Detects SDKs, picks/creates a key, writes .env.local, prints snippet.

llummo keys list

List all proxy keys for the current account.

llummo keys create [name]

Create a new proxy key. The raw key is shown once — store it immediately.

llummo keys delete [id]

Revoke a proxy key. Prompts for selection if no ID is provided.

llummo status

--from <YYYY-MM-DD>

--to <YYYY-MM-DD>

Print spend and token usage for the current period.

Personal Access Tokens

CLI tokens are scoped to your account and use the same plan limits as your dashboard session. They never expire but can be revoked at any time from Settings → CLI Tokens. Each token is shown once at creation — store it in a password manager.

Ready to integrate

Start tracking costs today

Create an account, add your provider keys, and your first tracked call is minutes away.