Cost Tracking
How OpenGateway records LLM request cost from provider token usage, model pricing, cached token counts, and platform fees.
OpenGateway records request cost from model pricing and token usage. The dashboard and log detail views use those records to show spend over time and per-request cost.
How a request is priced#
request cost = prompt_tokens × prompt_rate
+ completion_tokens × completion_rate
+ cached_tokens × cached_rate (when applicable)
+ platform_fee (a few percent)Every term comes from a known source:
- Token counts are reported by the provider in the response body
- Rates come from the model price table maintained by the control plane
- Platform fee is added on top of provider cost
You can inspect request cost from the log detail view.
Where to see it#
Use these views:
| View | What it shows | Best for |
|---|---|---|
| Dashboard | Cost over the selected time range | At-a-glance check |
| Logs | Per-request cost detail | Investigating a spike |
Per-team, per-key, per-user#
Every log row carries:
team_id— which team made the callapi_key_id— which key made the calluser_id— value ofx-opengateway-user-id(when you sent it)session_id— value ofx-opengateway-session-id(when you sent it)
Use the logs filters to narrow cost by API key, user ID, session ID, model, provider, and time range.
See HTTP Headers for how to attach those values.
Token math, in detail#
Models charge differently for input and output tokens. A typical chat completion:
prompt: 1,200 tokens × $0.005 / 1K = $0.0060
completion: 300 tokens × $0.015 / 1K = $0.0045
──────────────────────────────────────────
request total $0.0105
+ platform fee (3%) $0.000315
──────────────────────────────────────────
recorded cost $0.010815Cached tokens#
When the provider supports prompt caching (OpenAI, Anthropic), cache hits get their own token counter and a discounted rate:
prompt: 1,200 tokens × $0.005 / 1K = $0.0060
cached prompt: 800 tokens × $0.00125 / 1K = $0.0010
completion: 300 tokens × $0.015 / 1K = $0.0045You do not configure caching at the gateway — the provider handles cache hits transparently. The gateway just reports them.
Failover changes the math#
When a fallback fires, attempts are recorded separately. Use the request log and cost detail to inspect the cost attached to each recorded attempt.
What to know#
Are token counts from the provider trustworthy?#
OpenGateway reports the provider's token counts as-is. We do not re-tokenize on our side; doing so would invite drift between our number and theirs. If OpenAI says 1,200 tokens, the dashboard says 1,200 tokens.
Why is the cost slightly higher than the provider's published rate?#
The gateway adds a platform fee on top of provider cost. See Billing.
Can I see spend per Git branch / per environment?#
Yes — issue separate API keys for prod, staging, and dev, and filter logs by API key.