Cost Tracking

How OpenGateway records LLM request cost from provider token usage, model pricing, cached token counts, and platform fees.

OpenGateway records request cost from model pricing and token usage. The dashboard and log detail views use those records to show spend over time and per-request cost.

How a request is priced#

  request cost  =  prompt_tokens   × prompt_rate
                +  completion_tokens × completion_rate
                +  cached_tokens    × cached_rate    (when applicable)
                +  platform_fee     (a few percent)

Every term comes from a known source:

Token counts are reported by the provider in the response body
Rates come from the model price table maintained by the control plane
Platform fee is added on top of provider cost

You can inspect request cost from the log detail view.

Where to see it#

Use these views:

View	What it shows	Best for
Dashboard	Cost over the selected time range	At-a-glance check
Logs	Per-request cost detail	Investigating a spike

Per-team, per-key, per-user#

Every log row carries:

team_id — which team made the call
api_key_id — which key made the call
user_id — value of x-opengateway-user-id (when you sent it)
session_id — value of x-opengateway-session-id (when you sent it)

Use the logs filters to narrow cost by API key, user ID, session ID, model, provider, and time range.

See HTTP Headers for how to attach those values.

Token math, in detail#

Models charge differently for input and output tokens. A typical chat completion:

  prompt: 1,200 tokens   × $0.005 / 1K = $0.0060
  completion:  300 tokens × $0.015 / 1K = $0.0045
  ──────────────────────────────────────────
  request total                             $0.0105
  + platform fee (3%)                       $0.000315
  ──────────────────────────────────────────
  recorded cost                             $0.010815

Cached tokens#

When the provider supports prompt caching (OpenAI, Anthropic), cache hits get their own token counter and a discounted rate:

  prompt:        1,200 tokens × $0.005   / 1K = $0.0060
  cached prompt:   800 tokens × $0.00125 / 1K = $0.0010
  completion:      300 tokens × $0.015   / 1K = $0.0045

You do not configure caching at the gateway — the provider handles cache hits transparently. The gateway just reports them.

Failover changes the math#

When a fallback fires, attempts are recorded separately. Use the request log and cost detail to inspect the cost attached to each recorded attempt.

What to know#

Are token counts from the provider trustworthy?#

OpenGateway reports the provider's token counts as-is. We do not re-tokenize on our side; doing so would invite drift between our number and theirs. If OpenAI says 1,200 tokens, the dashboard says 1,200 tokens.

Why is the cost slightly higher than the provider's published rate?#

The gateway adds a platform fee on top of provider cost. See Billing.

Can I see spend per Git branch / per environment?#

Yes — issue separate API keys for prod, staging, and dev, and filter logs by API key.

Dashboard

Daily, weekly, monthly cost views.

Logs

Per-request cost breakdown.

Billing

Payment methods, payments, and credits.

HTTP Headers

Tag requests with user-id and session-id for attribution.