Skip to content

How model usage is billed

Model usage is billed from your account’s single, tenant-wide USD credit pool — the same balance that funds everything else an agent does. There’s no separate per-provider or per-token invoice to reconcile.

Your account has one shared credit balance. As your agents work, usage draws that balance down:

  • Model usage — requests to the LLM providers Alfe routes to.
  • Compute — running managed agents.
  • Voice, channels, storage, and other platform usage.

Because it’s a single pool, you have one place to watch your balance and one place to top it up.

Each model request is metered based on how much it uses (for language models, this is driven by the amount of text processed). Alfe applies the rate for the model you used and deducts the cost from your credit pool. Different models cost different amounts, so a faster, lighter model will typically draw down your balance more slowly than a top-tier one — which is why picking the right default model per agent matters.

If you’ve set up bring your own key for a provider, requests to that provider run on your own provider account and are billed to you by that provider directly — they don’t draw from your Alfe credit pool. Providers you haven’t supplied a key for continue to use pooled access and bill from the pool as usual.

You can review usage and manage your balance from the billing area of the dashboard, and you can set optional spend limits to cap how much model usage an agent runs up. For current plans, included capacity, and rates, see the pricing page.