Quickstart
Use the same chat completions shape most OpenAI-compatible tools already support.
curl https://api.flatfee.one/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{
"role": "user",
"content": "Run this coding task through flatfee.one."
}]
}'
Authentication
Send the API key as a bearer token. Do not expose keys in browser clients.
Authorization: Bearer YOUR_API_KEYapplication/jsonModel selection
auto lets the gateway choose the lane. Explicit model aliases will be documented before public launch.
autoUse routing policy for premium, spend-control, and continuation lanes.premiumPrefer stronger reasoning capacity when available.economyPrefer lower-cost continuation behavior.Exact model/provider availability can change during beta. See Models for the public policy.
Headers
Response headers are planned as the visible receipt for routing and limits.
x-flatfee-lanepremium, budget, continuation, or fallback.x-flatfee-fair-use-remainingPremium fair-use estimate for the current billing window.x-flatfee-request-idSupport and trace correlation.Errors
Named errors make agent behavior easier to debug than a generic provider failure.
fallback_exhaustedAll eligible fallback routes failed or were unavailable.Billing behavior
Plans include premium fair-use and documented continuation. The point is predictable spend, not pretending compute has no limits.
- Premium fair-use is consumed before continuation behavior.
- Concurrent workflow limits protect shared capacity and abuse controls.
- Continuation keeps eligible runs moving on economy routing after premium fair-use.
- Private beta limits may change before general availability.