Flat tiers. No per-token markup.
Pay for the control plane, not for the tokens. You keep your provider relationships — we handle routing, analytics, budgets, and replay.
For teams with meaningful LLM traffic.
Get started- ▸ 1M requests / month
- ▸ Unlimited projects, keys, providers
- ▸ Unlimited team members
- ▸ Everything in Pro
- ▸ Unlimited log retention
- ▸ Replay experiments + budgets
- ▸ Priority support
For small teams building LLM-powered products.
Get started- ▸ 100k requests / month
- ▸ Unlimited projects, keys, providers
- ▸ Unlimited team members
- ▸ All routing policies
- ▸ All routing strategies
- ▸ Unlimited log retention
For individual developers and side projects.
Sign up- ▸ 10k requests / month
- ▸ Unlimited projects, keys, providers
- ▸ Unlimited team members
- ▸ All routing strategies
- ▸ Unlimited log retention
For scale, compliance, and dedicated support.
Talk to sales- ▸ Unlimited or negotiated volume
- ▸ Unlimited team members
- ▸ SSO / SAML / SCIM
- ▸ Audit logging
- ▸ Unlimited log retention
- ▸ Dedicated support & SLA
- ▸ Custom deployment options
Feature comparison
| Feature | Team | Pro | Free | Enterprise |
|---|---|---|---|---|
| Core | ||||
| Monthly requests | 1M | 100k | 10k | Custom |
| Projects | Unlimited | Unlimited | Unlimited | Unlimited |
| Provider credentials | Unlimited | Unlimited | Unlimited | Unlimited |
| API keys | Unlimited | Unlimited | Unlimited | Unlimited |
| Team members | Unlimited | Unlimited | Unlimited | Unlimited |
| Routing | ||||
| Single model | ✓ | ✓ | ✓ | ✓ |
| Fallback chains | ✓ | ✓ | ✓ | ✓ |
| Cost-optimized | ✓ | ✓ | — | ✓ |
| Latency-optimized | ✓ | ✓ | — | ✓ |
| Ensembles | ✓ | ✓ | — | ✓ |
| A/B tests | ✓ | ✓ | — | ✓ |
| Cascade | ✓ | ✓ | — | ✓ |
| Custom rule DSL | ✓ | — | — | ✓ |
| Control plane | ||||
| Budgets & caps | ✓ | ✓ | — | ✓ |
| Replay experiments | ✓ | — | — | ✓ |
| Decision traces | ✓ | ✓ | ✓ | ✓ |
| Webhooks | ✓ | ✓ | — | ✓ |
| Audit logs | — | — | — | ✓ |
| Observability | ||||
| Log retention | Unlimited | Unlimited | Unlimited | Unlimited |
| Request analytics | ✓ | ✓ | ✓ | ✓ |
| Latency percentiles | ✓ | ✓ | ✓ | ✓ |
| Cost forecasting | ✓ | ✓ | — | ✓ |
| Warehouse export | — | — | — | ✓ |
| Reliability & performance | ||||
| Multi-provider failover | ✓ | ✓ | ✓ | ✓ |
| Health-aware routing | ✓ | ✓ | ✓ | ✓ |
| Per-attempt timeouts & retries | ✓ | ✓ | ✓ | ✓ |
| Uptime target | 99.9% | 99.9% | Best-effort | 99.95% |
| Dedicated capacity | — | — | — | ✓ |
| Security & support | ||||
| Team management | ✓ | — | — | ✓ |
| SSO / SAML | — | — | — | ✓ |
| Support | Priority | Community | Dedicated | |
| Contractual SLA | — | — | — | ✓ |
Questions you might have
Why flat tiers instead of per-token pricing?
Predictable cost. You already pay providers per-token — adding a per-token fee on top feels like double-taxation. We charge a flat subscription so you know what you'll pay. We also want to encourage more traffic through modelux (more routing data = better decisions).
Do I pay modelux for the LLM calls?
No. modelux proxies your requests using your own provider credentials (BYO keys). You pay OpenAI, Anthropic, etc. directly. modelux charges only for the control plane.
What happens if I exceed my tier's request limit?
Soft limit by default — service continues, you get an email and a dashboard banner suggesting an upgrade. 10% grace buffer before the nudge. No overage charges. Enterprise customers can configure hard limits if needed.
Can I self-host?
Not today. modelux is managed SaaS. If you need on-prem or VPC deployment, talk to us about Enterprise — we're evaluating dedicated deployments on a case-by-case basis.
How much can I save with smart routing?
Example: a team spending $10k/month on GPT-4o typically saves $4-5k by routing 60% of traffic to GPT-4o-mini for simpler queries. Ensembles of smaller models can match frontier-model quality at 20% of the cost. modelux pays for itself many times over.
Still have questions?
Talk to us