Flat tiers. No per-token markup.
Pay for the control plane, not for the tokens. You keep your provider relationships — we handle routing, analytics, budgets, and replay.
For individual developers and side projects.
Start free- ▸ 10k requests / month
- ▸ 1 project, 2 API keys
- ▸ 1 provider credential
- ▸ Single-model + fallback routing
- ▸ 7-day log retention
- ▸ Community support
For small teams building LLM-powered products.
Start Pro trial- ▸ 100k requests / month
- ▸ 5 projects, unlimited API keys
- ▸ Unlimited provider credentials
- ▸ All routing policies
- ▸ Ensembles, A/B tests, cascade
- ▸ 30-day log retention
- ▸ Email support
For teams with meaningful LLM traffic.
Start Team trial- ▸ 1M requests / month
- ▸ Unlimited projects
- ▸ Team roles & permissions
- ▸ Everything in Pro
- ▸ 60-day log retention
- ▸ Replay simulator + budgets
- ▸ Priority support
For scale, compliance, and dedicated support.
Talk to sales- ▸ Unlimited or negotiated volume
- ▸ SSO / SAML / SCIM
- ▸ Audit logging
- ▸ IP allowlisting
- ▸ 90-day+ configurable retention
- ▸ Dedicated support & SLA
- ▸ Custom deployment options
Feature comparison
| Feature | Free | Pro | Team | Enterprise |
|---|---|---|---|---|
| Core | ||||
| Monthly requests | 10k | 100k | 1M | Custom |
| Projects | 1 | 5 | Unlimited | Unlimited |
| Provider credentials | 1 | Unlimited | Unlimited | Unlimited |
| API keys | 2 | Unlimited | Unlimited | Unlimited |
| Routing | ||||
| Single model | ✓ | ✓ | ✓ | ✓ |
| Fallback chains | ✓ | ✓ | ✓ | ✓ |
| Cost-optimized | — | ✓ | ✓ | ✓ |
| Latency-optimized | — | ✓ | ✓ | ✓ |
| Ensembles | — | ✓ | ✓ | ✓ |
| A/B tests | — | ✓ | ✓ | ✓ |
| Cascade | — | ✓ | ✓ | ✓ |
| Custom rule DSL | — | — | ✓ | ✓ |
| Control plane | ||||
| Budgets & caps | — | ✓ | ✓ | ✓ |
| Replay simulator | — | — | ✓ | ✓ |
| Decision traces | ✓ | ✓ | ✓ | ✓ |
| Webhooks | — | ✓ | ✓ | ✓ |
| Audit logs | — | — | — | ✓ |
| Observability | ||||
| Log retention | 7 days | 30 days | 60 days | 90+ days |
| Request analytics | ✓ | ✓ | ✓ | ✓ |
| Latency percentiles | ✓ | ✓ | ✓ | ✓ |
| Cost forecasting | — | ✓ | ✓ | ✓ |
| Warehouse export | — | — | — | ✓ |
| Reliability & performance | ||||
| Multi-provider failover | ✓ | ✓ | ✓ | ✓ |
| Health-aware routing | ✓ | ✓ | ✓ | ✓ |
| Per-attempt timeouts & retries | ✓ | ✓ | ✓ | ✓ |
| Uptime target | Best-effort | 99.9% | 99.9% | 99.95% |
| Dedicated capacity | — | — | — | ✓ |
| Security & support | ||||
| Team management | — | — | ✓ | ✓ |
| SSO / SAML | — | — | — | ✓ |
| IP allowlists | — | — | — | ✓ |
| Support | Community | Priority | Dedicated | |
| Contractual SLA | — | — | — | ✓ |
Questions you might have
Why flat tiers instead of per-token pricing?
Predictable cost. You already pay providers per-token — adding a per-token fee on top feels like double-taxation. We charge a flat subscription so you know what you'll pay. We also want to encourage more traffic through Modelux (more routing data = better decisions).
Do I pay Modelux for the LLM calls?
No. Modelux proxies your requests using your own provider credentials (BYO keys). You pay OpenAI, Anthropic, etc. directly. Modelux charges only for the control plane.
What happens if I exceed my tier's request limit?
Soft limit by default — service continues, you get an email and a dashboard banner suggesting an upgrade. 10% grace buffer before the nudge. No overage charges. Enterprise customers can configure hard limits if needed.
Can I self-host?
Not today. Modelux is managed SaaS. If you need on-prem or VPC deployment, talk to us about Enterprise — we're evaluating dedicated deployments on a case-by-case basis.
How much can I save with smart routing?
Example: a team spending $10k/month on GPT-4o typically saves $4-5k by routing 60% of traffic to GPT-4o-mini for simpler queries. Ensembles of smaller models can match frontier-model quality at 20% of the cost. Modelux pays for itself many times over.
Still have questions?
Talk to us