modelux
$ modelux for AI startups

Move fast. Don't burn your runway on tokens.

Early-stage AI teams need optionality and cost discipline in equal measure. You want to try every new model the day it drops, cut costs when the fundraise is far, and not rewrite integration code when the answer changes. Modelux is the control plane that lets you do all three.

# a typical path

What it looks like from prototype to scale.

  1. Week 1 · Prototype

    Call gpt-4o-mini directly. Ship the demo. Free tier, no credit card.

  2. Month 1 · Beta

    Add a fallback chain. Add tags for per-customer cost attribution. Watch the analytics.

  3. Month 3 · Cost optimization

    Replay real traffic against a cost-optimized config. Cut spend 40-60% before raising round.

  4. Month 6 · Scale

    Enable ensembles for quality-critical paths. Set per-customer budgets. Add a second provider for reliability.

# why startups pick modelux

Optionality and cost discipline. No trade-off.

Ship first, optimize later

Start with the cheapest OpenAI-compatible endpoint. When you're ready to optimize, switch to a fallback or cost-optimized config in the dashboard. Your app code doesn't change.

A/B test cheaper models in production

New models drop every week. Route 10% of traffic to the candidate, compare cost/latency/quality, promote or roll back. No feature flags, no deploys.

Replay before you commit

Before flipping 100% of traffic to the cheaper model, replay the last 24 hours of real traffic against it and see the diff. You'll know before production does.

One API, every provider

Claude ships a new model on Monday, Gemini on Tuesday, OpenAI on Thursday. Modelux exposes all of them through one OpenAI-compatible surface so you can try them all the day they launch.

Cost forecasts that track runway

Dashboard shows projected end-of-month spend based on current rate. Set a hard cap with auto-downgrade so a spike on launch day doesn't burn a month of runway.

Free tier that covers the prototype

10k requests/month, free forever. Enough to build and demo. Upgrade only when traffic justifies it.

# model swap

New model drops. You ship it by lunch.

Your app calls @production. A new Haiku ships. You edit the routing config to try it on 10% of traffic, watch the analytics for an hour, and either promote or roll back. No deploy. No feature flag.

  • Percentage-based rollouts via A/B policy
  • One-click promote or rollback
  • Version history keeps every past config
@production v7 → v8 diff
{
  "strategy": "ab_test",
  "variants": [
-   { "weight": 100, "config": "@production-v7" }
+   { "weight":  90, "config": "@production-v7" },
+   { "weight":  10, "config": "@production-v8-candidate" }
  ]
}
startup-friendly pricing

Free tier is actually free. Upgrade when it matters.

10k requests/month on the free tier. $49/month for 100k (Pro). $199/month for 1M with team roles (Team). No per-token markup — you pay providers directly with your own keys. When a seed round closes, Team tier gets you everything you need to scale.

Y Combinator company? Backed accelerator? Ask about our startup program .

Build fast. Keep your runway.

Two-line migration from the OpenAI SDK. Free tier forever.