Routing
A routing config is a named, versioned resource that tells Modelux how to
handle a request. Your application calls Modelux with a routing config slug
(like @production) and Modelux decides which model(s) and provider(s) to
actually invoke.
Routing configs live in Modelux, not in your code. Change the routing behavior without redeploying your app.
Calling a routing config
Use the slug prefixed with @ as the model name:
client.chat.completions.create(
model="@production",
messages=[...]
)
You can also call raw model names (gpt-4o, claude-sonnet-4-5) directly —
Modelux auto-routes them to the matching provider using your credentials.
Strategies
| Strategy | Description |
|---|---|
single | Lock traffic to one model + provider. |
fallback | Ordered list of attempts with per-attempt timeouts. Retries on 429, 5xx, timeout. |
cost_optimized | Pick the cheapest model meeting a quality tier, from an allowlist. |
latency_optimized | Route to the lowest-p50-latency healthy provider. |
ensemble | Parallel fan-out + aggregation (voting, first-valid, weighted). |
ab_test | Percentage-based split across sub-configs. |
cascade | Sequential attempts with early stop on success. |
custom_rules | Programmable DSL over cost, latency, budget, tags. |
Versioning
Every save creates a new version. You can:
- Diff two versions side-by-side
- Rollback to any previous version with one click
- Promote a candidate from a simulation result
Model aliases
Instead of hardcoding gpt-4o-mini in every request, create a routing config
at @fast or @cheap and reference those slugs. Change the underlying model
later without touching your app code.
Tags
Tag requests with arbitrary key-value pairs to scope routing, analytics, and budgets:
client.chat.completions.create(
model="@production",
messages=[...],
extra_body={
"mlx:tags": {
"tenant": "acme",
"feature": "summarize",
},
},
)
Custom rules can branch on tags: if tenant == "enterprise" then use @premium else use @production.