[view as .md]

Routing

A routing config is a named, versioned resource that tells Modelux how to handle a request. Your application calls Modelux with a routing config slug (like @production) and Modelux decides which model(s) and provider(s) to actually invoke.

Routing configs live in Modelux, not in your code. Change the routing behavior without redeploying your app.

Calling a routing config

Use the slug prefixed with @ as the model name:

client.chat.completions.create(
    model="@production",
    messages=[...]
)

You can also call raw model names (gpt-4o, claude-sonnet-4-5) directly — Modelux auto-routes them to the matching provider using your credentials.

Strategies

StrategyDescription
singleLock traffic to one model + provider.
fallbackOrdered list of attempts with per-attempt timeouts. Retries on 429, 5xx, timeout.
cost_optimizedPick the cheapest model meeting a quality tier, from an allowlist.
latency_optimizedRoute to the lowest-p50-latency healthy provider.
ensembleParallel fan-out + aggregation (voting, first-valid, weighted).
ab_testPercentage-based split across sub-configs.
cascadeSequential attempts with early stop on success.
custom_rulesProgrammable DSL over cost, latency, budget, tags.

Versioning

Every save creates a new version. You can:

  • Diff two versions side-by-side
  • Rollback to any previous version with one click
  • Promote a candidate from a simulation result

Model aliases

Instead of hardcoding gpt-4o-mini in every request, create a routing config at @fast or @cheap and reference those slugs. Change the underlying model later without touching your app code.

Tags

Tag requests with arbitrary key-value pairs to scope routing, analytics, and budgets:

client.chat.completions.create(
    model="@production",
    messages=[...],
    extra_body={
        "mlx:tags": {
            "tenant": "acme",
            "feature": "summarize",
        },
    },
)

Custom rules can branch on tags: if tenant == "enterprise" then use @premium else use @production.