$ modelux for coding agents

Your team is burning tokens faster than coffee.

Coding agents are the heaviest per-seat LLM workload most companies will ever run. modelux gives engineering leadership the budget controls, policy surface, and audit trail this actually deserves — without getting in the way of the developers using the tools.

Get started free Connect Claude Code via MCP →

# setup

Point Claude Code at modelux once.

Claude Code and other coding agents read ANTHROPIC_BASE_URL. Set it once per engineer; from then on every agent request flows through modelux — tagged, logged, budgeted, and auditable.

▸ Works with Claude Code, Cursor, Aider, Cline, Continue
▸ Per-engineer mlx_sk_ keys
▸ Tag each request with repo, branch, and session
▸ MCP tools expose analytics and config to the agent itself

~/.zshrc bash

# Point Claude Code at modelux — uses your team's keys.
export ANTHROPIC_BASE_URL="https://api.modelux.ai/v1"
export ANTHROPIC_API_KEY="mlx_sk_..."

# Optional — tag every request so analytics can break down by repo.
export MODELUX_TAG_repo="acme-web"
export MODELUX_TAG_author="alice"

# the reality

Coding agents are different. They need different controls.

> problem

One dev burns through the team's Anthropic quota

A single refactoring session runs up 5M tokens. Everyone else gets rate-limited the rest of the day.

> problem

You can't tell what agents spent on what

The provider bill has one number. You have no way to say which task, which PR, which engineer.

> problem

Switching models means rewriting configs

Claude ships a better model, but updating config across ten engineers' setups is a chore nobody wants.

> problem

No audit trail for agent-driven changes

An agent rewrote your auth code. Nobody can answer "what prompts led to that commit" without digging.

# what modelux adds

Governance that respects the developers using the tools.

> solution

Per-engineer API keys, one central budget

Each developer gets a modelux key. The team shares a monthly budget cap. One engineer spiking doesn't block the others — auto-downgrade kicks in only for that key.

> solution

Tag requests with git metadata

Pass the repo, branch, and task ID in request tags. Analytics breaks spend down by repo, feature, or engineer — whatever tag key you care about.

> solution

Change default model in the dashboard

Update @default-coding to point at Claude Sonnet 4.6 once. Everyone's agents pick it up on the next request — no config push, no restart.

> solution

Full request logs with decision traces

Every agent prompt and response is captured, searchable by tags. Paired with audit logs of config changes, you can reconstruct any session.

> solution

Try a new default model without emailing the team

Before swapping @default-coding from Sonnet to Haiku, replay last week's agent traffic against the candidate. Routing-only mode is free and shows the projected spend; with-responses mode calls the candidate and scores it against Sonnet's outputs with embedding similarity. If the numbers hold, promote in one click — and every agent picks it up on the next request.

analytics by tag --group-by=author report

engineer       requests    input_tok    output_tok    cost
─────────────  ─────────  ───────────  ───────────  ───────
alice             4,821    12,480,312    1,104,227   $42.18
bob               3,204     8,711,558      891,402   $29.64
carol             2,977     7,409,921      823,110   $26.81
dan               1,248     3,102,847      294,501   $10.44
─────────────  ─────────  ───────────  ───────────  ───────
total            12,250    31,704,638    3,113,240  $109.07

budget:  $500/month   used: 21.8%   projected eom: $167

# visibility

Cost by engineer, by repo, by agent session.

Tag requests with whatever dimension you care about — author, repo, branch, session, task — and analytics breaks down spend along that dimension in real time. Projected end-of-month spend updates live so you know when to tighten the belt.

▸ Top engineers by spend, volume, latency
▸ Budget alerts scoped to one engineer or tag
▸ Auto-downgrade near cap to avoid surprise bills

# related

Cost

Get visibility into agent spend by Monday.

Free tier works for one developer. Team tier covers unlimited engineers with shared budgets and role-based access.

Start free MCP setup guide →

Your team is burning tokens faster than coffee.

Point Claude Code at modelux once.

Coding agents are different. They need different controls.

One dev burns through the team's Anthropic quota

You can't tell what agents spent on what

Switching models means rewriting configs

No audit trail for agent-driven changes

Governance that respects the developers using the tools.

Per-engineer API keys, one central budget

Tag requests with git metadata

Change default model in the dashboard

Full request logs with decision traces

Try a new default model without emailing the team

Cost by engineer, by repo, by agent session.

13× cheaper than GPT-5.

240ms TTFT.

Replay last week's agent traffic.

Get visibility into agent spend by Monday.