modelux
$ modelux for coding agents

Your team is burning tokens faster than coffee.

Coding agents are the heaviest per-seat LLM workload most companies will ever run. modelux gives engineering leadership the budget controls, policy surface, and audit trail this actually deserves — without getting in the way of the developers using the tools.

# setup

Point Claude Code at modelux once.

Claude Code and other coding agents read ANTHROPIC_BASE_URL. Set it once per engineer; from then on every agent request flows through modelux — tagged, logged, budgeted, and auditable.

  • Works with Claude Code, Cursor, Aider, Cline, Continue
  • Per-engineer mlx_sk_ keys
  • Tag each request with repo, branch, and session
  • MCP tools expose analytics and config to the agent itself
~/.zshrc bash
# Point Claude Code at modelux — uses your team's keys.
export ANTHROPIC_BASE_URL="https://api.modelux.ai/v1"
export ANTHROPIC_API_KEY="mlx_sk_..."

# Optional — tag every request so analytics can break down by repo.
export MODELUX_TAG_repo="acme-web"
export MODELUX_TAG_author="alice"
# the reality

Coding agents are different. They need different controls.

> problem

One dev burns through the team's Anthropic quota

A single refactoring session runs up 5M tokens. Everyone else gets rate-limited the rest of the day.

> problem

You can't tell what agents spent on what

The provider bill has one number. You have no way to say which task, which PR, which engineer.

> problem

Switching models means rewriting configs

Claude ships a better model, but updating config across ten engineers' setups is a chore nobody wants.

> problem

No audit trail for agent-driven changes

An agent rewrote your auth code. Nobody can answer "what prompts led to that commit" without digging.

# what modelux adds

Governance that respects the developers using the tools.

> solution

Per-engineer API keys, one central budget

Each developer gets a modelux key. The team shares a monthly budget cap. One engineer spiking doesn't block the others — auto-downgrade kicks in only for that key.

> solution

Tag requests with git metadata

Pass the repo, branch, and task ID in request tags. Analytics breaks spend down by repo, feature, or engineer — whatever tag key you care about.

> solution

Change default model in the dashboard

Update @default-coding to point at Claude Sonnet 4.6 once. Everyone's agents pick it up on the next request — no config push, no restart.

> solution

Full request logs with decision traces

Every agent prompt and response is captured, searchable by tags. Paired with audit logs of config changes, you can reconstruct any session.

> solution

Try a new default model without emailing the team

Before swapping @default-coding from Sonnet to Haiku, replay last week's agent traffic against the candidate. Routing-only mode is free and shows the projected spend; with-responses mode calls the candidate and scores it against Sonnet's outputs with embedding similarity. If the numbers hold, promote in one click — and every agent picks it up on the next request.

analytics by tag --group-by=author report
engineer       requests    input_tok    output_tok    cost
─────────────  ─────────  ───────────  ───────────  ───────
alice             4,821    12,480,312    1,104,227   $42.18
bob               3,204     8,711,558      891,402   $29.64
carol             2,977     7,409,921      823,110   $26.81
dan               1,248     3,102,847      294,501   $10.44
─────────────  ─────────  ───────────  ───────────  ───────
total            12,250    31,704,638    3,113,240  $109.07

budget:  $500/month   used: 21.8%   projected eom: $167
# visibility

Cost by engineer, by repo, by agent session.

Tag requests with whatever dimension you care about — author, repo, branch, session, task — and analytics breaks down spend along that dimension in real time. Projected end-of-month spend updates live so you know when to tighten the belt.

  • Top engineers by spend, volume, latency
  • Budget alerts scoped to one engineer or tag
  • Auto-downgrade near cap to avoid surprise bills

Get visibility into agent spend by Monday.

Free tier works for one developer. Team tier covers unlimited engineers with shared budgets and role-based access.