$ modelux changelog --all [rss]

Changelog

What's shipped, newest first. Major features roll into the features page ; bug fixes and small improvements don't always appear here. Subscribe via RSS or the updates list .

2026-04-16 [batches]

Async batches + files (Anthropic + OpenAI)
- ▸ POST /anthropic/v1/messages/batches with the full retrieve / list / results / cancel / delete surface — drop-in for the official Anthropic SDK with no body changes. See the docs.
- ▸ POST /openai/v1/batches + POST /openai/v1/files for the OpenAI side (batches reference uploaded JSONL files); multipart upload preserves the boundary, large result downloads stream through without buffering. See the docs.
- ▸ Each provider's 50% async discount applies untouched on the upstream side.
- ▸ Authenticated thin passthrough — auth, BYOK (X-Modelux-Provider-Key), rate limits, and observability all on top of byte-identical request forwarding.
- ▸ Batch traffic shows up in the dashboard's Logs + Analytics with per-operation request_type breakdowns alongside synchronous traffic.
2026-04-16 [responses]

OpenAI Responses API
- ▸ POST /openai/v1/responses proxied as a thin authenticated passthrough — sync + SSE streaming + background mode. See the docs.
- ▸ Stored / chained responses (previous_response_id) work via the retrieve / cancel / delete / input_items endpoints.
- ▸ Usage capture pulls input_tokens / output_tokens / cached_tokens out of the terminal response.completed event so streaming traffic shows up with full token + cost breakdowns in analytics.
- ▸ OpenAI-specific request fields (response_format, seed, logprobs, parallel_tool_calls) now pass through byte-identical on /openai/v1/chat/completions too — strict json_schema structured outputs and reproducible sampling work end-to-end.
2026-04-16 [anthropic]

Anthropic prompt caching + Files API
- ▸ cache_control markers pass through verbatim everywhere Anthropic accepts them: per-content-block on messages, per-system-block when system is sent as an array of blocks, and per-tool. Anthropic's native cache discount applies on the next matching request.
- ▸ Cache hit / write counts land on the request log row (cache_read_tokens, cache_creation_tokens) so you can answer "did my marker actually hit?".
- ▸ Per-provider cost calculation now applies the right cache discount automatically (Anthropic 0.10× reads / 1.25× writes, OpenAI 0.50× reads, Google 0.25× reads).
- ▸ POST /anthropic/v1/files proxied for the Anthropic Files API beta — upload documents/images once, reference by id from messages content blocks. The proxy forwards your anthropic-beta header so the SDK's beta-tag declaration reaches the upstream untouched. See the docs.
2026-04-16 [billing]

Self-serve plans and billing
- ▸ Upgrade, downgrade, or switch between Free, Pro, and Team from Settings → Billing.
- ▸ Stripe-powered checkout and customer portal for payment methods, invoices, and tax IDs.
- ▸ Monthly or annual billing — annual is two months free (~17% off).
- ▸ Plan-based feature gating and usage meters surface what's included and how much you've used.
2026-04-16 [providers]

Nine new providers
- ▸ Added Groq, Fireworks, DeepSeek, xAI, Mistral, Cerebras, Together, Perplexity, and Cohere.
- ▸ All available through the same OpenAI-compatible surface — drop them into any routing config.
- ▸ Fourteen providers in total now, with normalized tool-calling and structured-output behavior.
2026-04-16 [people]

People and Customers
- ▸ New People entity represents the humans inside your company who use your API keys — attach a key to a Person, see per-person spend and activity.
- ▸ New Customers page shows end-user spend and volume in external-persona projects.
- ▸ Offboarding a Person revokes their keys in one step, with an explicit confirmation.
- ▸ Projects now declare a persona (internal vs. customer-facing) to keep the two surfaces distinct.
2026-04-16 [scim]

SCIM provisions People
- ▸ SCIM now creates Person records (not dashboard users), with one token scoped per project.
- ▸ Deactivating someone in your IdP automatically revokes their attached API keys.
- ▸ Matches how enterprises actually model employee access — your joiner/mover/leaver flow maps cleanly onto projects and keys.
- ▸ See the updated SCIM guide.
2026-04-16 [security]

Security settings
- ▸ SSO, SCIM, and related controls consolidated under Settings → Security with a status-first layout.
- ▸ At-a-glance status for SAML, domain verification, and SCIM tokens.
2026-04-16 [keys]

API key improvements
- ▸ Reveal a key after creation — the plaintext is encrypted at rest so you can copy it again later instead of re-minting.
- ▸ Optionally attach a key to a Person at creation time for clear ownership and spend attribution.
- ▸ New Person column on the keys list; clickable counts jump you straight to that person's keys.
2026-04-16 [onboarding]

Agent-first onboarding
- ▸ Rich, agent-friendly 401 responses now tell assistants exactly what's missing and how to unblock you.
- ▸ New setup_status MCP tool lets Claude Code, Cursor, and other agents inspect and complete onboarding end-to-end.
- ▸ New onboarding checklist in the dashboard with an explicit "connect an assistant" step.
2026-04-14 [site]

Marketing site + developer docs
- ▸ Launched modelux.ai with terminal-themed marketing pages and developer docs.
- ▸ Every docs page available as raw markdown (/docs/<slug>.md) and through /llms.txt + /llms-full.txt for LLM ingestion.
- ▸ Pagefind full-text search in the top nav on docs pages.
- ▸ JSON-LD structured data, OG images, sitemap, AI-crawler-friendly robots.txt.
2026-04-12 [analytics]

Users page, cost forecasting, period-over-period
- ▸ New Users page surfaces top end-users by spend, volume, and latency.
- ▸ Cost forecasting card projects end-of-month spend with trend confidence.
- ▸ Period-over-period comparison overlays a previous window on every chart.
2026-04-10 [analytics]

Tag filtering across logs and analytics
- ▸ Filter logs and analytics by arbitrary tag key-value pairs you attach at request time.
- ▸ New analytics dashboard with stacked series, per-provider health rollups, and per-tag breakdowns.
2026-04-09 [exports]

Warehouse export via S3 Parquet
- ▸ Configure scheduled exports of request logs, audit events, and aggregates to your own S3 bucket.
- ▸ Parquet format with predictable per-hour partitioning.
- ▸ BullMQ-backed worker with retries, backfills, and resumable cursors.
- ▸ Tests cover transforms, PII handling, cursors, and multi-tenant isolation.
2026-04-08 [integrations]

Integrations surface + developer API keys
- ▸ Consolidated integration settings under a single Integrations page: webhooks, MCP, exports, management tokens.
- ▸ Rotate management API keys and view MCP tool usage from one place.
2026-04-07 [mcp]

MCP server with 80+ management tools
- ▸ New MCP server at api.modelux.ai/mcp exposes every management API action as an MCP tool.
- ▸ Works with Claude Code, Cursor, and any MCP-compatible client.
- ▸ Natural-language workflows for creating configs, setting budgets, rotating credentials, inspecting logs.
2026-04-05 [routing]

Custom rule DSL
- ▸ New custom_rules routing strategy with a small expression DSL over cost, latency, budget, and tags.
- ▸ Test-harness endpoint lets you evaluate rules against sample requests before promoting.
- ▸ Tenant-aware routing: branch on tags.tenant to dispatch enterprise traffic differently.
2026-04-04 [audit]

Audit log + config versioning
- ▸ Every management-API mutation now writes an audit event with actor, target resource, diff, and source (UI, API, MCP).
- ▸ Routing configs and provider credentials keep a full version history.
- ▸ One-click rollback to any previous version.
2026-04-02 [replay]

Replay experiments
- ▸ Pick a window of historical traffic (up to 24h) and replay it against a candidate routing config.
- ▸ Side-by-side cost, latency, and success-rate diff vs. the current config.
- ▸ Promote the winner with a single click; promotion creates an audited new version.
2026-04-01 [budgets]

Finance-grade budgets with auto-downgrade
- ▸ Scoped budgets (org, project, tag, end-user) with soft-alert and hard-cap thresholds.
- ▸ At-cap actions: alert, block with 402, or auto-downgrade to a cheaper routing config.
- ▸ Budget-aware routing lets custom rules read budget.used_pct.
- ▸ Email + Slack-compatible webhook alerts on threshold crossings.
2026-03-31 [webhooks]

Webhook endpoints for events
- ▸ Subscribe to budget alerts, config changes, provider health transitions, and request anomalies.
- ▸ HMAC-SHA256 signatures, durable delivery queue with exponential backoff, replay from the dashboard.
- ▸ Slack-format auto-detection for webhook URLs pointing at Slack.
2026-03-29 [sdks]

Official Python + TypeScript SDKs
- ▸ Released modelux on PyPI and npm.
- ▸ Thin wrappers over the OpenAI SDK with extra helpers for tags, end-user IDs, routing slugs, and decision traces.
- ▸ MIT licensed; source in the monorepo.
2026-03-27 [cache]

Semantic caching
- ▸ New semantic-match cache mode: request embeddings against a cache of recent responses, return on high similarity.
- ▸ Per-routing-config mode (exact / semantic / off), tunable similarity threshold.
- ▸ Cache-hit metrics broken out in analytics.
2026-03-26 [routing]

Ensembles + cascades
- ▸ New ensemble routing strategy: parallel fan-out to N models, aggregation via weighted vote, first-valid, or LLM judge.
- ▸ New cascade strategy: sequential attempts with early stop on success — quality-tier fallback made easy.
- ▸ Live cost estimator in the routing config builder for both strategies.
2026-03-24 [routing]

Cost- and latency-optimized routing
- ▸ cost_optimized strategy picks the cheapest allowed model meeting a quality tier.
- ▸ latency_optimized strategy uses rolling p50 measurements to prefer the fastest healthy provider.
- ▸ A/B testing strategy lands for controlled rollouts between configs.
2026-03-23 [providers]

AWS Bedrock + Azure OpenAI adapters
- ▸ Added Bedrock with IAM credential format (access key::secret::region[::session]).
- ▸ Added Azure OpenAI with configurable base URLs per resource.
- ▸ Both adapters normalize tool-calling and structured-output behavior to the OpenAI shape.
2026-03-21 [dashboard]

Visual routing builder
- ▸ Drag-and-drop builder for fallback chains and ensembles.
- ▸ Live dry-run panel shows the decision trace for a sample prompt without calling the provider.
- ▸ Version diff view for every change.
2026-03-20 [observability]

Decision traces + full request logs
- ▸ Every request now records the full routing decision: attempts tried, reasons, per-attempt timings and costs.
- ▸ Log detail view in the dashboard with a routing trace card.
- ▸ Structured tags on every log entry for filtering and analytics group-by.
2026-03-19 [core]

Fallback routing, health tracking, retries
- ▸ fallback routing strategy with per-attempt timeouts and retry-on conditions (429, 5xx, timeout).
- ▸ Per-provider rolling health (success rate, p50 latency) powers health-based routing.
- ▸ OpenAI SDK streaming (SSE) passes through unchanged.
2026-03-18 [core]

modelux 1.0 — public beta
- ▸ OpenAI-compatible /v1/chat/completions and /v1/embeddings across OpenAI, Anthropic, and Google.
- ▸ Projects, API keys, and BYO provider credentials.
- ▸ Per-request cost computation with per-model pricing tables.
- ▸ Free + Pro plans launched.

Changelog

Async batches + files (Anthropic + OpenAI)

OpenAI Responses API

Anthropic prompt caching + Files API

Self-serve plans and billing

Nine new providers

People and Customers

SCIM provisions People

Security settings

API key improvements

Agent-first onboarding

Marketing site + developer docs

Users page, cost forecasting, period-over-period

Tag filtering across logs and analytics

Warehouse export via S3 Parquet

Integrations surface + developer API keys

MCP server with 80+ management tools

Custom rule DSL

Audit log + config versioning

Replay experiments

Finance-grade budgets with auto-downgrade

Webhook endpoints for events

Official Python + TypeScript SDKs

Semantic caching

Ensembles + cascades

Cost- and latency-optimized routing

AWS Bedrock + Azure OpenAI adapters

Visual routing builder

Decision traces + full request logs

Fallback routing, health tracking, retries

modelux 1.0 — public beta