$ modelux changelog --all
[rss]
Changelog
What's shipped, newest first. Major features roll into the features page ; bug fixes and small improvements don't always appear here. Subscribe via RSS or the updates list .
- [site]
Marketing site + developer docs
- ▸ Launched modelux.ai with terminal-themed marketing pages and developer docs.
- ▸ Every docs page available as raw markdown (
/docs/<slug>.md) and through/llms.txt+/llms-full.txtfor LLM ingestion. - ▸ Pagefind full-text search in the top nav on docs pages.
- ▸ JSON-LD structured data, OG images, sitemap, AI-crawler-friendly robots.txt.
- [analytics]
Users page, cost forecasting, period-over-period
- ▸ New Users page surfaces top end-users by spend, volume, and latency.
- ▸ Cost forecasting card projects end-of-month spend with trend confidence.
- ▸ Period-over-period comparison overlays a previous window on every chart.
- [analytics]
Tag filtering across logs and analytics
- ▸ Filter logs and analytics by arbitrary tag key-value pairs you attach at request time.
- ▸ New analytics dashboard with stacked series, per-provider health rollups, and per-tag breakdowns.
- [exports]
Warehouse export via S3 Parquet
- ▸ Configure scheduled exports of request logs, audit events, and aggregates to your own S3 bucket.
- ▸ Parquet format with predictable per-hour partitioning.
- ▸ BullMQ-backed worker with retries, backfills, and resumable cursors.
- ▸ Tests cover transforms, PII handling, cursors, and multi-tenant isolation.
- [integrations]
Integrations surface + developer API keys
- ▸ Consolidated integration settings under a single Integrations page: webhooks, MCP, exports, management tokens.
- ▸ Rotate management API keys and view MCP tool usage from one place.
- [mcp]
MCP server with 80+ management tools
- ▸ New MCP server at
api.modelux.ai/mcpexposes every management API action as an MCP tool. - ▸ Works with Claude Code, Cursor, and any MCP-compatible client.
- ▸ Natural-language workflows for creating configs, setting budgets, rotating credentials, inspecting logs.
- ▸ New MCP server at
- [routing]
Custom rule DSL
- ▸ New
custom_rulesrouting strategy with a small expression DSL over cost, latency, budget, and tags. - ▸ Test-harness endpoint lets you evaluate rules against sample requests before promoting.
- ▸ Tenant-aware routing: branch on
tags.tenantto dispatch enterprise traffic differently.
- ▸ New
- [audit]
Audit log + config versioning
- ▸ Every management-API mutation now writes an audit event with actor, target resource, diff, and source (UI, API, MCP).
- ▸ Routing configs and provider credentials keep a full version history.
- ▸ One-click rollback to any previous version.
- [replay]
Replay simulator
- ▸ Pick a window of historical traffic (up to 24h) and replay it against a candidate routing config.
- ▸ Side-by-side cost, latency, and success-rate diff vs. the current config.
- ▸ Promote the winner with a single click; promotion creates an audited new version.
- [budgets]
Finance-grade budgets with auto-downgrade
- ▸ Scoped budgets (org, project, tag, end-user) with soft-alert and hard-cap thresholds.
- ▸ At-cap actions: alert, block with 402, or auto-downgrade to a cheaper routing config.
- ▸ Budget-aware routing lets custom rules read
budget.used_pct. - ▸ Email + Slack-compatible webhook alerts on threshold crossings.
- [webhooks]
Webhook endpoints for events
- ▸ Subscribe to budget alerts, config changes, provider health transitions, and request anomalies.
- ▸ HMAC-SHA256 signatures, durable delivery queue with exponential backoff, replay from the dashboard.
- ▸ Slack-format auto-detection for webhook URLs pointing at Slack.
- [sdks]
Official Python + TypeScript SDKs
- ▸ Released
modeluxon PyPI and npm. - ▸ Thin wrappers over the OpenAI SDK with extra helpers for tags, end-user IDs, routing slugs, and decision traces.
- ▸ MIT licensed; source in the monorepo.
- ▸ Released
- [cache]
Semantic caching
- ▸ New semantic-match cache mode: request embeddings against a cache of recent responses, return on high similarity.
- ▸ Per-routing-config mode (
exact/semantic/ off), tunable similarity threshold. - ▸ Cache-hit metrics broken out in analytics.
- [routing]
Ensembles + cascades
- ▸ New
ensemblerouting strategy: parallel fan-out to N models, aggregation via weighted vote, first-valid, or LLM judge. - ▸ New
cascadestrategy: sequential attempts with early stop on success — quality-tier fallback made easy. - ▸ Live cost estimator in the routing config builder for both strategies.
- ▸ New
- [routing]
Cost- and latency-optimized routing
- ▸
cost_optimizedstrategy picks the cheapest allowed model meeting a quality tier. - ▸
latency_optimizedstrategy uses rolling p50 measurements to prefer the fastest healthy provider. - ▸ A/B testing strategy lands for controlled rollouts between configs.
- ▸
- [providers]
AWS Bedrock + Azure OpenAI adapters
- ▸ Added Bedrock with IAM credential format (access key::secret::region[::session]).
- ▸ Added Azure OpenAI with configurable base URLs per resource.
- ▸ Both adapters normalize tool-calling and structured-output behavior to the OpenAI shape.
- [dashboard]
Visual routing builder
- ▸ Drag-and-drop builder for fallback chains and ensembles.
- ▸ Live dry-run panel shows the decision trace for a sample prompt without calling the provider.
- ▸ Version diff view for every change.
- [observability]
Decision traces + full request logs
- ▸ Every request now records the full routing decision: attempts tried, reasons, per-attempt timings and costs.
- ▸ Log detail view in the dashboard with a routing trace card.
- ▸ Structured tags on every log entry for filtering and analytics group-by.
- [core]
Fallback routing, health tracking, retries
- ▸
fallbackrouting strategy with per-attempt timeouts and retry-on conditions (429, 5xx, timeout). - ▸ Per-provider rolling health (success rate, p50 latency) powers health-based routing.
- ▸ OpenAI SDK streaming (SSE) passes through unchanged.
- ▸
- [core]
Modelux 1.0 — public beta
- ▸ OpenAI-compatible
/v1/chat/completionsand/v1/embeddingsacross OpenAI, Anthropic, and Google. - ▸ Projects, API keys, and BYO provider credentials.
- ▸ Per-request cost computation with per-model pricing tables.
- ▸ Free + Pro plans launched.
- ▸ OpenAI-compatible