modelux
$ modelux changelog --all [rss]

Changelog

What's shipped, newest first. Major features roll into the features page ; bug fixes and small improvements don't always appear here. Subscribe via RSS or the updates list .

  • [site]

    Marketing site + developer docs

    • Launched modelux.ai with terminal-themed marketing pages and developer docs.
    • Every docs page available as raw markdown (/docs/<slug>.md) and through /llms.txt + /llms-full.txt for LLM ingestion.
    • Pagefind full-text search in the top nav on docs pages.
    • JSON-LD structured data, OG images, sitemap, AI-crawler-friendly robots.txt.
  • [analytics]

    Users page, cost forecasting, period-over-period

    • New Users page surfaces top end-users by spend, volume, and latency.
    • Cost forecasting card projects end-of-month spend with trend confidence.
    • Period-over-period comparison overlays a previous window on every chart.
  • [analytics]

    Tag filtering across logs and analytics

    • Filter logs and analytics by arbitrary tag key-value pairs you attach at request time.
    • New analytics dashboard with stacked series, per-provider health rollups, and per-tag breakdowns.
  • [exports]

    Warehouse export via S3 Parquet

    • Configure scheduled exports of request logs, audit events, and aggregates to your own S3 bucket.
    • Parquet format with predictable per-hour partitioning.
    • BullMQ-backed worker with retries, backfills, and resumable cursors.
    • Tests cover transforms, PII handling, cursors, and multi-tenant isolation.
  • [integrations]

    Integrations surface + developer API keys

    • Consolidated integration settings under a single Integrations page: webhooks, MCP, exports, management tokens.
    • Rotate management API keys and view MCP tool usage from one place.
  • [mcp]

    MCP server with 80+ management tools

    • New MCP server at api.modelux.ai/mcp exposes every management API action as an MCP tool.
    • Works with Claude Code, Cursor, and any MCP-compatible client.
    • Natural-language workflows for creating configs, setting budgets, rotating credentials, inspecting logs.
  • [routing]

    Custom rule DSL

    • New custom_rules routing strategy with a small expression DSL over cost, latency, budget, and tags.
    • Test-harness endpoint lets you evaluate rules against sample requests before promoting.
    • Tenant-aware routing: branch on tags.tenant to dispatch enterprise traffic differently.
  • [audit]

    Audit log + config versioning

    • Every management-API mutation now writes an audit event with actor, target resource, diff, and source (UI, API, MCP).
    • Routing configs and provider credentials keep a full version history.
    • One-click rollback to any previous version.
  • [replay]

    Replay simulator

    • Pick a window of historical traffic (up to 24h) and replay it against a candidate routing config.
    • Side-by-side cost, latency, and success-rate diff vs. the current config.
    • Promote the winner with a single click; promotion creates an audited new version.
  • [budgets]

    Finance-grade budgets with auto-downgrade

    • Scoped budgets (org, project, tag, end-user) with soft-alert and hard-cap thresholds.
    • At-cap actions: alert, block with 402, or auto-downgrade to a cheaper routing config.
    • Budget-aware routing lets custom rules read budget.used_pct.
    • Email + Slack-compatible webhook alerts on threshold crossings.
  • [webhooks]

    Webhook endpoints for events

    • Subscribe to budget alerts, config changes, provider health transitions, and request anomalies.
    • HMAC-SHA256 signatures, durable delivery queue with exponential backoff, replay from the dashboard.
    • Slack-format auto-detection for webhook URLs pointing at Slack.
  • [sdks]

    Official Python + TypeScript SDKs

    • Released modelux on PyPI and npm.
    • Thin wrappers over the OpenAI SDK with extra helpers for tags, end-user IDs, routing slugs, and decision traces.
    • MIT licensed; source in the monorepo.
  • [cache]

    Semantic caching

    • New semantic-match cache mode: request embeddings against a cache of recent responses, return on high similarity.
    • Per-routing-config mode (exact / semantic / off), tunable similarity threshold.
    • Cache-hit metrics broken out in analytics.
  • [routing]

    Ensembles + cascades

    • New ensemble routing strategy: parallel fan-out to N models, aggregation via weighted vote, first-valid, or LLM judge.
    • New cascade strategy: sequential attempts with early stop on success — quality-tier fallback made easy.
    • Live cost estimator in the routing config builder for both strategies.
  • [routing]

    Cost- and latency-optimized routing

    • cost_optimized strategy picks the cheapest allowed model meeting a quality tier.
    • latency_optimized strategy uses rolling p50 measurements to prefer the fastest healthy provider.
    • A/B testing strategy lands for controlled rollouts between configs.
  • [providers]

    AWS Bedrock + Azure OpenAI adapters

    • Added Bedrock with IAM credential format (access key::secret::region[::session]).
    • Added Azure OpenAI with configurable base URLs per resource.
    • Both adapters normalize tool-calling and structured-output behavior to the OpenAI shape.
  • [dashboard]

    Visual routing builder

    • Drag-and-drop builder for fallback chains and ensembles.
    • Live dry-run panel shows the decision trace for a sample prompt without calling the provider.
    • Version diff view for every change.
  • [observability]

    Decision traces + full request logs

    • Every request now records the full routing decision: attempts tried, reasons, per-attempt timings and costs.
    • Log detail view in the dashboard with a routing trace card.
    • Structured tags on every log entry for filtering and analytics group-by.
  • [core]

    Fallback routing, health tracking, retries

    • fallback routing strategy with per-attempt timeouts and retry-on conditions (429, 5xx, timeout).
    • Per-provider rolling health (success rate, p50 latency) powers health-based routing.
    • OpenAI SDK streaming (SSE) passes through unchanged.
  • [core]

    Modelux 1.0 — public beta

    • OpenAI-compatible /v1/chat/completions and /v1/embeddings across OpenAI, Anthropic, and Google.
    • Projects, API keys, and BYO provider credentials.
    • Per-request cost computation with per-model pricing tables.
    • Free + Pro plans launched.