[view as .md]

Message batches

modelux proxies the full Anthropic Message Batches API (/v1/messages/batches/*) as an authenticated thin passthrough. Batch traffic shows up in the dashboard alongside synchronous traffic; auth, BYOK, and rate-limits work identically. Anthropic’s 50% async batch discount applies on the upstream side untouched.

POST   /anthropic/v1/messages/batches              create
GET    /anthropic/v1/messages/batches              list (limit / before_id / after_id)
GET    /anthropic/v1/messages/batches/{id}         retrieve
GET    /anthropic/v1/messages/batches/{id}/results JSONL results stream
POST   /anthropic/v1/messages/batches/{id}/cancel
DELETE /anthropic/v1/messages/batches/{id}

Create

A batch is an array of independent message requests, each with a custom_id you supply (used to correlate results back to inputs):

curl https://api.modelux.ai/anthropic/v1/messages/batches \
  -H "Authorization: Bearer mlx_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "requests": [
      {
        "custom_id": "row-1",
        "params": {
          "model": "claude-haiku-4-5",
          "max_tokens": 256,
          "messages": [{"role": "user", "content": "Summarize: ..."}]
        }
      },
      {
        "custom_id": "row-2",
        "params": {
          "model": "claude-haiku-4-5",
          "max_tokens": 256,
          "messages": [{"role": "user", "content": "Classify: ..."}]
        }
      }
    ]
  }'

Anything inside params is forwarded verbatim, so all the features /anthropic/v1/messages supports work inside a batch too — tools, tool_choice, system blocks with cache_control, thinking, vision, etc.

Response:

{
  "id": "msgbatch_01XYZ...",
  "type": "message_batch",
  "processing_status": "in_progress",
  "request_counts": {"processing": 2, "succeeded": 0, "errored": 0, "canceled": 0, "expired": 0},
  "created_at": "...",
  "expires_at": "...",
  "results_url": null
}

Poll for completion

Retrieve by id; check processing_status until it’s ended:

curl https://api.modelux.ai/anthropic/v1/messages/batches/msgbatch_01XYZ \
  -H "Authorization: Bearer mlx_sk_..."

Anthropic guarantees results within 24 hours; in practice small batches finish in minutes. List recent batches:

GET /anthropic/v1/messages/batches?limit=20&after_id=msgbatch_01ABC

Download results

Once processing_status: "ended", fetch the JSONL results stream (one line per sub-request, keyed by your custom_id):

curl https://api.modelux.ai/anthropic/v1/messages/batches/msgbatch_01XYZ/results \
  -H "Authorization: Bearer mlx_sk_..."

Each line is one of:

{"custom_id":"row-1","result":{"type":"succeeded","message":{"id":"msg_...","content":[...],"usage":{...}}}}
{"custom_id":"row-2","result":{"type":"errored","error":{"type":"invalid_request_error","message":"..."}}}
{"custom_id":"row-3","result":{"type":"canceled"}}
{"custom_id":"row-4","result":{"type":"expired"}}

The proxy streams the JSONL through with io.Copy, no buffering — the file can be arbitrarily large.

Cancel

POST /anthropic/v1/messages/batches/{id}/cancel

Returns the batch with processing_status: "canceling". Already- processed sub-requests are kept; in-flight ones are abandoned.

Delete

DELETE /anthropic/v1/messages/batches/{id}

Allowed only after the batch has reached a terminal state.

BYOK

X-Modelux-Provider-Key: sk-ant-... overrides the org’s stored Anthropic credential for this call. Same precedence as /anthropic/v1/messages — useful for trial users billing against their own Anthropic account while still getting modelux’s auth and analytics layer.

curl https://api.modelux.ai/anthropic/v1/messages/batches \
  -H "Authorization: Bearer mlx_sk_..." \
  -H "X-Modelux-Provider-Key: sk-ant-..." \
  -H "Content-Type: application/json" \
  -d '{"requests":[...]}'

Observability

Each endpoint logs to ClickHouse with a distinct request_type so the dashboard’s breakdowns separate batch operations from synchronous calls:

  • batch_create
  • batch_retrieve
  • batch_list
  • batch_results
  • batch_cancel
  • batch_delete

The proxy captures the model from the first request in the batch (for analytics — actual per-sub-request token counts and costs live in the results JSONL). Per-sub-request analytics inside the batch isn’t expanded into individual log rows; query results.jsonl directly for that breakdown.

SDK drop-in

The official Anthropic SDKs work as drop-in clients — both Authorization: Bearer mlx_sk_... and x-api-key: mlx_sk_... are accepted, so no auth-header swap is needed:

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.modelux.ai/anthropic",
    api_key="mlx_sk_...",
)

batch = client.messages.batches.create(
    requests=[
        {
            "custom_id": "row-1",
            "params": {
                "model": "claude-haiku-4-5",
                "max_tokens": 256,
                "messages": [{"role": "user", "content": "..."}],
            },
        },
    ],
)
print(batch.id, batch.processing_status)

See also