Message batches
modelux proxies the full Anthropic Message Batches API
(/v1/messages/batches/*) as an authenticated thin passthrough.
Batch traffic shows up in the dashboard alongside synchronous traffic;
auth, BYOK, and rate-limits work identically. Anthropic’s 50% async
batch discount applies on the upstream side untouched.
POST /anthropic/v1/messages/batches create
GET /anthropic/v1/messages/batches list (limit / before_id / after_id)
GET /anthropic/v1/messages/batches/{id} retrieve
GET /anthropic/v1/messages/batches/{id}/results JSONL results stream
POST /anthropic/v1/messages/batches/{id}/cancel
DELETE /anthropic/v1/messages/batches/{id}
Create
A batch is an array of independent message requests, each with a
custom_id you supply (used to correlate results back to inputs):
curl https://api.modelux.ai/anthropic/v1/messages/batches \
-H "Authorization: Bearer mlx_sk_..." \
-H "Content-Type: application/json" \
-d '{
"requests": [
{
"custom_id": "row-1",
"params": {
"model": "claude-haiku-4-5",
"max_tokens": 256,
"messages": [{"role": "user", "content": "Summarize: ..."}]
}
},
{
"custom_id": "row-2",
"params": {
"model": "claude-haiku-4-5",
"max_tokens": 256,
"messages": [{"role": "user", "content": "Classify: ..."}]
}
}
]
}'
Anything inside params is forwarded verbatim, so all the features
/anthropic/v1/messages supports
work inside a batch too — tools, tool_choice, system blocks
with cache_control, thinking, vision, etc.
Response:
{
"id": "msgbatch_01XYZ...",
"type": "message_batch",
"processing_status": "in_progress",
"request_counts": {"processing": 2, "succeeded": 0, "errored": 0, "canceled": 0, "expired": 0},
"created_at": "...",
"expires_at": "...",
"results_url": null
}
Poll for completion
Retrieve by id; check processing_status until it’s ended:
curl https://api.modelux.ai/anthropic/v1/messages/batches/msgbatch_01XYZ \
-H "Authorization: Bearer mlx_sk_..."
Anthropic guarantees results within 24 hours; in practice small batches finish in minutes. List recent batches:
GET /anthropic/v1/messages/batches?limit=20&after_id=msgbatch_01ABC
Download results
Once processing_status: "ended", fetch the JSONL results stream
(one line per sub-request, keyed by your custom_id):
curl https://api.modelux.ai/anthropic/v1/messages/batches/msgbatch_01XYZ/results \
-H "Authorization: Bearer mlx_sk_..."
Each line is one of:
{"custom_id":"row-1","result":{"type":"succeeded","message":{"id":"msg_...","content":[...],"usage":{...}}}}
{"custom_id":"row-2","result":{"type":"errored","error":{"type":"invalid_request_error","message":"..."}}}
{"custom_id":"row-3","result":{"type":"canceled"}}
{"custom_id":"row-4","result":{"type":"expired"}}
The proxy streams the JSONL through with io.Copy, no buffering — the
file can be arbitrarily large.
Cancel
POST /anthropic/v1/messages/batches/{id}/cancel
Returns the batch with processing_status: "canceling". Already-
processed sub-requests are kept; in-flight ones are abandoned.
Delete
DELETE /anthropic/v1/messages/batches/{id}
Allowed only after the batch has reached a terminal state.
BYOK
X-Modelux-Provider-Key: sk-ant-... overrides the org’s stored
Anthropic credential for this call. Same precedence as
/anthropic/v1/messages — useful for trial users billing against
their own Anthropic account while still getting modelux’s auth and
analytics layer.
curl https://api.modelux.ai/anthropic/v1/messages/batches \
-H "Authorization: Bearer mlx_sk_..." \
-H "X-Modelux-Provider-Key: sk-ant-..." \
-H "Content-Type: application/json" \
-d '{"requests":[...]}'
Observability
Each endpoint logs to ClickHouse with a distinct request_type so the
dashboard’s breakdowns separate batch operations from synchronous
calls:
batch_createbatch_retrievebatch_listbatch_resultsbatch_cancelbatch_delete
The proxy captures the model from the first request in the batch (for
analytics — actual per-sub-request token counts and costs live in the
results JSONL). Per-sub-request analytics inside the batch isn’t
expanded into individual log rows; query results.jsonl directly for
that breakdown.
SDK drop-in
The official Anthropic SDKs work as drop-in clients — both
Authorization: Bearer mlx_sk_... and x-api-key: mlx_sk_... are
accepted, so no auth-header swap is needed:
from anthropic import Anthropic
client = Anthropic(
base_url="https://api.modelux.ai/anthropic",
api_key="mlx_sk_...",
)
batch = client.messages.batches.create(
requests=[
{
"custom_id": "row-1",
"params": {
"model": "claude-haiku-4-5",
"max_tokens": 256,
"messages": [{"role": "user", "content": "..."}],
},
},
],
)
print(batch.id, batch.processing_status)
See also
- Messages (Anthropic shape) — the synchronous endpoint
- Capability matrix — what’s supported where