<!-- source: https://modelux.ai/docs/api/anthropic-batches -->

> /anthropic/v1/messages/batches — async batch processing at 50% upstream discount.

# Message batches

modelux proxies the full Anthropic Message Batches API
(`/v1/messages/batches/*`) as an authenticated thin passthrough.
Batch traffic shows up in the dashboard alongside synchronous traffic;
auth, BYOK, and rate-limits work identically. Anthropic's 50% async
batch discount applies on the upstream side untouched.

```
POST   /anthropic/v1/messages/batches              create
GET    /anthropic/v1/messages/batches              list (limit / before_id / after_id)
GET    /anthropic/v1/messages/batches/{id}         retrieve
GET    /anthropic/v1/messages/batches/{id}/results JSONL results stream
POST   /anthropic/v1/messages/batches/{id}/cancel
DELETE /anthropic/v1/messages/batches/{id}
```

## Create

A batch is an array of independent message requests, each with a
`custom_id` you supply (used to correlate results back to inputs):

```bash
curl https://api.modelux.ai/anthropic/v1/messages/batches \
  -H "Authorization: Bearer mlx_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "requests": [
      {
        "custom_id": "row-1",
        "params": {
          "model": "claude-haiku-4-5",
          "max_tokens": 256,
          "messages": [{"role": "user", "content": "Summarize: ..."}]
        }
      },
      {
        "custom_id": "row-2",
        "params": {
          "model": "claude-haiku-4-5",
          "max_tokens": 256,
          "messages": [{"role": "user", "content": "Classify: ..."}]
        }
      }
    ]
  }'
```

Anything inside `params` is forwarded verbatim, so all the features
[`/anthropic/v1/messages`](/docs/api/anthropic-messages) supports
work inside a batch too — `tools`, `tool_choice`, `system` blocks
with `cache_control`, `thinking`, vision, etc.

Response:

```json
{
  "id": "msgbatch_01XYZ...",
  "type": "message_batch",
  "processing_status": "in_progress",
  "request_counts": {"processing": 2, "succeeded": 0, "errored": 0, "canceled": 0, "expired": 0},
  "created_at": "...",
  "expires_at": "...",
  "results_url": null
}
```

## Poll for completion

Retrieve by id; check `processing_status` until it's `ended`:

```bash
curl https://api.modelux.ai/anthropic/v1/messages/batches/msgbatch_01XYZ \
  -H "Authorization: Bearer mlx_sk_..."
```

Anthropic guarantees results within 24 hours; in practice small batches
finish in minutes. List recent batches:

```
GET /anthropic/v1/messages/batches?limit=20&after_id=msgbatch_01ABC
```

## Download results

Once `processing_status: "ended"`, fetch the JSONL results stream
(one line per sub-request, keyed by your `custom_id`):

```bash
curl https://api.modelux.ai/anthropic/v1/messages/batches/msgbatch_01XYZ/results \
  -H "Authorization: Bearer mlx_sk_..."
```

Each line is one of:

```json
{"custom_id":"row-1","result":{"type":"succeeded","message":{"id":"msg_...","content":[...],"usage":{...}}}}
{"custom_id":"row-2","result":{"type":"errored","error":{"type":"invalid_request_error","message":"..."}}}
{"custom_id":"row-3","result":{"type":"canceled"}}
{"custom_id":"row-4","result":{"type":"expired"}}
```

The proxy streams the JSONL through with `io.Copy`, no buffering — the
file can be arbitrarily large.

## Cancel

```
POST /anthropic/v1/messages/batches/{id}/cancel
```

Returns the batch with `processing_status: "canceling"`. Already-
processed sub-requests are kept; in-flight ones are abandoned.

## Delete

```
DELETE /anthropic/v1/messages/batches/{id}
```

Allowed only after the batch has reached a terminal state.

## BYOK

`X-Modelux-Provider-Key: sk-ant-...` overrides the org's stored
Anthropic credential for this call. Same precedence as
`/anthropic/v1/messages` — useful for trial users billing against
their own Anthropic account while still getting modelux's auth and
analytics layer.

```bash
curl https://api.modelux.ai/anthropic/v1/messages/batches \
  -H "Authorization: Bearer mlx_sk_..." \
  -H "X-Modelux-Provider-Key: sk-ant-..." \
  -H "Content-Type: application/json" \
  -d '{"requests":[...]}'
```

## Observability

Each endpoint logs to ClickHouse with a distinct `request_type` so the
dashboard's breakdowns separate batch operations from synchronous
calls:

- `batch_create`
- `batch_retrieve`
- `batch_list`
- `batch_results`
- `batch_cancel`
- `batch_delete`

The proxy captures the model from the first request in the batch (for
analytics — actual per-sub-request token counts and costs live in the
results JSONL). Per-sub-request analytics inside the batch isn't
expanded into individual log rows; query `results.jsonl` directly for
that breakdown.

## SDK drop-in

The official Anthropic SDKs work as drop-in clients — both
`Authorization: Bearer mlx_sk_...` and `x-api-key: mlx_sk_...` are
accepted, so no auth-header swap is needed:

```python
from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.modelux.ai/anthropic",
    api_key="mlx_sk_...",
)

batch = client.messages.batches.create(
    requests=[
        {
            "custom_id": "row-1",
            "params": {
                "model": "claude-haiku-4-5",
                "max_tokens": 256,
                "messages": [{"role": "user", "content": "..."}],
            },
        },
    ],
)
print(batch.id, batch.processing_status)
```

## See also

- [Messages (Anthropic shape)](/docs/api/anthropic-messages) — the synchronous endpoint
- [Capability matrix](/docs/concepts/capability-matrix) — what's supported where
