Session grouping

Every request through modelux lives in a three-level hierarchy:

conversation                 ← a long-lived thread
  └── trace                  ← one agent run / user turn
        └── request          ← one HTTP call to the proxy

Request is the lowest level — a single HTTP call in, single response out. modelux gives every request a UUID and echoes it back in the X-Modelux-Request-Id response header. Always present. You don’t set it; modelux does.

Trace groups the LLM calls that serve a single user-facing prompt. In a simple chatbot that’s one LLM call per trace. In an agent workflow — tool loops, query-rewrite → retrieve → synthesize chains — one user turn fans out into several LLM calls, all sharing one trace_id. Vocabulary matches OpenTelemetry / LangSmith / Langfuse.

Conversation groups turns in a long-lived thread. A chatbot session between a user and your app spans many user↔assistant exchanges; each turn is one trace; the whole thread shares one conversation_id.

Both trace_id and conversation_id are customer-supplied — modelux doesn’t invent them because only you know where your turn boundaries are.

Setting them

HTTP headers — works on every SDK that lets you attach custom headers:

curl https://api.modelux.ai/openai/v1/chat/completions \
  -H "Authorization: Bearer mlx_sk_..." \
  -H "X-Modelux-Conversation-Id: chat-42" \
  -H "X-Modelux-Trace-Id: turn-17" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hi"}]}'

Body metadata — for SDKs that don’t expose custom headers (some chat libraries), modelux reads metadata.trace_id and metadata.conversation_id from the OpenAI-compatible request body as a fallback:

{
  "model": "gpt-4o-mini",
  "messages": [{"role": "user", "content": "Hi"}],
  "metadata": {
    "conversation_id": "chat-42",
    "trace_id": "turn-17"
  }
}

The same fallback works on the Anthropic surface — set metadata.trace_id / metadata.conversation_id alongside the existing metadata.user_id. modelux parses the metadata and strips it before forwarding to the upstream provider, so OpenAI/Anthropic won’t see unknown fields.

Headers win when both are set.

Where you see it in the dashboard

Overview — “Top conversations by cost” card ranks long-lived threads by spend in the selected window. Hidden when no requests in the window carry a conversation_id.
Analytics — “Trace fan-out” histogram shows the distribution of requests per trace (1, 2-3, 4-6, 7-10, 11-20, 21+). The right tail is where runaway tool loops and buggy agents show up. A “Worst trace →” link jumps to the single trace with the highest fanout.
Logs — filter by traceId or conversationId in the filter bar. When a conversationId is set, a summary card appears above the logs table with the full per-trace timeline of that conversation. Stats bar above the table shows N conversations · M traces · K requests · $X for the current filter.
Log detail — Trace ID and Conversation ID render as clickable links that drill into the filtered view.

Management API + MCP

GET /manage/v1/logs?conversationId=<id>&traceId=<id>
GET /manage/v1/conversations/[id]     # timeline + aggregates

MCP tools: list_logs takes the same filter params, and get_conversation returns the timeline programmatically.

What to use them for

Question	Filter or view
”Which chat threads are costing the most?”	Overview → Top conversations
”Are any of my agent runs fanning out too much?”	Analytics → Trace fan-out
”Show me every call in conversation X”	`/logs?conversationId=X`
”Show me every call in this one agent run”	`/logs?traceId=X`
”What’s the cost of this conversation?”	Conversation summary card on `/logs?conversationId=X`

What about `end_user_id`?

End user is an orthogonal dimension — one user can span many conversations, and one conversation can involve multiple users (e.g. support handoff). Set it via X-Modelux-User-Id or the OpenAI-compatible user field on the request body. It answers “who” rather than “which thread.” All three (end_user_id, conversation_id, trace_id) can be set independently.