Session grouping
Every request through modelux lives in a three-level hierarchy:
conversation ← a long-lived thread
└── trace ← one agent run / user turn
└── request ← one HTTP call to the proxy
Request is the lowest level — a single HTTP call in, single response out. modelux
gives every request a UUID and echoes it back in the X-Modelux-Request-Id
response header. Always present. You don’t set it; modelux does.
Trace groups the LLM calls that serve a single user-facing prompt. In a
simple chatbot that’s one LLM call per trace. In an agent workflow — tool
loops, query-rewrite → retrieve → synthesize chains — one user turn fans
out into several LLM calls, all sharing one trace_id. Vocabulary matches
OpenTelemetry / LangSmith / Langfuse.
Conversation groups turns in a long-lived thread. A chatbot session
between a user and your app spans many user↔assistant exchanges; each turn
is one trace; the whole thread shares one conversation_id.
Both trace_id and conversation_id are customer-supplied — modelux
doesn’t invent them because only you know where your turn boundaries are.
Setting them
HTTP headers — works on every SDK that lets you attach custom headers:
curl https://api.modelux.ai/openai/v1/chat/completions \
-H "Authorization: Bearer mlx_sk_..." \
-H "X-Modelux-Conversation-Id: chat-42" \
-H "X-Modelux-Trace-Id: turn-17" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hi"}]}'
Body metadata — for SDKs that don’t expose custom headers (some chat
libraries), modelux reads metadata.trace_id and metadata.conversation_id
from the OpenAI-compatible request body as a fallback:
{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hi"}],
"metadata": {
"conversation_id": "chat-42",
"trace_id": "turn-17"
}
}
The same fallback works on the Anthropic surface — set metadata.trace_id /
metadata.conversation_id alongside the existing metadata.user_id.
modelux parses the metadata and strips it before forwarding to the upstream
provider, so OpenAI/Anthropic won’t see unknown fields.
Headers win when both are set.
Where you see it in the dashboard
- Overview — “Top conversations by cost” card ranks long-lived threads
by spend in the selected window. Hidden when no requests in the window
carry a
conversation_id. - Analytics — “Trace fan-out” histogram shows the distribution of requests per trace (1, 2-3, 4-6, 7-10, 11-20, 21+). The right tail is where runaway tool loops and buggy agents show up. A “Worst trace →” link jumps to the single trace with the highest fanout.
- Logs — filter by
traceIdorconversationIdin the filter bar. When aconversationIdis set, a summary card appears above the logs table with the full per-trace timeline of that conversation. Stats bar above the table showsN conversations · M traces · K requests · $Xfor the current filter. - Log detail —
Trace IDandConversation IDrender as clickable links that drill into the filtered view.
Management API + MCP
GET /manage/v1/logs?conversationId=<id>&traceId=<id>
GET /manage/v1/conversations/[id] # timeline + aggregates
MCP tools: list_logs takes the same filter params, and get_conversation
returns the timeline programmatically.
What to use them for
| Question | Filter or view |
|---|---|
| ”Which chat threads are costing the most?” | Overview → Top conversations |
| ”Are any of my agent runs fanning out too much?” | Analytics → Trace fan-out |
| ”Show me every call in conversation X” | /logs?conversationId=X |
| ”Show me every call in this one agent run” | /logs?traceId=X |
| ”What’s the cost of this conversation?” | Conversation summary card on /logs?conversationId=X |
What about end_user_id?
End user is an orthogonal dimension — one user can span many conversations,
and one conversation can involve multiple users (e.g. support handoff).
Set it via X-Modelux-User-Id or the OpenAI-compatible user field on the
request body. It answers “who” rather than “which thread.” All three
(end_user_id, conversation_id, trace_id) can be set independently.