[view as .md]

Embeddings

Create vector embeddings. OpenAI-compatible request and response shape.

POST /openai/v1/embeddings

Request

{
  "model": "text-embedding-3-small",
  "input": ["Hello world", "Another string"]
}

Supports:

  • Single string or array of strings
  • Routing config slugs (@embeddings) just like chat completions
  • OpenAI, Google, Cohere, and Voyage embedding models through their respective providers

Response

{
  "object": "list",
  "data": [
    { "object": "embedding", "index": 0, "embedding": [0.012, -0.034, ...] },
    { "object": "embedding", "index": 1, "embedding": [0.056, 0.089, ...] }
  ],
  "model": "text-embedding-3-small",
  "usage": { "prompt_tokens": 10, "total_tokens": 10 }
}

Dimensions

Two ways to control the output vector size:

Per-request (body field) — pass dimensions: N for direct provider/model calls. Works on models that support the parameter (text-embedding-3-small, text-embedding-3-large). Ignored by models that don’t (text-embedding-ada-002 has a fixed 1536-dim output and upstream will 400 if the field is present — modelux strips it automatically).

Policy-pinned contract — on a routing config, set embedding_dimensions to declare the output shape. Every call to model: "@<slug>" returns a vector of exactly that length, regardless of the underlying model’s default:

POST /manage/v1/routing-configs
{
  "projectId": "proj_…",
  "name": "embeddings",
  "policy": "fallback_chain",
  "config": {
    "models": [
      { "model": "text-embedding-3-small", "provider_credential_id": "cred_…" },
      { "model": "text-embedding-3-large", "provider_credential_id": "cred_…" }
    ]
  },
  "embedding_dimensions": 1536
}

Then callers just pin the slug:

POST /openai/v1/embeddings
{ "model": "@embeddings", "input": "hello" }

No dimensions field needed — modelux injects the contract into the upstream call, shortens the vector via Matryoshka representation learning for models that support it, and refuses to route to candidates that can’t produce the pinned length. If upstream silently returns a different size, modelux responds with 502 dimension_mismatch rather than leaking a wrong-shape vector to the caller.

This is the recommended pattern for LangChain and similar clients that strip dimensions from the request body when the model name isn’t on their allow-list (e.g., any custom @slug alias).

Conflict with body.dimensions

When a config pins embedding_dimensions and the caller also sends dimensions in the body:

  • match → allowed (redundant but harmless)
  • mismatch → 400 dimension_conflict

Direct provider/model calls (no @slug) still honor body dimensions as before.