Skip to content

OpenAI-Compatible API

MindRoom exposes an OpenAI-compatible chat completions API so any chat frontend can use MindRoom agents as selectable "models". LibreChat, Open WebUI, LobeChat, ChatBox, BoltAI, and anything else that speaks the OpenAI protocol works out of the box.

How It Works

The frontend calls GET /v1/models and sees your agents in the model picker. The user picks an agent and chats. The frontend sends standard OpenAI requests; MindRoom routes them to the selected agent with all its tools, instructions, and memory. The frontend doesn't know it's talking to an agent — it's transparent.

Chat Frontend (LibreChat, Open WebUI, etc.)
│  GET  /v1/models           → returns your agents as "models"
│  POST /v1/chat/completions → routes to the selected agent
└──→ MindRoom API ──→ ai_response() / stream_agent_response()
                         └──→ agents, tools, memory, knowledge bases

No Matrix auth dependency. You can run the OpenAI-compatible API standalone or alongside the Matrix bot.

Setup

1. Set API keys

Add to your .env:

# Option A: Set API keys (recommended for production)
OPENAI_COMPAT_API_KEYS=sk-my-secret-key-1,sk-my-secret-key-2

# Option B: Allow unauthenticated access (local dev only)
OPENAI_COMPAT_ALLOW_UNAUTHENTICATED=true

Without either of these, the API returns 401 on all requests.

2. Start MindRoom

# Full MindRoom runtime (Matrix bot + API server + dashboard)
uv run mindroom run

# Or via just
just start-mindroom-dev

The API is available at http://localhost:8765/v1/.

Important

If the dashboard and /v1/* share a domain behind a reverse proxy, route /v1/* to the MindRoom runtime (in addition to /api/*). Otherwise OpenAI-compatible requests can be handled by the dashboard and fail.

3. Verify

# List available agents
curl -H "Authorization: Bearer sk-my-secret-key-1" \
  http://localhost:8765/v1/models

# Chat (non-streaming)
curl -H "Authorization: Bearer sk-my-secret-key-1" \
  -H "Content-Type: application/json" \
  -d '{"model":"general","messages":[{"role":"user","content":"Hello"}]}' \
  http://localhost:8765/v1/chat/completions

# Chat (streaming)
curl -N -H "Authorization: Bearer sk-my-secret-key-1" \
  -H "Content-Type: application/json" \
  -d '{"model":"general","messages":[{"role":"user","content":"Hello"}],"stream":true}' \
  http://localhost:8765/v1/chat/completions

Client Configuration

LibreChat

Add to your librechat.yaml:

endpoints:
  custom:
    - name: "MindRoom"
      apiKey: "${MINDROOM_API_KEY}"
      baseURL: "http://localhost:8765/v1"
      models:
        default: ["general"]
        fetch: true
      modelDisplayLabel: "MindRoom"
      titleConvo: true
      titleModel: "general"
      dropParams: ["stop", "frequency_penalty", "presence_penalty", "top_p"]
      headers:
        # Highest-priority session key used by MindRoom
        X-Session-Id: "{{LIBRECHAT_BODY_CONVERSATIONID}}"
        # Backward-compatible fallback used by MindRoom
        X-LibreChat-Conversation-Id: "{{LIBRECHAT_BODY_CONVERSATIONID}}"

X-Session-Id is recommended when you want deterministic MindRoom session continuity. This is especially important for tools that keep long-lived sessions inside the MindRoom runtime. X-LibreChat-Conversation-Id alone is still enough to keep continuity if you already use it.

Open WebUI

  1. Go to Admin Settings > Connections > OpenAI > Manage
  2. Set API URL to http://localhost:8765/v1
  3. Set API Key to one of your OPENAI_COMPAT_API_KEYS
  4. Agents appear automatically in the model picker

Any OpenAI-compatible client

Point the base URL at http://localhost:8765/v1 and set the API key. MindRoom implements the OpenAI-compatible GET /v1/models and POST /v1/chat/completions endpoints.

Features

Model selection

Each agent in config.yaml appears as a selectable model. The model ID is the agent's internal name (e.g., code, research), and the display name comes from display_name. Only shared agents that are either unscoped or explicitly configured with worker_scope=shared appear in /v1/models. Agents that use agents.<name>.private are not listed there, because private.per creates requester-private instances and therefore an isolating execution scope.

Auto-routing

Select the auto model to let MindRoom's router pick the best agent for each message, the same routing logic used in Matrix rooms. Once routing resolves a specific agent, session continuity and streamed identity bind to that resolved agent name, not the literal auto label.

Teams

Teams are exposed as team/<team_name> models. Selecting team/super_team runs the full team collaboration or coordination workflow.

Streaming

stream: true returns Server-Sent Events in the standard OpenAI format: role chunk, content chunks, finish chunk, [DONE].

Tool calls appear inline as text in the stream (not as native OpenAI tool_calls deltas). MindRoom currently emits tool events in stream chunks as inline <tool id="N" state="start|done">...</tool> content.

Multimodal messages

When a message's content is an array of content parts (the OpenAI multimodal format), MindRoom extracts only the text parts and concatenates them as the prompt. Non-text parts such as image_url are silently ignored by the current implementation. Agents still process the text normally with all their configured tools and instructions.

Session continuity

Session IDs are derived from request headers:

  1. X-Session-Id header (explicit control)
  2. X-LibreChat-Conversation-Id header (automatic with LibreChat)
  3. Random UUID fallback

Agent memory and conversation history persist across requests with the same session ID. For persistent MindRoom tool sessions (for example a long-running coding session), prefer X-Session-Id.

Session IDs are namespaced internally with a hash of the API key to prevent cross-key session collision. Two different API keys using the same X-Session-Id value will not share a session.

Claude Agent tool sessions

If an agent enables the claude_agent tool, the same X-Session-Id keeps the Claude session alive across turns. This lets a user continue one long coding flow instead of starting a fresh Claude process on every request. See Claude Agent Sessions for configuration details.

Parallel Claude sub-sessions are supported by using different session_label values in tool calls:

  • Same session_label: one shared Claude session (serialized by a per-session lock)
  • Different session_label: independent Claude sessions that can run concurrently

Knowledge bases

Agents with configured knowledge_bases in config.yaml get RAG support automatically. No additional API configuration needed. For Git-backed knowledge bases, API-only deployments auto-clone/sync/index on manager initialization.

What's ignored

The API accepts but ignores these OpenAI parameters (the agent's own config controls them):

  • temperature, top_p, max_tokens, max_completion_tokens
  • tools, tool_choice (agents use their configured tools)
  • n, stop, frequency_penalty, presence_penalty, seed
  • response_format, logprobs, logit_bias
  • stream_options (usage stats are always zeros)

Client system / developer messages are prepended to the prompt. They augment the agent's built-in instructions, not replace them.

Authentication

OPENAI_COMPAT_API_KEYS OPENAI_COMPAT_ALLOW_UNAUTHENTICATED Behavior
Set (any) Bearer token required, must match one of the comma-separated keys
Unset true No authentication required
Unset Unset/false All requests return 401 (locked)

The OpenAI-compatible API uses its own auth (OPENAI_COMPAT_API_KEYS), separate from the dashboard API auth. In standalone mode, the dashboard /api/* endpoints can be protected with MINDROOM_API_KEY; the browser dashboard uses a same-origin auth cookie, while CLI and curl clients can still send Authorization: Bearer .... These are independent: MINDROOM_API_KEY secures the dashboard, while OPENAI_COMPAT_API_KEYS secures the /v1/* chat completions endpoints.

Limitations

  • Token usage is always zeros — Agno doesn't expose token counts
  • No native tool_calls format — tool results appear inline in content text
  • show_tool_calls config is Matrix-only today — OpenAI-compatible /v1/chat/completions currently includes tool-call text/events regardless of show_tool_calls: false
  • No room memory — only agent-scoped memory (no room_id in API requests)
  • No requester-private instances/v1 currently supports only shared agents that are unscoped or configured with worker_scope=shared, so agents.<name>.private and other isolating execution scopes are not available there
  • Scheduler tool unavailable — scheduling requires Matrix context and returns an error message when no Matrix scheduling context is available