Agent Orchestration

The MultiAgentOrchestrator (in src/mindroom/orchestrator.py) manages the lifecycle of all agents, teams, and the router.

Boot Sequence

main() entry
       │
       ▼
┌──────────────────┐
│ Sync Provider    │
│ Credentials      │
│ (.env/bootstrap  │
│ env → shared     │
│ credentials)     │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│  Initialize()    │
│ ─────────────────│
│ 1. Parse config  │
│    (Pydantic)    │
│ 2. Load plugins  │
│ 3. Create "user" │
│    Matrix account│
│    (mindroom_user)│
│ 4. Prepare       │
│    entity Matrix │
│    accounts      │
│ 5. Create bots   │
│    for entities  │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│    Start()       │
│ ─────────────────│
│ 1. try_start()   │
│    each bot      │
│ 2. Setup rooms   │
│    & memberships │
│ 3. Create sync   │
│    tasks         │
└────────┬─────────┘
         │
         ▼
┌──────────────────────────────────────┐
│  Auxiliary Tasks (auto-restart)      │
│ ─────────────────────────────────────│
│ • config watcher (file polling)      │
│ • skills watcher (skill cache)       │
│ • API server (if enabled)            │
│  (each wrapped in                    │
│   _run_auxiliary_task_forever)        │
└───────────────┬──────────────────────┘
                │
                ▼
┌──────────────────────────────────────┐
│  Bot Sync Tasks (asyncio.gather)     │
│ ─────────────────────────────────────│
│ • One sync loop per bot              │
│ • sync_forever_with_restart()        │
│ • Awaited until shutdown             │
└──────────────────────────────────────┘

Key details:

Entity order: Router first, then agents, then teams
Room setup (_setup_rooms_and_memberships): Router creates rooms, invites agents, teams, and users, then bots join
Sync loops: Each bot runs sync_forever_with_restart() with automatic retry
Internal user identity: mindroom_user.username is the account-creation request; runtime authorization uses the persisted actual Matrix ID

Hot Reload

Config changes are detected via polling (watch_file() checks st_mtime every second):

On change, update_config() is called
_identify_entities_to_restart() computes diff using model_dump(exclude_none=True)
Affected entities are stopped, recreated, and restarted
Removed entities run cleanup() (leave rooms, stop bot)
New/restarted bots go through room setup

Skills are watched separately via _watch_skills_task() with cache invalidation.

Orchestration Subpackage

The src/mindroom/orchestration/ subpackage contains helpers extracted from the monolithic orchestrator:

runtime.py — Sync loop helpers: sync_forever_with_restart() with linear backoff (capped at 60s), cancel_task(), and create_logged_task() for safe asyncio task creation.
config_updates.py — Config diffing and reload planning: build_config_update_plan() computes a ConfigUpdatePlan by calling _identify_entities_to_restart(), which diffs old and new configs using model_dump(exclude_none=True).
rooms.py — Room invitation helpers: get_authorized_user_ids_to_invite() and get_root_space_user_ids_to_invite() compute which users should be invited to managed rooms and the root Matrix space.

Runtime Resolution

Agent and team materialization is handled by dedicated top-level modules (not inside the orchestration/ subpackage):

src/mindroom/runtime_resolution.py — Resolves ResolvedAgentRuntime (the full set of runtime parameters for one agent instance) including ResolvedKnowledgeBinding for knowledge base attachment.
src/mindroom/team_exact_members.py — Resolves ResolvedExactTeamMembers for team materialization via materialize_exact_requested_team_members().
src/mindroom/agent_policy.py — Resolves canonical execution policies and private-team eligibility derived from authored agent config.
src/mindroom/model_loading.py — Owns get_model_instance() and provider-specific model loader selection.
src/mindroom/ai_runtime.py — Owns agent-run input copying, queued-notice hooks, and inline-media fallback helpers used during execution.
src/mindroom/agent_storage.py — Owns agent session and learning SQLite storage construction helpers.
src/mindroom/agent_descriptions.py — Owns shared agent description rendering used by routing and delegation.
src/mindroom/runtime_state.py — Shared runtime readiness state with set_runtime_starting(), set_runtime_ready(), and set_runtime_failed() used by health endpoints.

Message Handling

Event callbacks are wrapped in _create_task_wrapper() to run as background tasks, ensuring the sync loop is never blocked.

_on_message flow:

Skip own messages (except voice transcriptions from router)
Check sender authorization and handle edits
Check if already responded (ResponseTracker)
Router handles commands exclusively
Extract message context (mentions, thread history, non-agent mention detection)
Skip messages from other agents (unless mentioned)
Router routes when no agent or team is mentioned and thread doesn't have multiple human participants
Check for team formation or individual response
Generate response and store memory

Message edits: When a user edits a message that already received an agent response, the agent regenerates its response for the updated content. The agent edits its own previous reply in place rather than sending a new message. Edits from other agents are ignored, and the feature requires that the original response event ID is tracked by the ResponseTracker.

_on_media_message: Handles media events (images, videos, files, and audio). Downloads and decrypts media data, then processes it through the selected responder. When no agent or team is mentioned, routing selects the appropriate agent or team, similar to text messages.

_on_reaction: Handles ReactionEvent for the interactive Q&A system (e.g., confirming or rejecting agent suggestions) and config confirmation workflows.

Routing (when no agent or team is mentioned): Router narrows candidates from room configuration or joined MindRoom entities, filters them by sender permissions, lets one remaining candidate answer directly, and uses suggest_responder_for_message() only when multiple candidates remain. In threads where multiple non-agent users have posted, routing is skipped entirely — an explicit @mention is required. Non-MindRoom bots listed in bot_accounts are excluded from this detection.

Concurrency

Each bot runs its own sync loop via sync_forever_with_restart()
Sync loop failures trigger automatic restart with linear backoff (5s, 10s, 15s, ... up to 60s max)
Event callbacks run as background tasks (never block the sync loop)
ResponseTracker prevents duplicate replies
StopManager handles cancellation of in-progress responses

Graceful Shutdown

On orchestrator.stop():

Set self.running = False
Cancel config reload task
Stop memory auto-flush worker
Shut down the per-binding knowledge refresh scheduler
Cancel pending bot start tasks
Stop the MCP manager
Cancel all sync tasks
Signal all bots to stop (bot.running = False)
Call bot.stop() for each bot concurrently (waits 5s for background tasks, cancels scheduled tasks, closes Matrix client)