Attachments
MindRoom can process files, images, audio, and videos sent to Matrix rooms, passing them to agents for analysis or action.
Supported attachment kinds: audio, file, image, video.
Overview
When a user sends a file, image, audio message, or video in a Matrix room:
- The agent determines whether it should respond (via mention, thread participation, or DM)
- The media is downloaded and decrypted (if E2E encrypted)
- The file is saved locally and registered as a context-scoped attachment
- The agent receives the media as an Agno
File,Video,Audio, orImageobject plus an attachment ID it can reference in tool calls - The agent responds with its analysis or takes action on the file
Attachment support works automatically for all agents -- no configuration is needed.
How It Works
┌──────────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ File/Image/Audio │────>│ Download & │────>│ Register │────>│ Pass to AI │
│ /Video (Matrix) │ │ Decrypt │ │ Attachment │ │ Model │
└──────────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│
v
┌─────────────┐
│ Agent │
│ Responds │
└─────────────┘
Usage
Send a file, image, audio message, or video in a Matrix room and mention the agent in the caption:
- With caption:
@assistant Summarize this document-- the caption is used as the prompt - Without caption: The agent receives
[Attached file],[Attached image],[Attached audio], or[Attached video]as the prompt - Bare filename: If the body is just the filename (e.g.,
report.pdf), it is treated the same as no caption
Attachments work in both direct messages and threads, and with both individual agents and teams.
Attachment IDs
Each uploaded file or video is assigned a stable attachment ID (e.g., att_abc123).
The agent's prompt is augmented with the available IDs:
Attachment IDs are context-scoped -- an attachment registered in one room or thread is not accessible from another. This prevents cross-room data leakage for ID-based access. Voice raw-audio fallback uses the same attachment ID mechanism; see Voice Fallback.
The attachments Tool
Agents can use the optional attachments tool to interact with context-scoped attachments programmatically.
Enabling
Add attachments to the agent's tool list:
Operations
| Operation | Description |
|---|---|
list_attachments(target?) |
List metadata for attachments in the current context (ID, kind, local_path, filename, MIME type, size, room_id, thread_id, sender, created_at) |
get_attachment(attachment_id) |
Return one context attachment record, including its local file path |
register_attachment(file_path) |
Register a local file path as a context attachment ID (att_*) |
attachment_ids accepts only context attachment IDs (att_*).
attachment_file_paths accepts local file paths and auto-registers them in the current context before sending.
Use matrix_message(action="send"|"reply"|"thread-reply", attachment_ids=..., attachment_file_paths=...) to send attachments.
Why use this tool?
Not all AI models support direct file inputs.
The attachments tool lets any model work with files by calling tools that operate on attachment IDs, even if the model itself cannot ingest the raw bytes.
Encryption
Both unencrypted and E2E encrypted files and videos are supported. Encrypted media is decrypted transparently using the key material from the Matrix event.
Caching
AI response caching is automatically skipped when files, images, audio, or videos are present, since media payloads are large and unlikely to repeat.
Retention
MindRoom automatically prunes attachment metadata and managed incoming_media/ files older than 30 days.
Pruning runs opportunistically during new attachment registration.
Limitations
- Routing in multi-agent rooms -- in multi-agent rooms without an
@mention, the router selects the best agent based on the file caption. - Model support -- the configured model must support file or video inputs for direct analysis. Models that do not can still use the
attachmentstool to inspect and process files via tool calls.