Web Search

Use these tools to search the public web, query paid search APIs, access Google- or Baidu-oriented results, or point agents at a self-hosted SearXNG instance.

What This Page Covers

This page documents the built-in tools in the web-search group. Use these tools when you need general web discovery, current-events search, answer-style search APIs, Google or Baidu specific results, or a self-hosted metasearch backend.

Tools On This Page

[duckduckgo] - No-key DuckDuckGo-backed web and news search through the shared DDGS backend.
[googlesearch] - No-key Google-backed web and news search through the shared DDGS backend.
[baidusearch] - No-key Baidu search tuned for Chinese-language discovery.
[tavily] - API-backed current-information search with optional answer, context, and URL extraction modes.
[exa] - API-backed research search with content fetching, similar-page lookup, answers, and deep research tasks.
[serpapi] - API-backed Google and YouTube SERP access.
[serper] - API-backed Google web, news, and scholar search plus lightweight webpage scraping.
[searxng] - Self-hosted SearXNG search across web, images, maps, music, science, news, and video.
[linkup] - API-backed web search that can return either raw search results or sourced answers.

Common Setup Notes

duckduckgo, googlesearch, and baidusearch are setup_type: none, so they work out of the box once their optional Python dependencies are available. tavily, exa, serpapi, serper, and linkup are status=requires_config and are intended to be configured with a stored api_key. searxng is also status=requires_config, but it needs a reachable host URL instead of an API key. None of the tools on this page declare an auth_provider, and src/mindroom/api/integrations.py currently only exposes Spotify OAuth routes, so these tools use ordinary tool credentials or SDK environment variables rather than a dedicated dashboard OAuth flow. Password fields such as api_key should be stored through the dashboard or credential store instead of inline YAML. Current upstream SDKs also support environment variables such as TAVILY_API_KEY, TAVILY_API_BASE_URL, EXA_API_KEY, SERP_API_KEY, SERPER_API_KEY, and LINKUP_API_KEY. Missing optional dependencies can auto-install at first use unless MINDROOM_NO_AUTO_INSTALL_TOOLS=1 is set. duckduckgo and googlesearch are the simplest no-key defaults for general search and basic news lookups. baidusearch is the better fit when you want Baidu indexing or Chinese-language defaults. tavily and linkup are useful when you want answer-oriented search output instead of only result lists. exa is the deepest research option on this page when you need domain filters, date filters, content fetches, find-similar, or a long-running research task. serpapi and serper are Google-focused paid APIs, with serpapi covering Google and YouTube verticals and serper covering Google web, news, scholar, and a scrape endpoint. searxng is the best fit when you control your own search stack or want SearXNG categories such as images, maps, music, science, and video.

[`duckduckgo`]

duckduckgo is the simplest built-in web search option for general search and news without any API key setup.

What It Does

duckduckgo wraps Agno's DuckDuckGoTools, which is a convenience layer over the shared WebSearchTools backend with backend="duckduckgo". It exposes web_search(query, max_results=5) and search_news(query, max_results=5). modifier prepends extra query text, fixed_max_results caps all calls, and proxy, timeout, and verify_ssl control the underlying DDGS client. The tool returns JSON strings from DDGS rather than a MindRoom-specific normalized response format.

Configuration

Option	Type	Required	Default	Notes
`enable_search`	`boolean`	`no`	`true`	Enable `web_search()`.
`enable_news`	`boolean`	`no`	`true`	Enable `search_news()`.
`modifier`	`text`	`no`	`null`	Prepends fixed query text to every web search.
`fixed_max_results`	`number`	`no`	`null`	Caps result count for both web and news searches.
`proxy`	`url`	`no`	`null`	Optional proxy for DDGS requests.
`timeout`	`number`	`no`	`10`	Request timeout in seconds.
`verify_ssl`	`boolean`	`no`	`true`	Verify TLS certificates for DDGS requests.

Example

agents:
  researcher:
    tools:
      - duckduckgo:
          enable_news: true
          fixed_max_results: 8

web_search("latest Matrix client features", max_results=5)
search_news("Matrix ecosystem", max_results=5)

Notes

Pick duckduckgo when you want the lowest-friction no-key option for general web and news search.
Pick googlesearch instead when you want Google-style ranking but still do not want a paid API.
Pick tavily, exa, serper, or serpapi when you need provider-backed APIs, answer generation, or more vertical-specific search behavior.

[`googlesearch`]

googlesearch uses the same DDGS-powered search surface as duckduckgo, but it hardwires the backend to Google.

What It Does

MindRoom registers googlesearch as a custom wrapper around Agno's WebSearchTools with backend="google". It exposes web_search(query, max_results=5) and search_news(query, max_results=5). Runtime behavior matches the duckduckgo tool surface, including modifier, fixed_max_results, proxy, timeout, and verify_ssl. This is still a DDGS-backed scraper-style search path rather than an official Google paid search API.

Configuration

Option	Type	Required	Default	Notes
`enable_search`	`boolean`	`no`	`true`	Enable `web_search()`. The current registry metadata marks this field as text, but the wrapper expects a boolean.
`enable_news`	`boolean`	`no`	`true`	Enable `search_news()`. The current registry metadata marks this field as text, but the wrapper expects a boolean.
`modifier`	`text`	`no`	`null`	Prepends fixed query text to every web search.
`fixed_max_results`	`number`	`no`	`null`	Caps result count for both web and news searches. The current registry metadata marks this field as text.
`proxy`	`url`	`no`	`null`	Optional proxy for DDGS requests. The current registry metadata marks this field as text.
`timeout`	`number`	`no`	`10`	Request timeout in seconds. The current registry metadata marks this field as text.
`verify_ssl`	`boolean`	`no`	`true`	Verify TLS certificates for DDGS requests. The current registry metadata marks this field as text.

Example

agents:
  researcher:
    tools:
      - googlesearch:
          modifier: site:docs.mindroom.chat
          fixed_max_results: 6

web_search("MindRoom Matrix threads", max_results=5)
search_news("open source Matrix news", max_results=5)

Notes

Pick googlesearch when you want Google-backed ranking without introducing an API key dependency.
If you need a first-party paid Google SERP API with more predictable structure, use serper or serpapi instead.
The current MindRoom wrapper makes this tool available without dedicated dashboard integration or OAuth.

[`baidusearch`]

baidusearch is the Baidu-specific search tool for Chinese-language search and Baidu-indexed results.

What It Does

baidusearch exposes one method, baidu_search(query, max_results=5, language="zh"). fixed_language overrides the per-call language, and non-two-letter language values are normalized through pycountry when possible. If language normalization fails, the upstream tool falls back to zh. The returned payload is a JSON array with title, url, abstract, and rank.

Configuration

Option	Type	Required	Default	Notes
`fixed_max_results`	`number`	`no`	`null`	Caps result count for every call.
`fixed_language`	`text`	`no`	`null`	Forces a default search language, with `zh` as the upstream fallback.
`headers`	`text`	`no`	`null`	Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to `search()`.
`proxy`	`url`	`no`	`null`	Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to `search()`.
`timeout`	`number`	`no`	`10`	Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to `search()`.
`debug`	`boolean`	`no`	`false`	Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to `search()`.
`enable_baidu_search`	`boolean`	`no`	`true`	Enable `baidu_search()`.
`all`	`boolean`	`no`	`false`	Enable the full upstream toolkit surface.

Example

agents:
  cn_research:
    tools:
      - baidusearch:
          fixed_language: zh
          fixed_max_results: 8

baidu_search("Matrix 协议 新闻", max_results=5, language="zh")

Notes

Pick baidusearch when Chinese-language search quality matters more than Google-style ranking.
Use duckduckgo or googlesearch for simpler English-centric general search defaults.
The current installed upstream baidu_search() path only forwards keyword and result count, so headers, proxy, timeout, and debug are best treated as placeholders until the wrapper or upstream call path is tightened.

[`tavily`]

tavily is the built-in current-information search API with optional context mode and URL extraction.

What It Does

tavily can expose web_search_using_tavily(query, max_results=5), web_search_with_tavily(query), and extract_url_content(urls), depending on the enable flags. enable_search_context switches the search surface from the normal result-list call to the context-oriented call, so you get one search method or the other instead of both. web_search_using_tavily() can include an AI-generated answer and returns either JSON or Markdown depending on format. extract_url_content() accepts one URL or a comma-separated URL list and formats extracted page content as Markdown or plain text depending on extract_format.

Configuration

Option	Type	Required	Default	Notes
`api_key`	`password`	`yes`	`null`	Tavily API key. The upstream SDK also checks `TAVILY_API_KEY`.
`api_base_url`	`url`	`no`	`null`	Optional base URL override. The upstream SDK also checks `TAVILY_API_BASE_URL`.
`enable_search`	`boolean`	`no`	`true`	Enable Tavily search.
`enable_search_context`	`boolean`	`no`	`false`	Use `web_search_with_tavily()` instead of `web_search_using_tavily()`.
`enable_extract`	`boolean`	`no`	`false`	Enable `extract_url_content()`.
`all`	`boolean`	`no`	`false`	Enable the full upstream toolkit surface.
`max_tokens`	`number`	`no`	`6000`	Token budget for context output and filtered result formatting.
`include_answer`	`boolean`	`no`	`true`	Include the answer field in search output when available.
`search_depth`	`text`	`no`	`advanced`	Tavily search depth, currently `basic` or `advanced`.
`extract_depth`	`text`	`no`	`basic`	Tavily extract depth, currently `basic` or `advanced`.
`include_images`	`boolean`	`no`	`false`	Include images in extract responses when supported.
`include_favicon`	`boolean`	`no`	`false`	Include favicons in extract responses when supported.
`extract_timeout`	`number`	`no`	`null`	Optional extraction timeout in seconds.
`extract_format`	`text`	`no`	`markdown`	Extraction output format, currently `markdown` or `text`.
`format`	`text`	`no`	`markdown`	Search output format, currently `json` or `markdown`.

Example

agents:
  newsdesk:
    tools:
      - tavily:
          enable_extract: true
          include_answer: true
          search_depth: advanced
          format: markdown

web_search_using_tavily("latest Matrix bridge updates", max_results=5)
extract_url_content("https://matrix.org/blog/")

Notes

Pick tavily when you want current-information search plus an optional synthesized answer or URL extraction in the same toolkit.
Use enable_search_context when you want a compact context blob rather than a normal result list.
If you want deeper research features such as similar-page search, date filters, and long-running structured research tasks, use exa instead.

[`exa`]

exa is the research-heavy search toolkit for web search, content retrieval, similar-page discovery, answer generation, and deep research tasks.

What It Does

exa can expose search_exa(query, num_results=5, category=None), get_contents(urls), find_similar(url, num_results=5), exa_answer(query, text=False), and research(instructions, output_schema=None). Search results can include title, author, published date, URL, and truncated page text. The toolkit supports domain allowlists and denylists, crawl-date and publish-date filters, category and type filters, answer-model selection, and a separate research_model for long-running research tasks. enable_research is off by default, so deep research is opt-in even when the rest of the toolkit is enabled.

Configuration

Option	Type	Required	Default	Notes
`enable_search`	`boolean`	`no`	`true`	Enable `search_exa()`.
`enable_get_contents`	`boolean`	`no`	`true`	Enable `get_contents()`.
`enable_find_similar`	`boolean`	`no`	`true`	Enable `find_similar()`.
`enable_answer`	`boolean`	`no`	`true`	Enable `exa_answer()`.
`enable_research`	`boolean`	`no`	`false`	Enable `research()`.
`all`	`boolean`	`no`	`false`	Enable the full upstream toolkit surface.
`text`	`boolean`	`no`	`true`	Include page text in results.
`text_length_limit`	`number`	`no`	`1000`	Maximum text length per result.
`summary`	`boolean`	`no`	`false`	Request result summaries where supported.
`api_key`	`password`	`yes`	`null`	Exa API key. The upstream SDK also checks `EXA_API_KEY`.
`num_results`	`number`	`no`	`null`	Default result count override.
`livecrawl`	`text`	`no`	`always`	Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to search requests.
`start_crawl_date`	`text`	`no`	`null`	Include results crawled on or after this date.
`end_crawl_date`	`text`	`no`	`null`	Include results crawled on or before this date.
`start_published_date`	`text`	`no`	`null`	Include results published on or after this date.
`end_published_date`	`text`	`no`	`null`	Include results published on or before this date.
`type`	`text`	`no`	`null`	Optional content type filter such as article, blog, or video.
`category`	`text`	`no`	`null`	Optional category filter such as `news`, `github`, or `research paper`.
`include_domains`	`string[]`	`no`	`null`	Domain allowlist. The current registry metadata exposes this as a text field, but runtime expects a list of domains.
`exclude_domains`	`string[]`	`no`	`null`	Domain denylist. The current registry metadata exposes this as a text field, but runtime expects a list of domains.
`show_results`	`boolean`	`no`	`false`	Emit debug logs with raw parsed results.
`model`	`text`	`no`	`null`	Answer model for `exa_answer()`, currently `exa` or `exa-pro`.
`timeout`	`number`	`no`	`30`	Timeout in seconds for API operations.
`research_model`	`text`	`no`	`exa-research`	Model for `research()`, currently `exa-research` or `exa-research-pro`.

Example

agents:
  analyst:
    tools:
      - exa:
          enable_research: true
          category: news
          include_domains:
            - matrix.org
            - element.io
          research_model: exa-research

search_exa("Matrix sliding sync adoption", num_results=5)
find_similar("https://matrix.org/blog/")
exa_answer("What changed in the Matrix ecosystem this week?")
research("Compare hosted Matrix bridges for small teams.")

Notes

Pick exa when you need the richest research surface on this page rather than a simple search box.
model only affects exa_answer(), and research_model only affects research().
The current wrapper exposes livecrawl, but the installed upstream call path in this worktree does not apply that setting to the search requests, so do not rely on it yet for behavior changes.

[`serpapi`]

serpapi is the Google and YouTube search toolkit for agents that need a paid SERP provider instead of DDGS-backed scraping.

What It Does

serpapi exposes search_google(query, num_results=10) and search_youtube(query). search_google() returns a JSON payload with search_results, recipes_results, shopping_results, knowledge_graph, and related_questions. search_youtube() returns video_results, movie_results, and channel_results. MindRoom does not add extra behavior here beyond registering the tool metadata and dependency set.

Configuration

Option	Type	Required	Default	Notes
`api_key`	`password`	`yes`	`null`	SerpApi key. The upstream SDK also checks `SERP_API_KEY`.
`enable_search_google`	`boolean`	`no`	`true`	Enable `search_google()`.
`enable_search_youtube`	`boolean`	`no`	`false`	Enable `search_youtube()`.
`all`	`boolean`	`no`	`false`	Enable the full upstream toolkit surface.

Example

agents:
  researcher:
    tools:
      - serpapi:
          enable_search_youtube: true

search_google("Matrix bridges", num_results=10)
search_youtube("Matrix conference talks")

Notes

Pick serpapi when you specifically want Google plus YouTube search from one paid provider.
serpapi is a better fit than googlesearch when you want a provider-backed API instead of DDGS-backed scraping.
serper is the better fit when you need Google news, Google Scholar, or a scrape endpoint instead of YouTube search.

[`serper`]

serper is the Google API toolkit for web, news, scholar, and lightweight scrape calls.

What It Does

serper exposes search_web(query, num_results=None), search_news(query, num_results=None), search_scholar(query, num_results=None), and scrape_webpage(url, markdown=False). location, language, and date_range become shared request parameters across the search endpoints. The search methods return raw JSON responses from Serper. scrape_webpage() hits Serper's scrape endpoint and can optionally request Markdown output.

Configuration

Option	Type	Required	Default	Notes
`api_key`	`password`	`yes`	`null`	Serper API key. The upstream SDK also checks `SERPER_API_KEY`.
`location`	`text`	`no`	`us`	Google location code sent as `gl`.
`language`	`text`	`no`	`en`	Search language code sent as `hl`.
`num_results`	`number`	`no`	`10`	Default result count for search calls.
`date_range`	`text`	`no`	`null`	Shared date-range filter sent as `tbs`.
`enable_search`	`boolean`	`no`	`true`	Enable `search_web()`.
`enable_search_news`	`boolean`	`no`	`true`	Enable `search_news()`.
`enable_search_scholar`	`boolean`	`no`	`true`	Enable `search_scholar()`.
`enable_scrape_webpage`	`boolean`	`no`	`true`	Enable `scrape_webpage()`.
`all`	`boolean`	`no`	`false`	Enable the full upstream toolkit surface.

Example

agents:
  analyst:
    tools:
      - serper:
          location: us
          language: en
          enable_search_scholar: true

search_web("latest Matrix rooms UX", num_results=5)
search_news("Matrix foundation news", num_results=5)
search_scholar("Matrix protocol paper", num_results=5)
scrape_webpage("https://matrix.org/blog/", markdown=True)

Notes

Pick serper when you want Google news and scholar in the same paid toolkit.
serper also covers quick scrape calls, which makes it a good bridge between search and light extraction workflows.
If you want YouTube search instead of scholar or scraping, use serpapi instead.

[`searxng`]

searxng points an agent at your own SearXNG instance instead of a hosted paid API.

What It Does

searxng exposes search_web(query, max_results=5), image_search(query, max_results=5), it_search(query, max_results=5), map_search(query, max_results=5), music_search(query, max_results=5), news_search(query, max_results=5), science_search(query, max_results=5), and video_search(query, max_results=5). All of those calls route through the same /search?format=json endpoint on the configured host. If engines is set, the tool appends those engine names to the SearXNG request. fixed_max_results truncates every category response to a consistent maximum.

Configuration

Option	Type	Required	Default	Notes
`host`	`url`	`yes`	`null`	Base URL for the SearXNG instance. Use the instance root, not a prebuilt `/search` URL.
`engines`	`string[]`	`no`	`[]`	Optional engine allowlist. The current registry metadata exposes this as a text field, but runtime expects a list of engine names.
`fixed_max_results`	`number`	`no`	`null`	Caps result count for all categories.

Example

agents:
  privacy_research:
    tools:
      - searxng:
          host: https://search.example.com
          engines:
            - duckduckgo
            - wikipedia
          fixed_max_results: 6

search_web("Matrix federation guide", max_results=5)
news_search("Matrix news", max_results=5)
science_search("decentralized messaging protocol", max_results=5)
image_search("Matrix logo", max_results=5)

Notes

Pick searxng when you want a self-hosted or privacy-preserving search backend under your own control.
searxng is the only tool on this page that exposes image, map, music, science, and video categories through the same configuration.
If your SearXNG deployment needs auth or reverse-proxy policy, handle that at the instance or network layer because the current MindRoom tool metadata only exposes host, engines, and fixed_max_results.

[`linkup`]

linkup is a web-search API that can return either search-result lists or sourced answers.

What It Does

linkup exposes web_search_with_linkup(query, depth=None, output_type=None). depth controls how aggressively Linkup searches, and output_type controls whether the response is a searchResults list or a sourcedAnswer. The configured defaults are applied when the call does not override them. The tool returns the raw response from the Linkup SDK rather than a MindRoom-specific normalized envelope.

Configuration

Option	Type	Required	Default	Notes
`api_key`	`password`	`yes`	`null`	Linkup API key. The upstream SDK documents `LINKUP_API_KEY`, but the current MindRoom wrapper is safest when you store the key explicitly in tool config.
`depth`	`text`	`no`	`standard`	Default search depth, currently `standard` or `deep`.
`output_type`	`text`	`no`	`searchResults`	Default output type, currently `searchResults` or `sourcedAnswer`.
`enable_web_search_with_linkup`	`boolean`	`no`	`true`	Enable `web_search_with_linkup()`.
`all`	`boolean`	`no`	`false`	Enable the full upstream toolkit surface.

Example

agents:
  briefings:
    tools:
      - linkup:
          depth: deep
          output_type: sourcedAnswer

web_search_with_linkup(
    "Summarize the latest Matrix bridge announcements",
    depth="deep",
    output_type="sourcedAnswer",
)

Notes

Pick linkup when you want a sourced answer directly from the search provider instead of stitching one together downstream.
Pick tavily when you also want built-in extract calls, and pick exa when you need broader research primitives such as find_similar() or research().
The current MindRoom wrapper initializes the Linkup client from the explicit api_key argument, so a stored tool credential is more reliable than relying on environment-only fallback on this branch.

Web Search

What This Page Covers

Tools On This Page

Common Setup Notes

[duckduckgo]

What It Does

Configuration

Example

Notes

[googlesearch]

What It Does

Configuration

Example

Notes

[baidusearch]

What It Does

Configuration

Example

Notes

[tavily]

What It Does

Configuration

Example

Notes

[exa]

What It Does

Configuration

Example

Notes

[serpapi]

What It Does

Configuration

Example

Notes

[serper]

What It Does

Configuration

Example

Notes

[searxng]

What It Does

Configuration

Example

Notes

[linkup]

What It Does

Configuration

Example

Notes

Related Docs

[`duckduckgo`]

[`googlesearch`]

[`baidusearch`]

[`tavily`]

[`exa`]

[`serpapi`]

[`serper`]

[`searxng`]

[`linkup`]