Web Search
Use these tools to search the public web, query paid search APIs, access Google- or Baidu-oriented results, or point agents at a self-hosted SearXNG instance.
What This Page Covers
This page documents the built-in tools in the web-search group.
Use these tools when you need general web discovery, current-events search, answer-style search APIs, Google or Baidu specific results, or a self-hosted metasearch backend.
Tools On This Page
- [
duckduckgo] - No-key DuckDuckGo-backed web and news search through the shared DDGS backend. - [
googlesearch] - No-key Google-backed web and news search through the shared DDGS backend. - [
baidusearch] - No-key Baidu search tuned for Chinese-language discovery. - [
tavily] - API-backed current-information search with optional answer, context, and URL extraction modes. - [
exa] - API-backed research search with content fetching, similar-page lookup, answers, and deep research tasks. - [
serpapi] - API-backed Google and YouTube SERP access. - [
serper] - API-backed Google web, news, and scholar search plus lightweight webpage scraping. - [
searxng] - Self-hosted SearXNG search across web, images, maps, music, science, news, and video. - [
linkup] - API-backed web search that can return either raw search results or sourced answers.
Common Setup Notes
duckduckgo, googlesearch, and baidusearch are setup_type: none, so they work out of the box once their optional Python dependencies are available.
tavily, exa, serpapi, serper, and linkup are status=requires_config and are intended to be configured with a stored api_key.
searxng is also status=requires_config, but it needs a reachable host URL instead of an API key.
None of the tools on this page declare an auth_provider, and src/mindroom/api/integrations.py currently only exposes Spotify OAuth routes, so these tools use ordinary tool credentials or SDK environment variables rather than a dedicated dashboard OAuth flow.
Password fields such as api_key should be stored through the dashboard or credential store instead of inline YAML.
Current upstream SDKs also support environment variables such as TAVILY_API_KEY, TAVILY_API_BASE_URL, EXA_API_KEY, SERP_API_KEY, SERPER_API_KEY, and LINKUP_API_KEY.
Missing optional dependencies can auto-install at first use unless MINDROOM_NO_AUTO_INSTALL_TOOLS=1 is set.
duckduckgo and googlesearch are the simplest no-key defaults for general search and basic news lookups.
baidusearch is the better fit when you want Baidu indexing or Chinese-language defaults.
tavily and linkup are useful when you want answer-oriented search output instead of only result lists.
exa is the deepest research option on this page when you need domain filters, date filters, content fetches, find-similar, or a long-running research task.
serpapi and serper are Google-focused paid APIs, with serpapi covering Google and YouTube verticals and serper covering Google web, news, scholar, and a scrape endpoint.
searxng is the best fit when you control your own search stack or want SearXNG categories such as images, maps, music, science, and video.
[duckduckgo]
duckduckgo is the simplest built-in web search option for general search and news without any API key setup.
What It Does
duckduckgo wraps Agno's DuckDuckGoTools, which is a convenience layer over the shared WebSearchTools backend with backend="duckduckgo".
It exposes web_search(query, max_results=5) and search_news(query, max_results=5).
modifier prepends extra query text, fixed_max_results caps all calls, and proxy, timeout, and verify_ssl control the underlying DDGS client.
The tool returns JSON strings from DDGS rather than a MindRoom-specific normalized response format.
Configuration
| Option | Type | Required | Default | Notes |
|---|---|---|---|---|
enable_search |
boolean |
no |
true |
Enable web_search(). |
enable_news |
boolean |
no |
true |
Enable search_news(). |
modifier |
text |
no |
null |
Prepends fixed query text to every web search. |
fixed_max_results |
number |
no |
null |
Caps result count for both web and news searches. |
proxy |
url |
no |
null |
Optional proxy for DDGS requests. |
timeout |
number |
no |
10 |
Request timeout in seconds. |
verify_ssl |
boolean |
no |
true |
Verify TLS certificates for DDGS requests. |
Example
web_search("latest Matrix client features", max_results=5)
search_news("Matrix ecosystem", max_results=5)
Notes
- Pick
duckduckgowhen you want the lowest-friction no-key option for general web and news search. - Pick
googlesearchinstead when you want Google-style ranking but still do not want a paid API. - Pick
tavily,exa,serper, orserpapiwhen you need provider-backed APIs, answer generation, or more vertical-specific search behavior.
[googlesearch]
googlesearch uses the same DDGS-powered search surface as duckduckgo, but it hardwires the backend to Google.
What It Does
MindRoom registers googlesearch as a custom wrapper around Agno's WebSearchTools with backend="google".
It exposes web_search(query, max_results=5) and search_news(query, max_results=5).
Runtime behavior matches the duckduckgo tool surface, including modifier, fixed_max_results, proxy, timeout, and verify_ssl.
This is still a DDGS-backed scraper-style search path rather than an official Google paid search API.
Configuration
| Option | Type | Required | Default | Notes |
|---|---|---|---|---|
enable_search |
boolean |
no |
true |
Enable web_search(). The current registry metadata marks this field as text, but the wrapper expects a boolean. |
enable_news |
boolean |
no |
true |
Enable search_news(). The current registry metadata marks this field as text, but the wrapper expects a boolean. |
modifier |
text |
no |
null |
Prepends fixed query text to every web search. |
fixed_max_results |
number |
no |
null |
Caps result count for both web and news searches. The current registry metadata marks this field as text. |
proxy |
url |
no |
null |
Optional proxy for DDGS requests. The current registry metadata marks this field as text. |
timeout |
number |
no |
10 |
Request timeout in seconds. The current registry metadata marks this field as text. |
verify_ssl |
boolean |
no |
true |
Verify TLS certificates for DDGS requests. The current registry metadata marks this field as text. |
Example
web_search("MindRoom Matrix threads", max_results=5)
search_news("open source Matrix news", max_results=5)
Notes
- Pick
googlesearchwhen you want Google-backed ranking without introducing an API key dependency. - If you need a first-party paid Google SERP API with more predictable structure, use
serperorserpapiinstead. - The current MindRoom wrapper makes this tool available without dedicated dashboard integration or OAuth.
[baidusearch]
baidusearch is the Baidu-specific search tool for Chinese-language search and Baidu-indexed results.
What It Does
baidusearch exposes one method, baidu_search(query, max_results=5, language="zh").
fixed_language overrides the per-call language, and non-two-letter language values are normalized through pycountry when possible.
If language normalization fails, the upstream tool falls back to zh.
The returned payload is a JSON array with title, url, abstract, and rank.
Configuration
| Option | Type | Required | Default | Notes |
|---|---|---|---|---|
fixed_max_results |
number |
no |
null |
Caps result count for every call. |
fixed_language |
text |
no |
null |
Forces a default search language, with zh as the upstream fallback. |
headers |
text |
no |
null |
Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to search(). |
proxy |
url |
no |
null |
Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to search(). |
timeout |
number |
no |
10 |
Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to search(). |
debug |
boolean |
no |
false |
Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to search(). |
enable_baidu_search |
boolean |
no |
true |
Enable baidu_search(). |
all |
boolean |
no |
false |
Enable the full upstream toolkit surface. |
Example
Notes
- Pick
baidusearchwhen Chinese-language search quality matters more than Google-style ranking. - Use
duckduckgoorgooglesearchfor simpler English-centric general search defaults. - The current installed upstream
baidu_search()path only forwards keyword and result count, soheaders,proxy,timeout, anddebugare best treated as placeholders until the wrapper or upstream call path is tightened.
[tavily]
tavily is the built-in current-information search API with optional context mode and URL extraction.
What It Does
tavily can expose web_search_using_tavily(query, max_results=5), web_search_with_tavily(query), and extract_url_content(urls), depending on the enable flags.
enable_search_context switches the search surface from the normal result-list call to the context-oriented call, so you get one search method or the other instead of both.
web_search_using_tavily() can include an AI-generated answer and returns either JSON or Markdown depending on format.
extract_url_content() accepts one URL or a comma-separated URL list and formats extracted page content as Markdown or plain text depending on extract_format.
Configuration
| Option | Type | Required | Default | Notes |
|---|---|---|---|---|
api_key |
password |
yes |
null |
Tavily API key. The upstream SDK also checks TAVILY_API_KEY. |
api_base_url |
url |
no |
null |
Optional base URL override. The upstream SDK also checks TAVILY_API_BASE_URL. |
enable_search |
boolean |
no |
true |
Enable Tavily search. |
enable_search_context |
boolean |
no |
false |
Use web_search_with_tavily() instead of web_search_using_tavily(). |
enable_extract |
boolean |
no |
false |
Enable extract_url_content(). |
all |
boolean |
no |
false |
Enable the full upstream toolkit surface. |
max_tokens |
number |
no |
6000 |
Token budget for context output and filtered result formatting. |
include_answer |
boolean |
no |
true |
Include the answer field in search output when available. |
search_depth |
text |
no |
advanced |
Tavily search depth, currently basic or advanced. |
extract_depth |
text |
no |
basic |
Tavily extract depth, currently basic or advanced. |
include_images |
boolean |
no |
false |
Include images in extract responses when supported. |
include_favicon |
boolean |
no |
false |
Include favicons in extract responses when supported. |
extract_timeout |
number |
no |
null |
Optional extraction timeout in seconds. |
extract_format |
text |
no |
markdown |
Extraction output format, currently markdown or text. |
format |
text |
no |
markdown |
Search output format, currently json or markdown. |
Example
agents:
newsdesk:
tools:
- tavily:
enable_extract: true
include_answer: true
search_depth: advanced
format: markdown
web_search_using_tavily("latest Matrix bridge updates", max_results=5)
extract_url_content("https://matrix.org/blog/")
Notes
- Pick
tavilywhen you want current-information search plus an optional synthesized answer or URL extraction in the same toolkit. - Use
enable_search_contextwhen you want a compact context blob rather than a normal result list. - If you want deeper research features such as similar-page search, date filters, and long-running structured research tasks, use
exainstead.
[exa]
exa is the research-heavy search toolkit for web search, content retrieval, similar-page discovery, answer generation, and deep research tasks.
What It Does
exa can expose search_exa(query, num_results=5, category=None), get_contents(urls), find_similar(url, num_results=5), exa_answer(query, text=False), and research(instructions, output_schema=None).
Search results can include title, author, published date, URL, and truncated page text.
The toolkit supports domain allowlists and denylists, crawl-date and publish-date filters, category and type filters, answer-model selection, and a separate research_model for long-running research tasks.
enable_research is off by default, so deep research is opt-in even when the rest of the toolkit is enabled.
Configuration
| Option | Type | Required | Default | Notes |
|---|---|---|---|---|
enable_search |
boolean |
no |
true |
Enable search_exa(). |
enable_get_contents |
boolean |
no |
true |
Enable get_contents(). |
enable_find_similar |
boolean |
no |
true |
Enable find_similar(). |
enable_answer |
boolean |
no |
true |
Enable exa_answer(). |
enable_research |
boolean |
no |
false |
Enable research(). |
all |
boolean |
no |
false |
Enable the full upstream toolkit surface. |
text |
boolean |
no |
true |
Include page text in results. |
text_length_limit |
number |
no |
1000 |
Maximum text length per result. |
summary |
boolean |
no |
false |
Request result summaries where supported. |
api_key |
password |
yes |
null |
Exa API key. The upstream SDK also checks EXA_API_KEY. |
num_results |
number |
no |
null |
Default result count override. |
livecrawl |
text |
no |
always |
Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to search requests. |
start_crawl_date |
text |
no |
null |
Include results crawled on or after this date. |
end_crawl_date |
text |
no |
null |
Include results crawled on or before this date. |
start_published_date |
text |
no |
null |
Include results published on or after this date. |
end_published_date |
text |
no |
null |
Include results published on or before this date. |
type |
text |
no |
null |
Optional content type filter such as article, blog, or video. |
category |
text |
no |
null |
Optional category filter such as news, github, or research paper. |
include_domains |
string[] |
no |
null |
Domain allowlist. The current registry metadata exposes this as a text field, but runtime expects a list of domains. |
exclude_domains |
string[] |
no |
null |
Domain denylist. The current registry metadata exposes this as a text field, but runtime expects a list of domains. |
show_results |
boolean |
no |
false |
Emit debug logs with raw parsed results. |
model |
text |
no |
null |
Answer model for exa_answer(), currently exa or exa-pro. |
timeout |
number |
no |
30 |
Timeout in seconds for API operations. |
research_model |
text |
no |
exa-research |
Model for research(), currently exa-research or exa-research-pro. |
Example
agents:
analyst:
tools:
- exa:
enable_research: true
category: news
include_domains:
- matrix.org
- element.io
research_model: exa-research
search_exa("Matrix sliding sync adoption", num_results=5)
find_similar("https://matrix.org/blog/")
exa_answer("What changed in the Matrix ecosystem this week?")
research("Compare hosted Matrix bridges for small teams.")
Notes
- Pick
exawhen you need the richest research surface on this page rather than a simple search box. modelonly affectsexa_answer(), andresearch_modelonly affectsresearch().- The current wrapper exposes
livecrawl, but the installed upstream call path in this worktree does not apply that setting to the search requests, so do not rely on it yet for behavior changes.
[serpapi]
serpapi is the Google and YouTube search toolkit for agents that need a paid SERP provider instead of DDGS-backed scraping.
What It Does
serpapi exposes search_google(query, num_results=10) and search_youtube(query).
search_google() returns a JSON payload with search_results, recipes_results, shopping_results, knowledge_graph, and related_questions.
search_youtube() returns video_results, movie_results, and channel_results.
MindRoom does not add extra behavior here beyond registering the tool metadata and dependency set.
Configuration
| Option | Type | Required | Default | Notes |
|---|---|---|---|---|
api_key |
password |
yes |
null |
SerpApi key. The upstream SDK also checks SERP_API_KEY. |
enable_search_google |
boolean |
no |
true |
Enable search_google(). |
enable_search_youtube |
boolean |
no |
false |
Enable search_youtube(). |
all |
boolean |
no |
false |
Enable the full upstream toolkit surface. |
Example
Notes
- Pick
serpapiwhen you specifically want Google plus YouTube search from one paid provider. serpapiis a better fit thangooglesearchwhen you want a provider-backed API instead of DDGS-backed scraping.serperis the better fit when you need Google news, Google Scholar, or a scrape endpoint instead of YouTube search.
[serper]
serper is the Google API toolkit for web, news, scholar, and lightweight scrape calls.
What It Does
serper exposes search_web(query, num_results=None), search_news(query, num_results=None), search_scholar(query, num_results=None), and scrape_webpage(url, markdown=False).
location, language, and date_range become shared request parameters across the search endpoints.
The search methods return raw JSON responses from Serper.
scrape_webpage() hits Serper's scrape endpoint and can optionally request Markdown output.
Configuration
| Option | Type | Required | Default | Notes |
|---|---|---|---|---|
api_key |
password |
yes |
null |
Serper API key. The upstream SDK also checks SERPER_API_KEY. |
location |
text |
no |
us |
Google location code sent as gl. |
language |
text |
no |
en |
Search language code sent as hl. |
num_results |
number |
no |
10 |
Default result count for search calls. |
date_range |
text |
no |
null |
Shared date-range filter sent as tbs. |
enable_search |
boolean |
no |
true |
Enable search_web(). |
enable_search_news |
boolean |
no |
true |
Enable search_news(). |
enable_search_scholar |
boolean |
no |
true |
Enable search_scholar(). |
enable_scrape_webpage |
boolean |
no |
true |
Enable scrape_webpage(). |
all |
boolean |
no |
false |
Enable the full upstream toolkit surface. |
Example
search_web("latest Matrix rooms UX", num_results=5)
search_news("Matrix foundation news", num_results=5)
search_scholar("Matrix protocol paper", num_results=5)
scrape_webpage("https://matrix.org/blog/", markdown=True)
Notes
- Pick
serperwhen you want Google news and scholar in the same paid toolkit. serperalso covers quick scrape calls, which makes it a good bridge between search and light extraction workflows.- If you want YouTube search instead of scholar or scraping, use
serpapiinstead.
[searxng]
searxng points an agent at your own SearXNG instance instead of a hosted paid API.
What It Does
searxng exposes search_web(query, max_results=5), image_search(query, max_results=5), it_search(query, max_results=5), map_search(query, max_results=5), music_search(query, max_results=5), news_search(query, max_results=5), science_search(query, max_results=5), and video_search(query, max_results=5).
All of those calls route through the same /search?format=json endpoint on the configured host.
If engines is set, the tool appends those engine names to the SearXNG request.
fixed_max_results truncates every category response to a consistent maximum.
Configuration
| Option | Type | Required | Default | Notes |
|---|---|---|---|---|
host |
url |
yes |
null |
Base URL for the SearXNG instance. Use the instance root, not a prebuilt /search URL. |
engines |
string[] |
no |
[] |
Optional engine allowlist. The current registry metadata exposes this as a text field, but runtime expects a list of engine names. |
fixed_max_results |
number |
no |
null |
Caps result count for all categories. |
Example
agents:
privacy_research:
tools:
- searxng:
host: https://search.example.com
engines:
- duckduckgo
- wikipedia
fixed_max_results: 6
search_web("Matrix federation guide", max_results=5)
news_search("Matrix news", max_results=5)
science_search("decentralized messaging protocol", max_results=5)
image_search("Matrix logo", max_results=5)
Notes
- Pick
searxngwhen you want a self-hosted or privacy-preserving search backend under your own control. searxngis the only tool on this page that exposes image, map, music, science, and video categories through the same configuration.- If your SearXNG deployment needs auth or reverse-proxy policy, handle that at the instance or network layer because the current MindRoom tool metadata only exposes
host,engines, andfixed_max_results.
[linkup]
linkup is a web-search API that can return either search-result lists or sourced answers.
What It Does
linkup exposes web_search_with_linkup(query, depth=None, output_type=None).
depth controls how aggressively Linkup searches, and output_type controls whether the response is a searchResults list or a sourcedAnswer.
The configured defaults are applied when the call does not override them.
The tool returns the raw response from the Linkup SDK rather than a MindRoom-specific normalized envelope.
Configuration
| Option | Type | Required | Default | Notes |
|---|---|---|---|---|
api_key |
password |
yes |
null |
Linkup API key. The upstream SDK documents LINKUP_API_KEY, but the current MindRoom wrapper is safest when you store the key explicitly in tool config. |
depth |
text |
no |
standard |
Default search depth, currently standard or deep. |
output_type |
text |
no |
searchResults |
Default output type, currently searchResults or sourcedAnswer. |
enable_web_search_with_linkup |
boolean |
no |
true |
Enable web_search_with_linkup(). |
all |
boolean |
no |
false |
Enable the full upstream toolkit surface. |
Example
web_search_with_linkup(
"Summarize the latest Matrix bridge announcements",
depth="deep",
output_type="sourcedAnswer",
)
Notes
- Pick
linkupwhen you want a sourced answer directly from the search provider instead of stitching one together downstream. - Pick
tavilywhen you also want built-in extract calls, and pickexawhen you need broader research primitives such asfind_similar()orresearch(). - The current MindRoom wrapper initializes the Linkup client from the explicit
api_keyargument, so a stored tool credential is more reliable than relying on environment-only fallback on this branch.