neuralnoise.com


Homepage of Dr Pasquale Minervini
Researcher/Faculty at the University of Edinburgh, School of Informatics
Co-Founder and CTO at Miniml.AI
ELLIS Scholar, Edinburgh Unit


Deep Research MCP

To make my life a bit easier, I built deep-research-mcp, a small Python agent that exposes several “deep research” backends through a single Model Context Protocol server (Anthropic, 2024). It allows Claude Code, Codex, Gemini CLI, or any MCP client to fire off long-running research tasks against whichever backend the user prefers.

What the server exposes

The MCP server exposes deep_research, research_with_context, and research_status. The first kicks off a task; the second resumes one after an optional clarification round; the third polls. Everything else – provider selection, timeouts, clarification models, system prompts – lives in ~/.deep_research.

Architecture

flowchart LR Client["Claude Code / Codex / Gemini CLI"] -->|MCP| Server["FastMCP server"] Server --> Agent["DeepResearchAgent"] Agent --> Clar["Clarification"] Agent --> Instr["Instruction builder"] Agent --> Backend{{"Backend"}} Backend --> OAI["OpenAI Responses / Chat Completions"] Backend --> Gem["Gemini Deep Research"] Backend --> DrT["DR-Tulu /chat"] Backend --> ODR["Open Deep Research (smolagents)"]

DeepResearchAgent optionally runs a clarification pass (triage the query, ask follow-up questions, enrich it), optionally rewrites the query into a longer research brief, and then delegates to one of four backends behind a common interface. All backends return the same normalised record – report, citations, reasoning steps, task id, execution time – so the MCP tools and the CLI do not care which one ran.

Backends

  • OpenAI Responses API (default). Uses a Deep Research model such as o4-mini-deep-research with built-in web_search_preview and code_interpreter tools, background mode, and polling. Follows the reference pattern in OpenAI’s cookbook (OpenAI, 2025a; OpenAI, 2025b).
  • OpenAI Chat Completions. Same backend file, different code path (api_style = "chat_completions"). No built-in tools, no polling – just a blocking chat call. This is the escape hatch for any OpenAI-compatible endpoint: Perplexity’s Sonar Deep Research (Perplexity, 2025), Groq, Together, Ollama, vLLM, or llama.cpp’s llama-server.
  • Gemini Deep Research. Implemented against the Interactions API, with Google Search and URL context as built-in tools (Google, 2024).
  • DR-Tulu. Allen AI’s open research agent, called over its /chat endpoint; the client-side integration is intentionally thin (AI2, 2025).
  • Open Deep Research. A self-contained smolagents stack with a text browser and search tools, using LiteLLM as the model layer (Roucher et al., 2025).

Why MCP?

Plugging a deep-research tool into Claude Code takes one line:

claude mcp add deep-research -- uv run --directory /path/to/deep-research-mcp deep-research-mcp

Stdio for local spawning, Streamable HTTP on /mcp for a shared server. The repository also ships a CLI (cli/deep-research-cli.py) and a Textual TUI (cli/deep-research-tui.py) that can either run the agent directly or act as MCP clients against the HTTP endpoint – useful for debugging providers without involving the model in the loop.

Clarification

Optional, disabled by default in my setup and probably going away at some point. Three small chat models – a triage model, a clarifier, and an instruction builder – decide whether a query is underspecified, ask follow-up questions, and merge the answers into a longer brief before the expensive provider call. The pattern is adapted from OpenAI’s cookbook. On local endpoints (Ollama, llama-server), clarification needs a model with reasonable structured-output behaviour; very small models trip the triage step.

Notes

  • Long-running tasks can take hours. Client timeouts matter more than the agent’s own – raise MCP_TOOL_TIMEOUT (Claude Code) or tool_timeout_sec (Codex) and poll research_status rather than blocking a single tool call.
  • Clarification and instruction building remain OpenAI-compatible even when the main provider is Gemini or DR-Tulu; set clarification_base_url separately.
  • The repo ships a Claude Code skill and a Codex skill so the assistant knows how to use the tools without re-reading the README.

Code, tests, and setup: github.com/pminervini/deep-research-mcp.

comments powered by Disqus