- gemini-fullstack-langgraph-quickstart: Gemini fullstack and LangGraph integration.
- multi-agent research system: Multi-agent research system by Anthropic. Blog post
- gpt-researcher: Autonomous agent for comprehensive research tasks.
- DeerFlow: ByteDance's open-source deep research framework.
- r1-reasoning-rag: Reasoning-augmented retrieval-augmented generation framework.
- nanoDeepResearch: Lightweight deep research toolkit.
- deep-research (Aomni): Deep research assistant by Aomni.
- deep-research (u14app): Deep research platform by u14app.
- open-deep-research: Open-source deep research framework.
- deep-searcher: Deep search and research toolkit.
- node-DeepResearch: Deep research toolkit to find the right answers.
- Auto-Deep-Research: Automated deep research agent.
- langgraph-deep-research: Deep research workflows with LangGraph.
- DeepResearchAgent: Deep research agent by SkyworkAI.
- OpenManus: An open-source framework for building general AI agents.
| Title | Date & Code | Base model | Optimization | Search Engine | Agent Architecture | Training Dataset | Evaluation Dataset |
|---|---|---|---|---|---|---|---|
| Search-o1: Agentic Search-Enhanced Large Reasoning Models | 2025/01/09 |
QwQ-32B-Preview | Prompting | Web Search | Single-Agent | – | GPQA, MATH500, AMC2023, AIME2024, LiveCodeBench, NQ, TriviaQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, Bamboogle |
| Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research | 2025/02/07 |
N/A | Prompting | Web Search | Multi-Agent | – | GPQA |
| AutoAgent: A Fully-Automated and Zero-Code Framework for LLM Agents | 2025/02/18 |
Claude3.5-Sonnet | Prompting | Web Search | Multi-Agent | – | GAIA |
| Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models | 2025/03/11 |
GPT-4o, Claude3.5-Sonnet | Prompting | Web Search | Multi-Agent | – | TELL ME A STORY, WildSeek |
| Open Deep Search: Democratizing Search with Open-source Reasoning Agents | 2025/03/26 |
Llama3.1-70B, Deepseek-R1 | Prompting | Web Search | Single-Agent | – | SimpleQA, FRAME |
| Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents | 2025/05/17 |
Qwen2.5-14B, Qwen2.5-7B | Prompting | Local Retrieval | Single-Agent | – | Musique, NQ, 2WikiMultiHopQA, HotpotQA, Bamboogle, StrategyQA |
| Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework | 2025/06/03 | Claude3.7-Sonnet, GPT-4o-mini, Qwen3-235B-A22B, Qwen2.5-VL-72B-Instruct | Prompting | Web Search | Multi-Agent | – | Pew Research, Our World in Data, Open Knowledge Foundation |
| RAG-Gym: Systematic Optimization of Language Agents for Retrieval-Augmented Generation | 2025/05/31 |
Llama3.1-8B-Instruct, Qwen2.5-7B-Instruct, GPT-4o-mini | SFT, RL(PPO, DPO) | Local Retrieval | Single-Agent | HotpotQA, MedQA | HotpotQA, 2Wiki, Bamboogle, MedQA |
| R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning | 2025/03/07 |
Qwen2.5-7B-Base, Llama3.1-8B-Instruct | SFT, RL(GRPO, Reinforce++) | Web Search, Local Retrieval | Single-Agent | HotpotQA, 2WikiMultiHopQA | HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning | 2025/03/12 |
Qwen2.5-7B-Instruct, Qwen2.5-7B-Base, Qwen2.5-3B-Instruct, Qwen2.5-3B-Base | RL(PPO, GRPO) | Web Search | Single-Agent | NQ, HotpotQA | NQ, TriviaQA, PopQA, HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning | 2025/03/25 |
Qwen2.5-7B-Instruct, Qwen2.5-32B-Instruct | RL(GRPO) | Web Search | Single-Agent | MuSiQue | HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments | 2025/03/26 |
Qwen2.5-7B-Instruct | RL(GRPO) | Web Search | Multi-Agent | NQ, TQ, HotpotQA, 2WikiMultiHopQA | MuSiQue, Bamboogle, PopQA, NQ, TQ, HotpotQA, 2WikiMultiHopQA |
| Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs | 2025/04/11 |
Pangu Ultra-135B | SFT, RL | Local Retrieval | Single-Agent | – | – |
| Webthinker: Empowering large reasoning models with deep research capability | 2025/04/30 |
GPT-o1, GPT-o3, Deepseek-R1, QwQ-32B, Qwen2.5-32B-Instruct | RL(DPO) | Web Search | Single-Agent | SuperGPQA, WebWalkerQA, OpenThoughts, NaturalReasoning, NuminaMath | GPQA, GAIA, WebWalkerQA, Humanity’s Last Exam |
| ZeroSearch: Incentivize the Search Capability of LLMs without Searching | 2025/05/07 |
Qwen2.5-3B-Base, Qwen2.5-7B-Base, Qwen2.5-7B-Instruct, Qwen2.5-3B-Instruct, Llama3.2-3B-Instruct, Llama3.2-3B-Base | RL(Reinforce, GRPO, PPO) | Web Search | Single-Agent | NQ, HotpotQA | NQ, TriviaQA, PopQA, HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent | 2025/05/12 |
Qwen2.5-3B-Instruct, Qwen2.5-7B-Instruct | RL(GRPO) | Local Retrieval | Single-Agent | NQ, HotpotQA | PopQA, 2WikiMultihopQA |
| s3 - Efficient Yet Effective Search Agent Training via RL | 2025/05/20 |
Qwen2.5-7B-Instruct | RL(PPO) | Local Retrieval | Single-Agent | NQ, HotpotQA | NQ, TriviaQA, PopQA, HotpotQA, 2wiki, Musique, MedQA-US, MedMCQA, PubMedQA, BioASQ-Y/N, MMLU-Med |
| Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning | 2025/05/22 |
Qwen2.5-7B-Instruct | RL(DPO) | Local Retrieval | Single-Agent | PopQA, HotpotQA, 2WikiMultihopQA | PopQA, HotpotQA, 2WikiMultiHopQA, Bamboogle, MuSiQue |
| R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning | 2025/05/22 |
Qwen2.5-7B-Instruct | SFT, RL | Local Retrieval | Single-Agent | HotpotQA, 2WikiMultiHopQA | HotpotQA, 2WikiMultiHopQA, Musique, Bamboogle |
| WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning | 2025/05/22 |
Qwen2.5-3B, Llama3.1-8B | SFT, RL(M-GRPO) | Web Search | Single-Agent | WebArena-Lite, WebArena | WebArena-Lite, WebArena |
| SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis | 2025/05/25 |
Qwen2.5-7B-Instruct, Qwen2.5-32B-Instruct, DeepseekDistilled-Qwen2.5-32B, QwQ-32B | SFT | Web Search | Single-Agent | NQ, SimpleQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, MultiHopRAG | Bamboogle, FRAMES, GAIA, NQ, SimpleQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, MultiHopRAG |
| MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability | 2025/05/27 |
Llama3.1-8B, Llama3.2-3B, Llama3.2-1B, Llama3, Qwen2.5-7B, Qwen2.5-3B, Qwen2.5-1.5B, Qwen2.5 | SFT, RL(DAPO) | Local Retrieval | Multi-Agent | HotpotQA | HotpotQA, FanoutQA, Musique, 2WikiMultiHopQA, Bamboogle, FreshQA |
| MMSearch-R1: Incentivizing LMMs to Search | 2025/06/25 |
Qwen2.5-VL-7B | RL(GRPO) | Web Search | Single-Agent | VQA, MetaClip, FVQA, InfoSeek | FVQA-test, InfoSeek, MMSearch, SimpleVQA, LiveVQA |
Ref
https://github.com/DavidZWZ/Awesome-Deep-Research
- single Agent
- Anthropic