Get ahead
VMware offers training and certification to turbo-charge your progress.
Learn moreAs AI agents connect to more services—Slack, GitHub, Jira, MCP servers—tool libraries grow rapidly. A typical multi-server setup can easily have 50+ tools consuming 55,000+ tokens before any conversation starts. Worse, tool selection accuracy degrades when models face 30+ similarly-named tools.
The Tool Search Tool pattern, pioneered by Anthropic, addresses this: instead of loading all tool definitions upfront, the model discovers tools on-demand. It receives only a search tool initially, queries for capabilities when needed, and gets relevant tool definitions expanded into context. This achieves significant token savings while maintaining access to hundreds of tools.
The key insight: While Anthropic introduced this pattern for Claude, we can implement the same approach for any LLM using Spring AI's Recursive Advisors. Spring AI provides a portable abstraction that makes dynamic tool discovery work across OpenAI, Anthropic, Gemini, Ollama, Azure OpenAI, and any other LLM provider supported by Spring AI.
Our preliminary benchmarks show Spring AI's Tool Search Tool implementation achieves 34-64% token reduction across OpenAI, Anthropic, and Gemini models while maintaining full access to hundreds of tools.
The Spring AI Tool Search Tool project is available on: spring-ai-tool-search-tool.
First, let's understand how Spring AI's tool calling works when using the ToolCallAdvisor - a special recursive advisor that:
ToolCallingManagerThe tool execution happens in a recursive loop - the advisor keeps calling the LLM until no more tool calls are requested.
The standard tool calling flow (such as ToolCallAdvisor) sends all tool definitions to the LLM upfront. This creates three major issues with large tool collections:
By extending Spring AI's ToolCallAdvisor, we've created a ToolSearchToolCallAdvisor that implements dynamic tool discovery. It intercepts the tool calling loop to selectively inject tools based on what the model discovers it needs:
The flow works as follows:
ToolSearcher (but NOT sent to the LLM)ToolSearcher finds matching tools (e.g., "Tool XYZ") and their definitions are added to the next requestIn code, this looks like this:
var toolSearchToolCallAdvisor = ToolSearchToolCallAdvisor.builder()
.toolSearcher(toolSearcher)
.maxResults(5)
.build();
ChatClient chatClient = chatClientBuilder
.defaultTools(new MyTools()) // 100s of tools registered but NOT sent to LLM initially
.defaultAdvisors(toolSearchToolCallAdvisor) // Activate Tool Search Tool
.build();
The ToolSearcher interface abstracts the search implementation, supporting multiple strategies (see tool-searchers for implementations):
| Strategy | Implementation | Best For |
|---|---|---|
| Semantic | VectorToolSearcher |
Natural language queries, fuzzy matching |
| Keyword | LuceneToolSearcher |
Exact term matching, known tool names |
| Regex | RegexToolSearcher |
Tool name patterns (get_*_data) |
The project's GitHub repository is: spring-ai-tool-search-tool.
For detailed setup instructions and code examples, see the Quick Start guide (v1.x) and the related example application (v1.x).
Maven Central (1.0.1):
<dependency>
<groupId>org.springaicommunity</groupId>
<artifactId>tool-search-tool</artifactId>
<version>1.0.1</version>
</dependency>
<!-- Choose a search strategy -->
<dependency>
<groupId>org.springaicommunity</groupId>
<artifactId>tool-searcher-lucene</artifactId>
<version>1.0.1</version>
</dependency>
Version v1.0.x is Spring AI 1.1.x / Spring Boot 3 compatible and v2.0.x is Spring AI 2.x / Spring Boot 4 compatible.
Example Usage
@SpringBootApplication
public class Application {
@Bean
CommandLineRunner demo(ChatClient.Builder builder, ToolSearcher toolSearcher) {
return args -> {
var advisor = ToolSearchToolCallAdvisor.builder()
.toolSearcher(toolSearcher)
.build();
ChatClient chatClient = builder
.defaultTools(new MyTools())
.defaultAdvisors(advisor)
.build();
var answer = chatClient.prompt("""
Help me plan what to wear today in Amsterdam.
Please suggest clothing shops that are open right now.
""").call().content();
System.out.println(answer);
};
}
static class MyTools {
@Tool(description = "Get the weather for a given location at a given time")
public String weather(String location,
@ToolParam(description = "YYYY-MM-DDTHH:mm") String atTime) {...}
@Tool(description = "Get clothing shop names for a given location at a given time")
public List<String> clothing(String location,
@ToolParam(description = "YYYY-MM-DDTHH:mm") String openAtTime) {...}
@Tool(description = "Current date and time for a given location")
public String currentTime(String location) {...}
// ... potentially hundreds more tools
}
}
For the example above, the flow would be:
weather, clothing, currentTime (+ potentially 100s more)toolSearchTool
toolSearchTool(query="current time date") → ["currentTime"]toolSearchTool + currentTime
currentTime("Amsterdam") → "2025-12-08T11:30"toolSearchTool(query="weather location") → ["weather"]toolSearchTool + currentTime + weather
weather("Amsterdam") → "Sunny, 15°C"toolSearchTool(query="clothing shops") → ["clothing"]toolSearchTool + currentTime + weather + clothing
clothing("Amsterdam", "2025-12-08T11:30") → ["H&M", "Zara", "Uniqlo"]⚠️ Disclaimer: These are preliminary, manual measurements taken after a few runs. They are not averaged across multiple iterations and should be considered illustrative rather than representative.
To quantify the token savings, we ran preliminary benchmarks using the demo application with the following setup:
Task: "Help me plan what to wear today in Amsterdam. Please suggest clothing shops that are open right now."
28 total tools: 3 relevant tools (weather, clothing, currentTime) + 25 unrelated "dummy" tools, deliberately not relevant to the weather/clothing task, demonstrating how the tool search efficiently discovers only the needed tools among many unrelated options.
Search strategies: Lucene (keyword-based) and VectorStore (semantic)
Models tested: Gemini (gemini-3-pro-preview), OpenAI (gpt-5-mini-2025-08-07), Anthropic (claude-sonnet-4-5-20250929)
The measurements are collected using a custom TokenCounterAdvisor that tracks and aggregates the token usage.
| Model | Approach | Total Tokens | Prompt Tokens | Completion Tokens | Requests | Savings |
|---|---|---|---|---|---|---|
| Gemini | With TST | 2,165 | 1,412 | 231 | 4 | 60% |
| Without TST | 5,375 | 4,800 | 176 | 3 | — | |
| OpenAI | With TST | 4,706 | 2,770 | 1,936 | 5 | 34% |
| Without TST | 7,175 | 5,765 | 1,410 | 3 | — | |
| Anthropic | With TST | 6,273 | 5,638 | 635 | 5 | 64% |
| Without TST | 17,342 | 16,752 | 590 | 4 | — |
| Model | Approach | Total Tokens | Prompt Tokens | Completion Tokens | Requests | Savings |
|---|---|---|---|---|---|---|
| Gemini | With TST | 2,214 | 1,502 | 234 | 4 | 57% |
| Without TST | 5,122 | 4,767 | 73 | 3 | — | |
| OpenAI | With TST | 3,697 | 2,109 | 1,588 | 4 | 47% |
| Without TST | 6,959 | 5,771 | 1,188 | 3 | — | |
| Anthropic | With TST | 6,319 | 5,642 | 677 | 5 | 63% |
| Without TST | 17,291 | 16,744 | 547 | 4 | — |
currentTime before invoking the other tools, demonstrating correct reasoning about tool dependencies.| Tool Search Tool Approach | Traditional Approach |
|---|---|
| 20+ tools in your system | Small tool library (<20 tools) |
| Tool definitions consuming >5K tokens | All tools frequently used in every session |
| Building MCP-powered systems with multiple servers | Very compact tool definitions |
| Experiencing tool selection accuracy issues |
As the Tool Search Tool project matures and proves its value within the Spring AI Community, we may consider adding it to the core Spring AI project.
For deterministic tool selection without LLM involvement, explore the Pre-Select Tool Demo and the experimental PreSelectToolCallAdvisor.
Unlike the agentic ToolSearchToolCallAdvisor, this advisor pre-selects tools based on message content before the LLM call—ideal for Chain of Thought patterns where a preliminary reasoning step explicitly names the required tools.
Also, you can consider combining the Tool Search Tool with LLM-as-a-Judge patterns to ensure discovered tools actually fulfill the user's task. A judge model could evaluate whether the dynamically selected tools produced satisfactory results and improve the tool discovery if needed.
Try the current implementation and provide feedback to help shape its evolution into a first-class Spring AI feature.
The Tool Search Tool pattern is a step toward scalable AI agents. By combining Anthropic's innovative approach with Spring AI's portable abstraction, we can build systems that efficiently manage thousands of tools while maintaining high accuracy across any LLM provider.
The power of Spring AI's recursive advisor architecture is that it allows us to implement sophisticated tool discovery workflows that work universally - whether you're using OpenAI's GPT models, Anthropic's Claude, local Ollama models, or any other LLM supported by Spring AI. You get the same dynamic tool discovery benefits without being locked into a specific provider's native implementation.