Lower the MCP Tool Search Threshold to Save Tokens
Every MCP server you configure adds tool definitions to your context window, even when you're not using them. When those definitions exceed 10% of your context, Claude Code automatically defers them and loads tools on-demand via tool search. But you can trigger this earlier.
export ENABLE_TOOL_SEARCH=auto:5
Setting the threshold to auto:5 means tool search kicks in when MCP tool descriptions exceed just 5% of your context window. Deferred tools only enter context when actually used, so a lower threshold means fewer idle definitions eating your tokens.
This is especially useful if you have several MCP servers configured but only use one or two in a given session. Instead of paying for all those tool definitions on every message, only the tools Claude actually calls get loaded.
You can check what's consuming your context space at any time:
> /context
This shows you exactly how much space tool definitions, system prompts, and conversation history are taking up.
For maximum savings, combine this with disabling unused MCP servers via /mcp and preferring CLI tools like gh or aws that don't add persistent tool definitions at all.
Set a lower tool search threshold and stop paying for tools you're not using.
Log in to leave a comment.
A PreToolUse hook can intercept test runner commands and filter output to show only failures, cutting thousands of tokens from Claude's context.
CLAUDE.md loads into every message. Move workflow-specific instructions into skills that load on demand to reduce token costs across your session.
Every event emitted while processing a single prompt shares a prompt.id UUID, letting you trace the complete chain of API calls and tool executions.