Prefer CLI Tools Over MCP Servers to Reduce Context Overhead
MCP servers are powerful, but each one adds tool definitions to your context window on every single message, even when idle. If you're using an MCP server purely for something that has a CLI equivalent, you're burning tokens unnecessarily.
Tools like gh, aws, gcloud, and sentry-cli are more context-efficient because Claude can just run them as bash commands. No persistent tool definitions, no overhead.
# Instead of configuring the GitHub MCP server just for PR operations:
gh pr list --state open
gh issue create --title "Bug fix" --body "Details here"
# Instead of an AWS MCP server for S3:
aws s3 ls s3://my-bucket/
aws s3 cp ./file.txt s3://my-bucket/
The difference adds up. If you have five MCP servers configured, that could be hundreds of tokens of tool definitions loaded into every request. With CLI tools, those tokens are only spent when Claude actually runs a command.
Run /context to see how much space your MCP tool definitions are consuming. If a server's tools are taking significant space but you only use them occasionally, consider switching to the CLI equivalent and disabling the server via /mcp.
Reserve MCP servers for tools that genuinely need persistent connections or stateful interactions, like database servers or language servers.
CLI tools cost zero tokens when idle. MCP servers don't.
Log in to leave a comment.
A PreToolUse hook can intercept test runner commands and filter output to show only failures, cutting thousands of tokens from Claude's context.
CLAUDE.md loads into every message. Move workflow-specific instructions into skills that load on demand to reduce token costs across your session.
Every event emitted while processing a single prompt shares a prompt.id UUID, letting you trace the complete chain of API calls and tool executions.