Hierarchical Router: 99.5% Context Window Reduction
DeployStack Satellite now exposes just 2 meta-tools instead of 100+ individual tools, reducing context window consumption by 99.5%. Teams can now scale to unlimited MCP servers while freeing 81,650+ tokens for actual work instead of tool definitions.

We've implemented a hierarchical router pattern that solves the MCP context window consumption problem. Instead of exposing 100+ tools that consume up to 80% of your context window before any work begins, DeployStack Satellite now exposes just 2 meta-tools that provide access to all your MCP servers with 99.5% less token usage.
The Problem We Solved
Context Window Consumption Crisis
When MCP clients connect to multiple servers, each tool's definition (name, description, parameters, schemas) gets loaded into the context window. This creates a severe problem:
Real-world example:
- 15 MCP servers × 10 tools each = 150 total tools
- Each tool definition ≈ 500 tokens
- Total consumption: 75,000 tokens (37.5% of a 200k context window)
Before any actual work begins, nearly half your available context is gone just describing what tools exist.
Industry impact:
- Claude Code: 82,000 tokens consumed by MCP tools (41% of context)
- Cursor: Hard limit of 40 tools maximum
- General consensus: Performance degrades significantly after 20-40 tools
- Critical failures reported at 80+ tools
Why This Matters
More MCP servers = Better AI capabilities, but also = Less room for actual work. Users had to choose between:
1. Limited tooling (stay under 40 tools, miss valuable integrations)
2. Degraded performance (add more tools, sacrifice context space)
This wasn't sustainable as the MCP environment grows.
How We Fixed It: 2-Tool Hierarchical Router
How It Works
Instead of exposing all tools directly to MCP clients, DeployStack Satellite now exposes only 2 meta-tools:
1. discover_mcp_tools - Search for available tools
// Find tools using natural language
discover_mcp_tools({
query: "github create issue",
limit: 10
})
// Returns: [{ tool_path: "github:create_issue", description: "..." }]
2. execute_mcp_tool - Execute a discovered tool
// Execute using the tool_path from discovery
execute_mcp_tool({
tool_path: "github:create_issue",
arguments: { repo: "deploystackio/deploystack", title: "Bug report" }
})
Behind the Scenes
While clients only see 2 tools, the satellite still:
- Manages 20+ actual MCP servers (HTTP and stdio)
- Caches 100+ real tools internally
- Routes execution requests to the correct server
- Handles both HTTP/SSE remote servers and stdio subprocess servers
The magic: Clients discover tools dynamically only when needed, not upfront.
Token Reduction Results
| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| Tools Exposed | 150 | 2 | 98.7% reduction |
| Tokens Consumed | 75,000 | 350 | **99.5% reduction** |
| Context Available | 62.5% | 99.8% | +37.3% more space |
Example: If you previously had 82,000 tokens consumed by MCP tools, you now have 81,650 tokens freed for actual work.
Enhanced Tool Discovery
Full-Text Search Powered by Fuse.js
The discover_mcp_tools meta-tool uses Fuse.js for intelligent fuzzy search across all your MCP servers:
Features:
- Natural language queries - Search with phrases like "scrape website markdown"
- Fuzzy matching - Handles typos and synonym variations (e.g., "website" matches "webpage")
- Fast performance - 2-5ms search time across 100+ tools
- Relevance scoring - Results ranked by match quality
- Weighted search - Prioritizes tool names (40%), descriptions (35%), server names (25%)
Example workflow:
// User asks: "Do you have tools for GitHub?"
discover_mcp_tools({ query: "github" })
// Returns:
// - github:create_issue
// - github:update_issue
// - github:list_repos
// - github:search_code
// Execute the one you need:
execute_mcp_tool({
tool_path: "github:create_issue",
arguments: {...}
})
Search Quality Improvements
We've tuned the search engine for optimal user experience:
Configuration:
- Threshold: 0.5 - Balanced fuzzy matching (allows natural synonym variations)
- Min match length: 2 - Filters noise while catching abbreviations
- Extended search: enabled - Supports advanced query operators if needed
Result: Users find tools on first try, even when phrasing queries differently than tool descriptions.
What This Means For You
Unlimited MCP Server Growth
You can now add as many MCP servers as you need without worrying about context window consumption:
- ✅ 10 servers? No problem
- ✅ 50 servers? Still only 350 tokens
- ✅ 100 servers? Context usage unchanged
The hierarchical router scales infinitely because clients always see just 2 meta-tools.
Better AI Performance
With 99.5% more context available:
- AI assistants can hold longer conversations
- More complex tasks fit in a single session
- Better code generation with full project context
- Reduced need to restart conversations due to context limits
No Breaking Changes
Everything still works:
- All existing MCP servers continue to work
- No configuration changes required
- Internal routing handles stdio and HTTP servers automatically
- Tool discovery happens transparently
From the user's perspective: Tools just work, but now there's room to actually use them.
Technical Implementation
Technical Design
┌─────────────────────────────────────────────────────────────┐
│ MCP Client (Claude Desktop / VS Code) │
│ │
│ Sees: 2 meta-tools (350 tokens) │
│ - discover_mcp_tools │
│ - execute_mcp_tool │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ DeployStack Satellite (Hierarchical Router) │
│ │
│ Behind the scenes: │
│ - Manages 20+ actual MCP servers │
│ - Caches 100+ real tools │
│ - Full-text search with Fuse.js │
│ - Routes to stdio subprocesses or HTTP endpoints │
└─────────────────────────────────────────────────────────────┘
Key Features
Single Source of Truth:
- UnifiedToolDiscoveryManager maintains the only tool cache
- Search service queries this directly (no duplication)
- Always fresh data - automatic server add/remove reflected immediately
Format Conversion:
- External format: "serverName:toolName" (user-facing, clean API)
- Internal format: "serverName-toolName" (routing, backward compatible)
- Automatic conversion in execution layer
Transport Agnostic:
- HTTP/SSE servers: Routes via MCP SDK Client
- stdio servers: Routes via ProcessManager
- Same interface for both - client never knows the difference
Credits
This implementation is powered by Fuse.js, the excellent lightweight fuzzy-search library that makes intelligent tool discovery possible with zero external dependencies.
The hierarchical router pattern is based on research and best practices from the MCP community, validated by multiple open-source implementations showing 95-99% token reduction across different architectures.