Teaching MCP Servers New Tricks: Challenges in Tool Discovery
2 Jul 2025
•
Macey Baker
You’ve probably noticed a pattern when you try new LLM-backed or adjacent tools, especially in the developer space. Once you get oriented, the first few interactions are delightful. You start to gain an instinct for what the tool can do. Then you begin to explore the boundaries, either out of curiosity or necessity, and you find them faster than you’d like. Suddenly, the delight wears off, and the boundaries feel a bit constraining.
I felt this way writing an MCP server for a CLI tool I recently built.
Having used a few and gone through Anthropic’s quickstart tutorial, I still wasn’t 100% sure what to expect. I found the process equal parts magical and frustrating. I want to share a few things I learned along the way that might help anyone approaching MCPs for the first time.
Setting up the server
Actually writing the MCP server was quite straightforward. In my case, the server wrapped a few tools that were already exposed elsewhere for easy invocation. All I had to do was enumerate those tools with simple names and descriptions, then wire them up to the right functions. I’ve seen more complex examples, but I imagine for most products, this part will be reasonably quick.
What surprised me wasn’t the code, it was watching how LLM-backed tools interacted with the MCP server I had built. Or rather, how often they didn’t.
The challenge of tool discovery
Anthropic’s MCP quickstart tutorial uses a weather API, where the tool is called get_weather
. This is a good example, because LLMs already know what weather is, and if a user asks about it, a tool named get_weather
makes sense to use. There’s no need for explanation or discovery. It just works.
But my MCP wasn’t exposing a public API. It was describing a new product with a novel process and new vocabulary. There were no existing priors for the LLM, Claude Code in my case, to draw from. The tools were defined, but they weren’t recognised. Unlike traditional chat setups, MCP servers don’t include a system prompt, so there was no way to globally explain what the product was or how the tools were meant to be used. Even with good descriptions, Claude Code often tried anything other than calling the tools I had exposed: guessing what I meant, attempting operations itself, or researching more information online (good effort, to be fair, but not so amusing after <redacted> tokens). This led to a frustrating loop.
What kind of user is this for?
This led to a broader realisation (did everyone but me already know this?): MCP servers work best when the human user already knows what they’re doing. Most existing examples assume that the user is proficient in the tools provided, and just wants a new interaction pattern. In that case, the user can use precise vocabulary, steer the conversation, and help the LLM pick the right tools.
But I was designing for the opposite: users who were new to the product and needed the LLM to help them learn what was possible, or suggest next steps. That turned out to be a tricky fit. With no context and no training on the toolset, the LLM often flailed, or ignored the tools completely.
Remedies
Here are a few ideas for how to remedy this issue, many of which I’ve applied myself (with medium success, your mileage may vary!):
Adding an “intro” tool: A “fake” tool that just returns a blob of context: what the product is, what the tools do, and some ideas for how to use them. Here, you can give LLM-specific instructions about what to do next.
Naming tools more descriptively: Give the LLM the best chance of correctly choosing a tool by naming it very obviously. A pattern like <product_name>:<operation><purpose> might be a mouthful, but it leaves very little to guess, for example:
my_tool:create_readme_for_python_executable
as opposed to something less descriptive, likecreate_readme
. You could also consider mapping many tools to the same functions if there’s a chance of users struggling to articulate their intentions clearly — in this example, I might consider addingwrite_readme
ornew_readme
, which have the same behaviour, and increase the chances of the tool being discovered.Used IDE-level priming: If you expect users to integrate your MCP into their dev environments, adding .cursor.rules.json, claude.md, and other environment-specific files to preload guidance wherever possible will be helpful. Of course, you need to decide how and when to init this guidance, and how to make it clear to users that they’ll get the most out of your MCP server if they complete this step.
Still Early Days
My colleague Rob pointed out to me that most MCP developers will probably try to troubleshoot using LLMs, reaching less and less for StackOverflow and other publicly accessible forums. I wonder how many of these speedbumps we’ll have to encounter only by doing — or if it’ll be primarily LLMs that teach us how to use connector technologies like this.
Overall, this technology is still in its early days. Since I built this first version, the ecosystem has already evolved a bit — new interfaces, better patterns, more robust examples. I expect that a lot of the pain points I hit will smooth out quickly.