While operating an AI Agent in production, we encountered an unexpected problem. The Agent would make arbitrary assumptions based on incomplete information and proceed with tasks. We'd only realize "Oh, this isn't right..." after seeing the results. This post shares how we systematically prevented such assumption behavior in LLMs and significantly improved Agent accuracy.
The Problem: The Agent Made Arbitrary Assumptions About Inaccessible Information
Our team develops and operates an AI virtual employee Agent that communicates via Slack. Built on the Claude Agent SDK, it handles various tasks from code writing to documentation and issue management. We receive about 8 tasks daily, with the Agent autonomously processing over 80% of minor tasks.
However, during early operations, we kept encountering a recurring problem. When team members discussed requirements verbally or documented context only in specific documents (Notion, internal wikis, etc.), the Agent couldn't access that information but would proceed anyway, saying "Well, typically this is how it's done..."
Let's look at a real example:
- A team member requests the Agent to implement a specific feature
- While working, the Agent needs technical specs and project requirements from an internal Notion document
- The problem: The Agent cannot access the Notion document
Before creating the assumption prevention tool, in such situations the Agent would make groundless assumptions like "The typical implementation approach is this, and usually these requirements exist..." and create a work plan. Naturally, it didn't match the actual requirements.
Why was this problematic?
- Incorrect Work Plans: The Agent's plan didn't match actual requirements and had to be rejected
- Wasted Time: Reviewing the deliverable only to find it wrong? Starting from scratch
- Concentration Drain: Having to verify every assumption the Agent made reduced the value of automation
By our estimate, over half of the rejected work plans were due to such incorrect assumptions.
The Solution: Catching Assumptions as Tools
Initially, we took a simple approach. Create a question tool that says "ask the user when uncertain." We implemented ask-text-question
and ask-select-question
tools and wrote "ask questions when unsure" in the prompt.
But this didn't work as well as expected. The LLM would judge situations where it was about to make assumptions as "this is common enough, it should be fine." So it thought questions weren't necessary and just proceeded. No matter how strongly we wrote it in the prompt, if the LLM didn't recognize "oh, I'm making an assumption right now," it wouldn't use the question tool.
Here we gained a key insight. Just telling it "don't assume, ask questions" isn't enough. We need to explicitly toolify the act of assuming itself to catch it.
So we designed a two-stage approach:
- Stage 1 - Catching Assumptions: When the LLM tries to assume something, make it report "I'm about to assume this" through the
report-assumption
tool - Stage 2 - Forced Questioning: The
report-assumption
tool throws a message saying "don't assume, use the question tool," forcing the LLM to question
The key is systematically blocking assumptions rather than leaving it to the LLM's judgment.
Implementation: Three Tools That Make Assumptions Lead to Questions
We implemented it as a Model Context Protocol (MCP) server, providing three tools total.
1. report-assumption: The Assumption Reporting Tool
A tool the LLM uses when about to make assumptions with uncertain information.
{
name: "report-assumption",
description: "Use when intentionally proceeding under an unresolved assumption and wanting to document the leap explicitly.",
inputSchema: {
type: "object",
properties: {
assumption_summary: {
type: "string",
description: "Short summary of the assumption about to be made without evidence"
}
},
required: ["assumption_summary"]
}
}
The key is in the return value. When called, it returns this response:
return {
success: true,
message: "Do not proceed with the assumption. Verify using the question tool. Call the 'ask-text-question' or 'ask-select-question' tool right now.",
assumption_token: token
};
Why use imperative sentences rather than gentle suggestions? Because the purpose of this tool itself is to force questioning. In practice, LLMs that receive this message call the question tool nearly 100% of the time.
2. ask-text-question: The Text Question Tool
A tool that receives free-form answers from users. It sends questions via Slack and waits for responses.
{
name: "ask-text-question",
description: "Default path whenever the agent lacks context, data, or intent and needs the user's narrative to proceed.",
inputSchema: {
type: "object",
properties: {
question: {
type: "string",
description: "The question to ask the user"
},
multiline: {
type: "boolean",
description: "Whether to allow multiline input (default: true)"
},
max_length: {
type: "number",
description: "Maximum length of the answer (default: 1000)"
}
},
required: ["question"]
}
}
3. ask-select-question: The Selection Question Tool
A tool that makes users choose from predefined options. Used when clear choices are needed, like setting priorities or approval/rejection.
{
name: "ask-select-question",
description: "Default when uncertainty can be resolved by choosing from known options.",
inputSchema: {
type: "object",
properties: {
question: {
type: "string",
description: "The question to ask the user"
},
options: {
type: "array",
items: {
type: "object",
properties: {
text: { type: "string" },
value: { type: "string" }
}
}
}
},
required: ["question", "options"]
}
}
Here's How the Complete Flow Works
- LLM encounters uncertain information while working (e.g., can't access Notion document)
- LLM calls
report-assumption
tool → "I'm about to assume this" - Tool returns forced instruction "use question tool"
- LLM receives message and calls
ask-text-question
orask-select-question
- Question is sent to user via Slack
- User responds in Slack
- Response goes to LLM, which proceeds with accurate information
Results: Work Plan Rejections Reduced by Over 50%
After applying this system, we saw clear changes.
Quantitatively: Cases where we had to reject work plans due to incorrect assumptions decreased by over 50% by our estimate. The accuracy of the Agent's plans improved dramatically, and the time spent reviewing deliverables decreased.
Qualitatively: The Agent's behavior pattern itself changed. Previously, it would speculate "it's probably like this" for inaccessible information, but now it explicitly asks.
In the Notion document example mentioned earlier, after applying the assumption prevention tool, the Agent actually asks like this:
This change meant more than just improved accuracy. Trust developed between the Agent and users. Knowing "it verifies rather than making arbitrary decisions when uncertain," users came to trust the deliverables.
Prompts alone cannot control all LLM behavior. Especially in situations requiring subtle judgments like assumptions. This two-stage mechanism of explicitly toolifying assumption behavior to catch it and forcibly inducing questions proved genuinely effective in practice.