Skip to content
All posts
Guide//6 MIN READ

Agent Called a Tool You Didn't Send? Fix Recall Misses

Per-turn tool routing can drop a tool the model still needs. How recall@K catches misses, how session recovery responds, and what Orqen can — and cannot — fix.

O

Orqen Team

orqen.app

Your agent asked for update_invoice. The model tried to call it. But your router only forwarded search_invoices and list_customers. The tool call failed — or the model hallucinated a workaround. The user sends the same message again. Turn 48 begins.

This is a recall miss: the optimization layer removed a tool the model still needed for this turn. Per-turn routing saves tokens, but when recall drops below 1.0, agents loop, users retry, and the savings evaporate into wasted upstream calls.

What a recall miss looks like

Tool routing narrows a large catalog to a small subset each turn. Orqen might receive 51 MCP tools and forward 4 that match the current intent. That is usually correct — most turns need a handful of tools, not the full catalog.

A recall miss happens when the model's actual tool call is not in the forwarded set:

  • HTTP 200 with a bad tool call. The model picks a tool name that was pruned. Some providers return a validation error; others let the model invent arguments for a tool it cannot execute.
  • Wrong-tool workaround. The model uses a nearby tool (search_invoices instead of update_invoice) and produces a plausible but wrong answer.
  • User retry. The user re-sends "no, update it" — a signal that the previous turn failed even if no explicit error surfaced.

Recall misses are different from upstream errors. A 503 from your provider is not Orqen's routing fault. Session recovery classifies error types separately so widening only fires when pruning likely caused the failure.

Why agents loop after a miss

Without recovery, a recall miss poisons the next several turns:

  1. The model still "remembers" it wanted update_invoice from context, but the tool is absent from the schema list.
  2. The user clarifies ("now update it") — but if routing only reads the last message, the router may still miss the dependency on the prior get_invoice call.
  3. Each retry resends a large history plus the full tool catalog upstream, burning tokens without progress.

The fix is not "never prune." It is measure recall, recover fast, and calibrate K so misses are rare and self-healing when they happen. See why multi-turn routing context matters for the follow-up misroute pattern.

Measuring routing with recall@K

Orqen computes recall@K on every tool-using response: the fraction of tools the model actually called that were present in the pruned set forwarded upstream.

# recall@K = |called_tools ∩ pruned_tools| / |called_tools|
#
# Example: model called ["get_weather", "format_report"]
#          Orqen forwarded ["get_weather", "search_files", "format_report"]
# recall@K = 2/2 = 1.0  ✓
#
# Example: model called ["update_invoice"]
#          Orqen forwarded ["search_invoices", "list_customers"]
# recall@K = 0/1 = 0.0  → recall miss
recall@KMeaningTypical action
1.0All called tools were forwardedHealthy — keep K
0.5Half of called tools were missingRecovery widens next turn
0.0Every called tool was prunedStrong recovery + dashboard alert
NULLNo tool calls this turnNot scored

Recall@K is stored per request in the dashboard alongside tools_intools_out and which tools were called. Aggregated over a session, it tells you whether your routing window (K) is too aggressive for that workflow.

Session recovery: widen, boost, retry

When Orqen detects a recall miss, it writes short-lived session signals. The next turn's optimization plan consumes them:

# After a recall miss, Orqen stores short-lived session signals:
#   which tool was missed
#   which tools were pruned at error time
#
# Next turn's plan may:
#   widen the routing window
#   boost previously removed tools
#   run extra intent analysis after repeated misses
#   keep more raw history while recovering
  • Wider routing window. Borderline tools re-enter the candidate set. Repeated misses escalate the widening.
  • Tool boost. The missed tool and tools pruned at error time get score bonuses on the next turn.
  • Extra intent analysis. After repeated pruning-related errors, Orqen may run enrichment to disambiguate vague follow-ups.
  • Conservative compression. Aggressive context assembly pauses while the session recovers — you keep more raw history until routing stabilizes.
  • Decay on success. After consecutive successful turns, recovery signals fade so token savings resume.

Recovery is automatic. No SDK changes required. Route through Orqen, run a real multi-tool session, and watch recall@K in Usage — misses should trigger visible widening on the next turn.

Fail-open when routing is uncertain

Orqen's default posture is fail-open on infrastructure:

  • If session storage is unavailable, recovery returns empty signals — requests proceed normally, not blocked.
  • If the embedder or reranker times out, Stage 1 scores still forward a pruned set; the request never hard-fails because routing hiccuped.
  • Small tool sets pass through untouched — no risk of over-pruning a 3-tool agent.
  • Free-tier passthrough (monthly savings cap hit) disables optimization but still forwards requests — agents keep running.

Fail-open means optimization is best-effort acceleration, not a single point of failure. It does not mean Orqen sends the full catalog on every turn — pruning still runs. Recovery widens the window after a measured miss, not preemptively.

Measuring routing quality over time

Recall@K measures what happened after the fact. Orqen also stores privacy-preserving routing metadata — candidate tool names, policy version, alternate variants — so you can compare policies offline without logging user prompts.

Pro customers can opt into shadow comparison calls that run after the response delivers (zero latency impact on the user). Those calls help calibrate routing — they are an internal quality signal, not something your agent depends on at runtime.

Together, live recall@K + session recovery + offline eval form a closed loop: measure misses, heal the session, tune descriptions and routing policy over time. For tool sprawl context, see MCP Gave Your Agent 50 Tools.

Check recall on your agent

If you route tools per turn and see user retries or mystery tool errors:

  1. Create a free Orqen account and point your SDK at https://api.orqen.app.
  2. Run a multi-tool session with your full MCP or function catalog.
  3. Open Usage and filter for requests with recall_at_k below 1.0 — note which tools were called vs forwarded.
  4. Retry the failing turn and confirm K widens (more tools_out on the recovery turn).
Tagged:tool-callingagent-optimizationroutingrecallmcp
O

Orqen Team

We build the optimization layer for tool-heavy LLM agents. Our goal is to make agent costs predictable as your tool set grows.

Try Orqen free

250K saved tokens per month. Free forever. Two-line integration.

See your savings in the dashboard within seconds of your first request.