Core Concepts
Payload optimization
Orqen optimizes the full agent payload before it reaches the model. That includes tools, schemas, tool results, history, images, model choice, reconstruction, validation, and recovery.
One intent-aware plan coordinates every stage
Before forwarding a request, Orqen builds one request plan. The same plan guides tool routing, compression, summarization, reconstruction, validation, and recovery so those stages stay aligned instead of making isolated decisions.
What happens on a request
Snapshot and understand the request
Orqen snapshots the original payload, reads the current goal, and optionally enriches high-value requests before changing anything.
Decide the right cleanup level
Orqen chooses how much history to keep, when to compress, how many tools to forward, how to trim schemas, and how strict validation should be.
Optimize the whole payload
Orqen deduplicates prompts, compresses images and tool results, manages hot/warm/cold history, routes relevant tools, and trims schemas after routing.
Rebuild and validate
Orqen assembles the final model-facing request, checks critical terms and tool schemas, and restores context if validation fails.
Intent-aware tool routing
Tool routing is one part of the payload optimization layer. Orqen does not only compare text similarity; it understands the current request well enough to build a frame such as:
{
"domain": "weather",
"action": "forecast",
"slots": { "location": "Sittingbourne Kent" },
"side_effect_allowed": false,
"previous_tool_error": false,
"confidence": 0.73
}That frame is matched against tool capability cards derived from function names, descriptions, schemas, required inputs, and optional routing examples.
{
"type": "function",
"function": {
"name": "open_meteo_weather",
"description": "Get real weather forecast for a city.",
"x-orqen-examples": [
"weather in London",
"forecast for Sittingbourne Kent"
],
"parameters": {
"type": "object",
"properties": { "city": { "type": "string" } },
"required": ["city"]
}
}
}Adaptive K
Orqen chooses how many tools to forward based on confidence, risk, and recovery signals:
| Crisp single intent | 1 tool | Example: list files, weather in London. |
| Moderate confidence | 2-3 tools | Enough room for close alternatives. |
| Multi-step or failed retry | up to 4 tools | Protects recall when the agent is recovering. |
| Side effects | minimum 3 tools | Write/send/execute operations are widened unless confidence is very high. |
| Timeout or error | all tools | Fail-open behavior keeps customer requests reliable. |
Conversation history management
For multi-turn agents, prompt size grows with each exchange. Orqen manages history in three tiers:
| Hot — recent turns | Kept verbatim in the forwarded payload. |
| Warm — older turns | Compressed and deduplicated — semantically equivalent content collapsed. |
| Cold — early turns | Summarized — replaced by a compact LLM-generated summary of the chunk. |
For very long sessions (100+ turns), summaries are merged in a hierarchical pass — each merge call handles exactly two summaries — so no single LLM call sees unbounded input and early context is never silently truncated.
Learning loop
Orqen stores privacy-safe optimization traces: detected intent, selected tools, top candidates, recall, compression strategy, reconstruction strategy, shadow proactive rebuild, and recovery signals. This lets Orqen calibrate the system from real data without storing raw prompts.
x-orqen-tools-input: 51
x-orqen-tools-output: 1
x-orqen-prune-ratio: 1/51
x-orqen-routing: semanticBest practice
Write tool descriptions with explicit scope, keep required schema fields accurate, and preserve meaningful IDs/URLs in user-visible turns. Add x-orqen-examples when two tools have similar names or overlapping domains.