Orqen Docs

Introduction

Orqen documentation

Orqen is an OpenAI-compatible proxy that automatically removes irrelevant tools from your agent's context before each LLM call. Less noise, fewer tokens, better accuracy.

What is Orqen?

When your LLM agent has 30 tools and sends all of them on every API call, two things happen: you pay for tokens that don't help the response, and the LLM gets confused by irrelevant options and occasionally picks the wrong tool.

Orqen sits between your agent and your LLM provider. On each request, it reads the user's message, scores each tool by relevance, and forwards only the relevant subset to the LLM. The rest of the request — messages, model name, streaming flag — passes through unchanged.

Live Bedrock test (examples/bedrock_multi_tool_agent.py): 51 tools, two model calls, 9,235 vs 1,605 prompt tokens (~83% fewer) for the same weather question with Orqen pruning.

How it fits into your stack

Orqen is OpenAI-compatible. If your agent uses the OpenAI Python or JavaScript SDK, or any tool that speaks the OpenAI chat completions format, it works through Orqen without any code changes beyond the base_url.

If your agent currently uses native Anthropic Messages or Bedrock Converse tool calls, the model can still run through Orqen, but you should map those provider-specific tool payloads into the OpenAI-compatible request shape first. See provider migration examples.

Compatible with:

OpenAI Python SDKOpenAI Node.js SDKLangChainLlamaIndexHaystackAny OpenAI-compatible client

Supported LLM providers

Orqen forwards requests to your LLM provider using your own API keys. Supported providers include:

  • OpenAI (GPT-4o, GPT-4o mini, o1, o3-mini, …)
  • Anthropic (Claude 3.5 Sonnet, Claude Haiku, …)
  • AWS Bedrock (Claude, Llama, Titan, …)
  • Google Gemini (Gemini 1.5 Pro, Gemini 2.0 Flash, …)
  • Groq (Llama 3.3, Mixtral, …)
  • Mistral, Together AI, Fireworks, OpenRouter, Cohere