Core Concepts

Model routing

Model routing lets your agent ask for orqen/auto instead of hardcoding a specific provider model. Orqen chooses from your connected providers using task complexity, model capability, price, latency, and your observed performance data.

Routing strings

`orqen/auto`	Balanced routing. Simple tasks go to fast cheap models; complex tasks go to capable models.
`orqen/cheap`	Lowest-cost capable model from your enabled providers.
`orqen/fast`	Lowest-latency capable model based on observed performance.
`orqen/capable`	Prefer high-capability models for difficult reasoning or code tasks.

from openai import OpenAI

client = OpenAI(
    api_key="sk-orq-YOUR_KEY",
    base_url="https://api.orqen.app/v1",
)

response = client.chat.completions.create(
    model="orqen/auto",
    messages=[{"role": "user", "content": "Summarise this support thread."}],
    tools=[...],
)

How Orqen chooses

Task complexity

Short factual requests route differently from long reasoning, coding, or tool-heavy requests.

Provider availability

Only models from providers you connected and enabled in the dashboard are considered.

Capability tier

Models are grouped by capability, tool support, vision support, context window, and cost.

Performance feedback

Orqen tracks success rate, recall, latency, and usage per customer/model.

Customer controls

Disable individual models or set cost constraints in the dashboard.

Dashboard controls

The Routing page shows enabled models, per-model recall, latency, call count, and performance insights. Models with enough data are marked calibrated. Disabled models are excluded from orqen/*routing but can still be called directly by model id.

curl https://api.orqen.app/v1/chat/completions \
  -H "Authorization: Bearer sk-orq-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "orqen/cheap",
    "messages": [{"role": "user", "content": "Classify this ticket"}]
  }'

Operational rule

Use direct model names when compliance or benchmarking requires a fixed provider model. Use orqen/auto when you want Orqen to optimize cost, speed, and capability over time.

See model routing in the Chat API →