Oriora — AI model router: one API, best model per request

Pick your lane

Which lane is you?

The one thing Oriora always manages is the model choice. From there it's a dial: take just the pick and run the call yourself, or hand us the call too. You pay only for the layers you switch on.

You run the call

1 layer · just the pick

Your key stays on your own infrastructure. Ask us which model fits; we return the best-fit model plus ranked alternatives. You make the call yourself — we never see your key or your output, and your prompt only if you choose to have us classify it.

Pick this if you want maximum privacy or control, or already have your own call setup.

One flat fee per recommendation.

We run the call

2 layers · pick + run

Hand us your vendor key. One request — we pick the best model and run it, with caching, fallback, and retries, then return the output. Point your tool at one endpoint.

Pick this if you want the least work — one endpoint and we do the rest.

Two flat fees per call.

Both run on the same scoring brain — quality, cost, and latency across every vendor, within your preferences. You set two things: how many layers you hand us, and whether you declare the task or let us classify it.

Two optional layers can also fire on top — each a flat $0.001, only when your task triggers them, and both switch off from your account: a verify layer that has a second model double-check the answer on a genuinely hard task, and rent the brain, which draws on Oriora's knowledge base when your task matches it. See Rent the Brain.

Don't want to wire anything?

Quick start: Oriora routing in your terminal

The easiest way to watch Oriora pick the model for you — a capable AI terminal on your own key. New to this? Two quick pastes: install Claude Code, then add the Oriora proxy. Already in a terminal? Skip straight to the proxy.

First time? Open a terminal on your Mac — 3 steps

1Press ⌘ Command + Space
2Type Terminal and press Enter
3Hit Copy on a command below, paste it in, press Enter

Step 1 · Haven't used a terminal before? Install Claude Code

Claude Code is Anthropic's terminal AI. Install it straight from Anthropic — this is their official installer, so you always get the latest version. Open Terminal (Cmd+Space, type Terminal, Enter), paste this, press Enter:

curl -fsSL https://claude.ai/install.sh | bash

⏳

Give it a minute. This downloads about 60 MB — for up to a minute the screen may sit quiet and look frozen, but it isn't. Leave the window open and it finishes on its own.

Terminal

$ curl -fsSL https://claude.ai/install.sh | bash
Setting up Claude Code...

Checking installation status...
Installing Claude Code native build latest...
Setting up launcher and shell integration...

✔ Claude Code successfully installed!

  Version: 2.1.204
  Location: ~/.local/bin/claude

✅ Installation complete!

Already have Claude Code — or another CLI, or a Claude/ChatGPT subscription? Skip this and go straight to Step 2.

Step 2 · Add the Oriora proxy — any terminal, any subscription

Already have a terminal AI, or a Claude/ChatGPT subscription? Don't reinstall anything. This is a tiny local router: drop your API keys in once, then point any tool at it — Claude Code, Codex, any OpenAI-compatible CLI — and reach every vendor from one place. Calls run on your own keys, straight to your vendors.

What it does: add your Oriora key and set the model to oriora-auto — Oriora picks the best model for each task, double-checks the hard answers with a second model, and adds proven advice. You just work; it routes. (Oriora reads the task to route and stores nothing — the call itself runs on your key.)

Who it's for: anyone using AI in a terminal — not just developers. With these tools a terminal is about as simple as a chat: you bring the keys, Oriora handles the rest.

1 · Set it up (once) — open Terminal (Cmd+Space, type Terminal, Enter), paste this, hit Enter. It asks for your keys (or one OpenRouter key) — paste them in:

curl -fsSL https://orioralabs.com/terminal/get-proxy.sh | bash

Terminal

$ curl -fsSL https://orioralabs.com/terminal/get-proxy.sh | bash

Oriora local router — add your AI key(s).
  Paste an AI key (then press Enter): ••••••••••••••••
  Detected: deepseek
  ✓ deepseek key added

  Connect Oriora’s brain (required).
  Paste your Oriora key (sk_oriora_...): ••••••••••••••••

✓ Oriora is set up.  Keys loaded: deepseek
  ✓ Brain connected — best model per task + verify + advice.

Runs on macOS.

2 · Two lanes, side by side — pick per session (open a new tab so the command loads):

Your subscription CLI — works as normal. Whatever you already run on a plan (claude on Max, codex on Plus) stays completely untouched. We never touch it.

The Oriora proxy — any CLI, on your API keys. Routes through your own keys with oriora-auto (best model per task + verify + advice). Claude Code user? Just type oriora — it's ready. Any other tool (Codex, etc.)? Point its base URL at 127.0.0.1:8787.

Step 3 · Optional — give it more power

One free pack, three add-ons for oriora: Memory — remembers how you work. Screenshot reading (OCR) — pulls text from any image, offline. Computer use — sees the screen, clicks and types. Paste this, or just ask oriora to grab them:

curl -fsSL https://orioralabs.com/terminal/get-extras.sh | bash

Terminal

$ curl -fsSL https://orioralabs.com/terminal/get-extras.sh | bash

  Oriora extras — memory · screenshot OCR · computer use
  Free helpers for your terminal. Installing into ~/.oriora-proxy …

  ✓ Screenshot OCR   → ~/.oriora-proxy/bin/ocr
  ✓ Memory + skills  → ~/.oriora-proxy/claude-config/memory/
  ✓ Assistant rules  → loaded when you run oriora

  Computer use (click & type on screen) needs a tiny tool called cliclick.
  ✓ cliclick already installed — computer use is ready.

  Done. Next time you run oriora it can read screenshots, use its
  memory + skills, and (with cliclick) click & type on screen.

Installs into ~/.oriora-proxy/ — your normal claude is never touched. Free.

How much it asks before acting

Smart auto (default) — gets on with the work, but checks before anything risky: installs, downloads, deleting files. Nothing to set up.
Full auto — runs a whole task without stopping to ask. Open with oriora --dangerously-skip-permissions: faster, but it can do things you wouldn't have. How it works

Switch anytime — Shift+Tab.

What it does, plainly

1. Claude Code is Anthropic's terminal AI — you install it straight from Anthropic, not from us.
2. The Oriora proxy runs on your Mac and routes each job to the best model — on your own key. Your prompts go straight to your provider; we just pick the model.
3. Optional extras — a memory system, screenshot reading (OCR) and more — are free add-ons you can grab later; just ask your AI in the terminal.

Claude Code is a product of Anthropic — we only add the routing proxy. Not affiliated with or endorsed by Anthropic.

Prefer your own agent? (Hermes, OpenClaw, or any OpenAI tool)

A different setup from the terminal above: your vendor keys live on your Oriora account(not your Mac), and Oriora runs each call. The agent only needs your Oriora key — good if you'd rather not keep keys on your machine.

1Install it + point it at Oriora — copy a line, paste in your terminal, press Enter:

Hermes

curl -fsSL https://orioralabs.com/terminal/add-oriora-hermes.sh | bash

OpenClaw

curl -fsSL https://orioralabs.com/terminal/add-oriora-openclaw.sh | bash

2It asks for your sk_oriora_ key — create one in Settings.
3Add your vendor keys (DeepSeek, OpenAI, …) at Settings → Provider keys. That's where they live — on your Oriora account, never in the agent.
4Use the agent. Oriora picks the best model per task and runs it on your key. Add more vendor keys anytime — it routes across all of them.

Any OpenAI-compatible agent or SDK works the same way — point its base URL at https://api.orioralabs.com/v1, use your Oriora key, model oriora-auto.

That's it — you're set

Open a new terminal window — a fresh Terminal on Mac, or a new PowerShell on Windows — so the command loads. Then type oriora and press Enter. Every job now runs on your own key, routed to the best model.

Terminal — new window

$ oriora
● Oriora is ready — routed on your key (oriora-auto: best model per task).

> type what you want to do…

Want more? Give it a memory, screenshot reading and computer use in Step 3 above — free.

One OpenRouter key

Oriora picks the best model for each job from our whole catalogue — every vendor, one key.

One vendor's key

Oriora picks the best of that vendor's models for each job. Either way, the picking is the value.

OpenAI-compatible

Works with any tool you already use

Oriora uses the same API shape as OpenAI. Any tool that accepts a custom base URL works today — set it to Oriora's endpoint, drop in your Oriora key, and the tool gets intelligent model routing on your own vendor keys. No code changes, no new SDK. Your vendor keys live on your Oriora account (Settings → Provider keys); the tool only needs your Oriora key.

Claude Code

Install it, add the proxy (both above), type oriora. An OpenRouter key routes across our full catalogue per request; a single vendor key uses that vendor. Docs →

Cursor

AI code editor. Add Oriora as a custom model (Settings → Models); Cursor chat routes through us. (Composer/agent + tab keep their own models.) Docs → · Watch ↗

Continue.dev

VS Code / JetBrains AI extension. Add Oriora in config.json (apiBase) → chat + inline edits, routed to the best model. Docs → · Watch ↗

LiteLLM

Building a stack? Add Oriora as a model in your LiteLLM proxy — everything behind it gets our routing. Docs → · Watch ↗

LangChain

ChatOpenAI(base_url="https://api.orioralabs.com/v1") — one line, every call routed. Docs → · Watch ↗

LlamaIndex

OpenAI(api_base="https://api.orioralabs.com/v1") — one line for any app or RAG, every call routed. Docs → · Watch ↗

Vercel AI SDK

createOpenAI({ baseURL: "https://api.orioralabs.com/v1" }) — drop-in for a web app, every call routed. Docs →

Open WebUI

Self-hosted ChatGPT-style app. Add Oriora as a connection → every message routes to the best model. Docs → · Watch ↗

Dify

Visual AI-app builder. Add Oriora as an OpenAI-compatible provider → use it in any app or workflow. Docs → · Watch ↗

Flowise

Visual LLM-flow builder. Point a ChatOpenAI node at Oriora → your flows route to the best model. Docs → · Watch ↗

Cloudflare AI Gateway

Add Oriora as a custom provider in your CF AI Gateway → your traffic routes through us to the best model. Docs →

Any OpenAI SDK

Python / Node / Go / Rust — set the base URL to Oriora, keep your code. Every call routed. Docs →

The pattern

base_url = "https://api.orioralabs.com/api/route"
api_key  = "<your-oriora-key>"
# Everything else stays the same — model name, messages, stream, all of it.

Or — Oriora as an MCP server

Plug Oriora into Claude Desktop, Cursor, or any MCP client and your agent gains three tools on one hosted server, all on your single Oriora key: recommend_model (the best model per task), recommend_approach (the proven way to do it), and verified_answer (hand over the whole question — answered, cross-checked, saved to your own storage). Your agent runs every call on your own keys; Oriora never sees your prompt or vendor key. Flat per-layer fee, billed only when it delivers. Nothing to install or self-host — hosted at api.orioralabs.com/mcp.

{
  "mcpServers": {
    "oriora": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://api.orioralabs.com/mcp",
               "--header", "Authorization: Bearer ${ORIORA_API_KEY}"]
    }
  }
}

Get your Oriora key at Settings → Provider keys. recommend_model returns the same fields as /api/select (model · provider · alternatives); recommend_approach + verified_answer are the brain — see Rent the Brain. What is MCP? →

And — run your account from your own AI

The same one key lets your AI manage the account for you, over REST or as MCP tools (account_status, get_preferences, set_preferences, list_vendor_keys): check your spend, change model preferences, toggle the brain, add or remove your BYOK vendor keys, set a budget. It can only spend the funds you've already deposited. For safety, three things stay with you— signed in on the website: minting or revoking keys, deleting the account, and adding money. An AI can't do those.

Oriora is an independent product — not affiliated with, partnered with, or endorsed by any of the tools listed above. We work with them because Oriora is OpenAI-compatible; anyone can point a compatible client at our endpoint.

Privacy

No prompt at rest. Anywhere.

Not a policy. Not a contractual add-on. The architecture itself has nowhere for your prompt to land.

Managed selection

Logs the task type, the model that ran, and the flat fee per request. That's it. Prompt content is never written to disk or any database.

Model gateway

Forwards your request to the model provider. Configured for zero prompt retention — nothing stored in transit.

Model provider

Runs inference and returns a response. Same as calling them directly — the routing layer adds no extra data surface.

Relevant for any privacy-conscious product where prompt content shouldn't pass through a third-party logging layer. Some routing tools store prompts by default and charge extra for zero-data-retention. We don't store them at all.

Powered by Oriora — every call is labeled

Each AI result can carry an honest, un-fakeable credit line — proof that Oriora routed the call and your key was never retained. It rides on the apps you build and spreads wherever Oriora ran — proof, not a logo.

Routed by Oriora

Bring Your Own Memory

Your key is yours. So is your memory.

Bring your own key — and bring your own memory. Your rules, your preferences, your way of working, kept in a file you own. Any Oriora app or agent reads it the moment you act, uses it for that one reply, then forgets it. One memory, everywhere you work — and we never store it.

It stays yours

Your memory lives in a file you own — your repo, your device. You edit the source; the next reply already has it. We keep no copy.

Read per call, stored nowhere

When you act, the memory is added to the prompt for that single call — then it is gone. Nothing of it is written to disk or any database.

One memory, every surface

The same memory follows you across every app and agent you use. Not per-app silos — one you, everywhere.

Up in three steps.

01

Create an account

Sign up at orioralabs.com. Connect your vendor API keys — Oriora charges only its flat per-call fees.

02

Generate a key

Inside your account, generate an sk_oriora_... key. Yours in seconds.

03

Pass your task type

One POST request with taskType declared. Oriora handles everything from there.

Generate your API key →

Opens Settings → Oriora API keys (sign in required).

Client-side — get a model recommendation

Oriora returns the best-fit model; you make the call with your own vendor key. Your key never touches us. Flat $0.001 per recommendation.

curl https://api.orioralabs.com/api/select \
  -H "Authorization: Bearer sk_oriora_..." \
  -H "Content-Type: application/json" \
  -d '{"task_type": "coding_extra_hard"}'

# → { "model": "anthropic/claude-fable-5", "provider": "anthropic",
#     "native_model": "claude-fable-5", "alternatives": [...],
#     "task_type": "coding_extra_hard",
#     "orchestration": { "pattern": "quality-verify",
#                        "verifier_model": "deepseek/deepseek-v4-pro",
#                        "verify_prompt": "Review the answer above for correctness
#                          and completeness, then produce an improved, final version.
#                          Respond with only the final answer." } }
#
# Task tiers: coding → coding_hard → coding_extra_hard (same for reasoning/agentic).
# coding_extra_hard = premium model + a quality-verify recommendation. To use it:
# run the call on "model", then run a second pass on "verifier_model" (a different
# vendor) feeding it your prompt + the first answer + "verify_prompt" — that second
# model reviews and returns the corrected final answer. To disable the recommendation
# account-wide, toggle "Orchestration hints" off in Settings. Oriora runs no extra call.
#
# *_extra_hard is reached by declaration (as above) or by auto-classification when the
# classifier detects high-consequence signals (security vulnerabilities, whole-system
# scope, completeness requirements). orchestration is null on all other task types.
#
# "models" is optional — omit it and Oriora ranks across the full catalogue.
# Prefer not to name the task? Send "messages" instead of "task_type" and
# Oriora classifies the prompt in memory — stored nowhere (see above).

Discover task types (no auth): GET /api/select/task-types

Choosing the routing tier

Every request is classified into a routing tier. You can set the tier explicitly with task_type, or let Oriora infer it from your prompt. Explicit task_typeis always honored exactly; prompt-based inference is a convenience for when you'd rather not classify yourself.

To request	Set task_type, or signal in your prompt
Standard routing — the best model for the task	The default; no action needed
A higher-capability model for demanding work	Phrasing such as “this is complex / difficult” → the _hard tier
A higher-capability model plus a second, independent model that reviews the answer	Declare task_type: "coding_extra_hard" (or reasoning_/agentic_), or phrasing such as “double-check this”, “must be correct”, “high-stakes”

Server-side — routed completion (OpenAI-compatible)

Point your existing OpenAI SDK at Oriora. We route to the best model and run it on your own vendor key (BYOK) — your vendor bills the tokens; Oriora charges only the platform & routing fees.

curl https://api.orioralabs.com/v1/chat/completions \
  -H "Authorization: Bearer sk_oriora_..." \
  -H "x-oriora-app: my-app" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "oriora-auto",
    "messages": [{"role": "user", "content": "Review this PR diff..."}]
  }'

# Returns an OpenAI-shaped chat.completion.
# model:"oriora-auto" lets Oriora route. x-oriora-app is an optional label —
# it keeps each app's usage breakdown and cache/route warmth separate.
# Vendor keys connect account-wide under Settings -> Provider keys.

Server-side — declare the task (plain JSON)

The simplest integration: one POST, no SDK shape. Same routing and BYOK execution as the OpenAI-compatible endpoint — you name the task type yourself. Flat $0.002 per call.

curl https://api.orioralabs.com/api/route \
  -H "Authorization: Bearer sk_oriora_..." \
  -H "Content-Type: application/json" \
  -d '{
    "taskType": "coding",
    "messages": [{"role": "user", "content": "Write a regex for..."}]
  }'

# → { "response": "...", "amount_usd": 0.002, "provider": "...", "latency_ms": ... }
# You declare the task (full list: GET /api/select/task-types). Add
# "classify_task": true to let Oriora read the prompt and escalate the task.
# Three tiers: coding → coding_hard (premium model) → coding_extra_hard
# (premium model + quality-verify orchestration; also works for agentic/reasoning).

Limits & errors

What you can rely on, and what each status code means. Failed BYOK calls are not charged.

Rate limit   60 calls/min per account on the AI endpoints.
             Every response carries standard RateLimit-* headers.
Input cap    800,000 characters (~200k tokens) per request.
Output cap   16,384 tokens (max_tokens above this is clamped).

401  missing or invalid key — these endpoints take sk_oriora_... API
     keys, not website session logins
402  insufficient_funds — top up your wallet at orioralabs.com
403  requires BYOK — connect a vendor key in Settings -> Provider keys
400  unknown task_type — valid values: GET /api/select/task-types
413  input too large (see caps above)
429  rate limited — wait for the RateLimit-Reset header, then retry
502  routing or vendor failure — the call was not charged (BYOK)

You know what
you're building.
We route it.

Which lane is you?

Two ways to wire client-side

Quick start: Oriora routing in your terminal

That's it — you're set

Works with any tool you already use

Tell us the task — or let us read it

No prompt at rest. Anywhere.

Your key is yours. So is your memory.

Wired? Then it's read. Not wired? Nothing changes.

Prefer one key instead of ten?

Up in three steps.

Developer reference

Client-side — get a model recommendation

Choosing the routing tier

Server-side — routed completion (OpenAI-compatible)

Server-side — declare the task (plain JSON)

Limits & errors

Ready to route?

You know what you're building.We route it.

Which lane is you?

Two ways to wire client-side

Quick start: Oriora routing in your terminal

That's it — you're set

Works with any tool you already use

Tell us the task — or let us read it

No prompt at rest. Anywhere.

Your key is yours. So is your memory.

Wired? Then it's read. Not wired? Nothing changes.

Prefer one key instead of ten?

Up in three steps.

Developer reference

Client-side — get a model recommendation

Choosing the routing tier

Server-side — routed completion (OpenAI-compatible)

Server-side — declare the task (plain JSON)

Limits & errors

Ready to route?

You know what
you're building.
We route it.