Shadow Evaluator · Diagnostic Service

Offline routing audits for agent systems.

Qalbun analyzes historical agent logs to identify candidate inference savings, under-evidenced routing decisions, and validation-ready hypotheses — without changing production systems.

Request a Diagnostic See How It Works

The problem

If your team is spending high-cost model inference on tasks that may not require it, routing to expensive components by default because there is no evidence layer, or unable to measure whether your agent routing is cost-effective — Shadow Evaluator tells you what your logs can actually prove.

How it works

Three steps, fully offline.

Step 01
Profile your logs
We classify your data into one of four readiness tiers — audit-only, cost-opportunity, outcome-informed, or live-shadow-ready — so every later claim has a known evidence footing.
Step 02
Run an offline routing audit
We reconstruct the routing decisions your system actually made and separate observed facts from estimates, with each claim tied to its source data.
Step 03
Choose the next step
You receive a written report, an explicit list of evidence gaps, and a recommendation on whether a live shadow pilot is warranted.

What you receive

A written, reproducible diagnostic.

Data-readiness and claims-budget classification
Observed routing and component usage patterns
Estimated cost opportunities when your logs support them
Under-evidenced decisions and instrumentation gaps
Validation-ready hypotheses for a live shadow pilot
Reproducibility manifest for auditability

Who it is for

Teams running real routing decisions.

Built for teams running agent systems, model routers, tool-using LLM workflows, or multi-component AI systems with recurring inference decisions.

Best fit: at least 30 days of structured logs covering routing decisions, with at minimum cost or token signals or downstream outcome labels. Partial coverage is workable — the audit will tell you exactly which claims your data can support and which it cannot.

Offline routing audits for agent systems.

Common situations where Shadow Evaluator helps

Three steps, fully offline.

Profile your logs

Run an offline routing audit

Choose the next step

A written, reproducible diagnostic.

Teams running real routing decisions.

Request a Diagnostic Review.