Process
How we work, in order.
Four phases. Each one ends with something you can look at. No three-month discovery; no PowerPoint deliverables.
- 01
Discover
Week 1
We meet the operator (the person whose work the agent changes) and pick one workflow with measurable upside. Output: a one-page brief with the metric we'll move, the rough scope, and a fixed price.
- 02
Design
Week 2
We write the eval set before we write a single prompt. 30–100 examples drawn from real cases, scored by your team. Output: an eval harness and a target score.
- 03
Deploy
Weeks 3–5
We build the agent against the eval set, integrate it into your stack, and put it behind a feature flag. Output: production deploy with observability, cost dashboards, and a kill switch.
- 04
Iterate
Ongoing
Weekly review of the metric, weekly improvements. We stay until the numbers stabilise — or until you don't need us anymore. Output: a system, not a dependency.
Principles we won't bend
- Eval-driven. If we can't measure it, we won't ship it.
- Reversible. Every change can be rolled back in under five minutes.
- Observable. We instrument before we deploy, not after.
- Owned. There's a human on your side who owns the workflow. We're additive, not a magic box.
- Honest. If an LLM is the wrong tool, we'll tell you.