Field notes & essays.
Design rationale, release updates, and occasional deep-dives on what we learn building an AI agent for the open web.
-
How to tell if a browser agent is any good. A field guide.
A vendor-neutral way to judge any browser agent: why the public benchmark isn't enough, the five lenses that actually predict real-world reliability, and ten probes you can run on any agent in an afternoon.
Read post -
The perception rewrite, and the eval that finally tells us if we're heading the right way.
What shipped in v1.1: iframe-aware perception, a four-provider abstraction with a behavioural contract test, an action-reliability bundle worth +14 pp accuracy and −54% latency on our internal eval, and the harness that proves it.
Read post -
The four phases shipped. Here's what actually landed.
Following up on the audit post: what shipped, where we got caught in review, and why Phase 4 looks nothing like the version we described.
Read post -
We compared Auto Browser to the best browser agents of 2026. Here's what we're shipping.
A five-lens audit of Auto Browser against Surfer 2, Skyvern 2.0, Alumnium, and Manus, plus the four architectural upgrades we're shipping in response.
Read post -
Hello, Auto Browser
Introducing a Chrome extension that puts an AI agent in your side panel — and why we built it the way we did.
Read post