Skip to main content

← Back to Journal

Agentic engineering with Agent Dashboard

The first thing that surprised me about running agentic engineering with Agent Dashboard is that it genuinely goes faster. I expected slow. Vibe coding in a single session felt faster — type a prompt, watch something come back, iterate in place. That's true in the moment; it stops being true across a project. Agentic engineering with a written spec is faster across a project.

The second surprise is the token cost. I went in thinking a tight spec would keep context small. It doesn't, not the way I hoped. Twelve agents running review and execution rounds per PRD, each reading the relevant source files to understand what they're working against, produces a higher spend than I'd accounted for. I've optimised what I can — shorter PRDs, narrower scope per execution round, minimal file reads per agent. The bill is still higher than the vibe-coding baseline. That's the trade.

Neither of those is the main finding. The main finding is spec-alignment.

Before I ran spec-driven development with an AI agent fleet, I finished execution rounds and then spent time adjusting what came out. Not because it was broken — usually it wasn't — but because it wasn't quite what I'd had in mind. Two clarification passes per PRD was normal. The agent interpolated what I wanted; I corrected the interpolation. That cycle has nearly disappeared. Now the output is aligned with what the spec said, because the agents are working against the spec the whole time. When I find a bug, it's an integration bug — two correct parts that misunderstood each other's contract — not a "this isn't what I asked for" bug. The shift in failure mode matters more than the raw speed gain.

The open problem is parallel execution. Today the implementation step is sequential: one file at a time, one agent at a time, through the task list. That's the slowest part of the loop. I've run two kinds of experiments.

The first is agent teams — a lead agent plus two or three execution agents, each owning a module. Works when the module boundary is clean. When the contract between modules is underspecified — frontend and backend built toward different assumptions about what the API looked like — the integration phase undoes part of what the execution phase built. Not a concurrency bug. A contract bug.

The second is agent swarms — a planner that reads the task list, identifies what can run in parallel, and hands off to a pool. Same failure mode as teams, with the additional problem of merge conflicts when two agents touch overlapping files without knowing it. The conflicts were the symptom; underspecified contracts were the cause.

The pattern I keep landing on: parallel execution is a contract problem. When the spec is tight enough to be unambiguous, parallelism is safe. When it isn't, running two agents against the same surface makes the ambiguity expensive fast. The PRD-driven development discipline I run for planning needs to extend all the way to the implementation task list for parallel to be reliable.

I don't have a clean answer yet. That's what makes it the most interesting open problem in the workflow — everything else has converged on a repeatable shape; this one is still live.

If you want the practice overview — the full loop, the agent roster, the PRD lifecycle — how we build is the right starting point.