A person has a supervision budget. They can supervise one agent in a hands-on wa...

not_kurt_godel · 2026-03-14T02:44:23 1773456263

Just curious, what kind of work are you doing where agentic workflows are consistently able to make notable progress semi-autonomously in parallel? Hearing people are doing this, supposedly productively/successfully, kind of blows my mind given my near-daily in-depth LLM usage on complex codebases spanning the full stack from backend to frontend. It's rare for me to have a conversation where the LLM (usually Opus 4.6 these days) lasts 30 minutes without losing the plot. And when it does last that long, I usually become the bottleneck in terms of having to think about design/product/engineering decisions; having more agents wouldn't be helpful even if they all functioned perfectly.

avereveard · 2026-03-14T02:59:46 1773457186

I've passed that bottleneck with a review task that produces engineering recommendations along six axis (encapsulation, decoupling, simplification, dedoupling, security, reduce documentation drift) and a ideation tasks that gives per component a new feature idea, an idea to improve an existing feature, an idea to expand a feature to be more useful. These two generate constant bulk work that I move into new chat where it's grouped by changeset and sent to sub agent for protecting the context window.

What I'm doing mostly these days is maintaining a goal.md (project direction) and spec.md (coding and process standards, global across projects). And new macro tasks development, I've one under work that is meant to automatically build png mockup and self review.

not_kurt_godel · 2026-03-14T03:17:20 1773458240

What are you using to orchestrate/apply changes? Claude CLI?

avereveard · 2026-03-14T04:52:57 1773463977

I prefer in IDE tools because I can review changes and pull in context faster.

At home I use roo code, at work kiro. Tbh as long as it has task delegation I'm happy with it.

grafmax · 2026-03-14T23:17:26 1773530246

I work on 1M LOC 15 yr old repo. Like you it's across the full stack. Bugs in certain pieces of complex business logic would have catastrophic consequences for my employer. Basically I peel poorly-specific work items off my queue into its own worktree and session at high reasoning/effort and provide a well-specified prompt.

These things eat into my supervision budget:

* LLM loses the plot and I have to nudge (like you) * Thinking hard to better specify prompts (like you) * Reviewing all changes (I do not vibe code except for spikes or other low-risk areas) * Manual thing I have to do (for things I have not yet automated with a agent-authored scripts) * Meetings * etc

So, yes, my supervision budget is a bottleneck. I can only run 5-8 agents at a time because I have only so much time in the day.

Compare that vs a single agent at high reasoning/effort: I am sitting waiting for it to think. Waiting for it to find the code area I'm talking about takes time. Compiling, running tests, fixing compile errors. A million other things.

Any time I find myself sitting and waiting, this is a signal to me to switch to a different session.