Hacker Newsnew | past | comments | ask | show | jobs | submit | 10keane's commentslogin

been thinking about this exact problem for a while. my own setup uses OS keyring with a <secret:name> token substitution pattern — the agent requests a credential by name, the substitution happens at execution time, the LLM never sees the raw value in context or logs. works reasonably well.

but the problem with that model is it's static protection. if the agent process itself becomes hostile or gets prompt-injected, keyring doesn't really help — it can still request the secret and get it, it just doesn't see it in the context window.

the shift i've been landing on and building into Orbital(my own project) is that it's less about blocking credential access and more about supervising it. you want to know exactly when and why the agent is requesting something, and have the ability to approve or deny in the moment. pre-set policies are hard because you genuinely can't anticipate what tools an agent will call before it runs — claude code might use curl, bash, or a completely random command depending on the problem. the approval needs to happen at runtime, not preset.

the proxy model here is interesting because it creates a natural supervision boundary. curious whether you're planning runtime approval flows or if the design stays policy-based.


@10keane The proxy approach also solves the audit trail problem implicit in what you're describing. With OS keyring substitution, the agent receives the credential and you can't observe what happens next — the log shows intent (substitution happened) but not effect (what API calls were actually made). Routing through a proxy gives you an immutable record of every call made with each credential, which is the more useful thing for incident response: not "did the agent have access?" but "what did it actually do with it?"

The capability-scoping gap you're pointing at (static vs. dynamic trust) is the next layer up — effectively per-session IAM roles minted at task time, scoped to the specific endpoints the task actually needs. That's harder but it's the right direction.


i actually like the concept of workspace agent, because i am feeling some real pain here to run long-term project while retaining context for each instance of agent. but based on the demo it seems more like for cooperation instead of preserving long-term project state: decisions made, actions taken, approvals given, history of what each agent did and why. it is then just a more convenient chatgpt entry in group chat.

another thing: this is all on OpenAI's servers. Which is fine if that's what you want. But there's a real class of user — technical, working on actual production code, security-conscious — for whom "my workspace lives on my machine, in my git repo, under my version control, works for my other non-openai tools" is a hard requirement, not a preference.


Hi HN, I'm keane. Orbital is an open-source desktop app for running AI agents in a managed environment. Been building it for two months while holding a day job. Solo dev, mac and windows installers on the release page.

Why this exists:

- I loved Claude Projects, but I couldn't let an agent update the project, and it didn't live on my machine. Cowork Projects now can — but only Claude, closed source.

- I loved OpenClaw, but I had no control over what it was doing on my behalf.

Neither was the thing I actually wanted.

The thing I actually wanted is rooted in a belief about where agent-human interaction is going: from micromanagement to delegation.

Micromanagement — where most of us are today. You give a specific task. You hand-hold the agent, watch it work, provide context when it asks, correct it when it drifts. You're guiding an intern.

Delegation — what I want. It's handing work to a colleague the way you would when you leave a company: you give them all the context, describe the objective, set the boundaries, and then let go. Maybe they check in periodically, but you don't watch every keystroke.

For delegation to work, the agent needs a place to live. Not a session. A "project".

What a project is in Orbital:

- A workspace folder

- Persistent human-editable memory that survives restarts

- A budget cap

- A sandbox

- An autonomy preset (supervised / check-in / hands-off)

- Approval gates on write-risk tool calls

- A shared space for sub-agent coordination. The management agent is an autonomous agent I wrote; sub-agents (Claude Code, Codex, Goose, Cline) are discovered via SDK/ACP/PTY on the system path and called with separate context windows so their output doesn't bleed back into the main one.

The project is the unit you delegate. Everything else — approvals, budgets, memory, boundaries — are the affordances that make delegation actually safe.

Everything is transparent. Everything, from input to output, is on your machine. The only thing that leaves is the LLM API call.

Where this fits relative to other things:

- Claude/Cowork Projects: closest mental model, but you can't dispatch other agents like Codex to work in parallel. Exclusive to Claude.

- OpenClaw / Hermes: session-centric or agent-centric. Orbital is project-centric. Your project can delegate to them as sub-agents (planned).

What's real today. 335 commits over two months. Desktop installers for mac and windows. Used daily for a month — including to run my own launch prep. There's a distilled marketing-agent skill inside the repo that reads my calendar and drafts the next day's tasks, which is how I'm shipping this at all while holding a day job.

What's not there yet. Linux sandbox. Native mobile app (today it's LAN QR pairing plus an optional relay for remote supervision). Agent marketplace. Cross-project coordination with approval cascades. Adaptive autonomy.

Happy to answer questions about the architecture, the sub-agent handoff, the sandbox trade-offs, or anything else.


management and critical thinking.

management - it occured to me that giving instructions to agent is very similar to giving instructions to human employees - even the best of them make mistakes.

i learnt that asking claude code to "investigate for 3 potential root causes" is more effective than "investigate the root cause" in bug fix. this blows my mind as i realize that agent can be lazy, can be careless, and we can give better instruction to prevent that.

another reason why i said this is that giving enough context and defining blast boundary is more efficient than hand-holding/micromanaging and checking every tool call for agents. the management skill for human employees also works here.

critical thinking - you just need to have your judgement on the seemingly solid but actually halluncinated agent bs.


great project. think my agent will need it. but then one thing i notice is that this only catches single tool calls. most of the time the malicious behavior is a sequence where each call looks fine on its own: read a file, read another, then a curl to somewhere benign-sounding. individually each one scores low. the arc is the dangerous part and per-call scoring kinda misses that.


what is the point of teaching anyway when fundational knowledge are becoming obsolete?

i think what should be taught is the metacognative ability - like how to retrieve knowledge, how to ask the right questions towards a certain goal. knowledge itself are easily accessible with ai. now the difficult part is the ability to discern actual knowledge from llm halucination bs, the ability to retrieve the required knowledge given a scenario.

this still requires some foundational grounding — you can't detect bullshit with zero context. but the balance shifts from memorization to retrieval, iteration, verification. honestly i think it is more about critical thinking and philosophy.


> what is the point of teaching anyway when fundational knowledge are becoming obsolete?

1. It isn't

2. As you acknowledge, you need some 'foundational grounding', but the amount needed is quite a lot

3. The best way to teach metacognitive (and all other) skills is within a context

> the balance shifts from memorization to retrieval, iteration, verification

This has been trumpeted with every poorly-thought-out educational change, and it's a marker of unfamiliarity with the space. Memorisation hasn't been the focus ever; it's always about the other skills, and (some) memorisation is useful as part of that.


exactly. vibe coding only works when you fully understand the problem and know precisely how to solve it. ai just do the dirty implementation work for you.

that is another reason in why i separate product/architecture design and implementation into two agents with isolated context in my workflow. because i can always iterate with the product agent to refine my understanding and THEN ask the coding agent to implement it. by that time i already have the ability to make proper judgement and evaluate coding agent's output


if not for some strategic mistakes made by the US, you wouldnt even need to ask this question in the first place


i think there are two key things that helped me ship more successfully using ai

1. must isolate context. discuss with your architecture agent, implement with another. you can pass the implementation results back to the architecture agent to check for implementation drift. ai's self check and correction sucks - i guess it is because of the attention mechansim?

2. iterate with your architecture agent to produce a tightly scoped task spec. really need to iterate, ask it your align with you for the key assumption. dont be too ambitious. i myself has a guideline for task spec writing that specify spec cannot cross boundary or work with 2 subsystems in one go

but honestly, ai is only great at diagnosis and implementation. most of my successful runs are on the basis that i know exactly how to code or how to solve the problem. ai just do the dirty work for me.


Would you bet on you not being faster with autocomplete. No context loss for you and you can review and write at the same time.


i am using --dangerously-skip-permissions with task spec. think this is faster. and it gave me more control actually over architecture and product decision. i think i just dont like reacting to suggestions mid-flow


well written. finally someone mentioned that a human operator that has the full architecture context is needed. that i think is the role of human in coding in future.

but i will argue one thing though. the spec approach is good enough with the current model capability. it is the matter of scoping. if you scope the spec correctly and granular enough, the agent will produce replicable implementation. and if i am to look into future, as model capabilities advances, the spec approach will be better and better, allowing for larger spec scope to be implemented at once


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: