I watched a recent video arguing that Hermes Agent's slash-go feature is one of the most important AI agent updates of 2026. The energy was very YouTube. But underneath the hype there is a useful point: long-running agents only become useful when the objective is durable and the stopping condition is clear.
That is what /goal is really about. It turns "keep going" into a contract. Instead of prompting the agent after every partial step, you give it an outcome, a definition of done, and a way to verify progress. Then the agent keeps working until it reaches that outcome, runs out of budget, pauses, or hits a real blocker.
The video calls it "slash-go". The official Hermes documentation calls it Persistent Goals (/goal). I am going to use the official name here, because the distinction matters. This is not just a bigger prompt. It is a loop with state, a judge, a turn budget, and persistence across sessions.
Why /goal matters
The old way of using an AI coding agent is a stop-start rhythm. You ask for a plan. It makes a plan. You say proceed. It edits something. You say keep going. It runs a test. You say fix the next thing. It gets tired, loses track, or stops after one turn because normal chat is designed around single responses.
/goal changes the shape of that interaction. The unit is no longer "answer this message". The unit becomes "move this objective forward until the objective is satisfied".
"Refactor this app." The agent may plan, edit one slice, and stop. The human becomes the continuation engine.
"Refactor this app to TypeScript, keep routes unchanged, run the test suite, and stop when all tests pass." The agent has a target and a validation loop.
That is why this pattern is more interesting than the usual "agent can do anything" claim. It does not remove the need for judgment. It moves the judgment into the goal contract: what outcome, what constraints, what proof, and what should cause the agent to pause.
OpenAI makes the same point in its Codex Follow a goal docs: use goals for long-running work with a clear success condition and validation loop. Hermes takes a similar idea and implements it inside its own agent harness.
What Hermes /goal actually does
According to the Hermes docs, /goal gives Hermes a standing objective that survives across turns. After every turn, Hermes calls a lightweight judge model. That judge sees the goal text, the latest assistant response, and a system instruction that asks for a strict done-or-continue verdict.
If the judge decides the goal is not complete, Hermes feeds a continuation prompt back into the same session and keeps working. If the judge decides the goal is complete, Hermes stops. If the turn budget is exhausted, Hermes pauses and tells you how to resume or clear the goal.
The practical commands are simple:
/goal Complete [objective] without stopping until [verifiable end state].
/goal status
/goal pause
/goal resume
/goal clear
The important part is not the syntax. It is the control behavior around the syntax.
- Judge loop: after each turn, a judge decides whether the goal is done or should continue.
- Turn budget: Hermes has a default continuation budget, so the loop has a backstop.
- Persistence: goal state is stored, so resuming a session can preserve the active or paused goal.
- Mid-run steering: user messages preempt continuation, so you can add information or redirect without starting over.
- Control commands: status, pause, resume, and clear let you manage the run without relying on vibes.
This is why the feature is more serious than the simple "Ralph loop" pattern people were experimenting with earlier: plan, act, save state, repeat. A loop is not enough. A useful loop needs a judge, state, budget, validation, and a clear target.
The practical setup
The video walks through a VPS setup. That is a valid path, but it is not the only path. The right setup depends on what you want the agent to do.
If you are testing Hermes personally, local is fine. Install Hermes, choose a model provider, configure your tools, and run it from your terminal. The Hermes GitHub repo lists quick install commands for Linux, macOS, WSL2, Termux, and native Windows beta.
A VPS starts making sense when you want the agent to keep running while your laptop is closed, receive messages through a gateway, run scheduled work, or act as a small persistent automation server. In that case, pick a normal VPS provider, use SSH keys, keep the box boring, and treat it like infrastructure rather than a toy.
| Setup | Best for | Watch out for |
|---|---|---|
| Local machine | Testing, coding, research, personal workflows. | The agent stops when your machine sleeps or disconnects. |
| VPS | Persistent agents, scheduled runs, messaging gateways, long-running tasks. | Server security, billing controls, logs, and account isolation matter more. |
| Containerized VPS | Agent work that needs stronger filesystem and process isolation. | Docker helps, but it does not remove network, credential, or prompt-injection risk. |
The minimum setup path looks like this:
- Install Hermes from the official repo or docs.
- Run the setup flow and choose your model provider.
- Configure only the tools the agent actually needs.
- Set API keys through the Hermes configuration mechanism instead of pasting secrets into chat.
- Start with a harmless local goal that creates or edits test files.
- Only then add higher-risk tools, messaging gateways, scheduled tasks, or remote server access.
The video uses OpenRouter for image generation and mentions using a ChatGPT subscription for OpenAI-backed work. Those can be useful options, but the architecture point is provider-agnostic: the agent should have the model access and tool access it needs, and no more.
Do not start by giving an agent every credential you own. Start with a narrow workspace, a cheap model route, a low budget, and a goal that can prove the loop behaves.
How Codex compares
OpenAI's Codex documentation describes /goal as a way to give Codex a durable objective for long-running work. The official advice is very close to the Hermes lesson: goals should have a clear target, a validation loop, and a stopping condition.
The difference is mostly where the feature lives. Codex is primarily a coding agent and development environment. Hermes is a broader agent harness with skills, gateways, toolsets, memory, and orchestration patterns. That makes Hermes especially interesting for "CEO and CTO" style setups where one agent defines the business objective and delegates implementation slices to coding agents.
I would frame the comparison this way:
Strong fit for code migrations, refactors, prototype builds, test loops, and prompt optimization against evals.
Strong fit for persistent agent workflows that combine skills, tools, messaging, subagents, and business process orchestration.
The video's CEO/CTO idea is a useful mental model if you keep it grounded. Hermes can hold the business objective, coordinate specialist work, and ask coding agents to build or verify pieces. Codex can act as the engineering worker when the task is code-heavy.
But the same rule applies to both: do not ask for "a great app". Ask for a specific application, with specific routes, features, tests, acceptance criteria, and a command that proves the build works.
Bad goals vs good goals
The single most useful takeaway from the video is this: the goal text is the product spec. If the goal is vague, the judge has nothing solid to judge. If the goal is measurable, the agent can move.
| Weak goal | Better goal |
|---|---|
| Build a great reporting app. | Build a Next.js weekly reporting app where team members submit wins, blockers, next-week plans, and mood. Include a manager dashboard, local persistence, responsive styling, and a README. Stop when npm run build passes. |
| Refactor this codebase. | Refactor the billing module to remove duplicated validation logic. Keep public API behavior unchanged. Add or update unit tests for success, missing-field, invalid-currency, and retry cases. Stop when the billing test suite passes. |
| Make me more customers. | Create a reviewed outreach list of 20 qualified leads from the target segment, with company name, source URL, reason for fit, contact path, and a draft email. Do not send messages. Stop when the CSV and draft email file are complete. |
| Generate a presentation. | Create a five-slide editable presentation about agent harnesses vs model weights, using the supplied reference style, with one clear headline and short supporting copy per slide. Stop when the PPTX exists and can be opened. |
A good /goal prompt usually includes six parts:
- Objective: one outcome, not a pile of unrelated tasks.
- Inputs: files, links, folders, docs, tickets, datasets, or references to inspect first.
- Constraints: what should not change, what tools are allowed, what risky actions are forbidden.
- Deliverables: exact artifacts, files, commits, reports, slides, app screens, or data outputs.
- Validation: tests, build commands, counts, review criteria, or acceptance checks.
- Pause conditions: what should make the agent stop and ask before continuing.
/goal Complete [one objective].
Read [inputs] first.
Respect [constraints].
Produce [deliverables].
Validate by running [commands/checks].
Pause if [risk/blocker/ambiguous decision].
Stop only when [verifiable end state].
That prompt shape is less dramatic than "go make money while I sleep". It is also much more likely to work.
Useful workflows
The video shows or mentions several categories of work. I would group them by how safely they can be automated.
1. Presentation and content generation
A deck-generation goal is a good first experiment because the failure mode is low risk. The agent can gather reference style, generate slide content, create assets, export a PPTX, and keep iterating until the file exists.
The measurable end state is not "beautiful". It is "a five-slide editable deck exists, follows the source topic, has text on each slide, and opens correctly." A human can still improve taste, hierarchy, and message quality after that.
2. Code refactors and migrations
This is the cleanest use case for Codex-style goal following. Code gives the agent a validation loop: tests, type checks, builds, linting, visual checks, and diffs. If you can define parity and run checks, a goal can keep moving through the boring middle of a migration.
The right goal names the target stack, what behavior must stay the same, what tests prove parity, and what must not be touched.
3. CEO/CTO agent orchestration
The "Hermes as CEO, Codex as CTO" example is useful because it points to the future shape of agent work: one durable objective, multiple specialist workers, and a human who steers the objective instead of micromanaging each file edit.
In practice, I would still keep the first version modest. One app, one marketing artifact, one integration point, one verification command. Multi-agent orchestration can multiply productivity, but it can also multiply confusion if the handoff criteria are not explicit.
4. Business workflows
This is where the idea becomes commercially interesting, but also where human review matters most. Outreach, lead research, reporting, data cleanup, backfills, and content systems can all benefit from a persistent goal loop.
The safe version is not "let the agent sell without oversight". It is "let the agent prepare the sales work": identify leads, collect evidence, draft messages, flag risks, and stop at a review queue. Sending, billing, publishing, deleting, and updating source-of-record data should have approval gates.
Let goal-based agents prepare, research, draft, compare, and validate. Add human approval before they send, spend, publish, delete, or update a system of record.
Safety and control
The stronger the goal loop becomes, the more boring the control layer needs to be. A persistent agent with tools is not just a clever prompt. It is a worker with access.
This is where the earlier AI agent control plane argument becomes practical. Before giving an agent a long-running goal, define:
- Identity: which account does the agent use, and is it separate from your personal account?
- Permissions: what folders, tools, APIs, and services can it access?
- Budgets: what model budget, API spend, turn budget, and runtime limit apply?
- Logs: where can you inspect what the agent read, wrote, changed, sent, or attempted?
- Approval gates: what actions require human review before they go external?
- Rollback: how do you undo changes, revoke keys, stop schedules, or quarantine output?
If you run Hermes on a VPS, add server hygiene too: SSH keys, firewall rules, separate agent accounts, low-limit cards for paid APIs, provider-side billing caps, backups, and a clear kill switch. I covered that setup in more detail in How to Run Hermes Agent Safely on a Rented VM.
The useful version of autonomy is not "the agent can do anything". It is "the agent can keep working inside a bounded workflow until the defined result is reached". That smaller sentence is where real systems get built.
This is also why I like /goal as a concept. It forces the builder to answer the question that vague AI projects avoid: what does done mean?
Sources
This post was inspired by the YouTube walkthrough Hermes /goal is insane... just watch. The setup examples and business-use-case framing come from the supplied transcript, but the mechanics above are checked against official documentation where possible.
- Hermes Agent docs: Persistent Goals (
/goal) - NousResearch/hermes-agent on GitHub
- OpenAI Codex docs: Follow a goal
- Hermes Agent + ChatGPT 5.5
- Hermes Agent safety setup
The short version: use /goal when the work has a real objective, a validation loop, and a stopping condition. If you cannot define done, do not ask an autonomous agent to chase it.