In Part 1, I covered what Hermes Agent is, why I chose ChatGPT 5.5 as the brain, and how the self-improving skill loop works. That post was about the "what" and the "why." This one is about the practical side: how to actually run an autonomous AI agent on a rented server without exposing yourself to unnecessary risk.
Running a self-improving agent on a remote VM raises real questions. What happens if the agent starts doing something unexpected? How do you stop it from accessing files it should not touch? What does Docker actually protect you from, and what does it not? And how much does this cost?
This post answers all of those questions with specifics. No hand-waving about "just use Docker." I will walk through exactly what Docker does to contain Hermes Agent, what risks remain even with full containerisation, which hosting providers make sense, and the exact Hetzner CPX22 setup I am building.
What Hermes Agent actually does on the box
Before talking about safety and hosting, it helps to understand what Hermes Agent actually does at the system level. It is not a model. It does not run inference. It is an orchestrator.
When Hermes Agent runs, it sits on your server as a Node.js process. When you give it a task (via Telegram, a cron schedule, or the web UI), it breaks the task into steps, sends API calls to your chosen LLM provider (OpenAI, Anthropic, Google, etc.), processes the responses, and executes actions based on those responses. The heavy computation happens on the LLM provider's infrastructure, not on your server.
This means the hardware requirements are modest. No GPU. No high-memory instances. The server needs enough CPU to run the orchestration logic, enough RAM to hold the agent's context and skill files, and a stable network connection for the API calls. A 2-core VPS with 4 GB of RAM handles this comfortably.
The agent also writes to disk: skill files, session logs, checkpoint data, and the memory system that powers the self-improving loop. These files are where the agent stores what it learns. They persist across sessions and accumulate over time. Understanding this is important for the safety discussion that follows.
The four safety questions (and honest answers)
When you give an autonomous AI agent access to a server, four questions matter. I will go through each one honestly, including where the answers are not reassuring.
1. Data privacy and leakage
Every task you give Hermes Agent gets sent to your chosen LLM provider as an API call. If you use GPT 5.5, your prompts and task data go to OpenAI's servers. If you use Claude, they go to Anthropic. The agent itself runs locally, but the intelligence layer is remote.
This means anything the agent processes is subject to your LLM provider's data handling policies. OpenAI's API data is not used for training by default (as of their current policy), but the data still transits their infrastructure. If you are processing client data, sensitive business information, or anything regulated, this matters.
Mitigated
- Agent runs locally, data stays on your server between API calls
- Skill files and memory are stored on your machine, not in the cloud
- You choose which LLM provider handles which data
- Docker isolation prevents the agent from accessing unrelated server files
Remaining risk
- All task content is sent to a third-party LLM provider
- No end-to-end encryption between the agent and the LLM API
- Skill files could contain sensitive data extracted from previous tasks
- If you use the Hermes Portal ($20/mo cloud plan), additional data leaves your server
2. Agent doing harmful actions
An autonomous agent that can execute code, make API calls, and interact with external services can, in theory, do harmful things. Delete files, send messages you did not authorise, make purchases, or interact with services in ways you did not intend.
Hermes Agent mitigates this through the Docker sandbox. When the Docker backend is active, the agent runs inside a container with restricted permissions. It cannot access the host filesystem, cannot spawn arbitrary processes on the host, and operates under Linux capability restrictions. But it can still make outbound network requests, which is a meaningful gap I will cover in the limitations section.
3. Model alignment and jailbreaks
The agent's behaviour is ultimately governed by the LLM it calls. If someone crafts an input that jailbreaks the model, the agent will execute the jailbroken instructions. This is not a Hermes-specific risk. It applies to every agent framework that delegates reasoning to an external model.
The practical concern is prompt injection through the agent's own memory. If the agent processes a document that contains hidden instructions, those instructions could get stored in a skill file and influence future behaviour. The agent's self-improving loop (its best feature) is also its most subtle attack surface.
4. Operational and infrastructure security
Standard VPS security applies. SSH key access only, firewall rules, automatic security updates, fail2ban. The agent adds one new surface: the API credentials stored on the server. If someone compromises the VPS, they get your OpenAI/Anthropic API keys and can rack up charges or access your accounts.
Docker helps here by keeping credentials as read-only mounted volumes that the agent can read but not modify or exfiltrate through normal operations. But a full server compromise bypasses Docker entirely. I also isolate accounts at the identity level: one email, one card, one kill switch. I cover the full setup below.
How Docker contains Hermes Agent
Hermes Agent supports seven sandbox backends: local (no sandbox), Docker, SSH, Daytona, Modal, Singularity, and Vercel Sandbox. Docker is the recommended option for self-hosting because it provides strong process isolation with minimal overhead.
Here is what Docker actually does when you run Hermes Agent with the Docker backend:
Read-only root filesystem. The container's root filesystem is mounted read-only. The agent cannot modify system binaries, install packages, or alter the container's base configuration. It can only write to explicitly mounted volumes (the workspace and skill directories).
Dropped Linux capabilities and namespace isolation. Docker drops most Linux capabilities by default. The agent cannot mount filesystems, load kernel modules, change network configuration, or perform other privileged operations. It runs in its own PID, network, and mount namespace, isolated from the host and from other containers.
PID limits. Docker enforces process limits per container. If the agent tries to fork-bomb or spawn excessive child processes, the container hits the PID ceiling and the kernel blocks further process creation. This prevents a runaway agent from consuming all system resources.
Filesystem checkpoints and rollback. Hermes Agent's Docker integration supports filesystem checkpoints. Before executing a potentially destructive action, the system can snapshot the current state. If something goes wrong, you roll back to the checkpoint. This is not automatic by default, but the infrastructure supports it.
Credential files as read-only volumes. Your API keys (OpenAI, Anthropic, etc.) are mounted into the container as read-only volumes. The agent can read them to make API calls, but cannot modify, copy, or move them. This reduces the risk of credential exfiltration through agent actions (though not through network requests).
Skills Guard. Hermes Agent includes a Skills Guard feature that scans newly created skills for suspicious patterns, such as attempts to access environment variables, read credential files, or make unexpected network calls. It is not foolproof, but it adds a layer of review before a skill enters the persistent library.
A typical Docker run command for Hermes Agent looks something like this:
docker run -d \
--name hermes-agent \
--read-only \
--tmpfs /tmp:rw,noexec,nosuid \
--pids-limit 256 \
--memory 2g \
--cpus 1.5 \
-v ./workspace:/app/workspace \
-v ./skills:/app/skills \
-v ./credentials:/app/credentials:ro \
--restart unless-stopped \
hermes-agent:latest
The flags enforce read-only root, a writable but non-executable /tmp, PID limits, memory caps, CPU limits, and read-only credential mounts. This is a solid baseline for self-hosting.
What Docker does NOT protect you from
Docker is good isolation, but it is not a security boundary in the same way a VM is. Three gaps matter for agent workloads:
Docker does not filter outbound network traffic by default. The agent can make arbitrary HTTP requests to any destination. It could exfiltrate data, interact with external APIs you did not authorise, or send messages to services. You need iptables rules or a network proxy to restrict outbound access to only your LLM provider endpoints.
The agent's skill files and memory persist across sessions. If a malicious input gets processed and stored as a skill, it can influence the agent's behaviour in all future sessions. Docker isolates processes, but it does not inspect what the agent writes to its own workspace. This is the most subtle attack surface.
If you run Hermes Agent with the "local" sandbox backend instead of Docker, there is no isolation at all. The agent runs as your user, with full access to your filesystem, network, and environment variables. Only use the local backend for testing on a machine with no sensitive data.
Hosting comparison
You do not need much hardware to run Hermes Agent. The table below compares five realistic options for a solo operator or small team. All prices are as of May 2026.
| Provider | Plan | vCPU | RAM | Storage | Price | Notes |
|---|---|---|---|---|---|---|
| Hetzner CPX22 | Cloud VPS (AMD) | 2 | 4 GB | 80 GB NVMe | €8.49/mo max | Best value. Hourly billing capped. EU data centres. My pick. |
| Hetzner CPX31 | Cloud VPS (AMD) | 4 | 8 GB | 160 GB NVMe | €15.49/mo max | Growth path. Same value tier, double the resources. |
| Contabo | Cloud VPS S | 4 | 8 GB | 200 GB NVMe | €7.49/mo | Cheapest on paper. Variable network quality. No hourly billing. |
| DigitalOcean | Basic Droplet | 2 | 4 GB | 80 GB SSD | $24/mo | Good UX and docs. 2.8x the price of Hetzner for same specs. |
| Vultr / Linode | Cloud Compute | 2 | 4 GB | 80 GB NVMe | $24/mo | Comparable to DigitalOcean. Good if you need US/APAC regions. |
The Hetzner CPX series uses AMD EPYC processors with dedicated vCPU allocation (not shared burstable). It includes 20 TB of traffic per month, which is far more than an agent workload generates. The hourly billing model means you pay only for time the server is running, capped at the monthly maximum.
My specific setup
Why Hetzner CPX22 to start
I am starting with the CPX22 for a simple reason: it is the minimum viable server for the workload, and the cost is trivial. At a maximum of €8.49 per month, the financial risk of experimenting is effectively zero. If the agent needs more resources, I can resize to the CPX31 in a few clicks without rebuilding anything.
The specs are sufficient. 2 AMD EPYC vCPUs handle the Node.js orchestration with headroom. 4 GB of RAM is comfortable for Docker, the agent process, and a few hundred skill files. 80 GB of NVMe storage is generous for an agent that primarily stores text files. The bottleneck for agent workloads is API latency, not local compute.
Growth path: CPX22 to CPX31
The trigger for upgrading is straightforward: if the agent consistently uses more than 3 GB of RAM (leaving less than 1 GB for the OS and Docker), or if CPU utilisation stays above 80% during sustained task execution, I move to the CPX31. That gives 4 vCPUs, 8 GB RAM, and 160 GB storage for €15.49 per month. Same architecture, same data centre, just more resources.
Beyond the CPX31, the next step would be running multiple agents on a single larger instance or splitting agents across dedicated servers. But that is a scaling problem for later. Start small, measure, then grow.
Full stack: Docker backend + API + Telegram
Here is the full stack I am deploying:
- Server: Hetzner CPX22, Ubuntu 24.04 LTS, Docker CE
- Agent: Hermes Agent (latest stable), Docker sandbox backend
- Primary model: ChatGPT 5.5 via OpenAI API (research, scheduling, content tasks)
- Secondary model: Claude via Anthropic API (complex reasoning tasks, fallback)
- Interface: Telegram bot for mobile access and task submission
- Monitoring: Docker logs + simple uptime check via cron
- Security: SSH key only, UFW firewall, unattended-upgrades, fail2ban
No web dashboard, no public-facing ports except SSH. The only inbound access is my SSH key. The only outbound access is to OpenAI, Anthropic, and Telegram APIs. This is deliberately minimal. The fewer exposed surfaces, the fewer things that can go wrong.
Account isolation: one email, one card, one kill switch
If the VPS is compromised, every API key on that server is exposed. If those keys belong to your personal OpenAI account, your personal GitHub, your personal Anthropic billing, one breach cascades everywhere. The fix is a dedicated identity for the agent. One email, one card, one set of accounts. If something goes wrong, you disable that one identity and nothing personal is touched.
Start with a dedicated email solely for the agent. If you have your own domain: hermes@yourdomain.com or agent@yourdomain.com. No domain: a fresh ProtonMail or Gmail. This email is used only to register every service the agent touches. It should not be the email you use for client work, personal logins, or anything else.
Accounts to create under the agent email:
- OpenAI for GPT 5.5 API access (primary model)
- Anthropic for Claude API (secondary/fallback)
- GitHub dedicated bot account for any repos the agent pushes to or pulls from. Machine/bot accounts for CI/CD are standard practice. GitHub's terms allow them as long as a human is responsible.
- Vercel if the agent deploys frontends or serverless functions (not needed initially, but noted for later)
- Any future SaaS the agent integrates with (Supabase, Cloudflare, etc.)
Why does Hetzner stay on your personal account? Because you SSH into the VPS yourself. You manage billing, snapshots, firewall rules. The agent never logs into Hetzner's console. What the agent needs are the API keys for services it calls, not the hosting dashboard. For a solo operator, personal Hetzner access with agent-only API keys on the server is the right balance.
Telegram is already isolated. The BotFather flow creates a standalone bot token. It does not share your personal Telegram chat history or contacts. No change needed there.
Attach a dedicated payment method to the agent accounts: a virtual card from your bank, a prepaid card, or a secondary card with a low limit. Set hard monthly billing caps on every API provider (OpenAI and Anthropic both support this in their dashboards). If the agent enters a loop, the cap stops the bleeding automatically.
- Freeze the virtual card (stops all billing instantly)
- Revoke the API keys from the agent email's accounts (OpenAI, Anthropic, GitHub tokens)
- Power off the Hetzner VPS from your personal Hetzner account
Three steps. Everything the agent could access is dead. Your personal accounts, your personal email, your personal GitHub, your personal billing are all untouched.
What should NOT be on the server:
- Your personal email credentials
- Your personal GitHub tokens (with client repos, private projects)
- Your personal OpenAI/Anthropic accounts (with conversation history, billing for other work)
- SSH keys for other servers
- Any client data not actively needed by the agent
This takes 30 minutes to set up and removes an entire category of risk.
Other options worth considering
A Hetzner VPS is not the only way to run Hermes Agent. Depending on your situation, one of these alternatives might be better.
Modal serverless. Hermes Agent supports Modal as a sandbox backend. Modal runs your code in isolated containers on their infrastructure, billed per second of compute. No server to maintain, no Docker to configure. The trade-off is higher per-minute cost and less control over the environment. Good for intermittent workloads where you do not need 24/7 availability.
Your own machine. If you have a desktop or spare laptop that stays on, you can run Hermes Agent locally with the Docker backend. No hosting cost at all. The downside is reliability: power outages, sleep mode, and network interruptions all stop the agent. Fine for experimentation, less ideal for scheduled production tasks.
Termux on Android. The community has documented running Hermes Agent on Android via Termux. This is a creative solution for mobile-first users, but the performance and reliability constraints make it impractical for anything beyond demos. Interesting proof of concept, not a production setup.
Skip self-hosting entirely. The Hermes Portal plan ($20/month) runs the agent on Nous Research's cloud infrastructure. You trade control and privacy for zero operational overhead. If you do not want to manage a server and your data sensitivity is low, this is the simplest option.
Frequently asked questions
Can Hermes Agent access my server files outside its Docker container?
Not by default. When you run Hermes Agent with the Docker sandbox backend, the agent operates inside a container with a read-only root filesystem. It can only access files explicitly mounted as volumes. The agent cannot see or modify anything on the host machine unless you grant that access.
Does Hermes Agent need a GPU to run?
No. Hermes Agent is an orchestrator, not a model. It sends API calls to external LLM providers (OpenAI, Anthropic, etc.) and processes the responses. It runs on CPU-only servers. A 2-core VPS with 4 GB of RAM is enough for most workloads.
How much does it cost to run Hermes Agent on Hetzner?
A Hetzner CPX22 costs a maximum of €8.49 per month (hourly billing, capped). That gets you 2 AMD EPYC vCPUs, 4 GB RAM, 80 GB NVMe, and 20 TB of traffic. You also pay for the LLM API calls the agent makes (OpenAI, Anthropic, etc.), which vary based on usage.
Can Hermes Agent sessions access each other's data?
No, if you use the Docker sandbox backend. Each session runs in its own isolated container with its own filesystem and process namespace. One session cannot see or interact with another session's container.
What are the biggest remaining risks of self-hosting Hermes Agent?
The three main risks are: outbound network access (the agent can make arbitrary HTTP requests from inside Docker), prompt injection via memory files (an attacker could craft inputs that persist in the agent's skill files and influence future behaviour), and the cost of runaway API calls if the agent enters a loop. Docker handles process isolation well, but it does not filter what the agent says to the internet.
Should I use my personal email and credit card for agent API accounts?
No. Create a dedicated email (e.g. agent@yourdomain.com) and register all agent-facing accounts under it: OpenAI, Anthropic, GitHub, and any other services the agent uses. Attach a low-limit or virtual card with hard billing caps. If the server is compromised, you freeze one card and revoke one set of keys. Your personal accounts stay untouched.
Sources and credits
- Part 1: Hermes Agent + ChatGPT 5.5: Why I'm Building My Next AI Employee on This Stack (JQ AI SYSTEMS)
- Hermes Agent GitHub repository (104k+ stars, 14.9k forks as of May 2026)
- Hermes Agent official site (Nous Research)
- Docker Engine security documentation
- Hetzner Cloud pricing and specs
- Setup tutorial by @imranye (step-by-step walkthrough)
- What Is an AI Agent? (JQ AI SYSTEMS)
- Custom AI Systems vs Zapier and Make.com (JQ AI SYSTEMS)