en/

How to set up your own always-on AI agent with Hermes (install and hosting)

A practical guide to installing Hermes —an AI agent with tools, memory and autonomy— and keeping it running 24/7 in the cloud with the official image. What the harness is, which models you can use (free ones included), and how to connect it to WhatsApp or Telegram.

June 15, 20269 min read

aiagentshermesself-hostingrailway

How to set up your own always-on AI agent with Hermes

Most people use AI by opening a chat, asking something, and copying the answer. That's a search engine with better conversation. The real leap happens when the agent has tools, memory, and runs on its own, without you being present.

Hermes is an open-source agent by Nous Research that does exactly that: it runs in your terminal, on messaging platforms, and on your own infrastructure, with real access to your system. This guide takes you from zero to an agent running 24/7 in the cloud, connected to your WhatsApp or Telegram, using the official image. You don't need to be a DevOps engineer — if you can hold your own in a terminal, you're set.

💡

TL;DR: Install locally with one command to test. To make it always-on you deploy on Railway with the official nousresearch/hermes-agent image (no build needed) and a persistent volume at /opt/data. WhatsApp and Telegram connect through bridges that already ship inside the image.

The mental model: harness, not chatbot
Local installation
Which model to use (free ones included)
Why move it to the cloud
Deploying to Railway with the official image
The volume: where your agent lives
Connecting WhatsApp or Telegram
Summary — the minimum viable setup

The mental model: harness, not chatbot

Before installing anything, it's worth understanding what makes Hermes different. The language model is just the engine. What turns it into a useful agent is the harness around it: the layer that gives it hands, memory, and the ability to act without you.

Tools — terminal, filesystem, web search, browser, code execution. The agent doesn't talk about code: it opens the repo and ships the Pull Request. It doesn't describe a file: it reads it, edits it, and saves it.
Persistent memory — it remembers who you are, your projects, and your decisions across sessions. You don't start from scratch every time.
Skills — reusable procedures the agent learns once and reapplies: review a PR, publish a post, deploy to production.
Autonomy (cron) — scheduled tasks that run without you being present: a summary every morning, a price watcher, an email classifier.
Multi-platform gateway — the same agent, with the same memory, on WhatsApp, Telegram, Discord, or your terminal.

That's what the harness is for: it turns a model that answers into a system that does. And that's why hosting it in the cloud makes sense — so that system is available 24/7, not just when your laptop is open.

Local installation

Always start local to test. The install is one command:

bash

curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash

Then you configure the model and API keys with the interactive wizard:

bash

hermes setup          # guided wizard: model, terminal, gateway, tools
hermes model          # pick or change model/provider
hermes doctor         # check dependencies and config

For a first interactive test:

bash

hermes                                            # interactive chat
hermes chat -q "summarize this repo in 5 bullets" # single query

The configuration lives in simple files inside ~/.hermes/:

File	Contents
`config.yaml`	Settings: model, toolsets, approvals, compression
`.env`	Secrets: API keys
`state.db`	Sessions + memory (SQLite)
`skills/`	Installed skills
`auth.json`	OAuth tokens and credential pools

Keep this location in mind: when you move to the cloud, this — not the source code — is what persists on the volume.

Which model to use (free ones included)

Hermes is provider-agnostic: you pick the engine, and you can change it anytime with hermes model. The best part is you're not locked to one — you can stack several into a pool and let Hermes rotate between them.

A few options worth knowing for getting started:

OpenAI with your ChatGPT plan ($20/mo). If you already pay for ChatGPT Plus, connect it via the OpenAI Codex provider (OAuth login with your account, uses the Codex models). It gives you very generous limits without paying per token separately — one of the most cost-effective paths for a personal agent.
NVIDIA — free models. NVIDIA's Nemotron models are available via build.nvidia.com with a free NVIDIA_API_KEY. Open frontier models, no cost, ideal for not burning budget on background tasks.
OpenRouter. A single API key that routes to hundreds of models (Claude, Gemini, DeepSeek, Qwen…). Perfect when you want to try several without opening an account with each provider. It's Hermes' default.
Kimi / Moonshot. Very capable chat and coding models, with a KIMI_API_KEY.
Ollama. If you want to run local models on your own machine (full privacy, zero API cost), Hermes connects to Ollama like any other provider. Just give it at least 64K of context (-c 65536).

✅

All of these can be used as a pool. Hermes supports a fallback chain: you define a primary model and a list of backups, and when one runs out of quota or fails, the agent switches to the next mid-session, without losing the conversation. So you can combine, for example, a free Nemotron as primary, your ChatGPT plan as backup, and OpenRouter as the final safety net.

yaml

# ~/.hermes/config.yaml — a model pool with backups
fallback_providers:
  - provider: nvidia
    model: nvidia/nemotron-3-ultra
  - provider: openai-codex
    model: gpt-5-codex
  - provider: openrouter
    model: anthropic/claude-sonnet-4

The only requirement: the model needs at least 64,000 tokens of context. Almost all hosted ones (Claude, GPT, Gemini, Qwen, DeepSeek) clear that easily.

Why move it to the cloud

Local works, but it has one limit: the agent only exists while your machine is on. If you want scheduled tasks to run overnight, the bot to always answer, and the agent to survive reboots and laptop lids closing, you need an always-on host.

⚠️

Always-on hosting costs money. There's no permanent free tier: on Railway you need the Hobby plan (~$5/mo + usage) for a persistent service with a volume. Better to know upfront.

I chose Railway because it's about as simple as it gets: it deploys a Docker image with a persistent volume, no VPS to administer by hand. And since Hermes already publishes an official image, you don't have to build anything.

Deploying to Railway with the official image

The key to this setup: the official nousresearch/hermes-agent image already ships everything — Hermes, the WhatsApp and Telegram bridges, and s6-overlay supervision (auto-restart if something crashes). You don't write a Dockerfile; you deploy the image as-is.

The simplest path is through the Railway dashboard:

New Project → Deploy from Docker Image.
Image: nousresearch/hermes-agent:latest.
Add Volume, mount path: /opt/data (this is where the agent's state lives).
Under Variables, add your provider API key and whichever channels you want, for example:
- ANTHROPIC_API_KEY=*** (or OPENROUTER_API_KEY, NVIDIA_API_KEY`, etc.)
- `TELEGRAM_BOT_TOKEN=*** to connect Telegram
Deploy and open the logs.

If you prefer the terminal, the Railway CLI does the same:

bash

railway init                                  # name, e.g. hermes-gateway
railway add --service hermes \
  --image nousresearch/hermes-agent:latest    # ← official image, no build
railway volume add --mount-path /opt/data     # persistent volume
railway variables --set "ANTHROPIC_API_KEY=*** \
                  --set "TELEGRAM_BOT_TOKEN=*** up
railway logs                                  # wait for "connected"

Once it's up, configure the model by SSH-ing into the container and running the wizard once (it's saved to the volume):

bash

railway ssh
hermes setup     # pick provider/model; it's written to /opt/data

From there, s6-overlay keeps the process alive: if the gateway crashes, it restarts on its own.

The volume: where your agent lives

The volume mounted at /opt/data is the container's ~/.hermes: that's where your config.yaml, your state.db (memory + sessions), your skills, and the messaging session live. It survives redeploys and restarts, so your agent doesn't lose its memory every time you update.

If you're starting from scratch, you don't have to migrate anything: configure directly in the container and you're done.

💡

Already had a local Hermes with memory and skills you want to keep? Pack only the state (not the source) and copy it into the volume after first boot:

bash

# on your local machine
cd ~/.hermes
tar -czf /tmp/hermes-state.tgz \
  --exclude='./logs' --exclude='./cache' \
  config.yaml .env state.db auth.json skills memories cron
# upload and extract it into the volume via: railway ssh

Hermes' source code (2+ GB) is already in the image — that never gets migrated.

Connecting WhatsApp or Telegram

This is where the agent gets into your pocket. Both bridges already live inside the official image, so you don't install anything extra — you just enable the channel.

Telegram is the lowest-friction path: create a bot with @BotFather, copy the token, and set it as a variable. No QR, no re-pairing — the token is the whole authentication.

bash

railway variables --set "TELEGRAM_BOT_TOKEN=123456:ABC..."

WhatsApp lives closer to your day-to-day, but takes one extra step. You enable it with two variables:

bash

railway variables --set "WHATSAPP_ENABLED=true" --set "WHATSAPP_MODE=self-chat"

The self-chat mode lets you talk to the agent from your own chat. The first time, WhatsApp Web may ask to link a device: the QR shows up in railway logs, you scan it with your phone (Linked devices), and it's connected.

✅

If you want to start with the minimum, begin with Telegram (token and done) and add WhatsApp later. The agent is the same on both — they share memory.

Summary — the minimum viable setup

If you're starting from scratch, this is the order:

Install locally — curl … | bash, hermes setup, test interactively.
Pick your model — a plan you already pay for (ChatGPT $20), a free one (NVIDIA's Nemotron), or a pool with fallback across several.
Deploy on Railway — official nousresearch/hermes-agent image, volume at /opt/data, variables with your API key.
Configure in the container — railway ssh → hermes setup, once.
Connect a channel — Telegram (token) or WhatsApp (QR from the logs).

The result: an agent with tools, memory, and autonomy, running 24/7, that you talk to over WhatsApp or Telegram like it's another person. It's not a chatbot. It's a harness with hands, and it lives in the cloud.

How to set up your own always-on AI agent with Hermes

Table of contents

The mental model: harness, not chatbot

Local installation

Which model to use (free ones included)

Why move it to the cloud

Deploying to Railway with the official image

The volume: where your agent lives

Connecting WhatsApp or Telegram

Summary — the minimum viable setup