Skip to content

Forum

AI Assistant
Notifications
Clear all

My results after testing secret injection with the new gRPC transport layer.

1 Posts
1 Users
0 Reactions
3 Views
(@homelab_hardener_pete)
Active Member
Joined: 2 weeks ago
Posts: 15
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
  [#1379]

Hey folks,

I just spent the better part of a week testing every secret injection method I could think of against the new gRPC transport layer in OpenClaw v0.4.0-rc2. The goal was to see which patterns hold up in a homelab environment and, more importantly, which ones introduce subtle risks now that the control plane talks gRPC over TLS instead of the old REST API. I documented everything, and I've got some scripts to share.

**TL;DR:** Mounted secrets from a tmpfs volume are winning for me, but the gRPC layer changes the game for environment variable leakage.

Here's my setup: I'm running three Nano Claw agents in Docker Swarm mode (simulating a proper cluster). The manager has mTLS configured with client certs, and I'm using a local HashiCorp Vault dev server for the more complex patterns.

### The Big Surprise: Environment Variables Are More Leaky Now

With the old REST API, a compromised container could maybe expose env vars via a proc dump. But now, with gRPC's verbose error reporting and reflection (which I left on for testing), I found that a misconfigured health check could expose environment variable names in certain error contexts. Not the *values*, but the *keys*, which is still a reconnaissance goldmine.

I've switched to using a Docker secret mounted as a file for the primary agent token. Here's the relevant snippet from my stack file:

```yaml
agent:
image: openclaw/nano:latest
secrets:
- source: agent_master_token
target: /run/secrets/agent_token
environment:
- TOKEN_FILE=/run/secrets/agent_token
- CA_CERT_FILE=/run/secrets/ca_cert
volumes:
# tmpfs for volatile certs
- type: tmpfs
target: /run/secrets
tmpfs:
size: 1000000 # ~1MB
command: >
--grpc-tls-cert=/run/secrets/agent_cert
--grpc-tls-key=/run/secrets/agent_key
```

### The Winning Pattern: tmpfs + Short-Lived Certs

My current, most hardened pattern involves:
* **Docker Secrets** for the initial, long-lived bootstrap token (like for joining the cluster).
* A **tmpfs volume** (`/run/secrets`) for TLS certs and any tokens fetched from Vault post-bootstrap. This ensures they're never written to disk, even on the host.
* An **init container** (a sidecar, really) that fetches short-lived certs from Vault using the bootstrap token and writes them to the shared tmpfs. The main agent container then starts.

Here's the bash snippet for the init sidecar (runs as a `docker run --rm` before the main service):

```bash
#!/bin/bash
# fetch_certs.sh
set -euo pipefail

VAULT_ADDR="http://vault:8200"
# Read the initial bootstrap token from the Docker secret
BOOTSTRAP_TOKEN=$(cat /run/secrets/bootstrap_token)

# Fetch a short-lived TLS cert and key for the gRPC layer
curl -s -H "X-Vault-Token: ${BOOTSTRAP_TOKEN}"
--request POST
--data '{"common_name": "agent-$(hostname)"}'
${VAULT_ADDR}/v1/pki/issue/agent-role > /tmp/cert.json

# Parse and write to the shared tmpfs volume
jq -r '.data.certificate' /tmp/cert.json > /run/sharedsecrets/agent_cert
jq -r '.data.private_key' /tmp/cert.json > /run/sharedsecrets/agent_key
```

### Unsafe Patterns I'd Avoid Now

1. **Plain `environment:` in Docker Compose** for any secret retrieved from Vault after bootstrap. It's too easy for them to end up in logs via the new gRPC status messages.
2. **Long-lived gRPC TLS certificates stored in a baked image or a persistent host volume.** The gRPC layer is more sensitive to cert rotation, so you need a process for that.
3. **Using the same secret injection for the gRPC certs as for the application config.** They should be separated – a breach of one shouldn't compromise the other.

I'm working on an Ansible role to automate this entire bootstrap-and-rotate flow. The gRPC layer is faster and more efficient, but it demands a tighter secret rotation strategy.

Has anyone else tested the new transport with Vault's dynamic secrets? I'm curious about your agent restart strategies when a short cert expires.

Pete


Automate the boring parts.


   
Quote