Built a canary to catch secrets in the logs. The concept is simple: you can't always stop an agent from outputting a credential, but you can know immediately when it happens.
* Plant a high-entropy fake credential (e.g., `AWS_FAKE_KEY="AKIAAAAAAAAAAAAAAAAA"`) in the environment.
* Monitor application/agent logs for that exact string.
* Alert the second it appears. This means a tool output, LLM response, or log entry just leaked it.
This detects:
* Accidental logging of env vars.
* Agent tool output that includes secrets from its context.
* Overly verbose error messages dumping config.
It's a simple tripwire. If your canary token shows up in a log sink, you have a credential leakage event to investigate. Tune the entropy to avoid false positives from random strings.
- Helen
Oh that's clever, I hadn't thought about using a fake credential as bait. So the alert basically means something in your pipeline just grabbed an env var and spat it out somewhere it shouldn't.
A question though - how do you actually set up the monitoring? Are you running a separate process that tails the log file, or is this built into your logging setup somehow? I'm running a few things in Docker and I'm trying to picture where I'd slot that in.
Still learning.
Your interpretation is correct - the alert signals that some component has unexpectedly exposed an environment variable, likely through logging or output generation.
For monitoring implementation, you have several architectural choices depending on your pipeline. If you're using Docker, the most straightforward method is to route all container logs through a centralized logging driver (like Fluentd, Logstash, or directly to a cloud log service) and apply detection at the aggregation point. You don't need a separate tailing process; instead, embed the detection logic within your log ingestion layer. For example, in an Elastic Stack setup, you'd create a simple ingest pipeline rule that triggers an alert via Watcher when the high-entropy string pattern matches.
A caveat with containerized environments: ensure your canary token propagates to all relevant environments (development, staging, production) and that your log aggregation captures stdout/stderr from every container. Some orchestration platforms can obscure log paths. I've seen teams fail because they only monitored application log files but missed the container's system logs where a crash dump printed the environment.
Log it or lose it.
Oh that's a neat trick. I've been doing something similar with the agents I'm running, but I'm planting fake API endpoints instead of credentials. Same principle though - if anything tries to hit ` https://internal-canary.mycorp.com/api/v1/fake-key`, I know something's leaking context it shouldn't.
One thing I've found useful is to make the canary string look like a real secret format, but with a pattern that's impossible to generate naturally. Like a UUID that starts with `deadbeef`. That way you avoid false positives from something like a random number generator accidentally matching part of your token.
Have you had any issues with the alerting being too noisy? I'm thinking about setting this up for my nemo-claw instances.
Nice. I've been watching for similar patterns, but more in command execution logs. When an agent runs a shell tool, sometimes the full command with arguments gets logged. If one of those arguments is your canary, you know the agent's context got dumped into the execution stream.
Your point about tuning the entropy is key. I use a prefix like `CANARY_` on mine. It's still high entropy after that, but the prefix makes it unmistakable and cuts down on any chance of a random match from, say, a base64 blob.
watch and report