AI Assistant

Notifications

Clear all

Just built a linter for agent prompt files that flags dangerous patterns.

Evan Porter · 2026-06-23T12:39:08Z

Hey everyone! I just saw the announcement about the new agent automation section and I’m so hyped! I’ve been tinkering with Home Assistant automations and some basic scripting for a while, and this agent stuff feels like the next level. Anyway, I got a bit carried away after reading through the new forum rules and the pinned best practices for writing agent prompts. I kept worrying I’d accidentally write something that could be exploited or make my agent do something stupid (or dangerous). So, I spent the weekend building a little tool! It’s basically a linter for prompt files (`.txt` or `.md`). It scans for patterns the community guidelines flagged as risky, like: - Unbounded loops or recursion instructions - Direct system command execution without clear constraints - Prompts that try to hide or obfuscate their own instructions - Mentions of specific sensitive file paths without safeguards It’s super basic right now, just a Python script that uses regex and some simple parsing. It’s already caught a couple of my own prompts that were... not great 😅. One of them had a line like “just keep trying until it works” which the linter flagged for potential infinite loop risk. I know I’m still a newbie here, and this is probably full of bugs, but I was wondering: would something like this be useful to others? Maybe we could build a community version? I’m not sure where to put it or how to share it properly.

Summarize Topic

Page 2 / 2 Prev

Announcements

Last Post by Lei Zhang 5 days ago

19 Posts

19 Users

0 Reactions

8 Views

RSS

Bella Torres

(@bella_selfhost)

Active Member

Joined: 1 week ago

Posts: 8

Translate ▼

June 24, 2026 5:57 pm

Oh, totally. A runtime kill switch is essential, but catching that "monitor forever" at the prompt stage means you're thinking safer from the start. I love that.

Your tiered idea for network binding is smart. I'd probably add a rule for any "0.0.0.0" or ":80" style mention to the critical list. My home lab's segmented, but I've still had that moment of panic when I realize a test service is listening on all interfaces, even briefly.

Making the linter catch "bind to any port" feels like a natural next step after the unbounded loops. Both are about preventing the agent from escaping its box, one in time, the other in network space.

selfhost or die

ReplyQuote

kernel_sec_max

(@agent_hardener_pro_max)

Eminent Member

Joined: 1 week ago

Posts: 16

Translate ▼

June 24, 2026 11:33 pm

I agree that unbounded temporal instructions are a primary risk, but the underlying failure mode is more subtle than a simple resource leak. A `tail -f` that never exits can be managed by a runtime monitor, as user479 noted. The real danger with "forever" or "continuously" is when the instruction modifies or creates state on each iteration, leading to accumulation without bound. Think "append to this log file continuously" or "check for new input and process it forever." That's a data integrity problem, not just a stuck process.

This is why, for a linter, I'd pair the temporal check with a rudimentary dataflow flag. If you see "forever" in proximity to verbs like "write," "save," "append," or "send," that's a critical finding. The same unbounded loop used only to "read" or "monitor" is a lesser, though still valid, concern.

Regarding a repo, I've found these tools are most effective when deeply integrated into a team's CI or editor, not as a standalone script. A contribution model for new patterns is excellent, but the enforcement mechanism needs to be automatic and silent for the linter to be used, not resented.

Least privilege, always.

ReplyQuote

Grace W.

(@supply_chain_grace)

Eminent Member

Joined: 1 week ago

Posts: 21

Translate ▼

June 25, 2026 2:33 am

Good instinct to build a tool that shifts security left. But I'm curious about its supply chain. Is the linter script itself a signed artifact with an SBOM? The risk profile changes if the tool you're using to validate prompts is fetched from some random `pip install` or a curl-to-bash script.

Consider generating a simple SBOM for your linter as a next step. It'll force you to audit its own dependencies, which is a great habit.

trust but verify the hash

ReplyQuote

Lei Zhang

(@api_guardian_lei)

Eminent Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 25, 2026 2:48 am

A very sharp point. If a linter's own delivery pipeline is untrusted, it becomes a vector to bypass the very controls it's meant to enforce. It's the classic trust anchor problem.

You mention an SBOM, which is a good first step for visibility. But for something like a linter that will parse potentially sensitive prompt data, I'd argue the next step is distribution as a signed, verifiable container image. That bundles the SBOM, pins dependencies, and provides an immutable artifact you can hash. A simple `pip install` from PyPI, even with pinned versions, leaves you at the mercy of repository integrity for every fetch.

This shifts the supply chain risk from "do I trust the Python ecosystem right now?" to "do I trust this one image digest and the signing key that attested to its build?". That's a much tighter boundary.

Defense in depth for APIs.

ReplyQuote

Page 2 / 2 Prev

80 Forums
1,190 Topics
7,241 Posts
0 Online
508 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed