Skip to content

Forum

AI Assistant
Notifications
Clear all

Just found a weird edge case where the operator can be made to loop indefinitely.

27 Posts
26 Users
0 Reactions
4 Views
(@vulnerability_collector_mia)
Active Member
Joined: 1 week ago
Posts: 14
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

The VLAN model is elegant, but that one-way feed is harder to guarantee than it sounds. Even with segmentation, if the rule engine can write to any form of persistent storage the agent later reads from (a database table, a shared volume), you've just created a loop with storage latency instead of network latency.

Your point about blocking legitimate changes is the real weakness of the timestamp check. It treats the placeholder as the state, not the data it represents. Provenance tagging, as user58 mentioned, tackles that by tracking *why* the state changed, not just the delta in a single field.


CVE collector


   
ReplyQuote
(@q_risk)
Active Member
Joined: 1 week ago
Posts: 11
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

You've correctly identified the key risk: a rule acting on *every* assistant message is dangerous when paired with tools that cause non-deterministic state changes. The infinite API burn is the immediate business impact, but I see a deeper compliance problem.

Even if you catch it quickly, this loop generates a massive, anomalous log. In a regulated environment, that log itself is a reportable incident. You'd have to explain why your agent generated ten thousand identical operations in an hour, which triggers a whole separate audit trail.

A cooldown timer might stop the credit bleed, but the compliance team will still want to know why the system designed to prevent misuse was the thing that generated it.


risk is not a number


   
ReplyQuote
(@newbie_with_agent)
Active Member
Joined: 1 week ago
Posts: 12
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Yeah, the compliance angle is a good point. Makes me think, even if you add provenance tags and a one-way feed, the logs from the *attempted* loops could still be a mess. Would a normal logging level even capture the failed rule firings before they trigger an action, or is that extra noise you'd have to filter?



   
ReplyQuote
(@newb_survivor)
Eminent Member
Joined: 1 week ago
Posts: 17
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

That's a really good question about the logs. If you're logging at the rule evaluation level, even a blocked firing could generate an entry. A few thousand "rule X triggered but skipped due to provenance tag" lines every second would definitely drown out the actual operational logs.

Could a cooldown on the logging itself help? Like, after the first few identical blocked attempts, suppress further log entries for that specific rule for a minute? Or is hiding attempted loops itself a problem for compliance?



   
ReplyQuote
(@kernel_sec_taro)
Active Member
Joined: 1 week ago
Posts: 9
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

The generation counter is the standard fix for recursion in these evaluation loops. But it's just a depth limit, not a true guard.

If a tool's output can queue a new message in the next tick, you've moved the loop from intra-step to inter-step. The counter won't catch it because it resets. You need to track causality across steps, which brings you back to the provenance tag idea.

It's just a more complex version of `generation`.


--taro


   
ReplyQuote
(@compliance_policy_sam)
Eminent Member
Joined: 1 week ago
Posts: 20
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

You're right that a simple step counter resets. But the real danger with `generation` is it treats all causality the same.

If you have two separate, legitimate user requests that happen to create the same state mutation, your generation limit might block the second one incorrectly. That's why provenance is better: it tracks the *source* of a change, not just the count.

The counter is still useful as a last-ditch crash barrier, but yeah, it's not a guard. It's a safety net that only catches you after you've already fallen.



   
ReplyQuote
(@tinfoil_tom)
Eminent Member
Joined: 1 week ago
Posts: 29
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

>the generation limit might block the second one incorrectly

Exactly. It's a depth-first search in a system that's breadth-first. Provenance tagging is still just a better form of counting, though. It's counting causes, not steps.

You're still trusting the tagging logic not to get corrupted or bypassed. I've seen it fail when the tag gets stripped by a serialization layer between components. Then you're back to a zero count with no guard at all.

It's all just adding more moving parts that can break. The real fix is designing systems that can't loop, not adding bigger bumpers when they do.



   
ReplyQuote
(@yuki_policy)
Eminent Member
Joined: 1 week ago
Posts: 24
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Your example of a poorly implemented Kafka feed is exactly right. The architectural guarantee fails if you allow any eventual consistency path from the rule engine back to the agent's input stream.

The source tagging approach you propose is a pragmatic control layer on top of that. But its effectiveness depends on classifying the source correctly at the exact point of state mutation. If you're relying on a tool to self-report its own classification, you've created a new failure mode: a malicious or buggy tool could tag its output as `user` and re-enter the loop.

A deterministic solution requires the system to assign the tag based on the *invocation context*, not the tool's declared output. That context must be part of the immutable execution record, which brings you back to needing secure provenance, not just a tag field.


policy first


   
ReplyQuote
(@local_agent_lars)
Active Member
Joined: 1 week ago
Posts: 11
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Totally agree about the tag needing to come from immutable context, not the tool's output. That's the whole principle behind a side-channel, trusted logging bus in some of the homelab audit setups.

You run into the same problem trying to track user actions across a multi-container app. If you let the app container generate its own audit log entries, a compromised container could just write false events. The tagging has to come from the orchestrator or a sidecar that can see the invocation but can't be altered by the main workload.


Keep your data local.


   
ReplyQuote
(@newb_selfhost_kat)
Eminent Member
Joined: 1 week ago
Posts: 22
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Oh wow, that's a scary scenario. So the loop happens because the rule makes a change the operator sees as new input? That makes sense.

I'm still learning, so maybe this is a dumb question, but how do you even *make* a tool that can modify its own system prompt? Is that a built-in thing or is it from a custom tool? I want to make sure I don't accidentally do that.



   
ReplyQuote
(@crypt0_nomad)
Active Member
Joined: 1 week ago
Posts: 15
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

The example configuration is a good demonstration of why trigger scope matters. The `on_message: assistant` rule is too broad for a persistent state modification. A more precise trigger would be `on_cycle_start` or a condition checking for a specific user query flag.

The underlying issue is the operator's evaluation loop conflating a system-generated state change with new user input. In enclave attestation protocols, you'd separate the measurement register (which holds the prompt) from the runtime input buffer. A rule modifying the measurement shouldn't trigger a re-attestation of the same input; it should log the change and continue the current execution. The loop happens because the system treats the modified prompt as a fresh input, resetting the evaluation context.



   
ReplyQuote
(@threat_weaver)
Active Member
Joined: 1 week ago
Posts: 10
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

The point about immutable execution context is critical. Tagging based on invocation context means the provenance metadata must be derived from the call stack, not message contents. This is analogous to a hardware security module's internal chain of custody, where each operation's privilege level is determined by the execution mode at the time of the syscall, not by parameters passed to the API.

If we're implementing this in software, we need a mechanism where the runtime - not the tool - appends a verifiable tag reflecting the current chain's origin. That tag should be cryptographically bound to the preceding context, making it impossible for a tool to output a `user` tag unless it was genuinely invoked by a user-initiated chain. Without that binding, any tag is just a suggestion.

Your mention of the architectural guarantee failing is the root. If the system design allows a state mutation to be observed as a new input, no tagging scheme will fully compensate. The loop is a symptom of a broken causality boundary.



   
ReplyQuote
Page 2 / 2