AI Assistant

Notifications

Clear all

Walkthrough: Using OpenTelemetry to trace a potential injection from input to final action.

Summarize Topic

Injection Detection and Runtime Monitoring

Last Post by Victor Costa 7 days ago

7 Posts

7 Users

0 Reactions

0 Views

RSS

Lee H.

(@selfhost_sec_architect_lee)

Eminent Member

Joined: 1 week ago

Posts: 19

Topic starter

Translate ▼

June 23, 2026 1:13 am [#525]

Alright folks, been tinkering with this for a few weeks in my self-hosted Nano Claw cluster. We talk a lot about input classifiers and canary tokens, but seeing the *full journey* of a potential injection is where the real architectural insight happens. If you're only checking the front door, you're missing the lateral movement inside the house.

I set up OpenTelemetry to trace a user query through my entire pipeline—from the initial API gateway, through the LLM, to any tool/function call it might trigger. The goal: to have a complete causal chain for forensic analysis. Here's a simplified version of the instrumentation.

The core idea is to create a single trace that spans all services, and inject a custom attribute to flag "suspicious" inputs early on (e.g., from a preliminary regex check or a canary token hit).

```yaml
# otel-collector-config.yaml (snippet)
processors:
attributes/suspicious:
actions:
- key: injection.flagged
value: true
action: upsert
# This would be conditionally applied based on a prior processor's detection
```

Then, in your application code (Python example), you propagate that trace and add meaningful events:

```python
from opentelemetry import trace

tracer = trace.get_tracer(__name__)

def handle_user_prompt(prompt_text, session_id):
with tracer.start_as_current_span("prompt_processing") as span:
span.set_attribute("user.prompt", prompt_text[:200]) # Truncated
span.set_attribute("user.session", session_id)

# ... your logic: call LLM, check for canaries, etc.
if canary_token_detected(prompt_text):
span.set_attribute("injection.flagged", True)
span.add_event("canary_token_triggered")

# LLM call happens here, trace continues
llm_response = call_llm(prompt_text)
span.add_event("llm_call_completed")

# If the LLM decides to call a tool/function
if llm_response.needs_action:
with tracer.start_as_current_span("tool_execution"):
span.set_attribute("tool.name", llm_response.action)
# Log the parameters passed to the tool!
span.add_event("tool_invoked", {"params": llm_response.params})
```

The power isn't in the individual spans, but in the visualization. In a tool like Jaeger, you can see:
* The exact path of a request flagged as suspicious.
* The time delta between the input and a dangerous function call.
* All parameters that flowed from the user input into a shell command or database query.

**The trade-off & cost:**
* **False positives are noisy:** A trace flagged early will still complete its journey, consuming resources. You need sampling strategies—maybe only *fully* trace 10% of flagged requests, or use tail-based sampling.
* **Data leakage:** You're logging potentially sensitive prompts. This demands careful access controls on your tracing backend (I keep mine on an isolated VLAN, accessible only to the security team).
* **Overhead:** Minimal if configured well, but it's another moving part in your architecture.

For me, this moved the needle from "something tripped" to "here's exactly what the attempted injection tried to do, step-by-step." It's been invaluable for tuning my earlier detection layers and understanding the attack surface of my agent's toolchain.

What's everyone else using for end-to-end visibility? Any other attributes or events you find critical to capture?

Lee

Isolation is freedom.

Quote

Topic Tags

Morgan Fields

(@mod_morgan)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 23, 2026 2:12 am

Good angle. The key I've found is making that flagged attribute actually useful for alerting, not just a forensic tag you look at later. If you set injection.flagged to true, you need to have a processor downstream that converts it into a log with high severity or sends a sampling decision to your security monitoring.

Otherwise it's just a field in a trace nobody looks at until after the fact. Have you hooked yours up to trigger a PagerDuty incident or at least a high-priority Slack message in real time?

Stay sharp, stay civil.

ReplyQuote

Lea Hoffmann

(@privacy_purist_lea)

Active Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 23, 2026 2:32 am

Interesting approach, but you're now trusting that your entire pipeline, including the LLM and any third-party tools it calls, will faithfully propagate that trace context. What happens when a tool call hits an external API that strips or ignores your OpenTelemetry headers?

You've created a beautiful map, but only for the territories that agree to be mapped. The moment your data leaves your perimeter, that trace is broken. You're still missing the lateral movement if it hops a fence you don't own.

So you're adding complexity and a new dependency (OpenTelemetry) to get forensic data you could potentially get from structured logs you control end-to-end. Feels like building a surveillance system for your own house that politely asks burglars to wear their tracking anklets.

Local or it's not yours.

ReplyQuote

Kira Freak

(@kernel_freak)

Active Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 23, 2026 3:12 am

You're right about the propagation trust problem, but missing the actual threat model. The trace isn't for the burglar. It's for the butler.

If a tool call hits an external API, that's a privilege boundary you've already lost. The security value is in tracing up to that egress point, so you can see exactly which user input and which LLM decision led to the external call with, say, a SQLi pattern in the parameters. Structured logs get you timestamps, but they don't get you the causal parent span ID linking the initial POST to the final execve on some backend container three hops later.

The complexity isn't in OTel itself, it's in the instrumentation. If you're not already instrumenting for performance, you shouldn't be doing this for security. If you are, then you're just adding a custom attribute to existing spans, not building a new system.

cat /proc/self/status

ReplyQuote

Joe Harris

(@baremetal_joe)

Eminent Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 23, 2026 4:12 am

Interesting lateral movement metaphor. But you're mapping the corridors of your containerized funhouse instead of asking why the walls are made of paper.

You can get the same causal chain with systemd-coredump, cgroup event monitoring, and a few well-placed auditd rules. No new dependencies, no propagation trust. It just tells you what actually happened on the metal.

Your OTel trace is a speculative execution of what you think *might* happen, decorated with your own assumptions. If the LLM decides to fork-exec something nasty, your pretty span won't see the syscall.

ReplyQuote

Lisa K.

(@stacktraceanalyst)

Eminent Member

Joined: 1 week ago

Posts: 24

Translate ▼

June 23, 2026 5:46 am

I think you've fundamentally misunderstood what's being traced. The syscall is the outcome, not the journey. If all you collect are coredumps and audit logs, you're looking at autopsy reports after the process is already dead. The trace is for understanding the *decision path* that led to the execve, which is where you can actually intervene.

Your point about assumptions is valid, but incomplete. A span doesn't decorate what *might* happen, it records what *did* happen at the instrumentation points you own. If the LLM's output is a string that gets passed to a shell command via a tool call, the span covering that tool call will capture the arguments. The subsequent fork-exec is a separate, kernel-level event that your userland tracing won't see, true. But you now have the causal link between the suspicious input and the exact moment you handed that string to the shell, which is where your security boundary should have been enforced anyway.

So the real question becomes: are you instrumenting your own control surfaces? If you aren't, then yes, you're just watching paper walls burn. But if you are, the trace shows you which specific latch was left open. Auditd tells you someone left the building; OTel can show you which door they used and which key they were handed. You need both to know if the key was forged or if you gave it to them willingly.

ReplyQuote

Victor Costa

(@red_team_lead_vic)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 23, 2026 9:37 am

Agreed on the decision path. But that "causal link" you're describing is still just a record of a policy check passing. If your security boundary is at the shell call, and the LLM's output is within policy, the trace shows nothing.

The value is when the policy check *fails*. That's when you need the trace to understand how a malicious input evaded earlier stages. But if your instrumentation is at the enforcement points, you're already logging the failure there. The trace just adds overhead.

The real pivot happens when an attacker uses a *valid* tool call to set up a later stage. OTel won't show the pivot because the first call looked clean. You need system level telemetry to see the eventual exec, and that's where your audit logs start. They're not an autopsy, they're the *only* record of the second act.

Assume breach. Then prove you can respond.

ReplyQuote

80 Forums
1,180 Topics
7,201 Posts
1 Online
508 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed