AI Assistant

Notifications

Clear all

Just built a simple webhook to push critical SIEM alerts back into our agent orchestration tool.

Summarize Topic

SIEM Integration for Agent Events

Last Post by David Kirsch 6 days ago

5 Posts

5 Users

0 Reactions

1 Views

RSS

Omar Hassan

(@sysadmin_prod)

Eminent Member

Joined: 1 week ago

Posts: 20

Topic starter

Translate ▼

June 23, 2026 10:03 pm [#677]

We've been running our own agent fleet for a while, and the alerting was always one-way: agent events go to the SIEM. That creates a blind spot for the orchestration system. If an agent starts throwing critical security events in Splunk, the platform managing that agent has no idea.

I built a simple webhook listener to close that loop. The goal: push high-severity, agent-specific SIEM alerts back into our orchestration tool's API. This lets the platform take automatic, cautious remediation steps or at least flag the host for manual review.

The flow is straightforward:
1. SIEM (we use Splunk) triggers an alert based on agent runtime events (e.g., "unexpected module load," "secret retrieval spike," "process ancestry anomaly").
2. A small webhook app (Flask in this case) receives the alert, validates a shared secret token, and parses the critical fields.
3. The app maps the alert to a host identifier, then makes a POST to our orchestration tool's API to update the host's status and trigger a playbook.

Key considerations I had to address:
* Blast radius: The webhook only accepts alerts from a specific, hardened Splunk search head IP.
* Rollback: Any action taken by the orchestration tool is logged and can be reverted via a separate playbook. The webhook itself only triggers "investigative" or "containment" tags, not immediate termination, unless the alert confidence is extremely high.
* Rate limiting: The webhook has a simple in-memory counter to prevent alert storm issues.

Here's the core of the webhook logic for validation and forwarding:

```python
from flask import Flask, request, jsonify
import requests
import os

app = Flask(__name__)
ORCHESTRATOR_URL = os.getenv('ORCH_URL')
WEBHOOK_SECRET = os.getenv('WH_SECRET')
SPLUNK_SOURCE_IP = os.getenv('SPLUNK_SH_IP')

@app.route('/webhook/siem_alert', methods=['POST'])
def siem_webhook():
# Validate source IP
if request.remote_addr != SPLUNK_SOURCE_IP:
return jsonify({"error": "unauthorized source"}), 403

# Validate shared secret
if request.headers.get('X-Webhook-Token') != WEBHOOK_SECRET:
return jsonify({"error": "invalid token"}), 403

data = request.json
# Extract minimal required fields
host_id = data.get('host_identifier')
alert_id = data.get('alert_id')
severity = data.get('severity')

if not all([host_id, alert_id, severity]):
return jsonify({"error": "missing required fields"}), 400

# Only act on critical/High severity for automated steps
if severity in ['Critical', 'High']:
payload = {
"host": host_id,
"status": "investigation_pending",
"siem_alert_id": alert_id,
"action": "tag_for_containment"
}
# Call orchestration API
resp = requests.post(
f"{ORCHESTRATOR_URL}/api/v1/host/action",
json=payload,
verify=False # Use internal CA in prod
)
return jsonify({"orchestrator_response": resp.status_code}), 200
else:
# Log medium/low alerts, no action
return jsonify({"status": "logged_no_action"}), 200
```

Current detection use cases feeding this:
* Agent process integrity violations (hash mismatch).
* Failed secret retrieval attempts from Vault exceeding threshold.
* Scheduled agent tasks failing consecutively, which could indicate tampering.

The main benefit is that operational security events now create a tangible, automated workflow in the tool that owns the asset, not just a ticket in the SIEM queue.

automate, audit, repeat

Quote

Topic Tags

Bob Chen

(@practical_threat_bob)

Eminent Member

Joined: 1 week ago

Posts: 19

Translate ▼

June 24, 2026 12:00 am

That's a great approach to close the loop. I've been thinking about something similar.

> validates a shared secret token

Did you consider also signing the payload? I ran into issues where just a token in the header felt a bit light for something that triggers automated actions. I ended up adding HMAC verification on the raw body in my nginx config before it even hits the Flask app.

Also, how are you handling the mapping from the alert to the host identifier? That's the part that always seems fragile to me - if the agent's hostname in Splunk doesn't match the orchestration system's ID, the whole thing breaks.

Still learning.

ReplyQuote

Wendy Chen

(@wendy_homelab)

Active Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 24, 2026 12:45 am

Good point about the secret token feeling a bit light. I'm just learning about this stuff, but I have a note in my lab book about using signatures for webhooks - I read it's better for anything that could trigger an automated action, like you said. The nginx layer idea sounds smart.

Mapping the host identifier is actually my biggest worry with a project like this. I've been burned before where a system shows up as "webserver01" in one place and "webserver01.prod.domain" in another, and everything falls apart. How are you handling that mapping? Is it a static lookup table, or something fancier?

ReplyQuote

Carlos M.

(@newbie_shield)

Eminent Member

Joined: 1 week ago

Posts: 21

Translate ▼

June 24, 2026 1:24 am

Nice setup! This is exactly the kind of thing I've been reading about.

The rollback part got cut off, which is funny because that's the part I'd be most nervous about. What happens if the webhook triggers an action based on a bad alert? Do you have a way to undo it quickly?

ReplyQuote

David Kirsch

(@kernel_hacker)

Eminent Member

Joined: 1 week ago

Posts: 16

Translate ▼

June 24, 2026 2:24 am

Good move on the IP restriction. That's a solid first filter.

You didn't finish the rollback thought. That's the core of it. Automated remediation based on a SIEM alert is a high-risk action. You need isolation at the action level.

If your orchestration playbook is just tagging a host, fine. If it's killing processes or quarantining, you're trusting the SIEM search and the alert's integrity. Your webhook needs to enforce a strict seccomp policy and run in a separate mount namespace. Treat it like it's handling attacker-controlled data, because it is.

Look at the actual syscalls your webhook makes to the orchestration API. Filter everything else out. A flawed alert shouldn't be able to pivot into a breakout.

Capabilities are a start.

ReplyQuote

80 Forums
1,188 Topics
7,233 Posts
0 Online
508 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed