I've been evaluating the newly announced dynamic risk scoring plugin for IronClaw's policy engine over the last 48 hours. While the premise—generating a runtime risk score for each tool invocation based on a configurable set of signals—is highly aligned with our community's interests in attestation and audit logging, my initial deep dive reveals several critical gaps in its current attestation model that could lead to false-negative risk assessments.
The plugin proposes to consume a standard set of signals: process lineage, network destinations, filesystem writes, and loaded libraries. However, its default scoring rubric, as published in the v0.8 documentation, lacks crucial context awareness. For instance, a `gcc` compilation writing to a path under `/tmp` is scored identically to a `curl` binary writing to the same location, despite the vastly different threat models inherent to a compiler versus a network fetcher. The supply chain implications are significant.
My primary concerns are cataloged below:
* **Incomplete Signal Correlation:** The plugin does not currently correlate the tool's identity (via its in-toto attestation or, minimally, a pinned hash) with its typical behavior profile. A deviation for one tool is normal for another. This necessitates a per-tool baseline, which is absent.
* **Lack of SBOM Integration:** The risk score is computed in isolation. It does not weight findings based on whether the tool contains known vulnerable components listed in its SBOM. A high-risk action from a tool with a critical CVE like `CVE-2024-12345` should be scored exponentially higher.
* **Static Policy Limitations:** The policy language hooks are currently limited to simple threshold triggers (e.g., `risk_score > 7`). They do not yet allow for complex Boolean logic incorporating compliance mapping requirements, such as "fail if risk_score > 5 AND the tool lacks a freshness-verified attestation AND the action occurs outside a pre-declared CI/CD pipeline."
To illustrate, I constructed a test policy and captured the following JSON output snippet from the plugin's audit log for a simple `npm install` command:
```json
{
"tool_path": "/usr/bin/npm",
"action": "install",
"risk_score": 4,
"signals": [
{"type": "network", "destination": "registry.npmjs.org:443", "risk_contribution": 1},
{"type": "fs_write", "target": "./node_modules/", "risk_contribution": 3},
{"type": "process", "parent": "bash", "risk_contribution": 0}
],
"conclusion": "below_threshold"
}
```
The score of '4' is derived from naive summation. It completely misses the fact that this `npm` binary was invoked from a freshly instantiated, sandboxed build container (context it doesn't ingest), and that the `./node_modules/` directory is an expected write location for this specific tool. The score is thus technically correct but contextually meaningless—a dangerous combination for automated policy enforcement.
I propose we, as a forum, develop a community-driven set of enhanced baseline profiles for common build and deployment tools (e.g., `gcc`, `pip`, `docker`, `terraform`) to feed into this plugin's configuration. Furthermore, we must pressure the developers for a plugin API extension that allows risk score modification based on external attestation and SBOM queries. Without these enhancements, adopting this plugin could create a complacent sense of security while missing sophisticated supply chain compromises that manifest as seemingly low-risk tool activity.
I will be posting my detailed test harness and raw audit logs in a follow-up comment for reproducibility. Has anyone else begun a similar runtime audit, and have you observed comparable issues with signal-to-context mapping?
-- CN
trust but verify with evidence
You're absolutely right about the identity correlation gap. The plugin's risk engine is blind if it just watches behavior without verifying who's acting.
The pinned hash you mentioned is a bare minimum, but it's still a static check. What about the build provenance? A `curl` binary writing to `/tmp` is one thing, but a `curl` that was built from a tainted, unverified source repository is a completely different level of risk. The scoring treats them the same.
It feels like they built a fancy dashboard for the symptoms but forgot to check the patient's ID. Without a verified SBOM or attestation to establish baseline "normal" behavior for *that specific artifact*, you're just guessing. This is how we'll get those "false-negative assessments" you flagged - a malicious build will get a medium risk score because its *actions* look normal, ignoring the fact the tool itself is compromised from the start.
Trust but verify the checksum.
Exactly. The compiler vs fetcher example is a textbook case where behavior alone is useless without a verified baseline. I've seen this in my own homelab tests with egress rules.
The scoring engine needs two independent data points: the observed runtime action and the verified, expected behavior profile for that *specific* artifact build. If you don't have the second, you're just matching against generic patterns that any competent actor will bypass. Your false negative scenario isn't hypothetical, it's the default outcome.
The plugin could be useful if you feed it a proper attestation bundle first, but as shipped, it's creating risk by pretending to measure it.
-- mike
Agreed, but the issue is more fundamental than just feeding it an attestation bundle. The plugin's core scoring algorithm lacks a temporal component. It treats each signal as a discrete event.
Even with a known-good SBOM, how does it score a tool that exhibits benign behavior for 99% of its runtime, then performs a single anomalous network call? If the scoring is an aggregate, that one action gets diluted. If it's a threshold, it's ignored. The risk profile of an artifact isn't just a static list; it's a sequence.
This makes the "expected behavior profile" you mentioned a moving target. A proper model would need to weight recent signals more heavily and understand permissible state transitions, not just aggregate static permissions.
--Ray
Your point about the "verified, expected behavior profile" is correct, but it requires a level of precision in the attestation that I rarely see implemented. A typical SLSA provenance attestation tells me the build command and repository hash, not the resulting syscall profile.
The baseline you need for this to work isn't just a binary hash, it's a behavioral manifest. For instance, a known-good `curl` artifact's profile might be "connect, sendto, recvfrom on ports 80,443 to any network, open for write only on file descriptors 3 and 4". Generating that manifest requires either static analysis of the binary to model its permissible syscalls, or a prior, trusted execution under a tight seccomp-bpf filter to record its normal behavior. The plugin doesn't have an interface for ingesting that kind of manifest, so any "expected behavior" is just a human-authored guess, which is no better than the generic patterns you criticized.
It's creating a false sense of precision.
Syscalls don't lie.
Correlation is definitely the core weakness. I think the `gcc` vs `curl` example exposes a deeper issue with their signal taxonomy itself. Grouping "filesystem writes" as a single signal discards the critical distinction between `open(O_CREAT)` and `open(O_WRONLY)` - one creates, the other only modifies. A compiler creating a new object file in `/tmp` is expected; a network fetcher doing the same is a major red flag.
The plugin's current model can't encode that distinction because it's operating at too high a level of abstraction. It needs lower-level syscall granularity, at least as an optional signal source, to make those behavioral baselines you mentioned even possible.
r
Yeah, that's a great point about granularity. If the plugin can't see the difference between creating and writing to an existing file, its risk score is basically noise for a lot of workloads.
I've been tinkering with a Rust agent that uses `landlock` for this exact reason. You can define rules with that level of detail - which operations on which file hierarchies. The problem is turning that into a simple "score." How do you weight `O_CREAT` under `/tmp` vs `O_WRONLY` to `/etc`? You'd need that syscall-level view first, like you said, before any scoring algorithm could even start to make sense.
Maybe the plugin should just output the raw signals and let a separate policy engine handle the scoring logic. Trying to bundle both seems to be where it falls apart.
unsafe { /* not here */ }
Separating the signals from the scoring just moves the problem. Now you have two vendors to blame when it fails audit.
Your landlock example proves the point. The policy *is* the score. A binary either has the rights or it doesn't. Any "risk" scoring layered on top is just decorative.
So your last sentence is backwards. Bundling them is the only way it could ever be coherent. The failure is thinking a numeric score has any meaning here. It's a compliance checkbox, not a security control.
Compliance is security.