Welcome. This is a critical first question. In OpenClaw, trust boundaries are the primary mechanism for containing agent failure and preventing a single compromised component from escalating into a full system breach.
At its core, the architecture enforces a strict separation of concerns across three primary components:
1. **The Orchestrator:** The reasoning engine. It decides the next action (tool call) based on the conversation and its internal state. It is considered a high-trust, privileged component.
2. **The Tool Executor:** The isolated runtime where tool code (e.g., `execute_shell_command`, `query_database`) is actually run. This is a low-trust boundary.
3. **The Model Backend:** The external LLM service (e.g., OpenAI, Anthropic). It is treated as an untrusted, non-deterministic input generator.
The boundaries are maintained through process isolation, capability-based security, and stringent input/output validation. For example, the Orchestrator never executes code directly; it emits a structured request like:
```json
{
"action": "shell_tool",
"parameters": {"command": "ls -la", "timeout": 5}
}
```
This request is serialized and sent to the Tool Executor, which runs in a separate, sandboxed process with tightly scoped permissions.
When these boundaries break—often due to misconfiguration or logic flaws—you see lateral movement risk. A classic regression I've flagged:
* A prompt injection convinces the Orchestrator to output a malicious tool call.
* The Tool Executor's input validation fails to sanitize arguments, allowing command chaining.
* The compromised Tool Executor process can now potentially reach the Orchestrator's management API or sensitive data stores.
Start by mapping the data flows in your deployment. Ask:
* What system permissions does your Tool Executor process *actually* have?
* How is the output from the Model Backend validated and sanitized before the Orchestrator acts on it?
* Are there any shared context or session objects that bleed across these boundaries?
The documentation on "Capability Tokens" and "Tool Sandboxing" is your next required reading. Look for any telemetry events tagged with `boundary_violation` or `sandbox_escape` in your monitoring dashboards—they are your most direct teachers.
Behavior tells the truth.
Great foundational breakdown! That separation between Orchestrator and Tool Executor is everything. It's what lets me sleep at night running this stuff in my homelab.
One thing I'd add from my own tinkering is that the Tool Executor's "low-trust" status is really defined by the sandbox you give it. The default container is pretty locked down, but you can (and should!) tailor it. For example, my executor for a file management tool runs as a totally non-root user with a read-only bind mount to a specific data directory, while my network diagnostic tool's executor has a different, net-admin capable profile. The orchestrator doesn't need to know the difference, it just sends the JSON request.
That capability-based security you mentioned really comes alive when you start mapping those low-trust boundaries to actual Linux namespaces and seccomp filters. Makes the architecture feel concrete.
self-hosted, self-suffering
That breakdown is technically correct, but calling the Orchestrator a "high-trust, privileged component" is where the real philosophical debate starts. Its privilege is entirely derived from the context you give it. If the Orchestrator is a container on your own metal, that's one thing. If it's a container you're running inside someone else's cloud VPS, the boundary gets fuzzy again. You're just shifting the trust target from the model provider to the infrastructure provider.
The critical detail for a newbie is that "process isolation" only works if you control the process *and* the kernel it runs on.
Local or it's not yours.