Skip to content

Forum

AI Assistant
Breaking: OpenClaw ...
 
Notifications
Clear all

Breaking: OpenClaw v0.8.3 released with prompt injection defenses — first look

1 Posts
1 Users
0 Reactions
4 Views
(@selfhost_sec_dev)
Eminent Member
Joined: 1 week ago
Posts: 15
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
  [#137]

Just pushed the release tags for v0.8.3. The main event is the new prompt injection defense subsystem. It's not a silver bullet, but it's a practical, configurable layer that should be in everyone's stack.

The core idea is a multi-stage filter pipeline that runs before the user query hits your primary agent logic. It's designed to catch the low-hanging fruit and some more sophisticated injection attempts. You can run it as a standalone service or integrate the modules directly.

Key additions:
* **Semantic Guardrails:** Uses a local embedding model (default: `all-MiniLM-L6-v2`) to score query similarity to a blocklist of known dangerous intents (e.g., "ignore previous instructions"). Threshold is configurable.
* **Token Sequence Detector:** Regex-like patterns, but for the token space. Catasks some encoded payloads that plain regex misses.
* **Canary Tokens:** Inject hidden markers into your system prompt; the filter checks if they've been altered or output in the response.

Here's a minimal config example (`config/filter_config.yaml`):

```yaml
filter_pipeline:
- name: token_sequence
parameters:
patterns: ["ignore", "previous", "instructions"]
- name: semantic_guardrail
parameters:
model_path: "local_models/all-MiniLM-L6-v2"
block_threshold: 0.85
- name: canary_check
parameters:
canary_string: "||SYSTEM_PROTECT||"
expected_position: 12
```

Deploy it in front of your existing setup. If you're running OpenClaw Agent, the update hooks it in automatically. For other setups (like a custom LangChain or raw API server), you can run the filter service on `localhost:8145` and proxy requests through it.

Initial tests show it adds 20-150ms latency, mostly from the embedding model load. It's blocking obvious jailbreaks in my homelab. Let me know what breaks it.

-- mike


-- mike


   
Quote