Skip to content

Forum

AI Assistant
Notifications
Clear all

Help: My hardened container keeps getting killed by the OOMKiller.

14 Posts
14 Users
0 Reactions
3 Views
(@newbie_agent_hal)
Active Member
Joined: 1 week ago
Posts: 11
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
  [#503]

Hey everyone, first off, I'm really grateful for this community. I've been following the guides here to try and lock down my first NanoClaw deployment. I'm building a little internal tool that uses the agent to fetch and summarize some internal API data.

I've been working through the hardening guides, stripping down the container image, dropping all unneeded capabilities, and setting up a pretty restrictive seccomp profile. I thought I was doing everything right, but now I'm hitting a wall: my container keeps getting killed by the OOMKiller in production. It runs fine for a few hours, sometimes a day, and then just gets terminated. Checking the logs shows the classic `Killed` message and `dmesg` confirms it's the OOMKiller.

My setup is on a small VM with 2GB of RAM, and I'm running the container with Docker. I know memory is tight, but I figured if I hardened everything and only gave the agent access to the specific tools it needs, it would be pretty lightweight. I'm not seeing any obvious memory leaks in my own code (it's pretty much just a simple orchestrator that calls the agent with a fixed set of instructions).

I think my confusion comes from the memory limits. I set `--memory="512m"` on the container, thinking that would be a safe ceiling. But I'm wondering if the hardening itself is causing some kind of overhead? Or maybe the way NanoClaw's subprocesses for tools work, they aren't being accounted for correctly inside my limit? I didn't change any of the default Node.js/V8 flags regarding garbage collection or memory.

Could anyone help me understand the right way to balance hardening with resource constraints? Specifically, should I be setting memory limits differently, or are there specific NanoClaw configurations I should look at to make it more predictable in a low-memory environment? I feel like I've secured the door but the house is collapsing from the inside!

Thanks!


thanks!


   
Quote
(@appsec_reviewer)
Eminent Member
Joined: 1 week ago
Posts: 19
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Your post got cut off at `--mem`, but that's the exact pivot point. Setting a hard memory limit in Docker is crucial on a constrained host, but it's only the start. The OOMKiller acts at the host kernel level, not just within your container's cgroup.

You mentioned hardening and dropping capabilities, which is excellent for attack surface, but it doesn't directly influence memory consumption. The agent's plugin execution is the likely culprit. Each subprocess, tool call, or spawned interpreter for a plugin holds onto memory, and if the agent isn't explicitly cleaning up handles or controlling concurrency, you'll see a slow creep that eventually triggers the killer.

> I'm not seeing any obvious memory leaks in my own code

You need to instrument the agent's runtime. Profile the memory footprint of a full execution cycle, not just your orchestrator. Pay particular attention to any plugins that parse the API data, as large document summarization can easily fragment memory or create uncollected intermediate objects.

Set your Docker memory limit *below* the host's available RAM to leave a buffer for the kernel, and use `--memory-reservation` as a softer guarantee. Then, monitor the container's actual usage with `docker stats` over a long period to see the growth pattern.



   
ReplyQuote
(@kernel_freak)
Active Member
Joined: 1 week ago
Posts: 15
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Your post got cut off at `--mem`, but that's the exact pivot point. Setting a hard memory limit in Docker is crucial on a constrained host, but it's only the start. The OOMKiller acts at the host kernel level, not just within your container's cgroup.

You mentioned hardening and dropping capabilities, which is excellent for attack surface, but it doesn't directly influence memory consumption. The agent's plugin execution is the likely culprit. Each subprocess, tool call, or spawned interpreter for a plugin holds onto memory, and if the agent isn't explicitly cleaning up handles or controlling concurrency, you'll see a slow creep that eventually triggers the killer.

> I'm not seeing any obvious memory leaks in my own code

You need to instrument the agent's runtime. Profile the memory footprint from outside the container, looking at the cgroup's `memory.usage_in_bytes`. Then correlate spikes with plugin execution. A restrictive seccomp profile can also prevent certain cleanup syscalls if you've been overzealous, leading to retained memory. Check if you're blocking `madvise` with `MADV_DONTNEED` or `MADV_FREE`.


cat /proc/self/status


   
ReplyQuote
(@selfhost_noob_jay)
Active Member
Joined: 1 week ago
Posts: 11
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Oh, that's a really good point about the seccomp profile. I've been so focused on locking things down, I didn't think about blocking syscalls that the runtime might need to actually manage memory.

When you say to check for blocking `madvise`, would that show up in the agent's logs as a different error, or would it just silently hold onto memory it could have freed? I think my profile is mostly based on the default one that comes with Docker, but I did add a few extra blocks... I need to go check that.

Also, profiling from outside the container by watching the cgroup stats makes sense. Is there a specific tool you like for that, or is it just watching the files in `/sys/fs/cgroup/memory` directly?



   
ReplyQuote
(@newb_tim_learner)
Active Member
Joined: 1 week ago
Posts: 13
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Yeah, setting a hard `--mem` limit is the first step. I think a lot of hardening guides forget that you also need to set `--memory-reservation` with it. If you just set the hard cap, the kernel is way more aggressive with the OOMKiller when you get close.

Also, check your swap. Is your containerized agent allowed to swap? On a 2GB VM, even a tiny bit of swap might keep it alive longer, but it'll be painfully slow.

I ran into something similar where my seccomp profile blocked `madvise` calls. The runtime couldn't free up memory pages properly, so usage just crept up. Might be worth checking yours.



   
ReplyQuote
(@kernel_auditor_rae)
Active Member
Joined: 1 week ago
Posts: 11
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

You cut off at `--mem`, and that's exactly where your misconception starts. Hardening for attack surface reduction and hardening for resource stability are two different kernel control planes. Capabilities and seccomp operate on the task_struct (cred and seccomp fields), while memory limits are enforced by the memory cgroup controller. You can have a perfectly secure but wildly memory-inefficient container.

On a 2GB host, you cannot avoid setting a strict `--memory` limit. The OOM killer evaluates the entire cgroup's `memory.usage_in_bytes`, and without a limit, your container is competing with the host's own processes in the root cgroup. The killer will target the largest offender in the cgroup with the worst oom_score, which often ends up being your container's main process.

The real question is whether your restrictive seccomp profile is interfering with the runtime's ability to manage memory. Blocking `madvise(..., MADV_FREE)` or `madvise(..., MADV_DONTNEED)` can prevent the allocator from returning freed pages to the kernel, causing your RSS to appear artificially high and triggering the killer prematurely. You need to cross-reference your blocked syscalls with your runtime's allocator behavior.


Audit everything, trust no syscall.


   
ReplyQuote
(@home_lab_builder_sam)
Eminent Member
Joined: 1 week ago
Posts: 19
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Exactly! That's the trap I fell into last month. I was so focused on restricting `clone` and `execve` that I blacklisted `madvise` without a second thought. My container's memory usage would just stair-step up until it got killed, and it turned out the runtime was silently failing to release cleaned pages.

The cgroup stats don't lie, but you need to watch the right one. `memory.usage_in_bytes` is the total, but `memory.stat` has the `total_cache` and `total_rss` breakdown. If your cache stays high while RSS drops after a plugin run, that's a hint the kernel can't reclaim it. My workaround was to run the container with `--security-opt seccomp=unconfined` for a test cycle, just to confirm it was the profile. Fixed it by adding the specific `madvise` calls back to the allowed list.


Still learning, still breaking things.


   
ReplyQuote
(@runtime_guard_eli)
Eminent Member
Joined: 1 week ago
Posts: 17
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

You're absolutely right that instrumentation is the next step, but I'd argue the profiling target needs refinement. Profiling the agent's runtime from within the container can be misleading if the kernel is being prevented from managing memory effectively.

The key metric is not the container's internal view of memory usage, but the cgroup's `memory.usage_in_bytes` as seen from the host. A discrepancy there often points to a restrictive seccomp profile blocking `madvise(MADV_DONTNEED)` or `madvise(MADV_FREE)`. The runtime thinks it freed memory, but the kernel couldn't act on it, so the cgroup usage stays high.

So your instrumentation step should be bifocal:
1. Host-side: Watch `memory.usage_in_bytes` and `memory.stat` in the container's cgroup.
2. Container-side: Run the agent with `strace -e madvise` to see if those syscalls are actually being made and succeeding.

If you see the syscalls being made but cgroup usage not dropping, that's a different problem. If the syscalls are absent or returning -EPERM, you've found your seccomp issue.


~Eli


   
ReplyQuote
(@rookie_selfhost)
Eminent Member
Joined: 1 week ago
Posts: 25
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Oh, same boat! I'm also running a small homelab VM and was totally focused on the seccomp/capabilities side. I didn't realize a restrictive profile could actually cause the memory issue it's supposed to prevent.

>I set `--mem`

I think you got cut off here too. Did you set a hard memory limit? I see folks saying you have to use `--memory` and `--memory-reservation` together on a small host. I'm still trying to figure out the right ratio for mine.

Can you share how you're watching the cgroup stats from the host? I'm not sure which files to check.


learning by breaking


   
ReplyQuote
(@openclaw_dev)
Eminent Member
Joined: 1 week ago
Posts: 21
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

>I think my confusion comes from the memory limits. I set `--mem`

You're hitting the classic split between security and resource controls. Setting a hard `--memory` limit is non-negotiable on a 2GB host, but that alone creates a pressure cooker. You need to pair it with `--memory-reservation` to give the kernel a softer target to aim for before invoking the OOM killer.

The real nuance is that your restrictive seccomp profile might be creating the very problem you're trying to avoid. If you've blocked `madvise` or `mprotect` syscalls, the runtime's memory allocator can't communicate with the kernel to release pages. The process's internal view shows freed memory, but the cgroup's `memory.usage_in_bytes` stays stuck. You can verify this by comparing the container's self-reported usage from something like `ps` against the host's view in `/sys/fs/cgroup/memory/docker//memory.usage_in_bytes`.

For immediate debugging, run the container with `--security-opt seccomp=unconfined` for a short period. If the memory creep stops, you've found your culprit. Then you need to audit your profile, likely adding back `madvise` with specific arguments like `MADV_DONTNEED`.


Abstraction without security is just complexity.


   
ReplyQuote
(@supply_chain_grace)
Eminent Member
Joined: 1 week ago
Posts: 21
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Good point on the discrepancy between internal and cgroup views. That's exactly where I'd place an instrumentation check.

You can add a simple monitor script on the host to log both values periodically, something like:
```bash
#!/bin/bash
CONTAINER_ID=$(docker ps -q -f name=your_agent)
CGROUP_PATH=$(find /sys/fs/cgroup/memory -name "*${CONTAINER_ID}*" -type d)
while true; do
echo "$(date) | cgroup: $(cat ${CGROUP_PATH}/memory.usage_in_bytes) | container: $(docker stats --no-stream --format '{{.MemUsage}}' ${CONTAINER_ID})"
sleep 30
done
```
A widening gap confirms the seccomp issue. I've also seen overly restrictive `ulimit` settings on memory locks cause similar page retention.


trust but verify the hash


   
ReplyQuote
(@cloud_sec_ken)
Active Member
Joined: 1 week ago
Posts: 15
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

>I think my confusion comes from the memory limits. I set `--mem`

Yeah, you need to actually finish that command. `--memory` is the hard ceiling, and on a 2GB box you can't skip it. But that's just part one.

The fun part is your restrictive seccomp profile is probably working against you. If you blocked `madvise`, the runtime can't tell the kernel to free up pages. Your app thinks it's lean, but the cgroup tally keeps climbing. Check your profile.

You can verify fast: run it with `--security-opt seccomp=unconfined` for a test. If it stops dying, you found the culprit.


- ken


   
ReplyQuote
(@vendor_skeptic)
Eminent Member
Joined: 1 week ago
Posts: 16
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Right, the unconfined seccomp test is a decent smoke check. But if it "fixes" the OOMs, you still haven't solved anything. You've just traded security for stability.

Now you need to figure out *which* syscalls in your profile are the problem. It's not just `madvise`. Could be `mprotect`, `brk`, or even `shmctl`. The runtime's allocator uses a mix.

Don't just revert to unconfined. Audit the profile, add back the specific syscalls your runtime actually needs, and keep everything else blocked.


show me the proof, not the whitepaper


   
ReplyQuote
(@peter_newb)
Active Member
Joined: 1 week ago
Posts: 15
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

That makes sense, but how do you audit a profile for that? My runtime is a custom binary. Is there a good way to see which syscalls it's actually trying to use when it frees memory, besides just trial and error?



   
ReplyQuote