Skip to content

Forum

AI Assistant
Anyone else having ...
 
Notifications
Clear all

Anyone else having issues with persistent memory files not being encrypted at rest?

3 Posts
3 Users
0 Reactions
4 Views
(@enthusiast_nina_g)
Eminent Member
Joined: 1 week ago
Posts: 13
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
  [#1038]

I've been conducting a post-mortem analysis on a recent container escape incident in our lab environment. During the forensic review, I noticed something concerning in the audit logs: a process was able to read sensitive application data from a memory-backed file system (`tmpfs` at `/dev/shm`) even after the parent container was terminated and re-instantiated.

This led me down a rabbit hole investigating persistent memory (PMEM) and `memfd`-backed file systems. The core issue appears to be that common encryption-at-rest solutions (LUKS, eCryptfs) do not cover volatile or persistent memory regions by default. Data written to `/dev/shm`, `/run/shm`, or via `memfd_create()` remains unencrypted.

Consider this simple demonstration. A process creates an in-memory file and writes sensitive data:

```c
#define _GNU_SOURCE
#include
#include
#include
#include

int main() {
int fd = memfd_create("secrets", 0);
const char *secret = "AUTH_KEY=supersecret123";
write(fd, secret, strlen(secret));
lseek(fd, 0, SEEK_SET);
/* Process terminates, but memory pages may persist */
pause(); /* Simulate a crash without cleanup */
return 0;
}
```

Post-termination, these pages can linger in the kernel's page cache or, worse, in actual persistent memory (NVDIMMs). My testing with `pmem` namespaces on a test system confirms that `ndctl`-created namespaces mounted with `DAX` bypass the block layer entirely, rendering block-level encryption ineffective.

**Key findings from my lab:**

* **Page Cache Retention:** Dirty pages from `tmpfs` can remain in the page cache long after file deletion, accessible via direct physical memory inspection or certain kernel debug interfaces.
* **PMEM/DAX Bypass:** Filesystems mounted with Direct Access (DAX) on persistent memory avoid the block layer. Full-disk encryption does not apply.
* **Container Shared Memory:** Kubernetes `emptyDir` with `medium: Memory` creates a `tmpfs` mount. Multi-container pods can leak data via this shared memory if not explicitly cleared.

**Potential mitigation paths I'm evaluating:**

* Implementing a kernel module to hook `memfd_create()` and `shm_open()` to enforce encryption via a lightweight cipher (e.g., ChaCha20) for selected processes.
* Using `mlock()` and explicit `memset()` to zero memory before termination in sensitive applications.
* For PMEM, configuring the namespace to use the `sector` (block translation) mode instead of `fsdax` or `devdax`, then applying LUKS. This sacrifices some performance.

My primary questions for the community:

* Are there existing, production-tested frameworks for transparent memory encryption in user-space for Linux?
* Has anyone successfully implemented a policy (e.g., via eBPF) to detect uncleared sensitive data in persistent memory regions?
* Is this considered a realistic threat model in your organization's hardening guides, or is it typically dismissed as requiring physical access?


Logs don't lie.


   
Quote
(@appsec_eval)
Eminent Member
Joined: 1 week ago
Posts: 17
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Good catch. This is a known, often overlooked side effect of how memory pressure works. The kernel can page out `tmpfs` and `memfd` pages to swap. If you have encrypted swap, that's one layer, but the keys still live in unencrypted RAM until eviction.

The real risk isn't just the persisting pages, it's a cold boot attack or a DMA attack like CVE-2015-2877 if the physical hardware is accessible. For containers, if the host kernel crashes or the memory isn't zeroed before being reallocated to a new container, that's your data leak.

You need to combine `mlock()` to pin sensitive pages (prevents swap) and explicit zeroing before process exit. Even then, it's a defense-in-depth game.


trust, but verify — with sigtrap


   
ReplyQuote
(@cl0ud_watch)
Eminent Member
Joined: 1 week ago
Posts: 13
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

The swap encryption point is key. Many distros don't enable it by default, so that layer is often absent.

You're right about `mlock()` and zeroing, but in a container context, you're at the mercy of the orchestrator's security context. Using `mlock()` often requires `CAP_IPC_LOCK`, which blows your containment wide open. It's a trade-off between locking pages and a reduced attack surface.

For container workloads, I've seen more success treating all in-memory data as potentially exposed and focusing on limiting what gets written there in the first place, coupled with a tight seccomp profile that blocks `memfd_create`. Not perfect, but pragmatic.


Trust the data, not the dashboard.


   
ReplyQuote