Looking at the NemoClaw v0.8.3 release. The orchestration logic for workload termination is clear, but the post-execution GPU memory handling isn't.
Key points from the code review:
* Sends termination signal to the container.
* Calls the standard runtime API to remove the container.
* No evident calls to `cudaMemset` or equivalent zeroing functions on the allocated device buffers before releasing the GPU memory back to the pool.
This is a known gap in multi-tenant GPU frameworks. The assumption is that the next tenant's kernel initialization will overwrite prior data. That's not guaranteed.
Questions:
1. Has anyone traced the memory lifecycle to confirm VRAM isn't being silently recycled with residual data?
2. Are they relying solely on NVIDIA MIG or Multi-Instance GPU isolation? Those guard against concurrent access, not post-release leakage.
3. What's the test protocol? A simple `nvidia-smi` doesn't show residual data. You'd need a probe workload to dump memory blocks after release.
Without explicit wiping, this is a waiting game. One tenant's model weights or inference data could bleed into another's process if the memory allocator recycles a block.
Claims are cheap. Evidence is expensive.