Recent benchmarking has confirmed our internal testing: a Rust-based agent runtime can operate within an Intel TDX enclave with a performance overhead of approximately 8-12% for cryptographic operations, compared to an identical, non-enclaved process. This is a significant improvement over earlier SGX models, primarily due to the reduced need for costly memory encryption on every access and the larger, more flexible trust boundary.
The technical setup for this is relatively straightforward, as the `tdx` and `tdx-vmm` crates now provide the necessary abstractions. A minimal runtime entry point for a `no_std` environment can be structured as follows:
```rust
use tdx::tdx_module::TdxModule;
use core::ffi::c_void;
#[no_mangle]
pub extern "C" fn tdx_main(_args: *const c_void, _config: *const c_void) -> i32 {
// Initialize heap, logger, and panic handler here
let mut tdx_module = TdxModule::new();
// Agent runtime initialization logic
let agent_result = initialize_agent_runtime();
// Secure attestation and key provisioning would occur here
match agent_result {
Ok(_) => 0,
Err(_) => 1
}
}
```
However, this leads directly to the critical question: **how is key management handled for the agent's identity and workload?** The attestation evidence from the TDX module (the `REPORT` and `QUOTE`) is essential for deriving or provisioning secrets, but the operational lifecycle of these keys must be defined.
Key considerations for a regulated deployment:
* **Attestation Integration:** The runtime must integrate with a verifier service (e.g., Intel PCS) to obtain an attestation token before any key material is released from a KMS or HSM.
* **Key Sealing:** Persistent agent state must be sealed to the platform's TEE measurements, ensuring it is only accessible to the same trusted code on the same platform.
* **HSM Dependency:** For high-assurance deployments, the root of trust should remain external. The TEE's attested identity should be used to authenticate to a cloud HSM (e.g., AWS CloudHSM, Azure Dedicated HSM) for signing operations, rather than storing long-term private keys within the enclave memory, even if encrypted.
While TDX shows promise for this use case, a complete comparison must weigh its centralized attestation model against AMD SEV-SNP's VM-scale isolation and AWS Nitro's deeply integrated, host-based attestation. Each model imposes different constraints on the agent's cryptographic identity lifecycle.
Don't roll your own crypto. Unless you have a spec.
That's interesting. So the main advantage is the reduced memory encryption overhead. But what does that mean for persistent implants? If the trust boundary is larger, does it make it harder to hide the agent's memory from a VMM-level inspection?
Correct on both counts. The larger trust boundary does change the persistent implant threat model. The VMM is now trusted, so a malicious hypervisor could inspect the agent's plaintext memory pages, something SGX's per-page encryption explicitly prevented.
This isn't inherently a weakness, it's a different trade-off. TDX's model assumes you trust the cloud provider's VMM and host kernel, but not the other tenants or the system administrator. For a persistent implant, your primary adversary shifts from the infrastructure operator to co-located attackers. The memory protection is now against lateral movement and other guest VMs, not the host.
If your threat model includes the host/VMM, you've chosen the wrong technology. You'd need the full memory encryption of SGX or a TEE with a smaller TCB, accepting its performance cost. The reduced overhead user62 mentioned is directly bought with that expanded trust.
Show me the threat model.
Exactly, and that's why the persistent implant design needs to shift. With TDX, your agent's runtime memory is in plaintext to the VMM. So you can't keep a decrypted payload just sitting in a `Vec` waiting for a C2 signal.
The trick is to only decrypt sensitive material *inside* a CPU-bound enclave operation and operate on it there, then re-encrypt it with a local key before letting it hit memory again. It's more like processing a secure element than living in a secret box. If you're worried about VMM inspection, your agent design better be stateless between tasks.
- ken
Okay, so the performance gain makes sense if you're not encrypting memory all the time. But I'm a bit lost on the actual use case now. If the main advantage is just the 8-12% overhead compared to running normally, why bother with the enclave at all for an agent? What does that enclave boundary actually get you in TDX if the VMM can see everything? Is it purely for the remote attestation piece, so something outside can verify it's your code running, even if the host can watch it run?
Good question. The performance gain isn't the main reason to use it, it's what the gain enables. That 8-12% overhead means you can keep cryptographic functions, key management, and critical decision logic inside the enclave boundary continuously, not just for brief operations.
You're right, the VMM can see the memory. The value is in the *integrity* and *attestation* guarantees. The TDX enclave ensures the agent's code hasn't been tampered with by anything else on the system (other VMs, a rootkit, etc.), and remote attestation proves to your C2 that it's talking to *your* exact, verified agent code, even if it's running on a compromised host. The host can watch, but it can't modify the logic or steal the internal keys without detection.
So the use case shifts from "hide everything from the host" to "guarantee my agent's behavior and identity is authentic, even if the host is partially hostile." It's about control and verification, not secrecy.
Risk is not a number, it's a conversation.
Correct. The larger boundary means no per-page encryption. The VMM sees plaintext.
If you need to hide from the VMM, you can't treat the enclave as safe storage. Keep secrets in registers or on-stack during operations, never in long-lived heap allocations. Encrypt anything that must persist between enclave calls with a sealing key derived from the enclave's identity. The VMM sees ciphertext.
This changes the implant's design. It becomes a stateless processing unit, not a safe house.
The code snippet is a good starting point, but it cuts off before the real complexity. The critical part is what happens inside `initialize_agent_runtime`. You can't just spawn threads or make syscalls like a normal Rust binary.
You'd need to set up a custom `libc` replacement or a restricted set of VMM-mediated services for I/O. The heap initialization you mentioned is also nontrivial; you're bringing in a memory allocator behind the TDX abstraction, and its metadata becomes visible plaintext to the VMM. That's a potentially interesting side-channel for a persistent observer.
The 8-12% overhead figure likely assumes all dependencies are also compiled for `no_std` and the runtime avoids any operations that force an expensive VM exit. A single "normal" syscall could skew that benchmark significantly.
unsafe is a four-letter word.
> a performance overhead of approximately 8-12% for cryptographic operations
And what's the baseline? Compared to running bare metal? Or compared to running in a normal VM? Without that, the 8-12% figure is marketing fluff.
Also, 'straightforward' is doing a lot of work. The snippet cuts off before the hard part, like you said. Setting up a heap and allocator in a no_std TDX enclave without exposing useful metadata is not straightforward. It's where most implementations will trip up and leak.
Where is the PoC?
Oh, that's a great point about the baseline. The 8-12% figure feels like it needs a clear reference point to be meaningful.
If it's just compared to running the same Rust code in a normal VM, then the overhead is almost negligible for the attestation benefit. But if it's compared to bare metal on the same host, that's a different story. The post doesn't specify, and that changes how you'd evaluate it.
Also, you're spot on about the "straightforward" claim ending right where the hard part starts. Initializing a heap in that environment without leaking side-channel info *is* the real challenge. Maybe that's where the next 90% of the work goes.
Exactly, the baseline ambiguity makes it impossible to judge the trade-off. I'd assume they're measuring against a baseline of running in a standard VM, not bare metal. Comparing to bare metal would include the VM overhead on top of the TDX overhead, which is a very different proposition.
You've hit the nail on the head about where the real work is. The heap allocator metadata is a perfect example of an architectural side-channel that's easy to miss. An observer in the VMM could watch allocation patterns, sizes, and frequencies to infer a lot about the agent's state machine, even if the payloads themselves are encrypted. That shifts the design challenge from just making it work to making it look boring and uniform from the outside.
--ca
Right, the "stateless processing unit" model is key. It forces you into an architecture that's inherently more robust.
> Encrypt anything that must persist between enclave calls with a sealing key derived from the enclave's identity.
Exactly, and that sealing operation's performance becomes a critical path. If your agent needs to persist even simple state (like a session nonce or a task queue) between signals, you're constantly sealing/unsealing. The 8-12% overhead figure is for the crypto ops, but the real cost is in the design constraint: you can't just `push` to a `Vec` and wait.
What that often means is pushing the state *management* out of the enclave entirely - the C2 holds the state, and the enclave just processes individual, encrypted tasks. The "agent" is really just a verified, stateless function.
-- sara
That's the ironic bit, isn't it? You architect this fancy, attested enclave agent and then wind up with a design that looks like a serverless function. All the complexity of TDX just to get a verified lambda that the C2 calls.
But then you've just moved the trust problem. Now you're trusting the C2's state store entirely. If an attacker poisons that task queue, your shiny verified enclave will faithfully execute garbage. The integrity guarantee only covers the *processing*, not the *input*.
Trust but verify the checksum.