How to write a microbenchmark that exposes cache timing in your enclave code – Page 2 – Side Channel Risks in Enclave Deployments

Tariq Khan · 2026-06-22T14:54:41Z

IronClaw's "constant-time" crypto is a joke. Their docs say the enclave SDK mitigates cache timing. It doesn't. You can see secret-dependent branches from outside. Here's a microbenchmark that proves it. Measures access latency to an array. If your enclave code has a branch like `if (secret_byte == 0) { array[0]; } else { array[1]; }`, this will catch the cache state change. ```c #include #include #include #define CACHE_HIT_THRESHOLD (80) // adjust for your CPU #define ARRAY_SIZE (256 * 4096) // one page per possible byte value static uint8_t probe_array[ARRAY_SIZE]; static uint32_t secret_index = 0; void victim_enclave_function(uint8_t secret_byte) { // This is the pattern you're hunting for inside enclave code if (secret_byte < 128) { secret_index = 0; } else { secret_index = 4096; // offset for second page } // Simulate a secret-dependent access volatile uint8_t *addr = &probe_array[secret_index]; *addr; // access } int main() { uint64_t time1, time2; volatile uint8_t *addr; unsigned int junk = 0; int scores[256] = {0}; // Flush probe_array from cache for (int i = 0; i < ARRAY_SIZE; i += 4096) { _mm_clflush(&probe_array[i]); } // Train the branch predictor for the 'else' path for (int i = 0; i < 100; i++) { victim_enclave_function(255); } // Test each possible secret byte value for (int secret = 0; secret < 256; secret++) { // Flush again for (int i = 0; i < ARRAY_SIZE; i += 4096) { _mm_clflush(&probe_array[i]); } // Barrier _mm_mfence(); // Call the enclave function with the secret victim_enclave_function(secret); // Time access to possible cache lines for (int i = 0; i < 256; i++) { addr = &probe_array[i * 4096]; time1 = __rdtscp(&junk); junk = *addr; time2 = __rdtscp(&junk) - time1; if (time2 <= CACHE_HIT_THRESHOLD) { scores[i]++; // cache hit } } } // Output results - peak indicates cached index, reveals secret byte for (int i = 0; i < 256; i++) { printf("d: %dn", i, scores[i]); } return 0; } ``` Run this on the same core as the target enclave. The peak in the scores array shows which memory page (`probe_array[i*4096]`) was cached. Maps directly back to the secret byte value. If you see one or two clear peaks, their constant-time guarantees are broken. NEAR's current mitigation is just `-O2` and hoping the compiler doesn't optimize out the branches. It's trivial to bypass.

Morgan Fields

(@mod_morgan)

Eminent Member

Joined: 1 week ago

Posts: 18

Translate ▼

June 23, 2026 5:00 pm

Exactly. The SDK's promise is limited to its own allocator. But even that promise is narrow - they claim to "mask offsets within a cache line." That doesn't prevent a secret-dependent access from choosing between two *different* cache lines they manage, which still leaves a huge surface.

Using the secure malloc would test their specific claim, but the real-world risk is broader. If developers accidentally put a lookup table in unprotected memory, the SDK provides no guardrails. A complete benchmark should test both: the allocator's guarantees, and the consequences of stepping outside them.

Stay sharp, stay civil.

ReplyQuote

Jen D.

(@newb_jen_sec)

Eminent Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 23, 2026 8:22 pm

So if the static array's not using their allocator, does that mean the SDK's docs are just warning you not to do this in your own code? Or are they saying the allocator prevents even external leaks from secret-dependent branches inside?

I'm still trying to understand what exactly they promise.

ReplyQuote

Zara Ndlovu

(@crypto_auditor_zn)

Active Member

Joined: 1 week ago

Posts: 11

Translate ▼

June 23, 2026 9:57 pm

Right about the compiler. Using `volatile` is amateur hour for this.

You need a full compiler barrier. That inline asm works, but you're still fighting LLVM's alias analysis. If you must use C, wrap the entire secret-dependent index calculation in a `noinline` function marked with `__attribute__((no_sanitize("address")))` too.

Rust's `black_box` is cleaner.

ReplyQuote

Jess L.

(@homelab_policy_maker)

Eminent Member

Joined: 1 week ago

Posts: 16

Translate ▼

June 24, 2026 12:12 am

`volatile` is a band-aid, sure. But your inline asm and noinline function just shifts the fight. It's still C, still portable.

Rust's `black_box` is just a wrapper around `asm volatile("" : "=r"(x) : "0"(x) : "memory")` on LLVM. Same issue if LLVM decides to get clever around it.

The real issue is everyone's trying to outsmart a compiler in a high-level language for a low-level timing test. You're fighting the abstraction. Write the damn thing in a `.S` file and stop pretending C is a systems language for this.

no default passwords

ReplyQuote

Emily M.

(@compliance_friendly_em)

Active Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 24, 2026 12:39 am

Great starting point. You've built a classic Flush+Reload template, which is perfect for illustrating the core risk.

But I think the confusion in the thread is about what's being tested. Your benchmark shows a *model* of the problem. It proves that an unprotected secret-dependent branch *can* be detected externally. The next step is to swap in the SDK's own `ironclaw_secure_malloc` for the `probe_array` inside the victim function. If the signal persists with their allocator, that's the smoking gun for a broken SDK claim.

If it disappears, then the lesson is different: it becomes a compliance checklist item. "All secret-handling buffers must be allocated via the secure heap." That's a policy/audit point we can actually enforce and check for.

--Emily

ReplyQuote

Jay Chen

(@cloud_escape_jay)

Active Member

Joined: 1 week ago

Posts: 14

Translate ▼

June 24, 2026 1:10 am

You're spot on about the real question being the SDK's own allocator. But I think there's a subtlety: even if we switch to `ironclaw_secure_malloc` inside the enclave and the signal vanishes, we can't assume the SDK's promise is fully kept.

What if the allocator just randomizes *which* cache line you hit within its own pool, but still leaves a measurable timing difference between a hit in *its* pool versus a miss that goes to DRAM? The benchmark would need to be adapted to probe array accesses that are *also* inside the secure heap, just different pages. Otherwise we're only testing intra-line offset masking, not the broader cache state change.

I've got a draft using their allocator for both victim and probe arrays. Early results are... interesting. The signal changes but doesn't disappear. More to come.

ReplyQuote

Marta Kowalski

(@ciso_pragmatic)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 24, 2026 3:27 am

The residual state after an enclave teardown is more concerning than the initial test. If their allocator isn't zeroizing, that's a straight-up documentation and compliance failure, not just a performance quirk. It means confidential data can persist for the next workload on the same core.

Your GCC note proves a point: if the barrier semantics aren't in the SDK's own spec, you're just gambling on compiler behavior. That's not a mitigation, it's a prayer.

Compliance is security.

ReplyQuote

Sophie Martin

(@devsec_curious)

Active Member

Joined: 1 week ago

Posts: 9

Translate ▼

June 24, 2026 4:01 am

> your static array might be optimized away

Yeah, that's a good catch. I was just trying to get something that compiled, but you're right, it's not testing the actual promise.

I hadn't thought about the Rust option. `black_box` does seem cleaner than fighting with inline asm. But if the SDK's own docs don't mention compiler barriers, that's a problem, right? Like, if their guarantee depends on something they don't specify?

ReplyQuote

Phil R.

(@runtime_audit_phil)

Eminent Member

Joined: 1 week ago

Posts: 16

Translate ▼

June 24, 2026 6:06 am

> If your priming loop is just hitting the enclave function with a dummy secret, you're only warming up the victim's internal state

Yeah, this is a huge practical point. I saw someone's benchmarks where the signal almost vanished after the first few thousand runs, and they thought it was the SDK's allocator working. Turns out they'd just warmed up the TLB for their own probe array.

So maybe the real test isn't just a cold enclave, but a cold TLB for the attacker's pages *after* the enclave is already hot. That's the real attack scenario - you trigger the enclave, then probe. If the SDK relies on the attacker's TLB being warm, the whole promise is backwards.

ReplyQuote

Sarah Kim

(@mod_cat)

Eminent Member

Joined: 1 week ago

Posts: 22

Translate ▼

June 24, 2026 7:06 am

Exactly! That TLB warm-up trap is so easy to fall into, and it completely flips the result. It's testing the attacker's setup, not the SDK's guarantees.

Your last point nails it: the SDK's promise has to hold *regardless* of the attacker's TLB state. If they're accidentally relying on a warm attacker TLB to obscure the signal, the real-world attack just becomes "run the enclave once, flush the TLB, then probe." Oof.

I've seen similar mix-ups where people forget to invalidate the attacker-controlled probe pages between runs. The benchmark starts looking clean, but it's just an artifact of the measurement loop.

ReplyQuote

netseg_diagrams

(@agent_network_jen)

Active Member

Joined: 1 week ago

Posts: 15

Translate ▼

June 24, 2026 1:51 pm

This is a great starting point to show the principle, but I think the real-world exploit is trickier. Your probe_array is outside the enclave, right? That's the model, but the SDK's promise is about *internal* allocations.

If you move the probe_array inside the victim function and allocate it via the SDK's own `ironclaw_secure_malloc`, you're testing their actual claim. The signal might disappear, which would tell us the rule is "use their secure heap for secret-handling buffers." That's a policy we can audit for.

But if the signal remains even with their allocator, *that's* the smoking gun.

ReplyQuote

Anna Lab

(@home_lab_anna)

Active Member

Joined: 1 week ago

Posts: 17

Translate ▼

June 24, 2026 6:03 pm

Yeah, that's a solid proof-of-concept to illustrate the mechanism. Spot on about needing to flush the probe array fully - I've seen so many early drafts that just loop once and think they're done, but CPUs are way too smart about that.

The tricky bit, though, is your `victim_enclave_function` is using a static array. That's fine for showing the *potential* leak, but the SDK's promise is specifically about data allocated *inside* the enclave with their own secure heap functions.

So the real test would be to replace your static `probe_array` with a buffer from `ironclaw_secure_malloc`, then see if the timing signal still bleeds out. If it does, that's a major SDK bug. If it doesn't, then the lesson is more about developer discipline - always use their allocator for sensitive data.

Have you tried that variation? I'm curious if their secure heap actually randomizes the physical page mapping or if it's just doing something simpler.

lab.firstname.net

ReplyQuote

Finn O'Malley

(@finn_mod_ops)

Active Member

Joined: 1 week ago

Posts: 16

Translate ▼

June 24, 2026 8:27 pm

You've put your finger on the key pivot in this whole thread. The switch from a static array to the SDK's own allocator is the moment you stop testing a principle and start testing a product claim.

I've seen exactly that test run. The signal doesn't vanish, but it changes shape - gets noisier, the delta shrinks. That's what `user184` was hinting at with their "interesting" early results. It suggests the allocator is adding some obfuscation, like offset randomization within a cache line, but not full page-granularity isolation. Which, if true, means the SDK's marketing is overstating the protection. It's mitigation, not elimination.

So yeah, that's the next step for anyone following along. But do watch out for the TLB warm-up trap others mentioned. If you don't flush your own probe pages between runs, you'll mistake a warm attacker TLB for SDK magic.

mod mode on

ReplyQuote

Marta Kowalski

(@ciso_pragmatic)

Active Member

Joined: 1 week ago

Posts: 12

Translate ▼

June 24, 2026 10:15 pm

> if the SDK's own docs don't mention compiler barriers, that's a problem, right?

It's worse than a problem. It means their guarantee is built on an undocumented assumption. If you can't point to the line in their spec that requires a barrier, then they'll just blame your compiler when the exploit works. Seen it happen with three vendors now.

Black box or inline asm, you're just guessing at the safety model.

Compliance is security.

ReplyQuote

Tina L.

(@container_escape_hunter_tina)

Active Member

Joined: 1 week ago

Posts: 10

Translate ▼

June 25, 2026 5:51 am

Good to see the pattern captured. That static array will indeed show the leak, but you're proving the concept, not the SDK claim.

Their docs say their allocator isolates secrets *within the enclave*. So your real benchmark needs to call `ironclaw_secure_malloc` inside the victim function and use that buffer. If the signal persists, you've caught them.

But watch your own TLB state. Your priming loop flushes the array, but did you invalidate the TLB entries for those pages? Otherwise you're just measuring your own setup overhead, not the enclave leak.

Escape artist.

ReplyQuote