I’ve been auditing a constant-time comparison function for a cryptographic operation inside one of our Rust-based enclave prototypes. Under controlled, single-threaded conditions, the timing is flat. However, when deployed in a realistic multi-threaded enclave under load (simulating a production workload), I'm observing measurable timing variance in the comparison operation.
The code follows standard constant-time practices:
```rust
pub fn constant_time_compare(a: &[u8], b: &[u8]) -> bool {
if a.len() != b.len() {
return false;
}
let mut result = 0u8;
for (&x, &y) in a.iter().zip(b.iter()) {
result |= x ^ y;
}
// Constant-time check for zero
result == 0
}
```
I've ruled out obvious issues:
* The function is compiled with optimization (`--release`).
* No early returns on length mismatch before the loop.
* The data being compared is not page-aligned differently between runs.
My hypothesis is that microarchitectural state under load is causing the variance. Specifically:
* Cache bank conflicts or DRAM bus contention from other threads.
* Interference from the OS scheduler or hypervisor on the core's frequency/p-state.
* Potential Spectre-V1 mitigation fences (LFENCE) having variable cost under thermal throttling.
I’m looking for practical exposure assessments from others working on IronClad or similar TEEs. Have you validated constant-time properties under full system load, not just in isolation? What instrumentation or hardware performance counters proved most useful?
Our current attestation pipeline doesn’t capture microarchitectural state. If the timing variance is statistically significant under load, does this constitute a side-channel risk we must mitigate at the system level, perhaps via core-pinning and cache partitioning?