[PATCH] D150033: fix stack probe lowering for x86_intrcc

Tom Dohrmann via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat May 6 14:38:38 PDT 2023


Freax13 added a comment.

In D150033#4324280 <https://reviews.llvm.org/D150033#4324280>, @pengfei wrote:

>> To be honest I don't quite understand why this didn't lead to more noticeable miscompilations previously.
>
> I don't know much about `STACKALLOC_W_PROBING`, just curious what's special to `x86_intrcc`? My real concern is if it's still well defined behavior if the function takes an error code.

`x86_intrcc` is used for exception/interrupt handlers on x86. Before pushing anything onto the stack the cpu will align the stack to 16-bytes. For exceptions without an error code it will push 5 8-byte values onto the stack and for exceptions with an error code it will push 6 8-byte values. This means that for exceptions without an error code the stack will be misaligned by 8 bytes and for exceptions with an error code it will be correctly 16-byte aligned.

Usually the x86-64 ABI requires that the stack will be 16-byte aligned before a function call meaning that it will by misaligned by 8 bytes after the call. The function will then push an additional value to the stack to align the stack to 16 bytes. IIUC for exceptions without an error code no special handling is needed because the stack is 8-byte misaligned just like it would be for a normal call, whereas it's off by 8 for an exception with an error code which is why the stack pointer is updated to bring it back to that misaligned state.

I also don't know much about `STACKALLOC_W_PROBING` but from I can gather it's used to allocate space on the stack while also probing it to ensure that that space isn't part of a guard page. The probing is done by writing to the memory to see if it will trigger a fault. IIUC in the context of `x86_intrcc` the 8 byte stack pointer change should never cause any stack probing to be emitted because it's smaller than the stack probe size (4096 bytes). Even if probing were to occur it should probe the memory at the opposite end of the allocation nowhere near the interrupt stack frame pushed by the cpu, so corruption of the interrupt stack frame shouldn't happen. I'm not sure if updating the stack pointer by 8 breaks any assumptions made for stack probing (previously the called function knew that `(%rsp)` contained the return address i.e. is used and writeable memory, this is no longer the case if we update the stack pointer by 8 when the function is first entered).



================
Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:655
+  while (1) {
+    auto Where = llvm::find_if(PrologMBB, [](MachineInstr &MI) {
+      return MI.getOpcode() == X86::STACKALLOC_W_PROBING;
----------------
pengfei wrote:
> Use `for (MachineInstr &MI : PrologMBB) {` to iterate rather than loop `find_if`.
Hmm, this doesn't work and I think my whole approach is flawed: Initially I was afraid to use a for loop because I was afraid that the iterator would get invalidated by the call to `emitStackProbeInline` and `eraseFromParent`. The invalidation caused `eraseFromParent` isn't a huge problem and could probably be bypassed by using `make_early_inc_range`. The real problem is that `emitStackProbeInline` will split the basic block when it actually inserts stack probe if it chooses to insert a loop. Crucially this means that if the first `STACKALLOC_W_PROBING` machine instruction causes one of these stack probe loops to be created all other `STACKALLOC_W_PROBING`s will be moved into a new basic block after the loop. This means that `inlineStackProbe` will no longer find them and we're back to where we started with `STACKALLOC_W_PROBING` not being inlined.
I'm not yet sure how to fix this, I'd welcome suggestions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150033/new/

https://reviews.llvm.org/D150033



More information about the llvm-commits mailing list