[llvm] [X86] Fix overflow with large stack probes on x86-64 (PR #113219)
Eli Friedman via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 21 14:30:20 PDT 2024
================
@@ -798,18 +798,43 @@ void X86FrameLowering::emitStackProbeInlineGenericLoop(
: Is64Bit ? X86::R11D
: X86::EAX;
- BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::COPY), FinalStackProbed)
- .addReg(StackPtr)
- .setMIFlag(MachineInstr::FrameSetup);
-
// save loop bound
{
- const unsigned BoundOffset = alignDown(Offset, StackProbeSize);
- const unsigned SUBOpc = getSUBriOpcode(Uses64BitFramePtr);
- BuildMI(MBB, MBBI, DL, TII.get(SUBOpc), FinalStackProbed)
- .addReg(FinalStackProbed)
- .addImm(BoundOffset)
- .setMIFlag(MachineInstr::FrameSetup);
+ const uint64_t BoundOffset = alignDown(Offset, StackProbeSize);
+
+ // Can we calculate the loop bound using SUB with a 32-bit immediate?
+ // Note that the immediate gets sign-extended when used with a 64-bit
+ // register, so in that case we only have 31 bits to work with.
+ bool canUseSub =
+ Uses64BitFramePtr ? isUInt<31>(BoundOffset) : isUInt<32>(BoundOffset);
+
+ if (canUseSub) {
+ const unsigned SUBOpc = getSUBriOpcode(Uses64BitFramePtr);
+
+ BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::COPY), FinalStackProbed)
+ .addReg(StackPtr)
+ .setMIFlag(MachineInstr::FrameSetup);
+ BuildMI(MBB, MBBI, DL, TII.get(SUBOpc), FinalStackProbed)
+ .addReg(FinalStackProbed)
+ .addImm(BoundOffset)
+ .setMIFlag(MachineInstr::FrameSetup);
+ } else if (Uses64BitFramePtr) {
+ BuildMI(MBB, MBBI, DL, TII.get(X86::MOV64ri), FinalStackProbed)
+ .addImm(-BoundOffset)
+ .setMIFlag(MachineInstr::FrameSetup);
+ BuildMI(MBB, MBBI, DL, TII.get(X86::ADD64rr), FinalStackProbed)
+ .addReg(FinalStackProbed)
+ .addReg(StackPtr)
+ .setMIFlag(MachineInstr::FrameSetup);
+ } else {
+ // We're being asked to probe a stack frame that's 4 GiB or larger,
+ // but our stack pointer is only 32 bits.
----------------
efriedma-quic wrote:
I'm a little concerned about printing an error for valid code... maybe less of an issue here than in other contexts, but generally, stack overflows should be a runtime error.
Maybe we can just set the size of the allocation to 2^32-1, and just let the OS generate a stack overflow trap?
https://github.com/llvm/llvm-project/pull/113219
More information about the llvm-commits
mailing list