[llvm] r244578 - [X86] When optimizing for minsize, use POP for small post-call stack clean-up

Alexander Potapenko via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 1 05:40:05 PDT 2015


I believe this commit broke codegen for X86, see
https://llvm.org/bugs/show_bug.cgi?id=24649

On Thu, Aug 20, 2015 at 10:26 AM, Kuperstein, Michael M via
llvm-commits <llvm-commits at lists.llvm.org> wrote:
> I haven't actually measured it, I was just being conservative, as it is not obviously a performance gain, and seemed like it could be a performance loss.
>
> As a side note, we ought to be careful when reasoning about what the frontend of modern Intels can and can't do, as the answer may be very different on Atoms.
>
> Michael
>
> -----Original Message-----
> From: Philip Reames [mailto:listmail at philipreames.com]
> Sent: Wednesday, August 19, 2015 21:14
> To: Kuperstein, Michael M; llvm-commits at lists.llvm.org
> Subject: Re: [llvm] r244578 - [X86] When optimizing for minsize, use POP for small post-call stack clean-up
>
> Out of curiosity, what is the performance impact of this change?  I could see an argument for doing this even in an -O2 build since a) the frontend should entirely handle the pop on modern intels, and b) it's a space savings.  I can't predict the outcome, but it seems like it might be worth measuring.
>
> Philip
>
> On 08/11/2015 01:48 AM, Michael Kuperstein via llvm-commits wrote:
>> Author: mkuper
>> Date: Tue Aug 11 03:48:48 2015
>> New Revision: 244578
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=244578&view=rev
>> Log:
>> [X86] When optimizing for minsize, use POP for small post-call stack
>> clean-up
>>
>> When optimizing for size, replace "addl $4, %esp" and "addl $8, %esp"
>> following a call by one or two pops, respectively. We don't try to do
>> it in general, but only when the stack adjustment immediately follows
>> a call - which is the most common case.
>>
>> That allows taking a short-cut when trying to find a free register to
>> pop into, instead of a full-blown liveness check. If the adjustment
>> immediately follows a call, then every register the call clobbers but
>> doesn't define should be dead at that point, and can be used.
>>
>> Differential Revision: http://reviews.llvm.org/D11749
>>
>> Added:
>>      llvm/trunk/test/CodeGen/X86/pop-stack-cleanup.ll
>> Modified:
>>      llvm/trunk/lib/Target/X86/X86FrameLowering.cpp
>>      llvm/trunk/lib/Target/X86/X86FrameLowering.h
>>      llvm/trunk/test/CodeGen/X86/fold-push.ll
>>
>> Modified: llvm/trunk/lib/Target/X86/X86FrameLowering.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Frame
>> Lowering.cpp?rev=244578&r1=244577&r2=244578&view=diff
>> ======================================================================
>> ========
>> --- llvm/trunk/lib/Target/X86/X86FrameLowering.cpp (original)
>> +++ llvm/trunk/lib/Target/X86/X86FrameLowering.cpp Tue Aug 11 03:48:48
>> +++ 2015
>> @@ -1851,6 +1851,69 @@ void X86FrameLowering::adjustForHiPEProl
>>   #endif
>>   }
>>
>> +bool X86FrameLowering::adjustStackWithPops(MachineBasicBlock &MBB,
>> +    MachineBasicBlock::iterator MBBI, DebugLoc DL, int Offset) const
>> +{
>> +
>> +  if (Offset % SlotSize)
>> +    return false;
>> +
>> +  int NumPops = Offset / SlotSize;
>> +  // This is only worth it if we have at most 2 pops.
>> +  if (NumPops != 1 && NumPops != 2)
>> +    return false;
>> +
>> +  // Handle only the trivial case where the adjustment directly
>> + follows  // a call. This is the most common one, anyway.
>> +  if (MBBI == MBB.begin())
>> +    return false;
>> +  MachineBasicBlock::iterator Prev = std::prev(MBBI);  if
>> + (!Prev->isCall() || !Prev->getOperand(1).isRegMask())
>> +    return false;
>> +
>> +  unsigned Regs[2];
>> +  unsigned FoundRegs = 0;
>> +
>> +  auto RegMask = Prev->getOperand(1);
>> +
>> +  // Try to find up to NumPops free registers.
>> +  for (auto Candidate : X86::GR32_NOREX_NOSPRegClass) {
>> +
>> +    // Poor man's liveness:
>> +    // Since we're immediately after a call, any register that is clobbered
>> +    // by the call and not defined by it can be considered dead.
>> +    if (!RegMask.clobbersPhysReg(Candidate))
>> +      continue;
>> +
>> +    bool IsDef = false;
>> +    for (const MachineOperand &MO : Prev->implicit_operands()) {
>> +      if (MO.isReg() && MO.isDef() && MO.getReg() == Candidate) {
>> +        IsDef = true;
>> +        break;
>> +      }
>> +    }
>> +
>> +    if (IsDef)
>> +      continue;
>> +
>> +    Regs[FoundRegs++] = Candidate;
>> +    if (FoundRegs == (unsigned)NumPops)
>> +      break;
>> +  }
>> +
>> +  if (FoundRegs == 0)
>> +    return false;
>> +
>> +  // If we found only one free register, but need two, reuse the same one twice.
>> +  while (FoundRegs < (unsigned)NumPops)
>> +    Regs[FoundRegs++] = Regs[0];
>> +
>> +  for (int i = 0; i < NumPops; ++i)
>> +    BuildMI(MBB, MBBI, DL,
>> +            TII.get(STI.is64Bit() ? X86::POP64r : X86::POP32r),
>> + Regs[i]);
>> +
>> +  return true;
>> +}
>> +
>>   void X86FrameLowering::
>>   eliminateCallFramePseudoInstr(MachineFunction &MF, MachineBasicBlock &MBB,
>>                                 MachineBasicBlock::iterator I) const {
>> @@ -1882,8 +1945,12 @@ eliminateCallFramePseudoInstr(MachineFun
>>       if (Amount) {
>>         // Add Amount to SP to destroy a frame, and subtract to setup.
>>         int Offset = isDestroy ? Amount : -Amount;
>> -      BuildStackAdjustment(MBB, I, DL, Offset, /*InEpilogue=*/false);
>> +
>> +      if (!(MF.getFunction()->optForMinSize() &&
>> +            adjustStackWithPops(MBB, I, DL, Offset)))
>> +        BuildStackAdjustment(MBB, I, DL, Offset,
>> + /*InEpilogue=*/false);
>>       }
>> +
>>       return;
>>     }
>>
>>
>> Modified: llvm/trunk/lib/Target/X86/X86FrameLowering.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Frame
>> Lowering.h?rev=244578&r1=244577&r2=244578&view=diff
>> ======================================================================
>> ========
>> --- llvm/trunk/lib/Target/X86/X86FrameLowering.h (original)
>> +++ llvm/trunk/lib/Target/X86/X86FrameLowering.h Tue Aug 11 03:48:48
>> +++ 2015
>> @@ -146,6 +146,11 @@ private:
>>                             MachineBasicBlock::iterator MBBI, DebugLoc DL,
>>                             uint64_t MaxAlign) const;
>>
>> +  /// Make small positive stack adjustments using POPs.
>> +  bool adjustStackWithPops(MachineBasicBlock &MBB,
>> +                           MachineBasicBlock::iterator MBBI, DebugLoc DL,
>> +                           int Offset) const;
>> +
>>     /// Adjusts the stack pointer using LEA, SUB, or ADD.
>>     MachineInstrBuilder BuildStackAdjustment(MachineBasicBlock &MBB,
>>
>> MachineBasicBlock::iterator MBBI,
>>
>> Modified: llvm/trunk/test/CodeGen/X86/fold-push.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fold-p
>> ush.ll?rev=244578&r1=244577&r2=244578&view=diff
>> ======================================================================
>> ========
>> --- llvm/trunk/test/CodeGen/X86/fold-push.ll (original)
>> +++ llvm/trunk/test/CodeGen/X86/fold-push.ll Tue Aug 11 03:48:48 2015
>> @@ -27,11 +27,11 @@ define void @test_min(i32 %a, i32 %b) mi
>>   ; CHECK: movl [[EAX:%e..]], (%esp)
>>   ; CHECK-NEXT: pushl [[EAX]]
>>   ; CHECK-NEXT: calll
>> -; CHECK-NEXT: addl $4, %esp
>> +; CHECK-NEXT: popl
>>   ; CHECK: nop
>>   ; CHECK: pushl (%esp)
>>   ; CHECK: calll
>> -; CHECK-NEXT: addl $4, %esp
>> +; CHECK-NEXT: popl
>>     %c = add i32 %a, %b
>>     call void @foo(i32 %c)
>>     call void asm sideeffect "nop",
>> "~{ax},~{bx},~{cx},~{dx},~{bp},~{si},~{di}"()
>>
>> Added: llvm/trunk/test/CodeGen/X86/pop-stack-cleanup.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pop-st
>> ack-cleanup.ll?rev=244578&view=auto
>> ======================================================================
>> ========
>> --- llvm/trunk/test/CodeGen/X86/pop-stack-cleanup.ll (added)
>> +++ llvm/trunk/test/CodeGen/X86/pop-stack-cleanup.ll Tue Aug 11
>> +++ 03:48:48 2015
>> @@ -0,0 +1,61 @@
>> +; RUN: llc < %s -mtriple=i686-windows | FileCheck %s
>> +-check-prefix=CHECK -check-prefix=NORMAL
>> +
>> +declare void @param1(i32 %a)
>> +declare i32 @param2_ret(i32 %a, i32 %b) declare i64 @param2_ret64(i32
>> +%a, i32 %b) declare void @param2(i32 %a, i32 %b) declare void
>> + at param3(i32 %a, i32 %b, i32 %c)
>> +
>> +define void @test() minsize {
>> +; CHECK-LABEL: test:
>> +; CHECK: calll _param1
>> +; CHECK-NEXT: popl %eax
>> +; CHECK: calll _param2
>> +; CHECK-NEXT: popl %eax
>> +; CHECK-NEXT: popl %ecx
>> +; CHECK: calll _param2_ret
>> +; CHECK-NEXT: popl %ecx
>> +; CHECK-NEXT: popl %edx
>> +; CHECK-NEXT: pushl %eax
>> +; CHECK: calll _param3
>> +; CHECK-NEXT: addl $12, %esp
>> +; CHECK: calll _param2_ret64
>> +; CHECK-NEXT: popl %ecx
>> +; CHECK-NEXT: popl %ecx
>> +  call void @param1(i32 1)
>> +  call void @param2(i32 1, i32 2)
>> +  %ret = call i32 @param2_ret(i32 1, i32 2)
>> +  call void @param3(i32 1, i32 2, i32 %ret)
>> +  %ret64 = call i64 @param2_ret64(i32 1, i32 2)
>> +  ret void
>> +}
>> +
>> +define void @negative(i32 %k) {
>> +; CHECK-LABEL: negative:
>> +; CHECK: calll _param1
>> +; CHECK-NEXT: addl $4, %esp
>> +; CHECK: calll _param2
>> +; CHECK-NEXT: addl $8, %esp
>> +; CHECK: calll _param3
>> +; CHECK-NEXT: movl %ebp, %esp
>> +  %v = alloca i32, i32 %k
>> +  call void @param1(i32 1)
>> +  call void @param2(i32 1, i32 2)
>> +  call void @param3(i32 1, i32 2, i32 3)
>> +  ret void
>> +}
>> +
>> +define void @spill(i32 inreg %a, i32 inreg %b, i32 inreg %c) minsize
>> +{ ; CHECK-LABEL: spill:
>> +; CHECK-DAG: movl %ecx,
>> +; CHECK-DAG: movl %edx,
>> +; CHECK: calll _param2_ret
>> +; CHECK-NEXT: popl %ecx
>> +; CHECK-NEXT: popl %edx
>> +; CHECK-DAG: movl {{.*}}, %ecx
>> +; CHECK-DAG: movl {{.*}}, %edx
>> +; CHECK: calll _spill
>> +  %i = call i32 @param2_ret(i32 1, i32 2)
>> +  call void @spill(i32 %a, i32 %b, i32 %c)
>> +  ret void
>> +}
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Dienerstraße 12
80331 München


More information about the llvm-commits mailing list