[llvm] r257589 - LEA code size optimization pass (Part 2): Remove redundant LEA instructions.

Philip Reames via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 25 14:24:48 PDT 2016


I notice that this is structured as a per-basic block action, but is 
only enabled if the entire function is marked Oz.  Are there any plans 
to use block profiling to enable this in cold blocks on a non-Oz 
function?  I'd be very interested in seeing that happen.

p.s. Sorry to revive a zombie thread; I came across this change due to 
the presentation at EuroLLVM which mentioned it.

Philip

On 01/13/2016 03:30 AM, Andrey Turetskiy via llvm-commits wrote:
> Author: aturetsk
> Date: Wed Jan 13 05:30:44 2016
> New Revision: 257589
>
> URL: http://llvm.org/viewvc/llvm-project?rev=257589&view=rev
> Log:
> LEA code size optimization pass (Part 2): Remove redundant LEA instructions.
>
> Make x86 OptimizeLEAs pass remove LEA instruction if there is another LEA
> (in the same basic block) which calculates address differing only be a
> displacement. Works only for -Oz.
>
> Differential Revision: http://reviews.llvm.org/D13295
>
>
> Modified:
>      llvm/trunk/lib/Target/X86/X86.h
>      llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp
>      llvm/trunk/test/CodeGen/X86/lea-opt.ll
>
> Modified: llvm/trunk/lib/Target/X86/X86.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.h?rev=257589&r1=257588&r2=257589&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86.h (original)
> +++ llvm/trunk/lib/Target/X86/X86.h Wed Jan 13 05:30:44 2016
> @@ -54,7 +54,8 @@ FunctionPass *createX86PadShortFunctions
>   /// instructions, in order to eliminate execution delays in some processors.
>   FunctionPass *createX86FixupLEAs();
>   
> -/// Return a pass that removes redundant address recalculations.
> +/// Return a pass that removes redundant LEA instructions and redundant address
> +/// recalculations.
>   FunctionPass *createX86OptimizeLEAs();
>   
>   /// Return a pass that optimizes the code-size of x86 call sequences. This is
>
> Modified: llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp?rev=257589&r1=257588&r2=257589&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp Wed Jan 13 05:30:44 2016
> @@ -9,8 +9,10 @@
>   //
>   // This file defines the pass that performs some optimizations with LEA
>   // instructions in order to improve code size.
> -// Currently, it does one thing:
> -// 1) Address calculations in load and store instructions are replaced by
> +// Currently, it does two things:
> +// 1) If there are two LEA instructions calculating addresses which only differ
> +//    by displacement inside a basic block, one of them is removed.
> +// 2) Address calculations in load and store instructions are replaced by
>   //    existing LEA def registers where possible.
>   //
>   //===----------------------------------------------------------------------===//
> @@ -38,6 +40,7 @@ static cl::opt<bool> EnableX86LEAOpt("en
>                                        cl::init(false));
>   
>   STATISTIC(NumSubstLEAs, "Number of LEA instruction substitutions");
> +STATISTIC(NumRedundantLEAs, "Number of redundant LEA instructions removed");
>   
>   namespace {
>   class OptimizeLEAPass : public MachineFunctionPass {
> @@ -71,6 +74,13 @@ private:
>     /// \brief Returns true if the instruction is LEA.
>     bool isLEA(const MachineInstr &MI);
>   
> +  /// \brief Returns true if the \p Last LEA instruction can be replaced by the
> +  /// \p First. The difference between displacements of the addresses calculated
> +  /// by these LEAs is returned in \p AddrDispShift. It'll be used for proper
> +  /// replacement of the \p Last LEA's uses with the \p First's def register.
> +  bool isReplaceable(const MachineInstr &First, const MachineInstr &Last,
> +                     int64_t &AddrDispShift);
> +
>     /// \brief Returns true if two instructions have memory operands that only
>     /// differ by displacement. The numbers of the first memory operands for both
>     /// instructions are specified through \p N1 and \p N2. The address
> @@ -88,6 +98,9 @@ private:
>     /// \brief Removes redundant address calculations.
>     bool removeRedundantAddrCalc(const SmallVectorImpl<MachineInstr *> &List);
>   
> +  /// \brief Removes LEAs which calculate similar addresses.
> +  bool removeRedundantLEAs(SmallVectorImpl<MachineInstr *> &List);
> +
>     DenseMap<const MachineInstr *, unsigned> InstrPos;
>   
>     MachineRegisterInfo *MRI;
> @@ -194,6 +207,69 @@ bool OptimizeLEAPass::isLEA(const Machin
>            Opcode == X86::LEA64r || Opcode == X86::LEA64_32r;
>   }
>   
> +// Check that the Last LEA can be replaced by the First LEA. To be so,
> +// these requirements must be met:
> +// 1) Addresses calculated by LEAs differ only by displacement.
> +// 2) Def registers of LEAs belong to the same class.
> +// 3) All uses of the Last LEA def register are replaceable, thus the
> +//    register is used only as address base.
> +bool OptimizeLEAPass::isReplaceable(const MachineInstr &First,
> +                                    const MachineInstr &Last,
> +                                    int64_t &AddrDispShift) {
> +  assert(isLEA(First) && isLEA(Last) &&
> +         "The function works only with LEA instructions");
> +
> +  // Compare instructions' memory operands.
> +  if (!isSimilarMemOp(Last, 1, First, 1, AddrDispShift))
> +    return false;
> +
> +  // Make sure that LEA def registers belong to the same class. There may be
> +  // instructions (like MOV8mr_NOREX) which allow a limited set of registers to
> +  // be used as their operands, so we must be sure that replacing one LEA
> +  // with another won't lead to putting a wrong register in the instruction.
> +  if (MRI->getRegClass(First.getOperand(0).getReg()) !=
> +      MRI->getRegClass(Last.getOperand(0).getReg()))
> +    return false;
> +
> +  // Loop over all uses of the Last LEA to check that its def register is
> +  // used only as address base for memory accesses. If so, it can be
> +  // replaced, otherwise - no.
> +  for (auto &MO : MRI->use_operands(Last.getOperand(0).getReg())) {
> +    MachineInstr &MI = *MO.getParent();
> +
> +    // Get the number of the first memory operand.
> +    const MCInstrDesc &Desc = MI.getDesc();
> +    int MemOpNo = X86II::getMemoryOperandNo(Desc.TSFlags, MI.getOpcode());
> +
> +    // If the use instruction has no memory operand - the LEA is not
> +    // replaceable.
> +    if (MemOpNo < 0)
> +      return false;
> +
> +    MemOpNo += X86II::getOperandBias(Desc);
> +
> +    // If the address base of the use instruction is not the LEA def register -
> +    // the LEA is not replaceable.
> +    if (!isIdenticalOp(MI.getOperand(MemOpNo + X86::AddrBaseReg), MO))
> +      return false;
> +
> +    // If the LEA def register is used as any other operand of the use
> +    // instruction - the LEA is not replaceable.
> +    for (unsigned i = 0; i < MI.getNumOperands(); i++)
> +      if (i != (unsigned)(MemOpNo + X86::AddrBaseReg) &&
> +          isIdenticalOp(MI.getOperand(i), MO))
> +        return false;
> +
> +    // Check that the new address displacement will fit 4 bytes.
> +    if (MI.getOperand(MemOpNo + X86::AddrDisp).isImm() &&
> +        !isInt<32>(MI.getOperand(MemOpNo + X86::AddrDisp).getImm() +
> +                   AddrDispShift))
> +      return false;
> +  }
> +
> +  return true;
> +}
> +
>   // Check if MI1 and MI2 have memory operands which represent addresses that
>   // differ only by displacement.
>   bool OptimizeLEAPass::isSimilarMemOp(const MachineInstr &MI1, unsigned N1,
> @@ -316,6 +392,81 @@ bool OptimizeLEAPass::removeRedundantAdd
>     return Changed;
>   }
>   
> +// Try to find similar LEAs in the list and replace one with another.
> +bool
> +OptimizeLEAPass::removeRedundantLEAs(SmallVectorImpl<MachineInstr *> &List) {
> +  bool Changed = false;
> +
> +  // Loop over all LEA pairs.
> +  auto I1 = List.begin();
> +  while (I1 != List.end()) {
> +    MachineInstr &First = **I1;
> +    auto I2 = std::next(I1);
> +    while (I2 != List.end()) {
> +      MachineInstr &Last = **I2;
> +      int64_t AddrDispShift;
> +
> +      // LEAs should be in occurence order in the list, so we can freely
> +      // replace later LEAs with earlier ones.
> +      assert(calcInstrDist(First, Last) > 0 &&
> +             "LEAs must be in occurence order in the list");
> +
> +      // Check that the Last LEA instruction can be replaced by the First.
> +      if (!isReplaceable(First, Last, AddrDispShift)) {
> +        ++I2;
> +        continue;
> +      }
> +
> +      // Loop over all uses of the Last LEA and update their operands. Note that
> +      // the correctness of this has already been checked in the isReplaceable
> +      // function.
> +      for (auto UI = MRI->use_begin(Last.getOperand(0).getReg()),
> +                UE = MRI->use_end();
> +           UI != UE;) {
> +        MachineOperand &MO = *UI++;
> +        MachineInstr &MI = *MO.getParent();
> +
> +        // Get the number of the first memory operand.
> +        const MCInstrDesc &Desc = MI.getDesc();
> +        int MemOpNo = X86II::getMemoryOperandNo(Desc.TSFlags, MI.getOpcode()) +
> +                      X86II::getOperandBias(Desc);
> +
> +        // Update address base.
> +        MO.setReg(First.getOperand(0).getReg());
> +
> +        // Update address disp.
> +        MachineOperand *Op = &MI.getOperand(MemOpNo + X86::AddrDisp);
> +        if (Op->isImm())
> +          Op->setImm(Op->getImm() + AddrDispShift);
> +        else if (Op->isGlobal())
> +          Op->setOffset(Op->getOffset() + AddrDispShift);
> +        else
> +          llvm_unreachable("Invalid address displacement operand");
> +      }
> +
> +      // Since we can possibly extend register lifetime, clear kill flags.
> +      MRI->clearKillFlags(First.getOperand(0).getReg());
> +
> +      ++NumRedundantLEAs;
> +      DEBUG(dbgs() << "OptimizeLEAs: Remove redundant LEA: "; Last.dump(););
> +
> +      // By this moment, all of the Last LEA's uses must be replaced. So we can
> +      // freely remove it.
> +      assert(MRI->use_empty(Last.getOperand(0).getReg()) &&
> +             "The LEA's def register must have no uses");
> +      Last.eraseFromParent();
> +
> +      // Erase removed LEA from the list.
> +      I2 = List.erase(I2);
> +
> +      Changed = true;
> +    }
> +    ++I1;
> +  }
> +
> +  return Changed;
> +}
> +
>   bool OptimizeLEAPass::runOnMachineFunction(MachineFunction &MF) {
>     bool Changed = false;
>   
> @@ -339,6 +490,11 @@ bool OptimizeLEAPass::runOnMachineFuncti
>       if (LEAs.empty())
>         continue;
>   
> +    // Remove redundant LEA instructions. The optimization may have a negative
> +    // effect on performance, so do it only for -Oz.
> +    if (MF.getFunction()->optForMinSize())
> +      Changed |= removeRedundantLEAs(LEAs);
> +
>       // Remove redundant address calculations.
>       Changed |= removeRedundantAddrCalc(LEAs);
>     }
>
> Modified: llvm/trunk/test/CodeGen/X86/lea-opt.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/lea-opt.ll?rev=257589&r1=257588&r2=257589&view=diff
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/lea-opt.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/lea-opt.ll Wed Jan 13 05:30:44 2016
> @@ -129,3 +129,41 @@ sw.epilog:
>   ; CHECK:	movl ${{[1-4]+}}, ([[REG2]])
>   ; CHECK:	movl ${{[1-4]+}}, ([[REG3]])
>   }
> +
> +define void @test4(i64 %x) nounwind minsize {
> +entry:
> +  %a = getelementptr inbounds [65 x %struct.anon1], [65 x %struct.anon1]* @arr1, i64 0, i64 %x, i32 0
> +  %tmp = load i32, i32* %a, align 4
> +  %b = getelementptr inbounds [65 x %struct.anon1], [65 x %struct.anon1]* @arr1, i64 0, i64 %x, i32 1
> +  %tmp1 = load i32, i32* %b, align 4
> +  %sub = sub i32 %tmp, %tmp1
> +  %c = getelementptr inbounds [65 x %struct.anon1], [65 x %struct.anon1]* @arr1, i64 0, i64 %x, i32 2
> +  %tmp2 = load i32, i32* %c, align 4
> +  %add = add nsw i32 %sub, %tmp2
> +  switch i32 %add, label %sw.epilog [
> +    i32 1, label %sw.bb.1
> +    i32 2, label %sw.bb.2
> +  ]
> +
> +sw.bb.1:                                          ; preds = %entry
> +  store i32 111, i32* %b, align 4
> +  store i32 222, i32* %c, align 4
> +  br label %sw.epilog
> +
> +sw.bb.2:                                          ; preds = %entry
> +  store i32 333, i32* %b, align 4
> +  store i32 444, i32* %c, align 4
> +  br label %sw.epilog
> +
> +sw.epilog:                                        ; preds = %sw.bb.2, %sw.bb.1, %entry
> +  ret void
> +; CHECK-LABEL: test4:
> +; CHECK:	leaq arr1+4({{.*}}), [[REG2:%[a-z]+]]
> +; CHECK:	movl -4([[REG2]]), {{.*}}
> +; CHECK:	subl ([[REG2]]), {{.*}}
> +; CHECK:	addl 4([[REG2]]), {{.*}}
> +; CHECK:	movl ${{[1-4]+}}, ([[REG2]])
> +; CHECK:	movl ${{[1-4]+}}, 4([[REG2]])
> +; CHECK:	movl ${{[1-4]+}}, ([[REG2]])
> +; CHECK:	movl ${{[1-4]+}}, 4([[REG2]])
> +}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits



More information about the llvm-commits mailing list