<div dir="ltr"><div>Hi Philip,</div><div> </div><div>I have no plans to do this in the near future. Your idea sounds interesting, however please consider the following:</div><div><br></div><div>The reason why this particular part of the pass was enabled only for –Oz is that I got small performance drop on Spec 2000 at –Os on Atom because of it while the code size improve was even smaller. It’s not that I have some proof that the transformation is mostly bad for performance (actually I believe otherwise - execution of complex LEAs like ‘lea 0x12345678(%rax, %rbx, 4), %rcx’ is usually costly so reducing their number is a potential gain), I just didn’t want to risk in favour of insignificant code size gain and since my primary concern was code size I didn’t do any performance analysis.</div><div>So just note that restricting removing redundant LEAs to –Oz is not a solid solution and there may be some changes in that in the future. Or may not :)</div><div>And if you have cases where removing redundant LEAs really helps code size without hurting performance I think it would make sense to enable it at –Os as well.</div><div><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Mar 26, 2016 at 12:24 AM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I notice that this is structured as a per-basic block action, but is only enabled if the entire function is marked Oz. Are there any plans to use block profiling to enable this in cold blocks on a non-Oz function? I'd be very interested in seeing that happen.<br>
<br>
p.s. Sorry to revive a zombie thread; I came across this change due to the presentation at EuroLLVM which mentioned it.<br>
<br>
Philip<div class="HOEnZb"><div class="h5"><br>
<br>
On 01/13/2016 03:30 AM, Andrey Turetskiy via llvm-commits wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Author: aturetsk<br>
Date: Wed Jan 13 05:30:44 2016<br>
New Revision: 257589<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=257589&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=257589&view=rev</a><br>
Log:<br>
LEA code size optimization pass (Part 2): Remove redundant LEA instructions.<br>
<br>
Make x86 OptimizeLEAs pass remove LEA instruction if there is another LEA<br>
(in the same basic block) which calculates address differing only be a<br>
displacement. Works only for -Oz.<br>
<br>
Differential Revision: <a href="http://reviews.llvm.org/D13295" rel="noreferrer" target="_blank">http://reviews.llvm.org/D13295</a><br>
<br>
<br>
Modified:<br>
llvm/trunk/lib/Target/X86/X86.h<br>
llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp<br>
llvm/trunk/test/CodeGen/X86/lea-opt.ll<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.h?rev=257589&r1=257588&r2=257589&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.h?rev=257589&r1=257588&r2=257589&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86.h (original)<br>
+++ llvm/trunk/lib/Target/X86/X86.h Wed Jan 13 05:30:44 2016<br>
@@ -54,7 +54,8 @@ FunctionPass *createX86PadShortFunctions<br>
/// instructions, in order to eliminate execution delays in some processors.<br>
FunctionPass *createX86FixupLEAs();<br>
-/// Return a pass that removes redundant address recalculations.<br>
+/// Return a pass that removes redundant LEA instructions and redundant address<br>
+/// recalculations.<br>
FunctionPass *createX86OptimizeLEAs();<br>
/// Return a pass that optimizes the code-size of x86 call sequences. This is<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp?rev=257589&r1=257588&r2=257589&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp?rev=257589&r1=257588&r2=257589&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp (original)<br>
+++ llvm/trunk/lib/Target/X86/X86OptimizeLEAs.cpp Wed Jan 13 05:30:44 2016<br>
@@ -9,8 +9,10 @@<br>
//<br>
// This file defines the pass that performs some optimizations with LEA<br>
// instructions in order to improve code size.<br>
-// Currently, it does one thing:<br>
-// 1) Address calculations in load and store instructions are replaced by<br>
+// Currently, it does two things:<br>
+// 1) If there are two LEA instructions calculating addresses which only differ<br>
+// by displacement inside a basic block, one of them is removed.<br>
+// 2) Address calculations in load and store instructions are replaced by<br>
// existing LEA def registers where possible.<br>
//<br>
//===----------------------------------------------------------------------===//<br>
@@ -38,6 +40,7 @@ static cl::opt<bool> EnableX86LEAOpt("en<br>
cl::init(false));<br>
STATISTIC(NumSubstLEAs, "Number of LEA instruction substitutions");<br>
+STATISTIC(NumRedundantLEAs, "Number of redundant LEA instructions removed");<br>
namespace {<br>
class OptimizeLEAPass : public MachineFunctionPass {<br>
@@ -71,6 +74,13 @@ private:<br>
/// \brief Returns true if the instruction is LEA.<br>
bool isLEA(const MachineInstr &MI);<br>
+ /// \brief Returns true if the \p Last LEA instruction can be replaced by the<br>
+ /// \p First. The difference between displacements of the addresses calculated<br>
+ /// by these LEAs is returned in \p AddrDispShift. It'll be used for proper<br>
+ /// replacement of the \p Last LEA's uses with the \p First's def register.<br>
+ bool isReplaceable(const MachineInstr &First, const MachineInstr &Last,<br>
+ int64_t &AddrDispShift);<br>
+<br>
/// \brief Returns true if two instructions have memory operands that only<br>
/// differ by displacement. The numbers of the first memory operands for both<br>
/// instructions are specified through \p N1 and \p N2. The address<br>
@@ -88,6 +98,9 @@ private:<br>
/// \brief Removes redundant address calculations.<br>
bool removeRedundantAddrCalc(const SmallVectorImpl<MachineInstr *> &List);<br>
+ /// \brief Removes LEAs which calculate similar addresses.<br>
+ bool removeRedundantLEAs(SmallVectorImpl<MachineInstr *> &List);<br>
+<br>
DenseMap<const MachineInstr *, unsigned> InstrPos;<br>
MachineRegisterInfo *MRI;<br>
@@ -194,6 +207,69 @@ bool OptimizeLEAPass::isLEA(const Machin<br>
Opcode == X86::LEA64r || Opcode == X86::LEA64_32r;<br>
}<br>
+// Check that the Last LEA can be replaced by the First LEA. To be so,<br>
+// these requirements must be met:<br>
+// 1) Addresses calculated by LEAs differ only by displacement.<br>
+// 2) Def registers of LEAs belong to the same class.<br>
+// 3) All uses of the Last LEA def register are replaceable, thus the<br>
+// register is used only as address base.<br>
+bool OptimizeLEAPass::isReplaceable(const MachineInstr &First,<br>
+ const MachineInstr &Last,<br>
+ int64_t &AddrDispShift) {<br>
+ assert(isLEA(First) && isLEA(Last) &&<br>
+ "The function works only with LEA instructions");<br>
+<br>
+ // Compare instructions' memory operands.<br>
+ if (!isSimilarMemOp(Last, 1, First, 1, AddrDispShift))<br>
+ return false;<br>
+<br>
+ // Make sure that LEA def registers belong to the same class. There may be<br>
+ // instructions (like MOV8mr_NOREX) which allow a limited set of registers to<br>
+ // be used as their operands, so we must be sure that replacing one LEA<br>
+ // with another won't lead to putting a wrong register in the instruction.<br>
+ if (MRI->getRegClass(First.getOperand(0).getReg()) !=<br>
+ MRI->getRegClass(Last.getOperand(0).getReg()))<br>
+ return false;<br>
+<br>
+ // Loop over all uses of the Last LEA to check that its def register is<br>
+ // used only as address base for memory accesses. If so, it can be<br>
+ // replaced, otherwise - no.<br>
+ for (auto &MO : MRI->use_operands(Last.getOperand(0).getReg())) {<br>
+ MachineInstr &MI = *MO.getParent();<br>
+<br>
+ // Get the number of the first memory operand.<br>
+ const MCInstrDesc &Desc = MI.getDesc();<br>
+ int MemOpNo = X86II::getMemoryOperandNo(Desc.TSFlags, MI.getOpcode());<br>
+<br>
+ // If the use instruction has no memory operand - the LEA is not<br>
+ // replaceable.<br>
+ if (MemOpNo < 0)<br>
+ return false;<br>
+<br>
+ MemOpNo += X86II::getOperandBias(Desc);<br>
+<br>
+ // If the address base of the use instruction is not the LEA def register -<br>
+ // the LEA is not replaceable.<br>
+ if (!isIdenticalOp(MI.getOperand(MemOpNo + X86::AddrBaseReg), MO))<br>
+ return false;<br>
+<br>
+ // If the LEA def register is used as any other operand of the use<br>
+ // instruction - the LEA is not replaceable.<br>
+ for (unsigned i = 0; i < MI.getNumOperands(); i++)<br>
+ if (i != (unsigned)(MemOpNo + X86::AddrBaseReg) &&<br>
+ isIdenticalOp(MI.getOperand(i), MO))<br>
+ return false;<br>
+<br>
+ // Check that the new address displacement will fit 4 bytes.<br>
+ if (MI.getOperand(MemOpNo + X86::AddrDisp).isImm() &&<br>
+ !isInt<32>(MI.getOperand(MemOpNo + X86::AddrDisp).getImm() +<br>
+ AddrDispShift))<br>
+ return false;<br>
+ }<br>
+<br>
+ return true;<br>
+}<br>
+<br>
// Check if MI1 and MI2 have memory operands which represent addresses that<br>
// differ only by displacement.<br>
bool OptimizeLEAPass::isSimilarMemOp(const MachineInstr &MI1, unsigned N1,<br>
@@ -316,6 +392,81 @@ bool OptimizeLEAPass::removeRedundantAdd<br>
return Changed;<br>
}<br>
+// Try to find similar LEAs in the list and replace one with another.<br>
+bool<br>
+OptimizeLEAPass::removeRedundantLEAs(SmallVectorImpl<MachineInstr *> &List) {<br>
+ bool Changed = false;<br>
+<br>
+ // Loop over all LEA pairs.<br>
+ auto I1 = List.begin();<br>
+ while (I1 != List.end()) {<br>
+ MachineInstr &First = **I1;<br>
+ auto I2 = std::next(I1);<br>
+ while (I2 != List.end()) {<br>
+ MachineInstr &Last = **I2;<br>
+ int64_t AddrDispShift;<br>
+<br>
+ // LEAs should be in occurence order in the list, so we can freely<br>
+ // replace later LEAs with earlier ones.<br>
+ assert(calcInstrDist(First, Last) > 0 &&<br>
+ "LEAs must be in occurence order in the list");<br>
+<br>
+ // Check that the Last LEA instruction can be replaced by the First.<br>
+ if (!isReplaceable(First, Last, AddrDispShift)) {<br>
+ ++I2;<br>
+ continue;<br>
+ }<br>
+<br>
+ // Loop over all uses of the Last LEA and update their operands. Note that<br>
+ // the correctness of this has already been checked in the isReplaceable<br>
+ // function.<br>
+ for (auto UI = MRI->use_begin(Last.getOperand(0).getReg()),<br>
+ UE = MRI->use_end();<br>
+ UI != UE;) {<br>
+ MachineOperand &MO = *UI++;<br>
+ MachineInstr &MI = *MO.getParent();<br>
+<br>
+ // Get the number of the first memory operand.<br>
+ const MCInstrDesc &Desc = MI.getDesc();<br>
+ int MemOpNo = X86II::getMemoryOperandNo(Desc.TSFlags, MI.getOpcode()) +<br>
+ X86II::getOperandBias(Desc);<br>
+<br>
+ // Update address base.<br>
+ MO.setReg(First.getOperand(0).getReg());<br>
+<br>
+ // Update address disp.<br>
+ MachineOperand *Op = &MI.getOperand(MemOpNo + X86::AddrDisp);<br>
+ if (Op->isImm())<br>
+ Op->setImm(Op->getImm() + AddrDispShift);<br>
+ else if (Op->isGlobal())<br>
+ Op->setOffset(Op->getOffset() + AddrDispShift);<br>
+ else<br>
+ llvm_unreachable("Invalid address displacement operand");<br>
+ }<br>
+<br>
+ // Since we can possibly extend register lifetime, clear kill flags.<br>
+ MRI->clearKillFlags(First.getOperand(0).getReg());<br>
+<br>
+ ++NumRedundantLEAs;<br>
+ DEBUG(dbgs() << "OptimizeLEAs: Remove redundant LEA: "; Last.dump(););<br>
+<br>
+ // By this moment, all of the Last LEA's uses must be replaced. So we can<br>
+ // freely remove it.<br>
+ assert(MRI->use_empty(Last.getOperand(0).getReg()) &&<br>
+ "The LEA's def register must have no uses");<br>
+ Last.eraseFromParent();<br>
+<br>
+ // Erase removed LEA from the list.<br>
+ I2 = List.erase(I2);<br>
+<br>
+ Changed = true;<br>
+ }<br>
+ ++I1;<br>
+ }<br>
+<br>
+ return Changed;<br>
+}<br>
+<br>
bool OptimizeLEAPass::runOnMachineFunction(MachineFunction &MF) {<br>
bool Changed = false;<br>
@@ -339,6 +490,11 @@ bool OptimizeLEAPass::runOnMachineFuncti<br>
if (LEAs.empty())<br>
continue;<br>
+ // Remove redundant LEA instructions. The optimization may have a negative<br>
+ // effect on performance, so do it only for -Oz.<br>
+ if (MF.getFunction()->optForMinSize())<br>
+ Changed |= removeRedundantLEAs(LEAs);<br>
+<br>
// Remove redundant address calculations.<br>
Changed |= removeRedundantAddrCalc(LEAs);<br>
}<br>
<br>
Modified: llvm/trunk/test/CodeGen/X86/lea-opt.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/lea-opt.ll?rev=257589&r1=257588&r2=257589&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/lea-opt.ll?rev=257589&r1=257588&r2=257589&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/lea-opt.ll (original)<br>
+++ llvm/trunk/test/CodeGen/X86/lea-opt.ll Wed Jan 13 05:30:44 2016<br>
@@ -129,3 +129,41 @@ sw.epilog:<br>
; CHECK: movl ${{[1-4]+}}, ([[REG2]])<br>
; CHECK: movl ${{[1-4]+}}, ([[REG3]])<br>
}<br>
+<br>
+define void @test4(i64 %x) nounwind minsize {<br>
+entry:<br>
+ %a = getelementptr inbounds [65 x %struct.anon1], [65 x %struct.anon1]* @arr1, i64 0, i64 %x, i32 0<br>
+ %tmp = load i32, i32* %a, align 4<br>
+ %b = getelementptr inbounds [65 x %struct.anon1], [65 x %struct.anon1]* @arr1, i64 0, i64 %x, i32 1<br>
+ %tmp1 = load i32, i32* %b, align 4<br>
+ %sub = sub i32 %tmp, %tmp1<br>
+ %c = getelementptr inbounds [65 x %struct.anon1], [65 x %struct.anon1]* @arr1, i64 0, i64 %x, i32 2<br>
+ %tmp2 = load i32, i32* %c, align 4<br>
+ %add = add nsw i32 %sub, %tmp2<br>
+ switch i32 %add, label %sw.epilog [<br>
+ i32 1, label %sw.bb.1<br>
+ i32 2, label %sw.bb.2<br>
+ ]<br>
+<br>
+sw.bb.1: ; preds = %entry<br>
+ store i32 111, i32* %b, align 4<br>
+ store i32 222, i32* %c, align 4<br>
+ br label %sw.epilog<br>
+<br>
+sw.bb.2: ; preds = %entry<br>
+ store i32 333, i32* %b, align 4<br>
+ store i32 444, i32* %c, align 4<br>
+ br label %sw.epilog<br>
+<br>
+sw.epilog: ; preds = %sw.bb.2, %sw.bb.1, %entry<br>
+ ret void<br>
+; CHECK-LABEL: test4:<br>
+; CHECK: leaq arr1+4({{.*}}), [[REG2:%[a-z]+]]<br>
+; CHECK: movl -4([[REG2]]), {{.*}}<br>
+; CHECK: subl ([[REG2]]), {{.*}}<br>
+; CHECK: addl 4([[REG2]]), {{.*}}<br>
+; CHECK: movl ${{[1-4]+}}, ([[REG2]])<br>
+; CHECK: movl ${{[1-4]+}}, 4([[REG2]])<br>
+; CHECK: movl ${{[1-4]+}}, ([[REG2]])<br>
+; CHECK: movl ${{[1-4]+}}, 4([[REG2]])<br>
+}<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote>
<br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">Best regards,<br>Andrey Turetskiy</div>
</div></div>