[PATCH] D13294: LEA code size optimization pass (Part 1): Remove redundant address recalculations

Fri Nov 6 17:22:41 PST 2015

qcolombet added a subscriber: qcolombet.
qcolombet added a comment.

Hi Andrey,

Thanks for working on this.

What is the compile time impact of that pass?

Right now, the scope of the pass is pretty narrow. For instance, it is basic block scope only whereas it should be MachineFunction scope.
Anyhow, I am fine with it if we have a plan to improve it, therefore the question:
What is the long term plan for this pass?

Finally, the added test case seems too small to cover all the code added. I may of course be wrong, it seems we are missing some case. In particular, please make sure to cover the cases where:

- The size of the displacement exceeds 4-bytes.
- The matching LEA is after the definition.
- The size of the displacement exceeds 1-bytes for a closer candidate.
- The function doesn’t have the midsize/optsize attribute.

Cheers,
-Quentin

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:42
@@ +41,3 @@
+
+  const char *getPassName() const override { return "X86 LEA Optimize"; }
+
----------------
That’s a bit strange to have that private.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:55
@@ +54,3 @@
+  /// Negative result means, that instructions occur in reverse order.
+  int calcInstrDist(MachineInstr *First, MachineInstr *Last);
+
----------------
Since you expect both instructions to be valid, put references.
This is also true for the rest of the class.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:59
@@ +58,3 @@
+  /// calculation in MI instruction.
+  bool chooseBestLEA(SmallVector<MachineInstr *, 16> &List, MachineInstr *MI,
+                     MachineInstr *&LEA, int64_t &AddrDispShift, int &Dist);
----------------
Use SmallVectorImpl to not have to repeat the number of elements everywhere.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:60
@@ +59,3 @@
+  bool chooseBestLEA(SmallVector<MachineInstr *, 16> &List, MachineInstr *MI,
+                     MachineInstr *&LEA, int64_t &AddrDispShift, int &Dist);
+
----------------
Please explain all the fields in the comment.
Also, shouldn’t List be const here?

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:68
@@ +67,3 @@
+  bool isSimilarMemOp(MachineInstr *MI1, unsigned N1, MachineInstr *MI2,
+                      unsigned N2, int64_t &AddrDispShift);
+
----------------
What are the unsigned argument used for?
In other words, please document how they are used.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:97
@@ +96,3 @@
+// Find the best LEA instruction in the List to replace address recalculation in
+// MI. Such LEA must meet these requirements:
+// 1) Address calculated by the LEA differs only by displacement from the
----------------
use \p MI.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:98
@@ +97,3 @@
+// MI. Such LEA must meet these requirements:
+// 1) Address calculated by the LEA differs only by displacement from the
+//    address used in MI.
----------------
*The* address… by *the* displacement…
I am not a native English speaker, but without the articles, I found the sentence strange.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:100
@@ +99,3 @@
+//    address used in MI.
+// 2) Class of the LEA def register doesn't conflict with class of MI
+//    address base register.
----------------
The register class of the definition of the LEA… is compatible?

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:104
@@ +103,3 @@
+// 4) The LEA should be as close to MI as possible, and prior to it if
+//    possible.
+bool OptimizeLEAPass::chooseBestLEA(SmallVector<MachineInstr *, 16> &List,
----------------
Why?
You didn’t define “best” for the LEA instruction, so I guess it may make sense, but I don’t see why now.

Also, after reading the code, looks like you bail on the first LEA that is past MI, so what is the strategy here?
I.e., what if the next LEA would have matched but not this one.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:132
@@ +131,3 @@
+        MRI->getRegClass(DefMI->getOperand(0).getReg()))
+      continue;
+
----------------
You can also constrain the register class to the intersection of both classes.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:172
@@ +171,3 @@
+  std::vector<int> IdenticalOpNums = {X86::AddrBaseReg, X86::AddrScaleAmt,
+                                      X86::AddrIndexReg, X86::AddrSegmentReg};
+  for (auto &N : IdenticalOpNums)
----------------
What are the advantages of the vector against a static const array?
Look into X86TransformInfo for instance.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:175
@@ +174,3 @@
+    if (!MI1->getOperand(N1 + N).isIdenticalTo(MI2->getOperand(N2 + N)))
+      return false;
+
----------------
Even if both operand are identical, that does not mean they carry the same value.
This is true for SSA variable, but physical register are not SSA.
You must check that the operand is not a physical register.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:220
@@ +219,3 @@
+
+    // Get a number of the first memory operand.
+    const MCInstrDesc &Desc = MI->getDesc();
----------------
the number

http://reviews.llvm.org/D13294