[PATCH] D18890: [AArch64] add SSA Load Store optimization pass

Fri Apr 8 06:53:29 PDT 2016

rengolin added a comment.

Hi Jongwon,

The idea is interesting, but I'm not an expert on this, so I'll let James have a more thorough look, but I do have some questions.

Don't we already have a load/store optimisation pass? Why have you decided to add a new pass, rather than a new step on the current pass?

Also, you seem to have written code for a large number of cases, with corner cases (including offset < |256|), and there aren't enough tests to cover all the cases. Can you please add more to make sure both basic and corner cases are covered?

Thanks,
--renato


================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:71
@@ +70,3 @@
+// Scaling factor for unscaled load or store.
+static int getMemScale(MachineInstr *MI) {
+  switch (MI->getOpcode()) {
----------------
This functionality seems to match with TII's isUnscaledLdSt(), maybe it could be merged there?

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:168
@@ +167,3 @@
+  int FirstMIOffset = getLdStOffsetOp(FirstMI).getImm();
+  int FirstMIOffsetStride = FirstMIIsUnscaled ? getMemScale(FirstMI) : 1;
+
----------------
You have added the ops with scale 1, and an assert to make sure you're passing the right opcode. Shouldn't you just call getMemScale here and use it as an opcode check, and set the flag with (Stride == 1)?

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:286
@@ +285,3 @@
+    MachineBasicBlock::iterator &I, std::set<MachineInstr *> &stores,
+    std::set<MachineInstr *> &loads) {
+  MachineBasicBlock::iterator E = I->getParent()->end();
----------------
The names of the variables 'loads' and 'stores' is not very explanatory. Maybe call them deletedLoads/Stores? old?

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:407
@@ +406,3 @@
+  bool Modified = false;
+  // 1) Find consecutive two 32-bit loads and consecutive two 32-bit stores that
+  //    write the values of the consecutive 32-bit loads. Transform the
----------------
This comment is in the wrong place. It should be at a higher level, where people can actually read it before going through the code.

================
Comment at: lib/Target/AArch64/AArch64TargetMachine.cpp:114
@@ +113,3 @@
+                                   " optimization pass in SSA form"),
+                          cl::init(true), cl::Hidden);
+
----------------
Maybe not leave it on by default as a first approach?

================
Comment at: test/CodeGen/AArch64/ldst-opt.ll:1125
@@ -1124,3 +1124,3 @@
   %i = phi i64 [ %dec.i, %for.body], [ %count, %0 ]
-  %gep1 = getelementptr i32, i32* %phi1, i64 -1
+  %gep1 = getelementptr i32, i32* %phi1, i64 -3
   %load1 = load i32, i32* %gep1
----------------
Can you explain this change? Looks suspicious...

================
Comment at: test/CodeGen/AArch64/ssa-ldst-opt.ll:1
@@ +1,2 @@
+; RUN: llc -march=aarch64 -verify-machineinstrs -asm-verbose=false -o - %s | FileCheck %s
+
----------------
It's good to add the 'aarch64-ssa-load-store-opt' flag, even if it ends up enabled by default for two reasons:

1. It'll document that this is what the test is about, so people can easily find in the code what the change was.

2. If it ever ends up disabled by default, the tests will not break.


http://reviews.llvm.org/D18890