[PATCH] D18890: [AArch64] add SSA Load Store optimization pass

Jun Bum Lim via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 12 12:00:57 PDT 2016


junbuml added a comment.

Hi Jongwon
Thanks you for the update with the additional alias check. However, as you merge up the second load/store to the first load/stores, I believe you need to check if there is any instruction in between the first and second load/store, which may alias with the second load/store, not just for the first store and the second load.

I'm also curious how this pass was motivated. Did you see any performance gain with this change?

Please, see my inline comments for minor issues.


================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:21-29
@@ +20,11 @@
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachineInstr.h"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/MachineRegisterInfo.h"
+#include "llvm/CodeGen/MachineTraceMetrics.h"
+#include "llvm/CodeGen/TargetSchedule.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Target/TargetInstrInfo.h"
+#include "llvm/IR/Constants.h"
+
----------------
Please remove header files not used in this file. For example : 
  llvm/Support/raw_ostream.h 
  llvm/IR/Constants.h
  llvm/CodeGen/MachineTraceMetrics.h
  llvm/CodeGen/TargetSchedule.h

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:80
@@ +79,3 @@
+// Scaling factor for unscaled load or store.
+static int getMemScale(MachineInstr *MI) {
+  switch (MI->getOpcode()) {
----------------
It appears that you handles  STURWi, STRWui, LDRWui, LDURWi in this patch. So, I don't think you need to keep all the other cases.  

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:182-184
@@ +181,5 @@
+    if (Offset % 2)
+      return UnscaledOffset < 256 && UnscaledOffset >= -256;
+    else
+      return UnscaledOffset <= 16380 && UnscaledOffset >= 0;
+  }
----------------
Is there any programmatic way to express the meaning of constants you use here instead of directly using them here? 

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:190
@@ +189,3 @@
+
+// Check if two loads/stores have consecutive memory accesses.
+bool AArch64SSALoadStoreOpt::isConsecutive(
----------------
It will be good to add more comments about that this function also insert elements in the SmallVector in the access order. 

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:195
@@ +194,3 @@
+  bool FirstMIIsUnscaled = TII->isUnscaledLdSt(FirstMI);
+  unsigned FirstMIBaseReg = getLdStBaseOp(FirstMI).getReg();
+  int FirstMIOffset = getLdStOffsetOp(FirstMI).getImm();
----------------
It will be good to make sure that we are handling instructions expected to be passed in this function by adding assert() or some checks ?    

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:219-228
@@ +218,12 @@
+
+  if (FirstMIBaseReg == SecondMIBaseReg &&
+      ((FirstMIOffset == SecondMIOffset + FirstMIOffsetStride) ||
+       (FirstMIOffset + FirstMIOffsetStride == SecondMIOffset))) {
+    if (FirstMIOffset < SecondMIOffset) {
+      MIs.push_back(FirstMI);
+      MIs.push_back(SecondMI);
+    } else {
+      MIs.push_back(SecondMI);
+      MIs.push_back(FirstMI);
+    }
+
----------------
Looks like you can simplify this if statements. For example, "FirstMIBaseReg == SecondMIBaseReg" could be checked earlier.  if (FirstMIOffset + FirstMIOffsetStride == SecondMIOffset) and  if (FirstMIOffset < SecondMIOffset) is redundant. 

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:239
@@ +238,3 @@
+void AArch64SSALoadStoreOpt::buildWideLdStInst(MachineInstr *MI,
+                                               MachineInstr *MergeMI) {
+  MachineBasicBlock *MBB = MI->getParent();
----------------
It will be good to make sure that MI and MergeMI are instructions expected in this function.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:242
@@ +241,3 @@
+  MachineOperand &RegOp = MI->getOperand(0);
+  unsigned Reg = MI->getOperand(0).getReg();
+  const MachineOperand &BaseRegOp = MI->getOperand(1);
----------------
unsigned Reg = RegOp.getReg();

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:261
@@ +260,3 @@
+  MachineBasicBlock::iterator InsertionPoint;
+  getPosition(InsertionPoint, *MBB, *MI);
+
----------------
Cannot you do just :
 InsertionPoint = MI;

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:269
@@ +268,3 @@
+  default:
+    assert(false && "Unexpected MI's opcode.");
+    break;
----------------
llvm_unreachable("Unexpected MI's opcode.");

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:272-275
@@ +271,6 @@
+  case AArch64::LDRWui:
+    if (ShouldBeUnscaled)
+      NewOpc = AArch64::LDURXi;
+    else
+      NewOpc = AArch64::LDRXui;
+    break;
----------------
NewOpc = ShouldBeUnscaled ? AArch64::LDURXi : AArch64::LDRXui;

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:330
@@ +329,3 @@
+  // operand.
+  if (!getLdStBaseOp(FirstMI).isReg() || !getLdStOffsetOp(FirstMI).isImm())
+    return false;
----------------
Should it be assert () because you specifically handle AArch64::STURWi, AArch64::STRWui ? 

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:349
@@ +348,3 @@
+  // operand.
+  if (!getLdStBaseOp(&DefInst).isReg() || !getLdStOffsetOp(&DefInst).isImm())
+    return false;
----------------
Same here. I think it should be assert () because you specifically handle AArch64::LDRWui, AArch64::LDURWi ? 

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:356-357
@@ +355,4 @@
+
+  SmallVector<MachineInstr *, 2> consecutive_stores;
+  SmallVector<MachineInstr *, 2> consecutive_loads;
+
----------------
ConsecutiveStores and ConsecutiveLoads will be better.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:359
@@ +358,3 @@
+
+  // Find consecutive store for FirstMI.
+  for (; MBBI != E; ++MBBI) {
----------------
You may want to do MBBI++ here, instead of MBBI++in line 326. Or you can use do/while loop.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:369
@@ +368,3 @@
+    // operand.
+    if (!getLdStBaseOp(MI).isReg() || !getLdStOffsetOp(MI).isImm())
+      return false;
----------------
Maybe assert() here as you specifically detect STURWi and STRWui.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:413-420
@@ +412,10 @@
+
+      // If consecutive loads/stores are found, the followings are satisfied.
+      //  dst reg of first load = src reg of first store
+      //  dst reg of second load = src reg of second store
+      if (getLdStRegOp(consecutive_loads[0]).getReg() !=
+              getLdStRegOp(consecutive_stores[0]).getReg() ||
+          getLdStRegOp(consecutive_loads[1]).getReg() !=
+              getLdStRegOp(consecutive_stores[1]).getReg())
+        return false;
+
----------------
Do we need to check this again? Or maybe assert() ?

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:424
@@ +423,3 @@
+      // instruction.
+      if (instrAliased(consecutive_stores[0], consecutive_loads[1])) {
+        DEBUG(dbgs() << "The first store instruction:\n    ");
----------------
Don't you also need to check  if there is any instruction in between the first and second load, which is aliased with the second loads?  The same check should be performed for stores.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:454-455
@@ +453,4 @@
+  //    write the values of the consecutive 32-bit loads. Transform the
+  //    loads/stores
+  //    to 64-bit load/store.
+  //
----------------
Please change :
   //    loads/stores
   //    to 64-bit load/store.
into 
  //    loads/stores to 64-bit load/store.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:480-481
@@ +479,4 @@
+  //    When the original load/store is scaled(ldr/str) and it has an odd offset
+  //    value,
+  //    it can be widened if (-256 <= unscaled offset value < 256) is satisfied.
+  //    cf.) unscaled offset value = scaled offset value * memory scale size
----------------
Please fully use 80 column line.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:493-494
@@ +492,4 @@
+  std::set<MachineInstr *> StoresWillBeDeleted;
+  for (MachineBasicBlock::iterator MBBI = MBB.begin(), E = MBB.end(); MBBI != E;
+       ++MBBI) {
+    MachineInstr *MI = MBBI;
----------------
Can you change it to range-based loop  ?

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:523
@@ +522,3 @@
+  TII = static_cast<const AArch64InstrInfo *>(Subtarget->getInstrInfo());
+  TRI = MF.getSubtarget().getRegisterInfo();
+  MRI = &MF.getRegInfo();
----------------
Please remove TRI if not used here.


http://reviews.llvm.org/D18890





More information about the llvm-commits mailing list