[PATCH] Add NVPTXPeephole pass to reduce unnecessary address cast

Jingyue Wu jingyue at google.com
Thu Jun 18 16:07:51 PDT 2015


================
Comment at: lib/Target/NVPTX/NVPTXFrameLowering.cpp:58
@@ -61,3 +57,3 @@
                   MF.getSubtarget().getInstrInfo()->get(NVPTX::cvta_local_yes),
-                  NVPTX::VRFrame).addReg(LocalReg);
+                  NVPTX::VRFrame).addReg(NVPTX::VRFrameLocal);
       BuildMI(MBB, MI, dl,
----------------
Is `NVPTX::VRFrameLocal` 32-bit or 64-bit? You use it for both 32-bit and 64-bit. Does that matter? 

================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:64
@@ +63,3 @@
+static bool isCVTAToLocalCombinationCandidate(MachineInstr& Root)
+{
+  auto &MBB = *Root.getParent();
----------------
style nit: we prefer putting { at the same line as the function header. 

================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:81
@@ +80,3 @@
+  if (!GenericAddrDef
+   || GenericAddrDef->getParent() != &MBB
+   || (GenericAddrDef->getOpcode() != NVPTX::LEA_ADDRi64 &&
----------------
The indentation looks strange. Did you run clang-format? 

================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:89
@@ +88,3 @@
+  auto &BaseAddrOp = GenericAddrDef->getOperand(1);
+  if (BaseAddrOp.getType() == MachineOperand::MO_FrameIndex) {
+    return true;
----------------
Could there be multiple frame indices? `fi#0`, `fi#1`, ...

================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:97
@@ +96,3 @@
+static void CombineCVTAToLocal(MachineInstr& Root)
+{
+  auto &MBB = *Root.getParent();
----------------
again, move { up one line. 

================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:109
@@ +108,3 @@
+
+  MBB.insert((MachineBasicBlock::iterator) &Root, (MachineInstr *) MIB);
+
----------------
The type casts seem unnecessary, aren't they? 

================
Comment at: test/CodeGen/NVPTX/call-with-alloca-buffer.ll:24
@@ -25,1 +23,3 @@
+; CHECK: mov.u64 %SPL
+; CHECK: cvta.local.u64 %SP, %SPL
 
----------------
I don't see `cvta.local.u64` removed in any tests, probably because %SP is still used. Is it worthwhile adding a test where the `cvta.local.u64` can indeed be removed, because that's the whole point of this peephole optimization?

http://reviews.llvm.org/D10549

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list