[PATCH] Add NVPTXPeephole pass to reduce unnecessary address cast

Xuetian Weng xweng at google.com
Thu Jun 18 17:45:18 PDT 2015


================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:8
@@ +7,3 @@
+//
+// In NVPTX, we always use a special frame register which holds local address
+// of frame, NVPTXLowerAlloca may introduce a lot of cvta.to.local instructions,
----------------
eliben wrote:
> Slight rewording/grammar:
> 
> // In NVPTX, we always use a special frame register which holds local address
> // of frame. NVPTXLowerAlloca may introduce a lot of cvta.to.local instructions.
> // This peephole pass optimizes these cases, for example
> 
> also, I'd clarify a bit more what "holds local address" means
Expanded with more details.

================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:64
@@ +63,3 @@
+static bool isCVTAToLocalCombinationCandidate(MachineInstr& Root)
+{
+  auto &MBB = *Root.getParent();
----------------
jingyue wrote:
> style nit: we prefer putting { at the same line as the function header. 
Fixed

================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:89
@@ +88,3 @@
+  auto &BaseAddrOp = GenericAddrDef->getOperand(1);
+  if (BaseAddrOp.getType() == MachineOperand::MO_FrameIndex) {
+    return true;
----------------
jingyue wrote:
> Could there be multiple frame indices? `fi#0`, `fi#1`, ...
Fixed

================
Comment at: test/CodeGen/NVPTX/call-with-alloca-buffer.ll:24
@@ -25,1 +23,3 @@
+; CHECK: mov.u64 %SPL
+; CHECK: cvta.local.u64 %SP, %SPL
 
----------------
jingyue wrote:
> I don't see `cvta.local.u64` removed in any tests, probably because %SP is still used. Is it worthwhile adding a test where the `cvta.local.u64` can indeed be removed, because that's the whole point of this peephole optimization? 
A test case for this is added in local-stack-frame.ll

http://reviews.llvm.org/D10549

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list