[PATCH] Add NVPTXPeephole pass to reduce unnecessary address cast
Xuetian Weng
xweng at google.com
Thu Jun 18 17:45:18 PDT 2015
================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:8
@@ +7,3 @@
+//
+// In NVPTX, we always use a special frame register which holds local address
+// of frame, NVPTXLowerAlloca may introduce a lot of cvta.to.local instructions,
----------------
eliben wrote:
> Slight rewording/grammar:
>
> // In NVPTX, we always use a special frame register which holds local address
> // of frame. NVPTXLowerAlloca may introduce a lot of cvta.to.local instructions.
> // This peephole pass optimizes these cases, for example
>
> also, I'd clarify a bit more what "holds local address" means
Expanded with more details.
================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:64
@@ +63,3 @@
+static bool isCVTAToLocalCombinationCandidate(MachineInstr& Root)
+{
+ auto &MBB = *Root.getParent();
----------------
jingyue wrote:
> style nit: we prefer putting { at the same line as the function header.
Fixed
================
Comment at: lib/Target/NVPTX/NVPTXPeephole.cpp:89
@@ +88,3 @@
+ auto &BaseAddrOp = GenericAddrDef->getOperand(1);
+ if (BaseAddrOp.getType() == MachineOperand::MO_FrameIndex) {
+ return true;
----------------
jingyue wrote:
> Could there be multiple frame indices? `fi#0`, `fi#1`, ...
Fixed
================
Comment at: test/CodeGen/NVPTX/call-with-alloca-buffer.ll:24
@@ -25,1 +23,3 @@
+; CHECK: mov.u64 %SPL
+; CHECK: cvta.local.u64 %SP, %SPL
----------------
jingyue wrote:
> I don't see `cvta.local.u64` removed in any tests, probably because %SP is still used. Is it worthwhile adding a test where the `cvta.local.u64` can indeed be removed, because that's the whole point of this peephole optimization?
A test case for this is added in local-stack-frame.ll
http://reviews.llvm.org/D10549
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list