[PATCH] D28719: [NVPTX] Improve lowering of llvm.ctlz.

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 17 13:28:13 PST 2017

tra accepted this revision.
tra added inline comments.
This revision is now accepted and ready to land.

Comment at: llvm/lib/Target/NVPTX/NVPTXInstrInfo.td:2633
+def : Pat<(i32 (zext (ctlz Int16Regs:$a))),
+          (SUBi32ri (CLZr32 (CVT_u32_u16 Int16Regs:$a, CvtNONE)), 16)>;
PTX has `mov.b32 %dest, {%src1, %src2}`
Instead of explicit conversion + subtracting 16, perhaps we could do something like this:
mov.b32 %t, {%src, 0xffff}
clz.b32 %result, %t
I'm not sure whether it makes any difference in SASS, though.


More information about the llvm-commits mailing list