[PATCH] D28794: [NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.

Justin Lebar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 16 23:18:43 PST 2017

jlebar created this revision.

There are many NVVM intrinsics that we can't entirely get rid of, but
that nonetheless often correspond to target-generic LLVM intrinsics.

For example, if flush denormals to zero (ftz) is enabled, we can convert
@llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32.  On the other hand, if ftz is
disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a
non-ftz PTX instruction.  In this case, we can, however, simplify the
non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32.

These transformations are particularly useful because they let us
constant fold instructions that appear in libdevice, the bitcode library
that ships with CUDA and essentially functions as its libm.



-------------- next part --------------
A non-text attachment was scrubbed...
Name: D28794.84634.patch
Type: text/x-patch
Size: 27717 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170117/1c2594a7/attachment.bin>

More information about the llvm-commits mailing list