[PATCH] D28794: [NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.
Justin Lebar via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 16 23:18:43 PST 2017
jlebar created this revision.
There are many NVVM intrinsics that we can't entirely get rid of, but
that nonetheless often correspond to target-generic LLVM intrinsics.
For example, if flush denormals to zero (ftz) is enabled, we can convert
@llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32. On the other hand, if ftz is
disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a
non-ftz PTX instruction. In this case, we can, however, simplify the
non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32.
These transformations are particularly useful because they let us
constant fold instructions that appear in libdevice, the bitcode library
that ships with CUDA and essentially functions as its libm.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 27717 bytes
Desc: not available
More information about the llvm-commits