[PATCH] D28794: [NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.
Justin Lebar via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 16 23:18:43 PST 2017
jlebar created this revision.
There are many NVVM intrinsics that we can't entirely get rid of, but
that nonetheless often correspond to target-generic LLVM intrinsics.
For example, if flush denormals to zero (ftz) is enabled, we can convert
@llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32. On the other hand, if ftz is
disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a
non-ftz PTX instruction. In this case, we can, however, simplify the
non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32.
These transformations are particularly useful because they let us
constant fold instructions that appear in libdevice, the bitcode library
that ships with CUDA and essentially functions as its libm.
https://reviews.llvm.org/D28794
Files:
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
llvm/test/Transforms/InstCombine/nvvm-intrins.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D28794.84634.patch
Type: text/x-patch
Size: 27717 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170117/1c2594a7/attachment.bin>
More information about the llvm-commits
mailing list