[PATCH] D36856: [AMDGPU] Use v_max_f* for fcanonicalize
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 17 18:01:08 PDT 2017
rampitec created this revision.
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, kzhuravl.
If denorms are not flushed we can use max instead of multiplication
by 1. For double that is simply faster, while for float and half
it is shorter, because mul uses constant bus and VOP3.
https://reviews.llvm.org/D36856
Files:
lib/Target/AMDGPU/AMDGPUInstructions.td
lib/Target/AMDGPU/SIInstructions.td
test/CodeGen/AMDGPU/fcanonicalize-denorms.ll
test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
test/CodeGen/AMDGPU/fcanonicalize.f16.ll
test/CodeGen/AMDGPU/fcanonicalize.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D36856.111594.patch
Type: text/x-patch
Size: 16033 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170818/09573713/attachment-0001.bin>
More information about the llvm-commits
mailing list