[PATCH] D36856: [AMDGPU] Use v_max_f* for fcanonicalize

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 17 18:01:08 PDT 2017


rampitec created this revision.
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, kzhuravl.

If denorms are not flushed we can use max instead of multiplication
by 1. For double that is simply faster, while for float and half
it is shorter, because mul uses constant bus and VOP3.


https://reviews.llvm.org/D36856

Files:
  lib/Target/AMDGPU/AMDGPUInstructions.td
  lib/Target/AMDGPU/SIInstructions.td
  test/CodeGen/AMDGPU/fcanonicalize-denorms.ll
  test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
  test/CodeGen/AMDGPU/fcanonicalize.f16.ll
  test/CodeGen/AMDGPU/fcanonicalize.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D36856.111594.patch
Type: text/x-patch
Size: 16033 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170818/09573713/attachment-0001.bin>


More information about the llvm-commits mailing list