[PATCH] D36856: [AMDGPU] Use v_max_f* for fcanonicalize

Thu Aug 17 18:01:08 PDT 2017

rampitec created this revision.
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, kzhuravl.

If denorms are not flushed we can use max instead of multiplication
by 1. For double that is simply faster, while for float and half
it is shorter, because mul uses constant bus and VOP3.

https://reviews.llvm.org/D36856

Files:
  lib/Target/AMDGPU/AMDGPUInstructions.td
  lib/Target/AMDGPU/SIInstructions.td
  test/CodeGen/AMDGPU/fcanonicalize-denorms.ll
  test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
  test/CodeGen/AMDGPU/fcanonicalize.f16.ll
  test/CodeGen/AMDGPU/fcanonicalize.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D36856.111594.patch
Type: text/x-patch
Size: 16033 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170818/09573713/attachment-0001.bin>