[PATCH] D74410: AMDGPU: Directly use rcp intrinsic in idiv expansions

Tue Feb 11 07:41:20 PST 2020

arsenm created this revision.
arsenm added reviewers: rampitec, cfang, kerbowa.
Herald added subscribers: hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl.
Herald added a project: LLVM.

Since natural fdiv lowering is now more conservative even with
denormals disabled, we get a slower expansion from just a plain
1.0/fdiv. Directly emit the rcp intrinsic when using it to implement
integer division to avoid a pointlessly complex sequence.

https://reviews.llvm.org/D74410

Files:
  llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
  llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll
  llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
  llvm/test/CodeGen/AMDGPU/divrem24-assume.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D74410.243867.patch
Type: text/x-patch
Size: 45340 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200211/da74f3c3/attachment.bin>