[PATCH] D123835: AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32

Thu Apr 14 22:39:35 PDT 2022

nhaehnle created this revision.
nhaehnle added reviewers: arsenm, rampitec, t-tye, b-sumner.
Herald added subscribers: hsmhsm, foad, kerbowa, hiraditya, tpr, dstuttard, yaxunl, jvesely, kzhuravl.
Herald added a project: All.
nhaehnle requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.

Only fold for uniform values on pre-GFX9 chips. GFX9+ allow us
to keep the calculation entirely on the SALU.

We never fold if the mul has multiple uses, since that effectively
duplicates the (expensive) multiply.

Finally, we expand 64x32 and 64x64 multiplies here as well, if they
feed into an addition. This results in better code generation than
the generic expansion for such multiplies because we end up using
the accumulator of the MAD instructions.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D123835

Files:
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
  llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
  llvm/test/CodeGen/AMDGPU/ipra-return-address-save-restore.ll
  llvm/test/CodeGen/AMDGPU/mad_64_32.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D123835.423022.patch
Type: text/x-patch
Size: 22138 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220415/772fd80a/attachment.bin>