[all-commits] [llvm/llvm-project] 6c2a01: AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32

Nicolai Hähnle via All-commits all-commits at lists.llvm.org
Tue May 10 07:16:12 PDT 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 6c2a01ce3a824622e4491e913023c304841363b1
      https://github.com/llvm/llvm-project/commit/6c2a01ce3a824622e4491e913023c304841363b1
  Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
  Date:   2022-05-10 (Tue, 10 May 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
    M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
    M llvm/test/CodeGen/AMDGPU/mad_64_32.ll

  Log Message:
  -----------
  AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32

Only fold for uniform values on pre-GFX9 chips. GFX9+ allow us
to keep the calculation entirely on the SALU.

For subtargets where integer multiplication isn't full-rate, avoid
folding if the multiply has too many uses.

Finally, we expand 64x32 and 64x64 multiplies here as well, if they
feed into an addition. This results in better code generation than
the generic expansion for such multiplies because we end up using
the accumulator of the MAD instructions.

Differential Revision: https://reviews.llvm.org/D123835




More information about the All-commits mailing list