[all-commits] [llvm/llvm-project] 6c2a01: AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32
Nicolai Hähnle via All-commits
all-commits at lists.llvm.org
Tue May 10 07:16:12 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 6c2a01ce3a824622e4491e913023c304841363b1
https://github.com/llvm/llvm-project/commit/6c2a01ce3a824622e4491e913023c304841363b1
Author: Nicolai Hähnle <nicolai.haehnle at amd.com>
Date: 2022-05-10 (Tue, 10 May 2022)
Changed paths:
M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
M llvm/test/CodeGen/AMDGPU/mad_64_32.ll
Log Message:
-----------
AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32
Only fold for uniform values on pre-GFX9 chips. GFX9+ allow us
to keep the calculation entirely on the SALU.
For subtargets where integer multiplication isn't full-rate, avoid
folding if the multiply has too many uses.
Finally, we expand 64x32 and 64x64 multiplies here as well, if they
feed into an addition. This results in better code generation than
the generic expansion for such multiplies because we end up using
the accumulator of the MAD instructions.
Differential Revision: https://reviews.llvm.org/D123835
More information about the All-commits
mailing list