[llvm] [AMDGPU] tensor_{load_to/store_from}_lds => ..._d2 simplification (PR #171540)
Krzysztof Drewniak via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 10 10:28:23 PST 2025
================
@@ -1737,6 +1737,26 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
NewII->takeName(&II);
return IC.replaceInstUsesWith(II, NewII);
}
+ case Intrinsic::amdgcn_tensor_load_to_lds:
+ case Intrinsic::amdgcn_tensor_store_from_lds: {
+ Value *D2 = II.getArgOperand(2);
+ Value *D3 = II.getArgOperand(3);
+ // We know that not passing the second and third tensor DMA groups is
+ // equivalent to passing zeroes for those registers, so we rewrite to the
+ // shorter form here.
+ if (!match(D2, m_Zero()) || !match(D3, m_Zero()))
----------------
krzysz00 wrote:
Yep, we're now matching undef/poison too
https://github.com/llvm/llvm-project/pull/171540
More information about the llvm-commits
mailing list