[Mlir-commits] [clang] [llvm] [mlir] [AMDGPU] Use a general form of intrinsic for tensor load/store (PR #182334)
Changpeng Fang
llvmlistbot at llvm.org
Fri Feb 20 17:27:59 PST 2026
================
@@ -4194,41 +4194,24 @@ def int_amdgcn_swmmac_f16_16x16x128_bf8_bf8 : AMDGPUSWmmacIntrinsicIdxReuse<llvm
def int_amdgcn_swmmac_i32_16x16x128_iu8 : AMDGPUSWmmacIntrinsicABIdxClamp<llvm_anyint_ty, llvm_anyint_ty, llvm_anyint_ty, llvm_anyint_ty>;
}
-
class AMDGPUTensorLoadStore:
Intrinsic<
[],
[llvm_v4i32_ty, // D# group 0
llvm_v8i32_ty, // D# group 1
- llvm_v4i32_ty, // D# group 2
- llvm_v4i32_ty, // D# group 3
+ llvm_v4i32_ty, // D# group 2: group 2 and 3 should be zero-initialized for D# up to 2D.
----------------
changpeng wrote:
> Should this accept type mangling to change the vector width?
I don't think so because every group has "fixed width" for existing and near-future hardware.
https://github.com/llvm/llvm-project/pull/182334
More information about the Mlir-commits
mailing list