[clang] [llvm] AMDGPU: Implement tensor load and store instructions for gfx1250 (PR #146636)
Juan Manuel Martinez CaamaƱo via cfe-commits
cfe-commits at lists.llvm.org
Wed Jul 2 08:12:23 PDT 2025
================
@@ -3580,6 +3580,37 @@ def int_amdgcn_fdiv_fast : DefaultAttrsIntrinsic<
[IntrNoMem, IntrSpeculatable]
>;
+class AMDGPUTensorLoadStore:
+ Intrinsic<
+ [],
+ [llvm_v4i32_ty, // D# group 0
+ llvm_v8i32_ty, // D# group 1
+ llvm_v4i32_ty, // D# group 2
+ llvm_v4i32_ty, // D# group 3
+ llvm_i32_ty], // cachepolicy:
+ // bits [0-2] = th
+ // bits [3-4] = scope
+ [IntrInaccessibleMemOrArgMemOnly, ImmArg<ArgIndex<4>>, IntrWillReturn, IntrConvergent, IntrNoCallback, IntrNoFree],
+ "", [SDNPMemOperand]
+ >;
+
+class AMDGPUTensorLoadStoreD2:
+ Intrinsic<
+ [],
+ [llvm_v4i32_ty, // D# group 0
+ llvm_v8i32_ty, // D# group 1
+ llvm_i32_ty], // cachepolicy:
+ // bits [0-2] = th
+ // bits [3-4] = scope
+ [IntrInaccessibleMemOrArgMemOnly, ImmArg<ArgIndex<2>>, IntrWillReturn, IntrConvergent, IntrNoCallback, IntrNoFree],
+ "", [SDNPMemOperand]
+ >;
+
+def int_amdgcn_tensor_load_to_lds : AMDGPUTensorLoadStore;
----------------
jmmartinez wrote:
Can we use `ClangBuiltin` in here to avoid the boilerplate in `TargetBuiltins/AMDGPU.cpp`?
```
def int_amdgcn_tensor_load_to_lds : ClangBuiltin<"__builtin_amdgcn_tensor_load_to_lds">, AMDGPUTensorLoadStore;
```
https://github.com/llvm/llvm-project/pull/146636
More information about the cfe-commits
mailing list