[clang] [llvm] [AMDGPU] Track tensor load/store DMAs with asyncmark (PR #200775)
Sameer Sahasrabuddhe via cfe-commits
cfe-commits at lists.llvm.org
Thu Jun 4 03:25:43 PDT 2026
================
@@ -50,6 +50,19 @@ memory and LDS memory.
void @llvm.amdgcn.global.store.async.from.lds.type(ptr %dst, ptr %src)
void @llvm.amdgcn.cluster.load.async.to.lds.type(ptr %dst, ptr %src)
+**GFX1250 Tensor DMA Instructions**
+
+.. code-block:: llvm
+
+ void @llvm.amdgcn.tensor.load.to.lds(...)
----------------
ssahasra wrote:
No, let's just leave it as is without `.async`. This list is likely to grow longer, in which case it will be easier to document the dependence on `asyncmark` in the builtin doc itself. And I am also seeing the wisdom in keeping intrinsic names close to the instruction. Much less friction about "what did they mean when they named it?"
https://github.com/llvm/llvm-project/pull/200775
More information about the cfe-commits
mailing list