[clang] [llvm] [AMDGPU] Track tensor load/store DMAs with asyncmark (PR #200775)

Thu Jun 4 03:25:43 PDT 2026

================
@@ -50,6 +50,19 @@ memory and LDS memory.
   void @llvm.amdgcn.global.store.async.from.lds.type(ptr %dst, ptr %src)
   void @llvm.amdgcn.cluster.load.async.to.lds.type(ptr %dst, ptr %src)
 
+**GFX1250 Tensor DMA Instructions**
+
+.. code-block:: llvm
+
+  void @llvm.amdgcn.tensor.load.to.lds(...)
----------------
ssahasra wrote:

No, let's just leave it as is without `.async`. This list is likely to grow longer, in which case it will be easier to document the dependence on `asyncmark` in the builtin doc itself. And I am also seeing the wisdom in keeping intrinsic names close to the instruction. Much less friction about "what did they mean when they named it?"

https://github.com/llvm/llvm-project/pull/200775