[Mlir-commits] [mlir] [MLIR][AMDGPU] Add amdgpu.global_transpose_load op for gfx1200+ global memory transpose loads (PR #195287)

Krzysztof Drewniak llvmlistbot at llvm.org
Mon May 4 09:45:24 PDT 2026


================
@@ -1449,6 +1449,53 @@ def AMDGPU_TransposeLoadOp :
   let hasVerifier = 1;
 }
 
+def AMDGPU_GlobalTransposeLoadOp :
+    AMDGPU_Op<"global_transpose_load", [SameVariadicOperandSize]>,
+    Arguments<(ins Arg<AnyMemRef, "buffer to transpose load from", [MemRead]>:$src,
+                      Variadic<Index>:$srcIndices)>,
+    Results<(outs AnyTypeOf<[
+      FixedVectorOfLengthAndType<[8], [I8, F16, BF16, I16]>,
----------------
krzysz00 wrote:

I think you've missed the relevant fp8s

https://github.com/llvm/llvm-project/pull/195287


More information about the Mlir-commits mailing list