[Mlir-commits] [mlir] [mlir][vector-to-gpu]: Extend MMA Lowerings (PR #176785)
Jack Frankland
llvmlistbot at llvm.org
Mon Feb 2 02:16:14 PST 2026
FranklandJack wrote:
> > I have a feeling this might be fixing the wrong problem. I'm not super familiar with the maths but I'm guessing that usually some kind of loop reordering would be possible. Jack mentioned that for some hardware non-contiguous loads are not supported so that would be another reason not to land this.
>
> It makes sense to me. If loop reordering is possible in the use cases, it would be better to do so. It is also benefit to bring performance on the hardware.
I'm not totally sure what this has to do with loop reordering? We have a `vector.transfer_read` instruction here with a strided minor identity map and this patch adds support for lowering this correctly to a `gpu.subgroup_mma_load_matrix` operation. It seems like we are solving different problems here so I'd argue it's still useful to have this functionality upstream?
https://github.com/llvm/llvm-project/pull/176785
More information about the Mlir-commits
mailing list