[PATCH] D124378: [X86][AMX] combine tile cast and load/store instruction.

Thu Jun 15 02:13:58 PDT 2023

yubing added inline comments.

================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:930
+  // stride.
+  Value *Stride = Builder.getInt64(64);
+  Value *I8Ptr =
----------------
LuoYuanke wrote:
> yubing wrote:
> > Why stride is 64 here instead of Col?
> Both 64 and Col should work as long as load/store keep the same stride value, but 64 is constant, so it is prefered.
how about the following IR:
%tile = call x86_amx @llvm.x86.tileloadd64.internal(i16 8, i16 32, i8* %src_ptr, i64 64)
%vec = call <256 x i8> @llvm.x86.cast.tile.to.vector.v256i8(x86_amx...%tile)
store <256 x i8> %vec, <256 x i8>* %dst_ptr, align 256

if you combine into:
%tile = call x86_amx @llvm.x86.tileloadd64.internal(i16 8, i16 32, i8* %src_ptr, i64 64)
call void @llvm.x86.tilestored64.internal(i16 8, i16 32, i8* %dst_ptr, i64 64, x86_amx %tile)

definitely it will out of bound.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124378/new/

https://reviews.llvm.org/D124378