[PATCH] D107544: [X86] [AMX] Replace bitcast with specific AMX intrinsics with X86 specific cast.

Bing Yu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 17 00:07:57 PDT 2021


yubing marked 6 inline comments as done.
yubing added inline comments.


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:844
+    for (User *V : make_early_inc_range(OldPN->users())) {
+      Instruction *ACI = dyn_cast<Instruction>(V);
+      if (ACI && isAMXCast(ACI)) {
----------------
LuoYuanke wrote:
> What does ACI stand for? AMX cast intrinsic?
Yeah, AMX cast intrinsic


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:920
+    for (Instruction &I : BB) {
+      if (isAMXCast(&I)) {
+        if (PHINode *PN = dyn_cast<PHINode>(I.getOperand(0)))
----------------
LuoYuanke wrote:
> We can erase dead cast code from Vec2TileInsts and Vec2TileInsts and get AMX cast instruction from there, so that we can avoid iterate basic block again.
We will refactor it in next patch


================
Comment at: llvm/test/CodeGen/X86/AMX/lat-combine-amx-bitcast.ll:106
+for.body.i.lr.ph.i:                               ; preds = %wrapper_entry
+  %1 = call x86_amx @llvm.x86.cast.vector.to.tile.v110i32(<110 x i32> undef)
+  %2 = call x86_amx @llvm.x86.cast.vector.to.tile.v616i8(<616 x i8> undef)
----------------
LuoYuanke wrote:
> We can optimize udef or zero vector with tilezero. We may do it in another patch.
Sure


================
Comment at: llvm/test/CodeGen/X86/AMX/lat-transform-amx-bitcast.ll:291
+;
+  %t0 = load <256 x i32>, <256 x i32>* %pa, align 64
+  %t1 = call x86_amx @llvm.x86.cast.vector.to.tile.v256i32(<256 x i32> %t0)
----------------
LuoYuanke wrote:
> We can combine load and cast to @llvm.x86.tileloadd64.internal.
Sure, we will do it in the next patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107544/new/

https://reviews.llvm.org/D107544



More information about the llvm-commits mailing list