[PATCH] D125075: [X86][AMX] Multiple configure for AMX register.

Xiang Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 23 17:46:41 PDT 2022


xiangzhangllvm added inline comments.


================
Comment at: llvm/lib/Target/X86/X86FastPreTileConfig.cpp:576
+    // If the src and dst of the COPY can NOT be in the same config in below
+    // case. Reload would be generated befor the copy instruction.
+    // def row0
----------------
Copy will cause the wrong action for following get shape.
MI.getOperand(x)


================
Comment at: llvm/lib/Target/X86/X86FastPreTileConfig.cpp:550
+    if (HasTileOperand(MRI, MI))
+      HasUnconfigTile = true;
+    // According to AMX ABI, all the tile registers including config register
----------------
LuoYuanke wrote:
> xiangzhangllvm wrote:
> > I am afraid , even without call, 1 ldtilecfg is not enough for 1 MBB.
> > For example
> > In 1 MBB, it contain 4 shapes, but the first 3 shape used up the "max reg num of ldtilecfg (8)" virtual tile regs in follow code. So It need another ldtilecfg.
> That would run out of register. Currently we have valotiled tile in "lower amx type" pass. We can improve it later and disable volatile tile in that pass.
If we consider greedy-->fast, we should consider this problem.


================
Comment at: llvm/lib/Target/X86/X86FastPreTileConfig.cpp:604
+        // reload befor UseMI
+        reload(UseMI.getIterator(), TileReg, RowMO, ColMO);
+      } else {
----------------
LuoYuanke wrote:
> xiangzhangllvm wrote:
> > We should escape duplicated reload. for example:
> > 
> > Not re-gen reload for line 2.
> > ```
> > 1  T0 = TileLoad
> > 2   TileUse T0
> > ```
> > 
> How about optimize reload in another patch?
No problem.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125075/new/

https://reviews.llvm.org/D125075



More information about the llvm-commits mailing list