[PATCH] D100026: [X86] Support AMX fast register allocation

Pengfei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 13 23:08:56 PDT 2021


pengfei added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ExpandPseudo.cpp:483
+  case X86::PLDTILECFGV: {
+    MI.setDesc(TII->get(X86::LDTILECFG));
+    return true;
----------------
Maybe don't need pseudo instruction?


================
Comment at: llvm/lib/Target/X86/X86FastTileConfig.cpp:26
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/PostOrderIterator.h"
+#include "llvm/CodeGen/LiveIntervals.h"
----------------
You can remove it if you don't use it.


================
Comment at: llvm/lib/Target/X86/X86FastTileConfig.cpp:28
+#include "llvm/CodeGen/LiveIntervals.h"
+#include "llvm/CodeGen/MachineDominators.h"
+#include "llvm/CodeGen/MachineFrameInfo.h"
----------------
Same above.


================
Comment at: llvm/lib/Target/X86/X86FastTileConfig.cpp:37
+#include "llvm/CodeGen/TileShapeInfo.h"
+#include "llvm/CodeGen/VirtRegMap.h"
+#include "llvm/InitializePasses.h"
----------------
Same above.


================
Comment at: llvm/lib/Target/X86/X86FastTileConfig.cpp:52-55
+  MachineDominatorTree *DomTree = nullptr;
+  MachineRegisterInfo *MRI = nullptr;
+  VirtRegMap *VRM = nullptr;
+  LiveIntervals *LIS = nullptr;
----------------
You can remove them if you don't use.


================
Comment at: llvm/lib/Target/X86/X86FastTileConfig.cpp:63
+
+  void FastTileConfig();
+  bool IsTileLoad(MachineInstr &MI);
----------------
Naming conversions: function should start with lower case letter.


================
Comment at: llvm/lib/Target/X86/X86FastTileConfig.cpp:111-114
+void X86FastTileConfig::getAnalysisUsage(AnalysisUsage &AU) const {
+  AU.setPreservesAll();
+  MachineFunctionPass::getAnalysisUsage(AU);
+}
----------------
You can remove it if you don't require any analysis.


================
Comment at: llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:4705
     }
+    case Intrinsic::x86_ldtilecfg_internal: {
+      unsigned Opc = X86::PLDTILECFGV;
----------------
You can move it to .td file.


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:521
+    Value *Op = PHI->getIncomingValue(i);
+    Instruction *I = dyn_cast<Instruction>(Op);
+    assert(I && "We shouldn't fold AMX instrution!");
----------------
You may need to exclude debug intrinsics.


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:529
+    if (DomBB == BB)
+      Pos = I;
+    else if (DomBB != LastDomBB)
----------------
I maybe not the last phi.


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:596
+      else
+        AMXDefInsts.push_back(&I);
+    }
----------------
Do you also insert a store for load intrinsic?


================
Comment at: llvm/lib/Target/X86/X86LowerAMXType.cpp:632
+    //   return false;
+    DominatorTree *DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
+
----------------
We may need to postpone it after we find a AMX intrinsic.


================
Comment at: llvm/lib/Target/X86/X86PreAMXConfig.cpp:60
+static bool IsAMXIntrinsic(IntrinsicInst *II) {
+  for (Value *Operand : II->operands())
+    if (Operand->getType()->isX86_AMXTy())
----------------
You may need to exclude debug intrinsics.


================
Comment at: llvm/lib/Target/X86/X86PreAMXConfig.cpp:173
+  Type *V512Ty = VectorType::get(Builder.getInt32Ty(), 16, false);
+  Align Alignment = DL.getPrefTypeAlign(Type::getInt32Ty(Ctx));
+
----------------
Should be V512Ty?


================
Comment at: llvm/lib/Target/X86/X86PreAMXConfig.cpp:193
+
+// Todo: We may need to handle "more than one store" case in the future.
+bool X86PreAMXConfig::checkVolatileModel(SmallSet<Value *, 4> &Loads,
----------------
Better add assert for this case.


================
Comment at: llvm/lib/Target/X86/X86PreAMXConfig.cpp:291
+
+      I = getConfigPosEnd(I, PosAndShapes[&*I]);
+      Find = true;
----------------
Better add assert to check I is load.


================
Comment at: llvm/lib/Target/X86/X86PreAMXConfig.cpp:356
+    // Prepare for fast register allocation at O0.
+    if (TM->getOptLevel() == CodeGenOpt::None) {
+
----------------
Do we need to check it since the pass is only created under O0?


================
Comment at: llvm/test/CodeGen/X86/AMX/amx-configO2toO0.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -O0 -mtriple=x86_64-unknown-unknown -mattr=+amx-int8 -mattr=+avx512f | FileCheck %s --check-prefix=AMX_O0
+
----------------
Better change one case to use avx or sse to check if stack cleared correctly.


================
Comment at: llvm/test/CodeGen/X86/AMX/amx-fast-tile-config.mir:416
+     ; CHECK: renamable $tmm2 = PTILELOADDV renamable $di, renamable $cx, killed renamable $r9, 1, renamable $rsi, 0, $noreg
+     ; CHECK: renamable $tmm0 = PTILELOADDV renamable $ax, renamable $cx, killed renamable $r8, 1, renamable $rsi, 0, $noreg
+     ; CHECK: renamable $tmm0 = PTDPBSSDV renamable $ax, renamable $cx, killed renamable $di, renamable $tmm0, killed renamable $tmm1, killed renamable $tmm2
----------------
The shapes for tmm0 is ax and cx, but the stored shape in stack is $sil and 8?


================
Comment at: llvm/test/CodeGen/X86/AMX/amx-fast-tile-config.mir:441
+    renamable $di = MOV16ri 8
+    renamable $tmm1 = PTILELOADDV renamable $ax, renamable $di, killed renamable $r10, 1, renamable $rsi, 0, $noreg
+    renamable $tmm2 = PTILELOADDV renamable $di, renamable $cx, killed renamable $r9, 1, renamable $rsi, 0, $noreg
----------------
I don't find where we shore this shape.


================
Comment at: llvm/utils/gn/secondary/llvm/lib/Target/X86/BUILD.gn:113
     "X86LowerAMXType.cpp",
+    "X86PreAMXConfig.cpp",
     "X86LowerTileCopy.cpp",
----------------
Why don't add X86FastTileConfig?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100026/new/

https://reviews.llvm.org/D100026



More information about the llvm-commits mailing list