[PATCH] D97112: [X86][AMX] Lower tile copy instruction.

Pengfei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Feb 20 01:15:55 PST 2021


pengfei added inline comments.


================
Comment at: llvm/lib/Target/X86/X86LowerTileCopy.cpp:1
+//-  X86Insertwait.cpp - Strict-Fp:Insert wait instruction X87 instructions --//
+//
----------------
Comment is wrong.


================
Comment at: llvm/lib/Target/X86/X86LowerTileCopy.cpp:9
+//
+// This file defines the pass which lower AMX tile copy instruction. Since there
+// is no tile copy instruction, we need store tile register to stack and load
----------------
instructions


================
Comment at: llvm/lib/Target/X86/X86RegisterInfo.cpp:878
     break;
+  case X86::COPY: {
+    Register SrcReg = MI->getOperand(1).getReg();
----------------
Is it possible to define a special COPY for AMX which can implicitly define a register for stride?


================
Comment at: llvm/lib/Target/X86/X86TargetMachine.cpp:584
 
+void X86PassConfig::addPostCopy() { addPass(createX86LowerTileCopyPass()); }
+
----------------
We are much like handling X87 register copy in pass "X86 FP Stackifier", so I think we can add the pass to addPostRegAlloc like it.


================
Comment at: llvm/test/CodeGen/X86/AMX/amx-lower-tile-copy.ll:37
+; CHECK-NEXT:    movabsq $64, %rax
+; CHECK-NEXT:    tilestored %tmm3, 2048(%rsp,%rax) # 1024-byte Folded Spill
+; CHECK-NEXT:    vzeroupper
----------------
As we had discussed, tilezero should be rematerialized instead of spilling. For non tilezero cases, we still need to consider the spilling as loop invariant and hoist it out of the loop. Anyway, these are optimization thoughs which don't affect the functionality here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97112/new/

https://reviews.llvm.org/D97112



More information about the llvm-commits mailing list