[PATCH] D128584: [X86][AMX] Split greedy RA for tile register

Mon Jun 27 19:16:37 PDT 2022

xiangzhangllvm added a comment.

I think the spill/split should still cover the shape regs:

Let me first simply remember our previous action about greedy allocation for AMX:

1 We collected shapes (MOs) in allocating tile regs (by hint tile for same shape) in greedy
2 After greedy we insert the "fill" instructions to set the shape to ldtilecfg's mem. (They are still virtual)
3 Then the rewriter assign to physic regs to them.

The order of related passes:

  Greedy Register Allocator
  Verify generated machine code
  Tile Register Configure
  Verify generated machine code
  Virtual Register Rewriter
  Verify generated machine code
  Register Allocation Pass Scoring

Example:  After Tile Register Configure

  96B       VMOVUPSZmr %stack.0, 1, $noreg, 0, $noreg, %13:vr512 :: (store (s512) into %stack.0, align 4)
  104B      MOV8mi %stack.0, 1, $noreg, 0, $noreg, 1 :: (store (s512) into %stack.0, align 4)
  112B      MOV16mi %stack.0, 1, $noreg, 18, $noreg, 8 :: (store (s512) into %stack.0 + 18, align 2, basealign 4)
  116B      MOV8mi %stack.0, 1, $noreg, 50, $noreg, 8 :: (store (s512) into %stack.0 + 50, align 2, basealign 4)
  124B      MOV16mr %stack.0, 1, $noreg, 20, $noreg, %1.sub_16bit:gr32 :: (store (s512) into %stack.0 + 20, align 4)
  132B      MOV8mr %stack.0, 1, $noreg, 49, $noreg, %0.sub_8bit:gr32 :: (store (s512) into %stack.0 + 49, align 1, basealign 4)
  140B      MOV16mr %stack.0, 1, $noreg, 16, $noreg, %1.sub_16bit:gr32 :: (store (s512) into %stack.0 + 16, align 4)
  148B      MOV8mr %stack.0, 1, $noreg, 48, $noreg, %0.sub_8bit:gr32 :: (store (s512) into %stack.0 + 48, align 4)
  172B      PLDTILECFGV %stack.0, 1, $noreg, 0, $noreg, implicit-def dead $tmm0,  xxx

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128584/new/

https://reviews.llvm.org/D128584