[PATCH] D89187: [AMDGPU] Minimize number of s_mov generated by copyPhysReg

Carl Ritson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 12 19:47:45 PDT 2020


critson added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:645-653
+  // Peek ahead to check for registers immediately being overwritten.
+  // This is intended to eliminate duplicate register writes from
+  // pattern suchs as:
+  //   $sgpr4_sgpr5_sgpr6_sgpr7 = COPY $sgpr28_sgpr29_sgpr30_sgpr31
+  //   $sgpr4 = COPY killed $sgpr1
+  //   $sgpr5 = COPY killed $sgpr2
+  SmallSet<Register, 4> OverwrittenSGPRs;
----------------
arsenm wrote:
> Can we do something earlier? I don't think copyPhysReg should be considering context beyond the given instruction
I think the earliest we can do this is immediately following "Simple Register Coalescing", although I am unsure of the RA implications and I do not think we have an appropriate pass it fits into.

To avoid adding a new pass I think this might be a reasonable addition to "SI Fix VGPR copies" as that already passes over the whole IR and is doing work on copies.
I could rename "SI Fix VGPR copies" to something like "SI Late Fixup Copies" and put the peephole in there?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89187/new/

https://reviews.llvm.org/D89187



More information about the llvm-commits mailing list