[PATCH] D89187: [AMDGPU] Minimize number of s_mov generated by copyPhysReg
Carl Ritson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 12 19:47:45 PDT 2020
critson added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:645-653
+ // Peek ahead to check for registers immediately being overwritten.
+ // This is intended to eliminate duplicate register writes from
+ // pattern suchs as:
+ // $sgpr4_sgpr5_sgpr6_sgpr7 = COPY $sgpr28_sgpr29_sgpr30_sgpr31
+ // $sgpr4 = COPY killed $sgpr1
+ // $sgpr5 = COPY killed $sgpr2
+ SmallSet<Register, 4> OverwrittenSGPRs;
----------------
arsenm wrote:
> Can we do something earlier? I don't think copyPhysReg should be considering context beyond the given instruction
I think the earliest we can do this is immediately following "Simple Register Coalescing", although I am unsure of the RA implications and I do not think we have an appropriate pass it fits into.
To avoid adding a new pass I think this might be a reasonable addition to "SI Fix VGPR copies" as that already passes over the whole IR and is doing work on copies.
I could rename "SI Fix VGPR copies" to something like "SI Late Fixup Copies" and put the peephole in there?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D89187/new/
https://reviews.llvm.org/D89187
More information about the llvm-commits
mailing list