[PATCH] D142192: [AMDGPU] Run unmerge combines post regbankselect

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 9 06:03:45 PST 2023


arsenm added inline comments.


================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1767-1775
+    // This combine may run after RegBankSelect, so we need to be aware of
+    // register banks.
+    if (MRI.getRegBankOrNull(DstReg) != MRI.getRegBankOrNull(SrcReg)) {
+      SrcReg = Builder.buildCopy(MRI.getType(SrcReg), SrcReg).getReg(0);
+      if (const RegisterBank *DstRB = MRI.getRegBankOrNull(DstReg))
+        MRI.setRegBank(SrcReg, *DstRB);
+      if (const TargetRegisterClass *RC = MRI.getRegClassOrNull(SrcReg))
----------------
Pierre-vh wrote:
> arsenm wrote:
> > arsenm wrote:
> > > I'm pretty sure we have a helper for this already (at least the artifact combiner handles this already)
> > What I was thinking of was canReplaceReg. We should have another flavor that only accepts virtual registers and inserts the copy if needed
> It's 3 lines now, do we need to create a helper?
> If yes, what should we name it? Do we just create an overload of canReplaceReg that takes a Builder?
I guess not. However, I’m not sure how scalable this strategy will be as we add more regbank combines. I guess we can go with this for now and see how it works out 


================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2050-2052
+  (V_BFI_B32_e64 (COPY_TO_REGCLASS VSrc_b32:$x, VGPR_32),
+                 (COPY_TO_REGCLASS VSrc_b32:$y, VGPR_32),
+                 (COPY_TO_REGCLASS VSrc_b32:$z, VGPR_32))
----------------
Pierre-vh wrote:
> arsenm wrote:
> > I think this needs to go off the a predicate. If we have to generate so many copies it's potentially worse than matching the pattern
> Do you mean they should be on the matching part (`DivergentBinFrag<...`) and not on the output pattern?
> For now this doesn't look like it really has cases where too many copies are emitted, I think SIFoldOperand will fold most of them.
> 
> I don't remember how we want to handle this in GISel. Do we want to be aware of constant-bus limitations at matching time, or ignore it/always respect it by greedily inserting copies and cleaning it up later (with SIFoldOperands and an eventual improved version)
> To me the second strategy looks simpler and just as powerful but I don't quite remember if we made a decision there
> 
> FWIW, I was thinking about adding a "Finalizer" method to GISel Pattern Matching to allow targets to call some C++ code before the pattern is applied/instructions are built so it can "veto" the pattern if needed and fail if it's non-profitable for instance. I didn't make an RFC yet but if it sounds like something that would be useful I can make one in the coming days/week.
It's partially an open question for how to handle the constant bus problem. The current strategy is supposed to be let regbankselect aggressively emit copies to VGPR up front so it's impossible to violate, which SIFoldOperands can clean up.

In the case of patterns, I think it would be worse if we had to manually write all of them out in C++ to handle them in SIFoldOperands. Selection patterns should be applying logic to avoid violating it. The finalizer sounds like the same as the current arbitrary code predicates?



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142192/new/

https://reviews.llvm.org/D142192



More information about the llvm-commits mailing list