[PATCH] D142192: [AMDGPU] Run unmerge combines post regbankselect

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 23 00:15:52 PST 2023


Pierre-vh added inline comments.


================
Comment at: llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1767-1775
+    // This combine may run after RegBankSelect, so we need to be aware of
+    // register banks.
+    if (MRI.getRegBankOrNull(DstReg) != MRI.getRegBankOrNull(SrcReg)) {
+      SrcReg = Builder.buildCopy(MRI.getType(SrcReg), SrcReg).getReg(0);
+      if (const RegisterBank *DstRB = MRI.getRegBankOrNull(DstReg))
+        MRI.setRegBank(SrcReg, *DstRB);
+      if (const TargetRegisterClass *RC = MRI.getRegClassOrNull(SrcReg))
----------------
arsenm wrote:
> arsenm wrote:
> > I'm pretty sure we have a helper for this already (at least the artifact combiner handles this already)
> What I was thinking of was canReplaceReg. We should have another flavor that only accepts virtual registers and inserts the copy if needed
It's 3 lines now, do we need to create a helper?
If yes, what should we name it? Do we just create an overload of canReplaceReg that takes a Builder?


================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2050-2052
+  (V_BFI_B32_e64 (COPY_TO_REGCLASS VSrc_b32:$x, VGPR_32),
+                 (COPY_TO_REGCLASS VSrc_b32:$y, VGPR_32),
+                 (COPY_TO_REGCLASS VSrc_b32:$z, VGPR_32))
----------------
arsenm wrote:
> I think this needs to go off the a predicate. If we have to generate so many copies it's potentially worse than matching the pattern
Do you mean they should be on the matching part (`DivergentBinFrag<...`) and not on the output pattern?
For now this doesn't look like it really has cases where too many copies are emitted, I think SIFoldOperand will fold most of them.

I don't remember how we want to handle this in GISel. Do we want to be aware of constant-bus limitations at matching time, or ignore it/always respect it by greedily inserting copies and cleaning it up later (with SIFoldOperands and an eventual improved version)
To me the second strategy looks simpler and just as powerful but I don't quite remember if we made a decision there

FWIW, I was thinking about adding a "Finalizer" method to GISel Pattern Matching to allow targets to call some C++ code before the pattern is applied/instructions are built so it can "veto" the pattern if needed and fail if it's non-profitable for instance. I didn't make an RFC yet but if it sounds like something that would be useful I can make one in the coming days/week.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142192/new/

https://reviews.llvm.org/D142192



More information about the llvm-commits mailing list