[PATCH] D144099: [AMDGPU] Fold more AGPR copies/PHIs in SIFoldOperands

Wed Feb 22 07:55:13 PST 2023

arsenm added a comment.

I just noticed that GCNPreRAOptimizations does try to do some fixups for the intermediate AGPR case, maybe worth looking into why that didn't work here

================
Comment at: llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:1697
+          getRegOpRC(*MRI, *TRI, Copy->getOperand(1));
+      if (ARC && ARC != CopyInRC)
+        return false;
----------------
Pierre-vh wrote:
> arsenm wrote:
> > Direct class equality checks are usually the wrong thing to do. Something like isSubclassEq or constrain to compatible subclass. Don't think there's any practical difference in this case
> Here I want to make sure the RC is identical because I can create new register/copies to that RC. Should I still replace equality checks with isSubclassEq?
It doesn't really matter, I think this is only a theoretical problem

================
Comment at: llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:1839
+// these PHIs come from a broken-up large PHI (e.g. 32 AGPR phis, one per vector
+// element).
+bool SIFoldOperands::tryOptimizeAGPRPhis(MachineBasicBlock &MBB) {
----------------
Instruction comment like above?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144099/new/

https://reviews.llvm.org/D144099