[PATCH] D144099: [AMDGPU] Fold more AGPR copies/PHIs in SIFoldOperands
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 20 12:53:40 PST 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:1697
+ getRegOpRC(*MRI, *TRI, Copy->getOperand(1));
+ if (ARC && ARC != CopyInRC)
+ return false;
----------------
Direct class equality checks are usually the wrong thing to do. Something like isSubclassEq or constrain to compatible subclass. Don't think there's any practical difference in this case
================
Comment at: llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:1847
+ // Look at all AGPR Phis and collect the register + subregister used.
+ DenseMap<std::pair<Register, unsigned>, std::vector<MachineOperand *>>
+ RegToMO;
----------------
Don't see why you need to build this map/vector. You can just start inserting the instructions after all the phis as you process each one
================
Comment at: llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:1631
-// Try to hoist an AGPR to VGPR copy out of the loop across a LCSSA PHI.
+static Register tryFindExistingCopy(MachineRegisterInfo &MRI,
+ MachineInstr &Begin, Register FromReg,
----------------
Pierre-vh wrote:
> arsenm wrote:
> > If SIFoldOperands worked like PeepholeOpt, you would have a map of these already. I'd rather avoid a linear scan backwards for every copy
> Do you mean that it's fine to store these in a map, or that we can't do that here?
> I'd also like to avoid the backwards scan if possible
I mean I don't like the current use iteration SIFoldOperands currently uses, and it would be better if it was rewritten to collect a map of known foldable defs as it walked. I think you can't simply introduce such a map without redoing the whole pass structure
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144099/new/
https://reviews.llvm.org/D144099
More information about the llvm-commits
mailing list