[PATCH] D119475: [AMDGPU] Add scheduler pass to rematerialize trivial defs
Austin Kerbow via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 21 17:51:30 PST 2022
kerbowa added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:640
+ if (Stage == PreRARematerialize) {
+ if (RegionsWithMinOcc.count() != 1 || Regions.size() == 1)
+ break;
----------------
If we only try this with a single region can RegionsWithMinOcc be an Optional<int> or something? Seems like it would not be necessary to have a vector tracking all the regions that fit the criteria.
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:646
+ // Check maximum occupancy
+ if (ST.computeOccupancy(MF.getFunction(), MFI.getLDSSize()) ==
+ MinOccupancy)
----------------
Is this needed? If we are limited by the function max occupancy including LDS and wave-per-eu constraints then multiple regions will be at the MinOccupancy.
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:652
+ // Pressure for regions inbetween the defs and region we sinked the def
+ // to. Will need to be fixed if there is another pass after this pass.
+ static_assert(LastStage == PreRARematerialize,
----------------
Why is the PreRARematerialize pass different and doesn't need this info? Do we only reschedule the single region?
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:736
+
+ if (HighRPLiveIns.find(Reg) == HighRPLiveIns.end())
+ continue;
----------------
Can you add a comment as Stas suggests?
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:783
+ SmallVector<MachineInstr *, 4> InsertedMI;
+ DenseMap<MachineInstr *, MachineInstr *> InsertedMIToOldDef;
+
----------------
Can these be combined to avoid the extra lookup?
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:795
+ // Rematerialize MI to its use block. Since we are only rematerializing
+ // instructions that does not have any virtual reg uses, we do not need
+ // to call LiveRangeEdit::allUsesAvailableAt() and
----------------
TYPO: does->do
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:817
+
+ if (ImproveOccupancy <= MinOccupancy) {
+ // Occupancy is not improved. Undo sinking for the region
----------------
Is there a test for this case?
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:878
+ MachineBasicBlock::iterator InsertPos =
+ std::next(MachineBasicBlock::iterator(MI));
+ bool UseBlockFound = false;
----------------
I guess you could have a function to do this updating of RegionBegin/RegionEnd since it is repeated?
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.h:81
+ PreRARematerialize,
+ LastStage = PreRARematerialize
};
----------------
I feel like we need to refactor the way we implement scheduler passes now that there are so many of them and the logic is getting a bit complex. It's a separate thing though.
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.h:123
+ // List of trivially rematerializable instructions we can remat to reduce RP
+ SmallVector<std::pair<MachineInstr *, MachineInstr *>> RematerializableInsts;
----------------
Can you update the comment so that it is clear these are pairs of defs/uses.
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.h:138
+
+ // Attempt to reduce RP of VGPR by sinking trivially rematerializable
+ // instructions. Returns true if we were able to sink instruction(s).
----------------
Why only VGPRs? Could this be extended to handle SGPRs and AGPRs.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119475/new/
https://reviews.llvm.org/D119475
More information about the llvm-commits
mailing list