[llvm] [AMDGPU] NFC: Add BBLiveOutMap & LiveOut Cache (PR #93089)

Fri Jun 14 12:19:39 PDT 2024

================
@@ -163,13 +163,43 @@ inline raw_ostream &operator<<(raw_ostream &OS, const ScheduleMetrics &Sm) {
   return OS;
 }
 
+class GCNScheduleDAGMILive;
+class RegionPressureMap {
+  GCNScheduleDAGMILive *DAG;
+  // The live in/out pressure as indexed by the first or last MI in the region
+  // before scheduling.
+  DenseMap<MachineInstr *, GCNRPTracker::LiveRegSet> BBLiveRegMap;
----------------
jrbyrnes wrote:

Yes, this is adding an instance of DenseMap<MachineInstr *, GCNRPTracker::LiveRegSet> . However, this PR is not introducing this new level above GCNRPTracker::LiveRegSet, as it already exists in our code https://github.com/llvm/llvm-project/blob/1af1c9fb98e5c99ce2aa3a9af8ede489ea85c745/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h#L214 , so I'm not sure that that particular class is unduly complicating our tracking. Also, this cache is used to avoid recalculation of the liveouts across stages. Obviously, there are some costs to creating the map, but I assume it is making things less expensive in general.

Perhaps more importantly, I don't think there is an alternative to this besides deleting the cache. Implementing a LIS->getLiveRegsAt(SlotIndex) with cached results would probably be counterproductive due to the maintenance required for each call to SlotIndex insertMachineInstrInMaps during scheduling. We could introduce and cache GCNPressureDiffs (instead of the new trackers) which would use our more accurate RP costs to encode the PDiff. But this is ultimately just a different way to build the cache, and I'm not sure this is worth a complete redesign (especially since there may be issues with inconsistent state between tracker and pdiffs). 

https://github.com/llvm/llvm-project/pull/93089