[PATCH] D136676: [AMDGPU] Speedup SIFormMemoryClauses live-in register set calculation

Valery Pykhtin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 25 14:37:43 PDT 2022


vpykhtin added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp:280
 
-  for (MachineBasicBlock &MBB : MF) {
-    GCNDownwardRPTracker RPT(*LIS);
+  SmallVector<MachineInstr *, 16> FirstBBClauseMI;
+  for (auto &MBB : MF) {
----------------
arsenm wrote:
> vpykhtin wrote:
> > arsenm wrote:
> > > You seem to be assuming a single clause per block. I'd expect to handle this a full clause in a time, within a single block.
> > Not quite, I just compute the live-in set for the first clause per BB to reset the RPTracker, then it is advanced to the next clause.
> Can you do this per block, instead of calculating this for every block?
This is the whole point of doing that all at once:

1. Slot indexes of all clauses first instructions are collected and sorted. 
2. For every virtual register's LiveRange we have two sorted sequences: Segments and SlotIndexes. We need to determine
which of SlotIndexes fall into Segments of the virtual register - that would mean the register is live at those SlotIndexes.
3. Since both sequences are sorted we progressively use two-way binary search: either SlotIndex that is contained by the Segment, or Segment containing the SlotIndex.

I now realize the complexity is not what I thought before, it should be (per register):

O( min ( NumSegments * lg(NumSlotIndexes),  NumSlotIndexes * lg(NumSegments)  )

See  getLiveRegMap,  LiveRange::findIndexesLiveAt.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136676/new/

https://reviews.llvm.org/D136676



More information about the llvm-commits mailing list