[PATCH] D136676: [AMDGPU] Speedup SIFormMemoryClauses live-in register set calculation
Valery Pykhtin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 25 14:37:43 PDT 2022
vpykhtin added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp:280
- for (MachineBasicBlock &MBB : MF) {
- GCNDownwardRPTracker RPT(*LIS);
+ SmallVector<MachineInstr *, 16> FirstBBClauseMI;
+ for (auto &MBB : MF) {
----------------
arsenm wrote:
> vpykhtin wrote:
> > arsenm wrote:
> > > You seem to be assuming a single clause per block. I'd expect to handle this a full clause in a time, within a single block.
> > Not quite, I just compute the live-in set for the first clause per BB to reset the RPTracker, then it is advanced to the next clause.
> Can you do this per block, instead of calculating this for every block?
This is the whole point of doing that all at once:
1. Slot indexes of all clauses first instructions are collected and sorted.
2. For every virtual register's LiveRange we have two sorted sequences: Segments and SlotIndexes. We need to determine
which of SlotIndexes fall into Segments of the virtual register - that would mean the register is live at those SlotIndexes.
3. Since both sequences are sorted we progressively use two-way binary search: either SlotIndex that is contained by the Segment, or Segment containing the SlotIndex.
I now realize the complexity is not what I thought before, it should be (per register):
O( min ( NumSegments * lg(NumSlotIndexes), NumSlotIndexes * lg(NumSegments) )
See getLiveRegMap, LiveRange::findIndexesLiveAt.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136676/new/
https://reviews.llvm.org/D136676
More information about the llvm-commits
mailing list