[PATCH] D144907: [RegAllocFast] insert additional spills along indirect edges of INLINEASM_BR

Wed Mar 1 14:11:52 PST 2023

nickdesaulniers added a subscriber: dblaikie.
nickdesaulniers added inline comments.

================
Comment at: llvm/lib/CodeGen/RegAllocFast.cpp:952-963
+      if (MI.getOpcode() == TargetOpcode::INLINEASM_BR) {
+        int FI = StackSlotForVirtReg[VirtReg];
+        const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);
+        for (MachineBasicBlock *Succ : MI.getParent()->successors()) {
+          if (Succ->isInlineAsmBrIndirectTarget()) {
+            TII->storeRegToStackSlot(*Succ, Succ->begin(), PhysReg, Kill,
+                FI, &RC, TRI, VirtReg);
----------------
nickdesaulniers wrote:
> MatzeB wrote:
> > MatzeB wrote:
> > > nickdesaulniers wrote:
> > > > nickdesaulniers wrote:
> > > > > nickdesaulniers wrote:
> > > > > > nickdesaulniers wrote:
> > > > > > > MatzeB wrote:
> > > > > > > > nickdesaulniers wrote:
> > > > > > > > > MatzeB wrote:
> > > > > > > > > > MatzeB wrote:
> > > > > > > > > > > MatzeB wrote:
> > > > > > > > > > > > May be good to add a comment here to remind readers that we have the really unusual behavior of possibly jumping in the middle of a basic block.
> > > > > > > > > > > Would it be possible to move this code block into the `spill()` function. To me it feels like it's just part of the spilling process. You could add a new parameter to pass along `MI` or a `bool MayJump`...
> > > > > > > > > > or given there is only this use here, rename to `spillAfter()`, add the `MI` parameter and drop the `SpillBefore` parameter instead. And then just always compute `SpillBefore` within the function based on MI...
> > > > > > > > > This works:
> > > > > > > > > ```
> > > > > > > > > diff --git a/llvm/lib/CodeGen/RegAllocFast.cpp b/llvm/lib/CodeGen/RegAllocFast.cpp
> > > > > > > > > index bf99b4b4a0ea..b2ef32a2328c 100644
> > > > > > > > > --- a/llvm/lib/CodeGen/RegAllocFast.cpp
> > > > > > > > > +++ b/llvm/lib/CodeGen/RegAllocFast.cpp
> > > > > > > > > @@ -435,6 +435,7 @@ void RegAllocFast::spill(MachineBasicBlock::iterator Before, Register VirtReg,
> > > > > > > > >    LLVM_DEBUG(dbgs() << " to stack slot #" << FI << '\n');
> > > > > > > > >  
> > > > > > > > >    const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);
> > > > > > > > > +  MachineBasicBlock *MBB = Before->getParent();// shadow MBB member.
> > > > > > > > >    TII->storeRegToStackSlot(*MBB, Before, AssignedReg, Kill, FI, &RC, TRI,
> > > > > > > > >                             VirtReg);
> > > > > > > > >    ++NumStores;
> > > > > > > > > @@ -954,9 +955,7 @@ void RegAllocFast::defineVirtReg(MachineInstr &MI, unsigned OpNum,
> > > > > > > > >          const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);
> > > > > > > > >          for (MachineBasicBlock *Succ : MI.getParent()->successors()) {
> > > > > > > > >            if (Succ->isInlineAsmBrIndirectTarget()) {
> > > > > > > > > -            TII->storeRegToStackSlot(*Succ, Succ->begin(), PhysReg, Kill,
> > > > > > > > > -                FI, &RC, TRI, VirtReg);
> > > > > > > > > -            ++NumStores;
> > > > > > > > > +            spill(Succ->begin(), VirtReg, PhysReg, Kill, LRI->LiveOut);
> > > > > > > > >              Succ->addLiveIn(PhysReg);
> > > > > > > > >            }
> > > > > > > > >          }
> > > > > > > > > ```
> > > > > > > > Calling into `spill()` instead of directly using the `storeRegToStackSlot` callback seems sensible. 
> > > > > > > >  Still logically it seems somewhat unfortunate to have part of the spilling logic here and not in the `spill()` function...
> > > > > > > We want to repeat the entirety of `spill` but for multiple destinations when encountering an `INLINEASM_BR`.
> > > > > > > 
> > > > > > > I could outline the body of `RegAllocFast::spill` into a new method (`RegAllocFast::spillTo` or something), and have `RegAllocFast::spill` check for the special case of `INLINEASM_BR`, calling `RegAllocFast::spillTo` 1-to-many times.
> > > > > > > 
> > > > > > > Perhaps it's clearer that `RegAllocFast::defineVirtReg` calls `RegAllocFast::spill` multiple times, rather than one call to `RegAllocFast::spill` resulting in 1-to-many spills?
> > > > > > Actually, I might be able to make `RegAllocFast::spill` recursive...let me give that a shot.
> > > > > > Calling into `spill()` instead of directly using the `storeRegToStackSlot` callback seems sensible. 
> > > > > 
> > > > > Looking at how ` LiveDbgValueMap` is used then cleared in `RegAllocFast::spill`, I'm not sure that calling `spill` directly is actually better than the current diff (500885)
> > > > > 
> > > > > Actually, I might be able to make `RegAllocFast::spill` recursive...let me give that a shot.
> > > > 
> > > > NVM
> > > - It's not that big of a deal to keep it as two spill() calls here if that works better.
> > > - The thing that did look useful to me was the placement of the extra `DBG_VALUE` when the value is live out. But I just realized that is not even that easy as the `LiveOut` is only compute for the current block, not the block where the new spill is placed in.
> > > 
> > > So maybe it would be best to go back to a direct `storeRegToStackSlot` call and then manually construct a `DBG_VALUE` immediate behind it to mark the value moving from register to stack?
> > Sorry for the back and forth. After thinking some more about this, I haven't looked at fastregalloc in a while. Actually I think we likely can skip the `DBG_VALUE` as that is only necessary to mark transitions between register and stack. However as soon as we spill a value once we point all debug info in other blocks to the stack slot as far as I can tell. Meaning we shouldn't need to specially mark the transition for the extra spill.
> If I regen the mir test from [the initial report](https://github.com/llvm/llvm-project/issues/60855#issue-1590531864) with debug info (`clang x.c -g -S -o - -fno-asynchronous-unwind-tables -mllvm -stop-before=regallocfast > regallocfast.g.mir`), this is the MIR I get from running regallocfast (`llc -run-pass=regallocfast regallocfast.g.mir -o - -verify-machineinstrs`): https://gist.github.com/nickdesaulniers/a6c234b56218bbcb21aabbd5521d9615
> 
> It looks like regardless of my change, the spills in the same block as the INLINEASM_BR are already missing debug-locations? (This is my first time looking at debug info in MIR).
@MatzeB it looks like even with optimizations enabled we don't have DBG_VALUEs produced in the MIR stream.  I spoke with @dblaikie about this briefly, but we couldn't spot what you may have been referring to.

Do you still have concerns about this code? If not, it's probably ready for review.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144907/new/

https://reviews.llvm.org/D144907