[PATCH] D106042: [AMDGPU] Ignore KILLs when forming clauses

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 19 19:15:13 PDT 2021


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp:300
+      if (isVerbose())
+        OutStreamer->emitRawComment(" meta instruction");
+      return;
----------------
sebastian-ne wrote:
> sebastian-ne wrote:
> > arsenm wrote:
> > > sebastian-ne wrote:
> > > > arsenm wrote:
> > > > > sebastian-ne wrote:
> > > > > > arsenm wrote:
> > > > > > > I don't see a reason to emit a comment for these. Kills for example already have a comment for them
> > > > > > The problem is that the AsmPrinter only handles top-level instructions, but not instructions inside bundles.
> > > > > > So, without the code here, printing a bundle containing a KILL will crash because KILL and others are unhandled in `AMDGPUInstPrinter::printInstruction`.
> > > > > We shouldn't be bundling these?
> > > > How do we create the clause then?
> > > > In the tests, we have
> > > > ```
> > > >     $sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
> > > >     KILL undef renamable $sgpr4
> > > >     $sgpr3 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 4, 0
> > > > ```
> > > > and we should be able to emit an `s_clause 0x1` in front of the two memory instructions.
> > > Where did this kill come from? A kill of an undef source is pointless and can be deleted. The point of kill is to artifically extend live ranges anyway, so it can be moved outside of the bundle. Kills are ordinarily deleted after RA anyway
> > The case where I encountered it looks like this:
> > ```
> > %470:sgpr_128 = S_LOAD_DWORDX4_IMM %469:sreg_64, 48, 0 :: (load (s128) from %ir.73, addrspace 4)
> > %25:sgpr_128 = S_LOAD_DWORDX4_IMM %3:sreg_64, 320, 0 :: (load (s128) from %ir.67, addrspace 4)
> > KILL %469.sub0:sreg_64, %469.sub1:sreg_64
> > %26:sgpr_128 = S_LOAD_DWORDX4_IMM %3:sreg_64, 480, 0 :: (load (s128) from %ir.70, addrspace 4)
> > ```
> > 
> > The kill is inserted by the SIFormMemoryClauses pass.
> > In the above example, if flat scratch is enabled, the kill gets inserted after the third s_load, so a clause is inserted successfully later. Without flat scratch, no clause is formed.
> Maybe SIFormMemoryClauses does not extend the potential clause to the last s_load because the register pressure estimate would get too high?
> Therefore it inserts the KILL after the second s_load, which in turn prevents inserting a hard clause containing the latter two loads.
> 
> The would explain why the position of the KILL changes when flat scratch is enabled or disabled (enabling it frees the 4 SGPRs of the scratch SRD, so it has lower register pressure).
SIFormMemoryClauses's pressure estimate is extremely crude, so I wouldn't put so much faith in it (I think the pass needs a replacement overall, I think the allocator really needs to be aware of the restriction). The kill is there as a hint to the allocator, but doesn't need to stick around afterwards


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106042/new/

https://reviews.llvm.org/D106042



More information about the llvm-commits mailing list