[PATCH] D106042: [AMDGPU] Ignore KILLs when forming clauses
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 27 02:01:33 PDT 2021
sebastian-ne added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp:300
+ if (isVerbose())
+ OutStreamer->emitRawComment(" meta instruction");
+ return;
----------------
sebastian-ne wrote:
> arsenm wrote:
> > sebastian-ne wrote:
> > > sebastian-ne wrote:
> > > > arsenm wrote:
> > > > > sebastian-ne wrote:
> > > > > > arsenm wrote:
> > > > > > > sebastian-ne wrote:
> > > > > > > > arsenm wrote:
> > > > > > > > > I don't see a reason to emit a comment for these. Kills for example already have a comment for them
> > > > > > > > The problem is that the AsmPrinter only handles top-level instructions, but not instructions inside bundles.
> > > > > > > > So, without the code here, printing a bundle containing a KILL will crash because KILL and others are unhandled in `AMDGPUInstPrinter::printInstruction`.
> > > > > > > We shouldn't be bundling these?
> > > > > > How do we create the clause then?
> > > > > > In the tests, we have
> > > > > > ```
> > > > > > $sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
> > > > > > KILL undef renamable $sgpr4
> > > > > > $sgpr3 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 4, 0
> > > > > > ```
> > > > > > and we should be able to emit an `s_clause 0x1` in front of the two memory instructions.
> > > > > Where did this kill come from? A kill of an undef source is pointless and can be deleted. The point of kill is to artifically extend live ranges anyway, so it can be moved outside of the bundle. Kills are ordinarily deleted after RA anyway
> > > > The case where I encountered it looks like this:
> > > > ```
> > > > %470:sgpr_128 = S_LOAD_DWORDX4_IMM %469:sreg_64, 48, 0 :: (load (s128) from %ir.73, addrspace 4)
> > > > %25:sgpr_128 = S_LOAD_DWORDX4_IMM %3:sreg_64, 320, 0 :: (load (s128) from %ir.67, addrspace 4)
> > > > KILL %469.sub0:sreg_64, %469.sub1:sreg_64
> > > > %26:sgpr_128 = S_LOAD_DWORDX4_IMM %3:sreg_64, 480, 0 :: (load (s128) from %ir.70, addrspace 4)
> > > > ```
> > > >
> > > > The kill is inserted by the SIFormMemoryClauses pass.
> > > > In the above example, if flat scratch is enabled, the kill gets inserted after the third s_load, so a clause is inserted successfully later. Without flat scratch, no clause is formed.
> > > Maybe SIFormMemoryClauses does not extend the potential clause to the last s_load because the register pressure estimate would get too high?
> > > Therefore it inserts the KILL after the second s_load, which in turn prevents inserting a hard clause containing the latter two loads.
> > >
> > > The would explain why the position of the KILL changes when flat scratch is enabled or disabled (enabling it frees the 4 SGPRs of the scratch SRD, so it has lower register pressure).
> > SIFormMemoryClauses's pressure estimate is extremely crude, so I wouldn't put so much faith in it (I think the pass needs a replacement overall, I think the allocator really needs to be aware of the restriction). The kill is there as a hint to the allocator, but doesn't need to stick around afterwards
> Ok, should we remove the kill instructions if we encounter them in SIInsertHardClauses?
> But that doesn’t help for other meta instructions like DBG instructions.
I guess this code is fine then, as we shouldn’t handle bundles in llvm’s generic code, so we need to handle everything in the targets.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D106042/new/
https://reviews.llvm.org/D106042
More information about the llvm-commits
mailing list