[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.

Michael Liao via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 29 10:31:00 PDT 2021


hliao added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIAvoidZeroExecMask.cpp:220
+  RS->enterBasicBlockEnd(MBB);
+  unsigned Scav = RS->scavengeRegisterBackwards(
+      AMDGPU::SReg_64RegClass, MachineBasicBlock::iterator(Cloned), false, 0);
----------------
rampitec wrote:
> hliao wrote:
> > rampitec wrote:
> > > hliao wrote:
> > > > foad wrote:
> > > > > What happens if there are no free sgprs so the scavenger has to spill something? That sounds like yet another case that won't work correctly when exec is zero.
> > > > Yeah, that's true. If we have a null-like reg pair, we could save the searching for that tmp SGPR as well. We have a null reg definition but we need an SGPR pair for the WAVE64 case. As we only duplicate that evaluation to update SCC, a null reg-pair would be sufficient.
> > > NULL works for 64 bit pair too. Although it is not available on every target.
> > Could you elaborate more? SGPR_NULL is currently defined as a 32-bit SGPR @ offset 125. For that 64-bit SGPR pair, we need a pair at an *even* offset based on the ISA document.
> It is not a real register, it is just a way to encode 0. It is even free in terms of the constant bus usage. It can be used as 64 bit too:
> 
> ```
> llvm-mc -arch=amdgcn -mcpu=gfx1010 -show-encoding <<< 's_mov_b64 s[0:1], null'
>         .text
>         s_mov_b64 s[0:1], null                  ; encoding: [0x7d,0x04,0x80,0xbe]
> ```
> Not sure if you would need to fix something in the verifier.
> 
> But again, this is not a universal solution, it is gfx10 only.
> It is not a real register, it is just a way to encode 0. It is even free in terms of the constant bus usage. It can be used as 64 bit too:
> 
> ```
> llvm-mc -arch=amdgcn -mcpu=gfx1010 -show-encoding <<< 's_mov_b64 s[0:1], null'
>         .text
>         s_mov_b64 s[0:1], null                  ; encoding: [0x7d,0x04,0x80,0xbe]
> ```
> Not sure if you would need to fix something in the verifier.
> 
> But again, this is not a universal solution, it is gfx10 only.

That explains why I cannot find that usage in vega ISA document. From scalar operand encoding map, it seems we have several reserved slots like 209-234.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99507/new/

https://reviews.llvm.org/D99507



More information about the llvm-commits mailing list