[PATCH] D63225: AMDGPU: Fold readlane from copy of SGPR or imm

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 12 14:30:04 PDT 2019


rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIFoldOperands.cpp:525
+      // %vgpr = V_MOV_B32 imm
+      // %sgpr = V_READFIRSTLANE_B32 %vgpr
+      // =>
----------------
arsenm wrote:
> rampitec wrote:
> > arsenm wrote:
> > > rampitec wrote:
> > > > arsenm wrote:
> > > > > rampitec wrote:
> > > > > > I do not think it is correct. If first lane is off you will fold wrong value. In both cases of readfirstlane and readlane you need to check a specific lane, which I am not sure even possible.
> > > > > readlane ignores exec. readfirstlane only reads exec to find the first lane, and still returns a result when  exec is 0. The source value is a constant, so it doesn't matter what the lane is
> > > > Readlane ignores exec, but v_mov_b32 does not. So you may have an old value of vgpr instead of immediate.
> > > > For readfirstlane you will get wrong value on the sgpr if exec=0 I believe.
> > > It would matter if there was an exec def between the copy and the use, which I want to ban within a block
> > But it is not banned yet, right? Are we sure that only happens within a block? Why block cannot be executed with a zero exec? And then at the very least we need to be sure register is virtual.
> We force execz skip branches around blocks that have readlane/readfirstlane in them. exec = 0 should also be illegal on function entry
I thought about it. From the ISA perspective it is illegal if you are reading an inactive lane. This can be legal here on the grounds that register is virtual and we are in SSA. Are both conditions true?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63225/new/

https://reviews.llvm.org/D63225





More information about the llvm-commits mailing list