[LLVMdev] Predicated Vector Operations

Wed May 8 19:38:00 PDT 2013

On Wed, May 8, 2013 at 11:07 AM,  <dag at cray.com> wrote:
>> The issue becomes how to match the instruction form above in a
>> TableGen pattern. In order for this to emit a masked instruction,
>> %newvalue and %oldvalue must be assigned to same physical register (I'm
>> assuming an instruction like 'add %r0{%m0} %r1 %r2') However, I don't
>> think there is even a notion of physical registers at the point that
>> instruction selection is performed and the virtual registers will be
>> different because everything is still in SSA form.
>
> Potentially you could use the "$src = $dst" constraint as in the
> two-address x86 forms.  I don't know that TableGen has been generalized
> enough to do this, though.  I think it's pretty highly specialized to
> specific x86 two-address forms at the moment.

I'm not familiar with the two-address constraint, but it seems like
this would be challenging to express generally.  Consider the previous
example:

%tx = select %mask, %x, <0.0, 0.0, 0.0 ...>
%ty = select %mask, %y, <0.0, 0.0, 0.0 ...>
%sum = fadd %tx, %ty
%newvalue = select %mask, %sum, %oldvalue

I believe the generated instructions depend on whether %oldvalue is
still live after the last instruction. If it is, you need to generate
two instructions: a copy into a new physical register then predicated
write to it.  If it is not used, then it is just a predicated write to
the same register.

  move r1, r0
  fadd r1{m0}, r2, r3

(r0 is now %oldvalue and r1 is %newvalue)

vs.

  fadd r0{m0}, r2, r3

(r0 was %oldvalue and is now %newvalue)

> The bottom line is that it is probably easier to set this up before LLVM
> IR goes into SSA form.

That makes sense, but it's unclear to me how you would preserve that
information after going into SSA form.

> There is a lot of interest in predication and a lot of recent
> discussions about handling it in LLVM.  Personally I think that
> long-term we will need some IR changes.  It might be as simple as adding
> an IR-level predicated load and predicated store, I'm not sure.

It seems to me that these are not really LLVM issues as much as the
fact that SSA doesn't cleanly map to predicated instructions. For
example, if predicates were hypothetically added universally to the IR
(which I don't think anyone wants to do), it's not clear to me how
that would even work.  How would you specify what value the result
would be received for non-enabled lanes?  Perhaps another parameter:

  %newvalue = fadd %x, %y, %mask, %previousvalue

Yuck.