# [llvm-dev] [RFC] Matrix support (take 2)

Simon Moll via llvm-dev llvm-dev at lists.llvm.org
Thu Dec 20 08:40:10 PST 2018

```On 12/20/18 4:43 PM, David Greene wrote:
> Simon Moll <moll at cs.uni-saarland.de> writes:
>
>>> How will existing passes be taught about the new intrinsics?  For
>>> example, what would have to be done to instcombine to teach it about
>>> these intrinsics?  Let's suppose every existing operation had an
>>> equivalent masked intrinsic.  Would it be easier to teach all of the
>>> operand on the existing Instructions?  Would it be easier to teach isel
>>> about all the intrinsics or would it be easier to teach isel about a
>> Consider that over night we introduce optional mask parameters to
>> vector instructions. Then, since you can not safely ignore the mask,
>> every transformation and analysis that is somehow concerned with
>> vector instructions is potentially broken and needs to be fixed.
> True, but is there a way we could do this incrementally?  Even if we
> start with intrinsics and then migrate to first-class support, at some
> point passes are going to be broken with respect to masks on
> Instructions.

Here is path an idea for an incremental transition:

a) Create a new, distinct type. Let's say its called the "predicated
vector type", written "{W x double}".

b) Make the predicate vector type a legal operand type for all binary
operators and add an optional predicate parameter to them. Now, here is
the catch: the predicate parameter is only legal if the data type of the
operation is "predicated vector type". That is "fadd <8 x double>" will
for ever be unpredicated. However, "fadd {8 x double} %a, %b" may have
an optional predicate argument. Semantically, these two operations would
be identical:

fadd <8 x double>, %a, %b

fadd {8 x double}, %a, %b, predicate(<8 x i1><1, 1, 1, 1, 1, 1, 1, 1>)

In terms of the LLVM type hierachy, PredicatedVectorType would be
distinct from VectorType and so no transformation can break it. While
you are in the transition (from unpredicated to predicated IR), you may
see codes like this:

%aP = bitcast <8 x  double> %a to {8 x double}
%bP = bitcast <8 x  double> %b to {8 x double}
%cP = fdiv %aP, %bP, mask(11101110) ; predicated fdiv
%c = bitcast <8 x double> %c to %cP
%d = fadd <8 x double> %a, %c   ; no predicated fadd yet

Eventually, when all optimizations/instructions/analyses have been
migrated to run well with the new type, 1. deprecate the old vector
type, 2. promote it to PredicatedVectorType when parsing BCand, after a
grace period, rename {8 x double} to <8 x double>
>> If you go with masking intrinsics, and set the attributes right, it is
>> clear that transformations won't break your code and you will need to
>> fadd` with a mask. However, this gives you the opportunity to
>> "re-enable" one optimization add a time each time making sure that the
>> mask is handled correctly. In case of InstCombine, the vector
>> intrinsics in the pattern have the same mask parameter you can apply
>> the transformation, the resulting mask intrinsics will again take the
> Right.
>
>> Also, this need not be a hard transition from vector instructions to
>> batches along with the required transformations. Masking intrinsics
>> and vector instruction can live side by side (as they do today,
>> anyway).
> Of course.
>
>
>>> I honestly don't know the answers to these questions.  But I think they
>>> are important to consider, especially if intrinsics are seen as a bridge
>>> to first-class IR support for masking.
>> I think its sensible to use masking intrinsics (or EVL
>> https://reviews.llvm.org/D53613) on IR level and masked SD nodes in
>> the backend. However, i agree that intrinsics should just be a bridge
>> to native support mid term.
> The biggest question I have is how such a transition would happen.
> Let's say we have a full set of masking intrinsics.  Now we want to take
> do that?  Is it any easier because we have all of the intrinsics, or
> does all of the work on masking intrinsics get thrown away at some
> point?
The masking intrinsics are just a transitional thing. Eg, we could add
them now and let them mature. Once the intrinsics are stable and proven
start migrating for core IR support (eg as sketched above).
>
> from Don Gohman, which I also mentioned in an SVE thread earlier this
> year:
>
> https://lists.llvm.org/pipermail/llvm-dev/2008-August/016284.html
>
> The applymask idea got worked through a bit and IIRC at some later point
> someone found issues with it that need to be addressed, but it's an
> interesting idea to consider.  I wasn't too hot on it at the time but it
> may be a way forward.
>
> In that thread, Tim Foley posted a summary of options for mask support,
> one of which was adding intrinsics:
>
> https://lists.llvm.org/pipermail/llvm-dev/2008-August/016371.html
>
>                                  -David

Thank for you for the pointer! Is this documented somewhere? (say in a
wiki or some proposal doc). Otherwise, we are bound to go through these
discussions again and again until a consensus is reached. Btw, different
to then, we are also talking about an active vector length now (hence EVL).

AFAIU apply_mask was proposed to have less (redundant) predicate
arguments. Unless the apply_mask breaks a chain in a matcher pattern,
the approach should be prone to the issue of transformations breaking
code as well.

Has something like the PredicatedVectorType approach above been proposed
before?

- Simon

--

Simon Moll
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de
Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll

```