# [llvm-dev] [RFC] Matrix support (take 2)

Simon Moll via llvm-dev llvm-dev at lists.llvm.org
Thu Dec 20 14:00:21 PST 2018

```On 12/20/18 6:43 PM, Roman Lebedev wrote:
> On Thu, Dec 20, 2018 at 7:40 PM Simon Moll via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> On 12/20/18 4:43 PM, David Greene wrote:
>>> Simon Moll <moll at cs.uni-saarland.de> writes:
>>>
>>>>> How will existing passes be taught about the new intrinsics?  For
>>>>> example, what would have to be done to instcombine to teach it about
>>>>> these intrinsics?  Let's suppose every existing operation had an
>>>>> equivalent masked intrinsic.  Would it be easier to teach all of the
>>>>> passes about them or would it be easier to teach the passes about a mask
>>>>> operand on the existing Instructions?  Would it be easier to teach isel
>>>>> about all the intrinsics or would it be easier to teach isel about a
>>>> Consider that over night we introduce optional mask parameters to
>>>> vector instructions. Then, since you can not safely ignore the mask,
>>>> every transformation and analysis that is somehow concerned with
>>>> vector instructions is potentially broken and needs to be fixed.
>>> True, but is there a way we could do this incrementally?  Even if we
>>> start with intrinsics and then migrate to first-class support, at some
>>> point passes are going to be broken with respect to masks on
>>> Instructions.
>> Here is path an idea for an incremental transition:
>>
>> a) Create a new, distinct type. Let's say its called the "predicated
>> vector type", written "{W x double}".
>>
>> b) Make the predicate vector type a legal operand type for all binary
>> operators and add an optional predicate parameter to them. Now, here is
>> the catch: the predicate parameter is only legal if the data type of the
>> operation is "predicated vector type". That is "fadd <8 x double>" will
>> for ever be unpredicated. However, "fadd {8 x double} %a, %b" may have
>> an optional predicate argument. Semantically, these two operations would
>> be identical:
>>
>> fadd <8 x double>, %a, %b
>>
>> fadd {8 x double}, %a, %b, predicate(<8 x i1><1, 1, 1, 1, 1, 1, 1, 1>)
>>
>> In terms of the LLVM type hierachy, PredicatedVectorType would be
>> distinct from VectorType and so no transformation can break it. While
>> you are in the transition (from unpredicated to predicated IR), you may
>> see codes like this:
>>
>> %aP = bitcast <8 x  double> %a to {8 x double}
>> %bP = bitcast <8 x  double> %b to {8 x double}
>> %cP = fdiv %aP, %bP, mask(11101110) ; predicated fdiv
>> %c = bitcast <8 x double> %c to %cP
>> %d = fadd <8 x double> %a, %c   ; no predicated fadd yet
>>
>> Eventually, when all optimizations/instructions/analyses have been
>> migrated to run well with the new type, 1. deprecate the old vector
>> type, 2. promote it to PredicatedVectorType when parsing BCand, after a
>> grace period, rename {8 x double} to <8 x double>
> I'm likely missing things,
> but i strongly suspect that the amount of effort needed is underestimated.
>
> Vector support works because, with some exceptions,
> vector is simply interpreted as several scalars concatenated.
>
> Much like as with native fixed-point type support,
> a whole new incompatible type is suggested to be added here.
> I *suspect*, *every* single transform in instcombine/instsimplify
> will need to be *duplicated*.

Actually, there is no need for an entirely distinct type.
PredicatedVectorType could simply be the super class of VectorType.

/// operations of this type may have a mask or an explicit vector length.

class PredicatedVectorType : public SequentialType {

public:
virtual bool allowsMasking() const { return true; }
virtual bool allowsExplicitVectorLength() const { return true; }

// [all contents of the current VectorType]
};

// operations of this type can not have a mask nor an explicit vector length

class VectorType : public PredicatedVectorType {
public:
bool allowsMasking() const { return false; }
bool allowsExplicitVectorLength() const { return false; }
// no other members
};

When we go down that path, there would be *no duplication* at all,
instead transformations would be lifted from VectorType to
PredicatedVectorType one at a time. Eventually, the PredicatedVectorType
class could be deprecated as masking and an explicit vector length would
only depend on the operation.

>
> That is a lot. Intrinsics sound like less intrusive solution, in both cases.
>>>> If you go with masking intrinsics, and set the attributes right, it is
>>>> clear that transformations won't break your code and you will need to
>>>> teach InstCombine, DAGCombiner, etc that a `masked.fadd` is just an
>>>> fadd` with a mask. However, this gives you the opportunity to
>>>> "re-enable" one optimization add a time each time making sure that the
>>>> mask is handled correctly. In case of InstCombine, the vector
>>>> instruction patterns transfer to mask intrinsics: if all mask
>>>> intrinsics in the pattern have the same mask parameter you can apply
>>>> the transformation, the resulting mask intrinsics will again take the
>>>> same mask parameter.
>>> Right.
>>>
>>>> Also, this need not be a hard transition from vector instructions to
>>>> masking intrinsics.. you can add new types of masking intrinsics in
>>>> batches along with the required transformations. Masking intrinsics
>>>> and vector instruction can live side by side (as they do today,
>>>> anyway).
>>> Of course.
>>>
>>>
>>>>> I honestly don't know the answers to these questions.  But I think they
>>>>> are important to consider, especially if intrinsics are seen as a bridge
>>>>> to first-class IR support for masking.
>>>> I think its sensible to use masking intrinsics (or EVL
>>>> https://reviews.llvm.org/D53613) on IR level and masked SD nodes in
>>>> the backend. However, i agree that intrinsics should just be a bridge
>>>> to native support mid term.
>>> The biggest question I have is how such a transition would happen.
>>> Let's say we have a full set of masking intrinsics.  Now we want to take
>>> one IR-level operation, say fadd, and add mask support to it.  How do we
>>> do that?  Is it any easier because we have all of the intrinsics, or
>>> does all of the work on masking intrinsics get thrown away at some
>>> point?
>> The masking intrinsics are just a transitional thing. Eg, we could add
>> them now and let them mature. Once the intrinsics are stable and proven
>> start migrating for core IR support (eg as sketched above).
>>> I'm reminded of this now decade-old thread on gather/scatter and masking
>>> from Don Gohman, which I also mentioned in an SVE thread earlier this
>>> year:
>>>
>>> https://lists.llvm.org/pipermail/llvm-dev/2008-August/016284.html
>>>
>>> The applymask idea got worked through a bit and IIRC at some later point
>>> someone found issues with it that need to be addressed, but it's an
>>> interesting idea to consider.  I wasn't too hot on it at the time but it
>>> may be a way forward.
>>>
>>> In that thread, Tim Foley posted a summary of options for mask support,
>>> one of which was adding intrinsics:
>>>
>>> https://lists.llvm.org/pipermail/llvm-dev/2008-August/016371.html
>>>
>>>                                   -David
>> Thank for you for the pointer! Is this documented somewhere? (say in a
>> wiki or some proposal doc). Otherwise, we are bound to go through these
>> discussions again and again until a consensus is reached. Btw, different
>> to then, we are also talking about an active vector length now (hence EVL).
>>
>> AFAIU apply_mask was proposed to have less (redundant) predicate
>> arguments. Unless the apply_mask breaks a chain in a matcher pattern,
>> the approach should be prone to the issue of transformations breaking
>> code as well.
>>
>> Has something like the PredicatedVectorType approach above been proposed
>> before?
>>
>> - Simon
>>
>> --
>>
>> Simon Moll
>> Researcher / PhD Student
>>
>> Compiler Design Lab (Prof. Hack)
>> Saarland University, Computer Science
>> Building E1.3, Room 4.31
>>
>> Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de
>> Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll
> Roman.
>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--

Simon Moll
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de
Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll

```