[llvm-dev] [RFC] Matrix support (take 2)

Wed Dec 19 14:37:29 PST 2018

Hi,

On 12/19/18 11:07 PM, Adam Nemet via llvm-dev wrote:
>
>
>> On Dec 19, 2018, at 1:31 PM, Stephen Canon <scanon at apple.com 
>> <mailto:scanon at apple.com>> wrote:
>>
>>> On Dec 19, 2018, at 11:09 AM, Stephen Canon via llvm-dev 
>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>
>>>> On Dec 18, 2018, at 10:18 PM, Adam Nemet <anemet at apple.com 
>>>> <mailto:anemet at apple.com>> wrote:
>>>>
>>>>> I don’t understand this.  What is the benefit of providing layout 
>>>>> info to element wise operations?  This defeats the goal of having 
>>>>> simple lowering and representation: you are encoding an ND vector 
>>>>> form into the IR in a really ugly way, and this will cause a 
>>>>> proliferation of intrinsics that are redundant with the core ops.
>>>>
>>>> The reason we need that information so that for example we can 
>>>> lower an operation on a 3-element column into a vector of 2 and a 
>>>> scalar op.  This should be beneficial for power consumption since 
>>>> for example in the case of a 3x3 with a single element padding 
>>>> rather than operating on 12 elements you’d operate only on 9 
>>>> (vector ops consume more power than their scalar counterparts).
>>>>
>>>> That said we should be able to remove these intrinsics in the long 
>>>> term.  Once we have masking on the core ops in the IR, we should be 
>>>> able to express the same semantics without dedicated intrinsics.
>>>
>>> There may be some cases where this holds (maybe with 5x5 or 
>>> something), but most of the time I would expect to get better power 
>>> from doing a four-element vector op with one wasted lane than doing 
>>> two arithmetic ops (plus possibly extracts and inserts, depending on 
>>> physical layout details).
>>>
>>> Explicit masking or arranging for zero in padding lanes seems like a 
>>> better way forward to me.
>>> – Steve
>>
>> I spent some time chatting with Adam about this and have a better 
>> understanding of his concerns here. It seems to me that if having 
>> masking intrinsics is the long-term solution we want, we should do 
>> that now (for add and sub) rather than building arbitrary matrix 
>> layout info into intrinsics, since a mask has all the information 
>> that we actually need.
>
> I think that sounds like a reasonable compromise.  We already have 
> masked load/store intrinsics so adding add and sub just follows that 
> precedent.  If the decision is made to move masking to the core 
> operations, the new intrinsics would just move as well.
>
> So an add->multiply for option B + masking intrinsics would look like 
> this:
>
> %a = load <12 x float>, <12 x float>* %A, align 16
> %b = load <12 x float>, <12 x float>* %B, align 16
> %c = load <8 x float>, <8 x float>* %C, align 16
>
> %add = call <12 x float> @llvm.masked.fadd(<12 x float> %a, <12 x 
> float> %b,
> ; mask, if false element is taken from passthrough
>     <12 x i1> <i1 true, i1 true, i1 true, i1 false,
> i1 true, i1 true, i1 true, i1 false,
> i1 true, i1 true, i1 true, i1 false >
> ; passthrough:
>     <12 x float> <float undef, float undef, float undef, float undef,
> float undef, float undef, float undef, float undef,
> float undef, float undef, float undef, float undef >)
>
> %mul = call <8 x float> @llvm.matrix.multiply(<12 x float> %add, <8 x 
> float> %c,
> ;     3 x 3           3 x 2  column-major:
> i32 3, i32 3, i32 3, i32 2, i1 true)
> store <8 x float> %mul, <8 x float>* %MUL, align 16
We've started an RFC that proposes exactly this: 
https://reviews.llvm.org/D53613

The RFC proposes intrinsics that take a mask and an explicit vector 
length argument. The explicit vector length is aimed at RISC-V V and NEC 
SX-Aurora and it can be legalized away for targets that do not support 
it (eg AVX512). We also propose a couple of new attributes that should 
help with function call vectorization.

I'll present this in Zurich at the upcoming LLVM Social on January, 10th 
for people who are interested. I also talked about a bit about this at 
the last DevMtg (from ~15:00 in https://youtu.be/BAZClv6nMxY).

- Simon

>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 

Simon Moll
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de
Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181219/3c852f22/attachment.html>