[llvm-dev] [RFC] Matrix support (take 2)
Stephen Canon via llvm-dev
llvm-dev at lists.llvm.org
Wed Dec 19 11:09:37 PST 2018
> On Dec 18, 2018, at 10:18 PM, Adam Nemet <anemet at apple.com> wrote:
>> I don’t understand this. What is the benefit of providing layout info to element wise operations? This defeats the goal of having simple lowering and representation: you are encoding an ND vector form into the IR in a really ugly way, and this will cause a proliferation of intrinsics that are redundant with the core ops.
> The reason we need that information so that for example we can lower an operation on a 3-element column into a vector of 2 and a scalar op. This should be beneficial for power consumption since for example in the case of a 3x3 with a single element padding rather than operating on 12 elements you’d operate only on 9 (vector ops consume more power than their scalar counterparts).
> That said we should be able to remove these intrinsics in the long term. Once we have masking on the core ops in the IR, we should be able to express the same semantics without dedicated intrinsics.
There may be some cases where this holds (maybe with 5x5 or something), but most of the time I would expect to get better power from doing a four-element vector op with one wasted lane than doing two arithmetic ops (plus possibly extracts and inserts, depending on physical layout details).
Explicit masking or arranging for zero in padding lanes seems like a better way forward to me.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev