[cfe-dev] matrix type conversion

Mon Aug 17 04:22:00 PDT 2020

> On Aug 13, 2020, at 13:57, Sjoerd Meijer <Sjoerd.Meijer at arm.com> wrote:
> 
> Hi,
> 
> > This should work according to the spec, but the conversion has not been implemented yet I think. I’ve created https://bugs.llvm.org/show_bug.cgi?id=47141 <https://bugs.llvm.org/show_bug.cgi?id=47141> and linked it to https://bugs.llvm.org/show_bug.cgi?id=46163 <https://bugs.llvm.org/show_bug.cgi?id=46163> which should act as an umbrella issue to track the missing pieces.
> 
> Ah, I was unaware of that umbrella ticket. Thanks for that, and for raising the ticket.
> 
> > I think currently we match the behavior for vector types and only convert scalar operands for binary operators implicitly to matrixes. If there’s a strong need for implicit conversions, this is certainly something that can be revisited.
> 
> Not sure if there's a strong need, but from writing my first examples yesterday, I can see that it would be convenient and possibly cleaner too (i.e. less text/clutter). I am not sure about this one, but it's also what people would expect perhaps?

I think as part of the discussion for the RFC we decided for being more explicit. But as I said, if there’s consensus that it would be better to provide implicit conversion, it is easy to change. But it would be good to implement explicit conversion to start with ;)

> 
> > Yes we can certainly extend this, to allow use cases to map to hardware instructions that implement an extension step, like AAch64’s udot. IIRC it extends the sums, which I think would make the most sense to use, as otherwise it should be sufficient to extend the operands/result.
> 
> Yes, or the v8.6 matrix multiply accumulate instructions which multiply 8 bit values and store them to 32-bits.

Oh right, I just had a look at those. It seems like the matrix multiply accumulate instructions widen the result of the matrix multiplication. I don’t think we need any changes to the intrinsic to model that. We should be able to model this by just extending the result vector of the matrix multiplication. And the extension instructions would be generated naturally from implicit/explicit conversion to a a matrix with wider element type.

What I was referring to in the statement below was related to instructions where the results of the intermediate multiplications get widened, which are then accumulated using the wider type. To model that, I think we would need a ‘widening’ version of the matrix multiply intrinsic.  And mapping this extension ‘in the middle’ to implicit/explicit conversion of the final result would be confusing/surprising IMO. But I might be missing something.

> 
> > I think the more interesting question here would be how this fits into the C/C++ spec. I guess it would be possible to specify it so a multiply that gets extended lowers to the widening intrinsic, but this would seem quite surprising/awkward. I
> 
> As I said, I haven't given this too much thought yet, so just for my understanding, what exactly is the surprising/awkward bit of the C/C++ spec here? I was guessing that the assignment of a result from a matrix operation, using an implicit/explicit conversion, would take care of this?

Cheers,
Florian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200817/28fc0e92/attachment.html>