[cfe-dev] RFC: First-class Matrix type

Mon Oct 15 08:59:46 PDT 2018

> On Oct 12, 2018, at 2:02 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:
> 
> On 11 Oct 2018, at 23:42, Richard Smith via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>> 
>> On the LLVM IR side, I'm personally unconvinced that we should model matrices in the IR directly as a new first-class type, unless there's some target out there that has matrix operations in hardware / matrix registers, but IR is not really my area of expertise so give that opinion as much or little weight as you see fit. However, I do wonder: how is this different from, say, complex numbers, for which we don't have a native IR representation? (Maybe the answer is that we should have native IR support for complex numbers too.) How would you expect the frontend to lower (eg) matrix multiplication for matrices of complex numbers?
> 
> I think there are a few potential problems with adding these into LLVM IR as a first-class type:
> 
> - With vectors, we have a simple extension from scalar operations to vector ones.  A scalar operation on a vector is exactly the same as the same scalar operation applied pairwise to the elements. This is not the case for matrix types.

Vectors also have things like reductions, masked load/store and scatter/gather.  Matrices also have element-wise operations and then some custom ones.  I don’t think the situation is all that different.

> 
> - A lot of analysis and transform passes need to have special cases for vectors.  This is already a bit of a maintenance problem, but one that we put up with because of the big wins from vector types.  Do matrix types have the same tradeoff?

I don’t think that’s the right way to look at this proposal.  As I said I think it’s better to look at matrices as an extension to vectors adding a different view on the same underlying data.  Thus most of the special cases for vectors would also apply to matrices as considered SequentialTypes or a new common base class.  So effectively isVector becomes isVectorOrMatrix in many cases.  This has definitely been the experience so far in the prototype.

With this extension we can cover a new class of applications that otherwise would require heroic front-end or user-level efforts.  And since all these approaches perform premature lowering, enabler passes like the inliner can’t reason about the representation.  Leaving them as first-class constructs makes this reasoning trivial.

> 
> - What does a matrix of pointers look like?  What is a GEP on a matrix of pointers?  If I do GEP on a matrix of pointers and a matrix of integers and I replace it with ptrtoint, add, inttoptr, is that equivalent?

Unlike vectors, I don’t think that’s a case that we want to support unless one of the HW implementations requires this.  Do you think this would be useful?

> 
> - Does the IR representation enforce an in-memory representation?  If different targets represent matrixes in different orders (e.g. column-major vs row-major) then how much do optimisers need to be aware of this?

Most of this should be limited to the lowering pass.  That is where we would reason about the layout, fusion, register pressure and vectorization.

How much InstCombine and things like that would be affected we can decide based on the trade-off of the specific cases we want to optimize.  I don’t think that’s a priority or a requirement for now.

I feel that having this extension experimental would allow us to evaluate pros and cons in this regard.

> 
> - Can all of the more complex operations that you might want to implement on matrixes be implemented in terms of a smaller number of primitives, without resorting to long extract and insert element sequences?

I’d like to keep the canonical representation for these operations short so that high-level optimizations like the inliner can easily reason about them.

> 
> - Can the same optimisations be used on sparse matrix representations that are then lowered by something custom nearer the back end?

I feel this is related to your scatter/gather question but I am not sure I completely get it.  Can you please elaborate?

> 
> - As others have said, how does this interact with variable-sized matrixes?

I’d like to experiment with lowering variable-sized matrix operations to this representation.  I have some hand-wavy ideas about this.  Something where we would have a lambda expressing operations on fixed-sized tiles and then a mapping function that applies/extends that to variable-sized matrices.  An expression-template library may be able to gather all the required pieces for this.

Adam

> 
> David
>