[PATCH] D81472: [Matrix] Update load/store intrinsics.

Wed Jun 10 11:41:24 PDT 2020

rjmccall added a comment.

In D81472#2085129 <https://reviews.llvm.org/D81472#2085129>, @fhahn wrote:

> In D81472#2083075 <https://reviews.llvm.org/D81472#2083075>, @rjmccall wrote:
>
> > I like the name change, although I wonder if you could just have a single intrinsic that takes both a row stride and a column stride and recognizes the common patterns.  Presumably even with column-major ordering you already want to optimize the case where the stride is a constant equal to the row count, so this would just be a generalization of that.
>
>
> I am not sure about having a single intrinsic. The `column.major` part in the name signifies that the resulting matrix is in column-major layout (whic, but  is then used internally during he lowering). I suppose it would be possible to have both row & column strides and use a special value to indicate what the leading dimension is, but it seems to me that having dedicated intrinsics would be more explicit.

This is just Fortran array slices.  You don't need a special value, the two strides are sufficient.  `M[i][j]` is at `p[i *  rowStride + j * columnStride]`.  To not have overlapping storage, you need either  `rowStride * rowCount <= columnStride` or vice-versa.  Row-major means `rowStride >= columnCount && columnStride == 1`; column-major means `rowStride == 1 && columnStride >= rowCount`.  You get better locality from doing the smaller stride in the inner loop (which may not actually be a loop, of course), but it's not wrong to do either way.

Anyway, it's up to you, but I think the two-stride representation is more flexible and avoids ultimately needing three separate intrinsics with optimizations that turn the general one into the more specific ones.  And it may have benefits for frontends like Flang that have to support these strided multi-dimensional slices.

>> Are you planning to actually support alignment in a follow-up patch?  I don't see anything here that propagates alignment to the lowering routines.
> 
> Yes, this patch just adjusts the intrinsics definitions/langref. Respecting both `IsVolatile` & the alignment attribute will be done in follow-up patches.\

Okay.

> There's already one for `IsVolatile` D81498 <https://reviews.llvm.org/D81498>. I think setting the alignment correctly for the split stores is not completely trivial, because the original alignment will hold for the first split access, but may not hold for the subsequent accesses and some extra work is needed to figure out which alignments to use for subsequent stores, depending on the stride.

The conservative alignment for addresses of the form `&p[i]` is the min of the alignment of `p` with the size of the element type.  If the index is constant you can do better, of course.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81472/new/

https://reviews.llvm.org/D81472