[cfe-dev] [llvm-dev] RFC: First-class Matrix type

Fri Oct 12 11:39:25 PDT 2018

> Recent NVIDIA GPUs do support some matrix operations natively in the
hardware

The inputs to the operations in question are actually defined as opaque.
You have to load the inputs using a special instruction, the semantics of
which are no precisely specified.  This instruction loads the matrix's
values distributed across the threads in a warp, and you are not allowed to
assume anything about which values are where.  In fact they claim that the
arrangement of values is architecture-dependent.

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-fragment

I suppose one might be able to spec the corresponding LLVM intrinsics for
loading data in preparation for a matmul as taking a matrix pointer rather
than a plain float*, but since as soon as you load it you no longer have a
matrix in hand (each thread has some undefined subset of the matrix's
elements), I'm not seeing any real use there other than a sort of handwavy
"type-safety" argument.

On Fri, Oct 12, 2018 at 11:02 AM David Greene via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> JF Bastien via llvm-dev <llvm-dev at lists.llvm.org> writes:
>
> > Agreed these patches would be neat to revive.
>
> We've been trying but haven't got responses to Phab comments we've
> posted.
>
> > I think we’d also want someone to pursue wg21.link/n4150.
>
> Definitely!
>
> > However, none of that seems like it should gate matrix support in the
> > IR.
>
> Agreed.  But we should at least be aware of alternatives so we have a
> good sense of the pros/cons of the proposal.
>
>                           -David
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181012/159ed20a/attachment.html>