[PATCH] D70456: [Matrix] Add first set of matrix intrinsics and initial lowering pass.

Tue Apr 7 07:34:48 PDT 2020

LuoYuanke added a comment.

> I've created D77549 <https://reviews.llvm.org/D77549> which uses AArch64's udot instruction to compute the result of multiplies on 4x4 tiles. To do so, first a tiled loop nest is created that iterates over the columns, rows and the inner dimension. In the inner loop, 4x4 tiles are loaded, multiplied (using the dot product) and accumulated. After the  inner loop, the final result of the 4x4 tile is stored.  The main reason I went for AArch64's udot is that I can easily run it, but IIUC the VNNI instructions are very similar, they just allow processing of larger tiles.
> 
> Please note that the patch is a bit rough around the edges and we currently it not clear how to specify 'multiply 8 bit operands, accumulate in 32 bit result' nature of those instructions; we will have to extend the llvm.matrix.multiply definition for that I think. But it should be enough for you to be able to get started with getting something working for VNNI. Please let me know if you have any questions or encounter any problems, either in the discussion for D77549 <https://reviews.llvm.org/D77549> or email.
> 
> Cheers,
> Florian

Thanks Florian.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70456/new/

https://reviews.llvm.org/D70456