[PATCH] D99433: [Matrix] Including __builtin_matrix_multiply_add for the matrix type extension.

Florian Hahn via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 26 12:27:38 PDT 2021


fhahn added a comment.

Thanks for putting up the patch!

Do you think it would be possible to get the desired behavior without a new builtin? We should be able to combine the add with the initial multiply for each vector, as long as we have the right fast-math flags? IIUC reassociate should be enough.  So perhaps it would be possible to perform this optimization in `LowerMatrixIntrinsics` directly.  The user should then be able to use to enable the right fast-math flags locally using `pragma clang fp`, like below. Clang first needs to be updated to handle those pragmas properly for the matrix types.

  #pragma clang fp reassociate(on)
  C = A*B + C;


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99433/new/

https://reviews.llvm.org/D99433



More information about the cfe-commits mailing list