[PATCH] D99433: [Matrix] Including __builtin_matrix_multiply_add for the matrix type extension.
    Florian Hahn via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Fri Mar 26 12:27:38 PDT 2021
    
    
  
fhahn added a comment.
Thanks for putting up the patch!
Do you think it would be possible to get the desired behavior without a new builtin? We should be able to combine the add with the initial multiply for each vector, as long as we have the right fast-math flags? IIUC reassociate should be enough.  So perhaps it would be possible to perform this optimization in `LowerMatrixIntrinsics` directly.  The user should then be able to use to enable the right fast-math flags locally using `pragma clang fp`, like below. Clang first needs to be updated to handle those pragmas properly for the matrix types.
  #pragma clang fp reassociate(on)
  C = A*B + C;
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99433/new/
https://reviews.llvm.org/D99433
    
    
More information about the llvm-commits
mailing list