[PATCH] D148068: [AArch64] Lower fused complex multiply-add intrinsic to AArch64::FCMA

Nicholas Guy via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 14 07:57:18 PDT 2023


NickGuy added a comment.

> For a bit of context, we are generating this complex code from MLIR where we handle vectors of complex.
> For performances, In our use case of BLAS libraries, we manage to reach better performance than hand optimised assembly on caxpy, cgemv and cgemm.

I'd be interested to see how the performance of this differs from what the ComplexDeinterleavingPass emits, or if the patterns aren't recognised by the pass, why that might be.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148068/new/

https://reviews.llvm.org/D148068



More information about the llvm-commits mailing list