[PATCH] D148068: [AArch64] Lower fused complex multiply-add intrinsic to AArch64::FCMA
Nicholas Guy via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 14 07:57:18 PDT 2023
NickGuy added a comment.
> For a bit of context, we are generating this complex code from MLIR where we handle vectors of complex.
> For performances, In our use case of BLAS libraries, we manage to reach better performance than hand optimised assembly on caxpy, cgemv and cgemm.
I'd be interested to see how the performance of this differs from what the ComplexDeinterleavingPass emits, or if the patterns aren't recognised by the pass, why that might be.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D148068/new/
https://reviews.llvm.org/D148068
More information about the llvm-commits
mailing list