[PATCH] D114174: [ARM][CodeGen] Add support for complex addition and multiplication

Wed Feb 9 13:04:53 PST 2022

dmgreen added a comment.

> I think @jcranmer-intel is working on getting Clang to emit target-independent intrinsics for complex operations (see D119284 <https://reviews.llvm.org/D119284> & linked patches). It might be good to sync up.

> As Florian mentioned, I just re-uploaded a full stack of patches for complex intrinsics support, ranging from defining multiply and divide intrinsics, including an expansion for x86 architecture in both expansion to __mulsc3 and friends and full lowering to instructions, as well as building on top of them to finally get CX_LIMITED_RANGE support into clang. The most interesting patch is probably D119287 <https://reviews.llvm.org/D119287>, since that's the one that does all of the codegen work that this is largely doing, and I personally don't have sufficient expertise with ARM or AArch64 to design that code very well.

Yeah thanks, we saw the updates. I had already left a message on https://reviews.llvm.org/D114398 a long time ago, but I think it was missed because the patch was a draft, and apparently that makes it fairly hidden for a phabriactor review.

I still have mis-givings about a generic complex intrinsics being the correct representation for the mid-end of llvm. As far as I understand at the moment they would just block fmul/fadd/etc folds that might already be happening, and block any vectorization (which from what I understand of the problem domain of complex numbers is really important!)

As I said above, it fells difficult for me to see how they will produce the most optimial solution, given the instrucions that are really available. And I don't yet have any examples of anything it really makes better/easier to optimize.

Happy to hear your thoughts on that, my experience with complex numbers mainly comes from people using arrays of interleaved real/imaginary numbers, not _Complex or std::complex. From my understanding of what Nick had tried, they would all vecotorize so that this pass (perhaps opportunistically) could match to something better for the target. From looking at the patches of the complex intrinsic they look much more focussed on the clang representation and whether they will overflow.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114174/new/

https://reviews.llvm.org/D114174