[PATCH] D114174: [ARM][CodeGen] Add support for complex addition and multiplication

Nicholas Guy via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 10 02:18:05 PST 2022


NickGuy added a comment.

Thanks all for the comments so far (And thanks Dave for taking on the evening shift, as it were)

In D114174#3308681 <https://reviews.llvm.org/D114174#3308681>, @jcranmer-intel wrote:

> I haven't delved into the ARM-specific code in detail, but the ComplexArithmeticGraph feels like it's reinventing a lot of Instruction-like infrastructure just to avoid having to do anything with complex intrinsics.

In hindsight, calling it a graph may be a misnomer; It acts more as a registry for attaching metadata to instructions, sitting alongside the Instruction use-def graph.

> What operations would you need from standardized complex intrinsics to completely eliminate all of logic in ComplexArithmeticGraph?

Having operations that can map to the Arm/AArch64 instructions would be a requirement for us to solely depend on them. Beyond those defined in D119284 <https://reviews.llvm.org/D119284>; Addition with rotations, and multiplying both complex components by a real/imaginary number.
That said, having the frontend emit numerous intrinsics just for Arm/AArch64 might set a bad precedent. I can imagine a case where different architectures have different ideas of what a complex multiply looks like, and each wants it's own idea reprensented by the standard intrinsics. It might be a better idea to have the frontend emit intrinsics for the concepts of a complex multiply, while targets can then match the specific patterns they're interested in beyond those.

> Have you tried an interface that lets targets nominate target-specific intrinsics for complex operations in lieu of creating a new graph ISA?

I mighr be missing something, but I'm not sure I follow how that would be different from delegating down to the TTI to generate the intrinsic. As I alluded to in a previous comment, the major problem with having common intrinsic building is that parameters are different across architectures.

> This says ComplexArithmetic, but it's mostly just limited to complex multiplication, right? There's no support for complex division or absolute value that I see (not that complex division is implemented in any hardware I'm aware of).

Multiplication and addition in it's current state. It was designed through an Arm-tinted lens, so only supports the operations we have instructions for.

> By starting the search for a complex multiply at a shufflevector, you're really leaving a lot of opportunities to match complex multiplies off the table. The pattern-matching I did in D119288 <https://reviews.llvm.org/D119288> looks for insertvalue, insertelement, and matching stores for things that might be the result of complex multiplies or divisions.

I've seen insertelement emitted in the scalar versions, which is on my list. But we've focused on matching the vector forms first, as those are where we saw the best potential gains in our preliminary investigations.

In D114174#3308914 <https://reviews.llvm.org/D114174#3308914>, @dmgreen wrote:

> ...my experience with complex numbers mainly comes from people using arrays of interleaved real/imaginary numbers, not _Complex or std::complex. From my understanding of what Nick had tried, they would all vecotorize so that this pass (perhaps opportunistically) could match to something better for the target...

That's correct, both cases (arrays of interleaved, and std::complex) are vectorised the same way, and produce the same IR. Though using std::complex does need `-ffast-math` to prevent generation of `__mulsc3` and co.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114174/new/

https://reviews.llvm.org/D114174



More information about the llvm-commits mailing list