[PATCH] D114174: [ARM][CodeGen] Add support for complex addition and multiplication

Tue Jun 28 07:42:56 PDT 2022

chill added inline comments.

================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:325
+  prepareCompositeNode(ComplexDeinterleavingOperation Operation) {
+    return std::shared_ptr<ComplexDeinterleavingCompositeNode>(
+        new ComplexDeinterleavingCompositeNode(Operation));
----------------
`return std::make_shared<ComplexDeinterleavingCompositeNode>(Operation);`

================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:561
+
+    auto RealMask = createDeinterleavingMask(ShuffleMask.size());
+    auto ImagMask = createDeinterleavingMask(ShuffleMask.size(), 1);
----------------
These could be, e.g.:
```
static const int RealMask[] = {0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30};
auto RealMaskRef = ArrayRef<int>(RealMask, ShufleMask.size());
```

with an assertion/bounds check. Good enough for 512-bit vectors with 16-bit elements, can be extended.

================
Comment at: llvm/test/CodeGen/ARM/ComplexArithmetic/complex-arithmetic-f32-add.ll:99
+  %b.imag = shufflevector <8 x float> %b, <8 x float> zeroinitializer, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
+  %0 = fsub fast <4 x float> %b.real, %a.imag
+  %1 = fadd fast <4 x float> %b.imag, %a.real
----------------
Shouldn't these be translated  to a couple of `vcadd.f32` instructions, like in the previous test?
And this amount of move instructions seems excessive.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114174/new/

https://reviews.llvm.org/D114174