[PATCH] D114174: [ARM][CodeGen] Add support for complex deinterleaving

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 13 02:22:18 PDT 2022


dmgreen added inline comments.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:168
+  bool tryIdentifyNodeFromPair(Value *RealV, Value *ImagV) {
+    auto Pair = getOperandPair(RealV, ImagV);
+    if (Pair.Valid) {
----------------
Why is this needed? When are the Real and Imag values not already correct? It seems strange to be able to reverse them.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:736-737
+bool ComplexDeinterleavingGraph::replaceNodes() {
+  if (CompositeNodes.empty())
+    return false;
+
----------------
Can we move this into identifyNodes (or maybe at this point there will always be nodes?)

That would make replaceNodes always return true, which simplifies the getDeadRoots a little too.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:437
+
+  CommonOperandI = dyn_cast<Instruction>(CommonOperand);
+
----------------
NickGuy wrote:
> dmgreen wrote:
> > Should this function be calling identifyNode on the inputs, similar to identifyPartialMul?
> Not here, no. The inputs need to be resolved as part of the whole pair, so the relevant one for the created node is passed back via CommonOperandI.
What about the other pair? They are not guaranteed to be the same as the operands from identifyPartialMul as far as I can see. I would probably make sure that the two operands are a valid match too. The second call to identifyNode with the same operands should just find the same node again. And that way this can have proper operands.

Currently this and identifyPartialMul are very much interlinked. I hope we can improve that in the future, but it sounds like an extension that can be thought about later.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:589
+
+  Node->Operands.push_back(RealI->getOperand(0));
+  Node->Operands.push_back(RealI->getOperand(1));
----------------
NickGuy wrote:
> dmgreen wrote:
> > I think I was expecting Operands to be the ComplexDeinterleavingCompositeNode* from identifyNode. That avoids the need to find the again later.
> Attempting to plumb something together that acceps both `ComplexDeinterleavingCompositeNode*` and `Value*` caused more bloat and made things more difficult to follow. And we can't simply use `Node->ReplacementNode` as the `Value*` because it hasn't been assigned yet.
> Because of that constraint, I'm opting to find the nodes later, though I'm open to ideas.
I was expecting it to just accept ComplexDeinterleavingCompositeNode. If all the operands are ComplexDeinterleavingCompositeNode, then they just need to be visited in the correct order to assign ReplacementNode based on how they need to be transformed. (Which I believe is fine providing we visit them in reverse order, like is already done).

i.e. we construct a graph made up of ComplexDeinterleavingCompositeNode, then transform that graph. That seems like a simpler, more extensible design going forward. It gets more complicated if there is a mixing of values between the original IR and the newly constructed intrinsics. I could imagine that might make graphs (as opposed to DAGs) more difficult, but we don't yet support any cycles.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:723-731
+  if (getOperatingComponentOfValue(RootI->getOperand(0)) !=
+      OperatingComponent::Real) {
+    return false;
+  }
+
+  if (getOperatingComponentOfValue(RootI->getOperand(1)) !=
+      OperatingComponent::Imaginary) {
----------------
NickGuy wrote:
> dmgreen wrote:
> > I feel that this should be correct by construction. What cases are not correct?
> The one that is easiest to see is when attempting to multiply by a value rotated by 270 (`a[i] * (b[i] * I*I*I)`), it produces the following IR which has the real and imaginary components reversed at %interleaved.vec
> 
> ```
>   %a.real = shufflevector <8 x float> %a, <8 x float> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
>   %a.imag = shufflevector <8 x float> %a, <8 x float> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
>   %b.real = shufflevector <8 x float> %b, <8 x float> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
>   %b.imag = shufflevector <8 x float> %b, <8 x float> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
>   %0 = fmul fast <4 x float> %b.imag, %a.imag
>   %1 = fmul fast <4 x float> %b.real, %a.real
>   %2 = fsub fast <4 x float> %0, %1
>   %3 = fmul fast <4 x float> %b.imag, %a.real
>   %4 = fmul fast <4 x float> %b.real, %a.imag
>   %5 = fadd fast <4 x float> %4, %3
>   %interleaved.vec = shufflevector <4 x float> %5, <4 x float> %2, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
> ```
Sorry - I'm not sure what this was referring to now! Do you think that example should be transforming into complex intrinsics? It doesn't look valid to me.

Have you added it as a test case?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114174/new/

https://reviews.llvm.org/D114174



More information about the llvm-commits mailing list