[PATCH] D114174: [ARM][CodeGen] Add support for complex deinterleaving

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 14 10:54:43 PDT 2022


dmgreen added inline comments.


================
Comment at: llvm/include/llvm/CodeGen/ComplexDeinterleavingPass.h:36
+
+enum class ComplexDeinterleavingOperation { None, CAdd, CMulPartial,
+                                            // The following 'operations' are used
----------------
I don't think None is ever used, and from what I can see _Placeholder is a Shuffle? That might be a better name, either Shuffle or Leaf. I would also probably remove _Incomplete. They might make sense, but it is difficult to follow and very easy to get wrong.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:105
+public:
+  SmallVector<Instruction *> ContainedInstructions;
+  // Instructions that are considered part of this node for validity checks,
----------------
I think it would be best to store the root Real and Imag values as part of the ComplexDeinterleavingCompositeNode. That way we can be sure we are matching a correct node from anything like getContainingComposite.
There can then be the TransientInstructions (maybe "ExtraInstructions"), which are also part of the pattern, but not the root nodes.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:114
+  unsigned Rotation = 0;
+  NodePtr AccumulatorNode = nullptr;
+  SmallVector<NodePtr> Operands;
----------------
Could AccumulatorNode just be another Operand?


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:115
+  NodePtr AccumulatorNode = nullptr;
+  SmallVector<NodePtr> Operands;
+
----------------
I think this would be best to avoid NodePtr, if that might keep a reference to a node alive. Avoiding shared_ptr entirely would be good, but I imagine any alternative might be difficult too. They needs to be stable values to avoid memory invalidation issues.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:170
+  void dump(raw_ostream &OS) {
+    auto PrintValue = [&](Value *V){
+      if(V) {
----------------
Formatting needs updating.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:219
+  SmallVector<NodePtr> CompositeNodes;
+  SmallVector<Value*> NodeInstructions;
+
----------------
If NodeInstructions is used to calculate all the Instructions we have seen, to check that all the uses are inside the graph, it could be a SmallPtrSet which will have quicker contains().


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:287
+
+  NodePtr wrapValueAsNode(Value *V) {
+    auto *I = dyn_cast<Instruction>(V);
----------------
It is hard to see how wrapping a Value as an _Incomplete node is always correct. It will have lost the information about whether V was real or imaginary.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:575
+  if (auto *I = dyn_cast_or_null<Instruction>(CommonOperand)) {
+    if (I->getOpcode() == Instruction::FNeg)
+      Rotation += 180;
----------------
Do you have any tests for fneg? This should probably be more like the code in identifyPartialMul where the rotation is chosen based on whether the Real/Imag parts are negated.


================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:672
+
+ if (CNode) {
+   // Check rotation pairs for full complex multiplication, certain rotation
----------------
It's not the combination of rotates that is invalid really, although some might make less sense than others, and some might simplify to code that is difficult to match. The even rotations will define the real part of `a`, the odd rotations define the imag part. Between the two parts of the multiply we match we need to find both the real and imag halves of the value, to successfully match it further. (We may be able to do something with only half a match in some cases, like matching leafs, but that should probably be left for a later patch).


================
Comment at: llvm/test/CodeGen/Thumb2/mve-complex-deinterleaving-mixed-cases.ll:79
 
 define arm_aapcs_vfpcc <4 x float> @mul_mul270_mul(<4 x float> %a, <4 x float> %b, <4 x float> %c) {
 ; CHECK-LABEL: mul_mul270_mul:
----------------
Are you sure this is valid?
When we were doing the gather lowering, we found it useful to annotate the tests with OK/Bad, so if they change we could see the ones we thought shouldn't.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114174/new/

https://reviews.llvm.org/D114174



More information about the llvm-commits mailing list