[PATCH] D114174: [ARM][CodeGen] Add support for complex deinterleaving
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 6 03:24:28 PDT 2022
dmgreen added a comment.
Some of this seems to have returned to a previous version of the patch?
================
Comment at: llvm/lib/CodeGen/CMakeLists.txt:49
CommandFlags.cpp
+ ComplexDeinterleavingPass.cpp
CriticalAntiDepBreaker.cpp
----------------
dmgreen wrote:
> Formatting is off here
This formatting again?
================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:528
+
+ Value *CommonOperand =
+ getCommonValues({RealMulI->getOperand(0), RealMulI->getOperand(1),
----------------
dmgreen wrote:
> What if the operands are FMul(A, A) and FMul (B, C)?
> I think it needs to be more precise about which nodes are which.
I still think this needs to be a more precise with regard to what is considered the CommonOperand and CommonOperand.
================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:431
+
+ Value *CommonOperand = getCommonValues(Values);
+ if (CommonOperand == nullptr)
----------------
Similar to identifyPartialMul, this probably needs to be more careful about what it is selecting as which operands.
================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:568-570
+ if (!NodeCreated)
+ identifyNode(cast<Instruction>(Pair.Imaginary),
+ cast<Instruction>(Pair.Real), TL, _);
----------------
Is it valid to test the operands in the opposite order?
================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:623
+
+ identifyNode(AR, AI, TL, CommonOperandI);
+ identifyNode(BR, BI, TL, CommonOperandI);
----------------
Should these be checking the subnodes?
```
if (!identifyNode(..))
return nullptr;
```
================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:646
+ const TargetLowering *TL,
+ Instruction *&CommonOperandI) {
+ auto IsZeroInitializer = [&](Value *V) -> bool { return match(V, m_Zero()); };
----------------
CommonOperandI only seems to be important between identifyPartialMul and identifyNodeWithImplicitAdd
================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:679-680
+
+ if (!isDeinterleavingMask(RealShuffle->getShuffleMask()) ||
+ !isDeinterleavingMask(ImagShuffle->getShuffleMask()))
+ IdentifySuccess = false;
----------------
These need to check that the first is the real deinterleave with offset=0, and the imag has offset=1. And maybe that they only take elements from the first operand and don't change size.
================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:723-731
+ if (getOperatingComponentOfValue(RootI->getOperand(0)) !=
+ OperatingComponent::Real) {
+ return false;
+ }
+
+ if (getOperatingComponentOfValue(RootI->getOperand(1)) !=
+ OperatingComponent::Imaginary) {
----------------
I feel that this should be correct by construction. What cases are not correct?
================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:790-792
+ // Discover the relevant inputs for a given node.
+ // This can't be moved any sooner, as the inputs could be the replacement
+ // value of a previous node.
----------------
I don't think this should be needed. The inputs should just be present from the ReplacementNode of the operands.
================
Comment at: llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:810
+ // what it might be based on the vector width
+ // TODO Defer to the target for this info. Do we care enough?
+ auto *VTy = cast<FixedVectorType>(CN->Input0->getType());
----------------
If the statistic is being awkward then it is probably not worth keeping (or keeping simple - just counting number of transforms, not the number of individual intrinsics that might become in the backend).
================
Comment at: llvm/lib/Target/ARM/ARMISelLowering.cpp:21817
+ int Stride = Ty->getNumElements() / 2;
+ const int SplitMask[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
+ 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
----------------
It is better not to use fixed sized arrays like this, just construct the array to be any needed size. If the values are continuous you can use iota.
================
Comment at: llvm/test/CodeGen/ARM/ComplexArithmetic/complex-arithmetic-f16-add.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -o - | FileCheck %s
+; RUN: llc < %s -o /dev/null -stats -stats-json 2>&1 | FileCheck %s --check-prefix=STATS
----------------
dmgreen wrote:
> I would use run lines that that are similar to the llvm/test/CodeGen/Thumb2/mve-xyz.ll tests. Something like:
> ```
> ; RUN: llc -mtriple=thumbv8.1m.main-none-none-eabi -mattr=+mve.fp -verify-machineinstrs %s -o - | FileCheck %s
> ```
> It's best not to use a specific cpu, using the arch instead.
>
> The tests can probably go in the same place too, under llvm/test/CodeGen/Thumb2/mve-xyz.ll for mve tests.
These tests seem to have moved back to the wrong place.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114174/new/
https://reviews.llvm.org/D114174
More information about the llvm-commits
mailing list