[llvm] [AArch64] Generate zeroing forms of certain SVE2.2 instructions (2/11) (PR #116828)
Momchil Velikov via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 29 03:27:10 PST 2024
================
@@ -381,6 +381,9 @@ def NoUseScalarIncVL : Predicate<"!Subtarget->useScalarIncVL()">;
def UseSVEFPLD1R : Predicate<"!Subtarget->noSVEFPLD1R()">;
+def UseUnaryUndefPseudos
----------------
momchil-velikov wrote:
This predicate is used to disable certain patterns for the cases where we have more than one pattern that could possibly match, but we want only a specific one.
For example, taking `AArch64fabs_mt`, we have these three patterns:
A: `(nxv8f16 (AArch64fabs_mt nxv8i1:$Op1, nxv8f16:$Op2, (nxv8f16 undef)))`
B: `(nxv8f16 (AArch64fabs_mt (nxv8i1 (SVEAllActive:Op1)), nxv8f16:$Op2, nxv8f16:$Op3))`
C: `(nxv8f16 (AArch64fabs_mt nxv8i1:$Op1, nxv8f16:$Op2, (nxv8f16 (SVEDup0Undef))))`
all of which are enabled with SVE2.2, and only A and B are enabled with SVE < 2.2. The first two patterns generate the merging form of the FABS (`FABS_ZPmZ_H`) instruction, the pattern C generates the zeroing form (`FABS_ZPzZ_H`).
A portion of a DAG could match all of A, B, and C. The conflict between A and B or A and C is resolved because A is "less complex" than either B or C (because both B and C contain a `ComplexPattern`), hence the instruction selection will try first C and only if it doesn't match will go and try to match against A or B.
However neither of B or C is (obviously) more complex than the other and I really want to avoid the ordering established by the position on the source code, because it's a rather fragile invariant to rely upon.
That's why patterns A and B are disabled (by the `UseUnaryUndefPseudos`) predicate when SVE2.2 is present.
We don't have such cases for the instructions affected by this patch (and other patches), because the pattern that we want for SVE2.2 is more complex and will be matched first.
https://github.com/llvm/llvm-project/pull/116828
More information about the llvm-commits
mailing list