[llvm] [AArch64] Generate zeroing forms of certain SVE2.2 instructions (1/n) (PR #115535)
Momchil Velikov via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 12 03:23:29 PST 2024
================
@@ -4442,3 +4442,45 @@ let Predicates = [HasSVE, HasCPA] in {
// Multiply-add vectors, writing addend
def MLA_CPA : sve_int_mla_cpa<"mlapt">;
}
+
+multiclass sve_int_un_pred_arit_bitwise_fp_pat<SDPatternOperator op> {
+ let Predicates = [HasSVEorSME, NotHasSVE2p2orSME2p2] in {
+ defm : SVE_1_Op_PassthruUndef_Pat<nxv8f16, op, nxv8i1, nxv8f16, !cast<Pseudo>(NAME # _ZPmZ_H_UNDEF)>;
+ defm : SVE_1_Op_PassthruUndef_Pat<nxv4f16, op, nxv4i1, nxv4f16, !cast<Pseudo>(NAME # _ZPmZ_H_UNDEF)>;
+ defm : SVE_1_Op_PassthruUndef_Pat<nxv2f16, op, nxv2i1, nxv2f16, !cast<Pseudo>(NAME # _ZPmZ_H_UNDEF)>;
+ defm : SVE_1_Op_PassthruUndef_Pat<nxv4f32, op, nxv4i1, nxv4f32, !cast<Pseudo>(NAME # _ZPmZ_S_UNDEF)>;
+ defm : SVE_1_Op_PassthruUndef_Pat<nxv2f32, op, nxv2i1, nxv2f32, !cast<Pseudo>(NAME # _ZPmZ_S_UNDEF)>;
+ defm : SVE_1_Op_PassthruUndef_Pat<nxv2f64, op, nxv2i1, nxv2f64, !cast<Pseudo>(NAME # _ZPmZ_D_UNDEF)>;
+ }
+
+ let Predicates = [HasSVE2p2orSME2p2] in {
+ def : SVE_1_Op_PassthruUndefZero_Pat<nxv8f16, op, nxv8i1, nxv8f16, !cast<Instruction>(NAME # _ZPzZ_H)>;
+ def : SVE_1_Op_PassthruUndefZero_Pat<nxv4f16, op, nxv4i1, nxv4f16, !cast<Instruction>(NAME # _ZPzZ_H)>;
+ def : SVE_1_Op_PassthruUndefZero_Pat<nxv2f16, op, nxv2i1, nxv2f16, !cast<Instruction>(NAME # _ZPzZ_H)>;
+ def : SVE_1_Op_PassthruUndefZero_Pat<nxv4f32, op, nxv4i1, nxv4f32, !cast<Instruction>(NAME # _ZPzZ_S)>;
+ def : SVE_1_Op_PassthruUndefZero_Pat<nxv2f32, op, nxv2i1, nxv2f32, !cast<Instruction>(NAME # _ZPzZ_S)>;
+ def : SVE_1_Op_PassthruUndefZero_Pat<nxv2f64, op, nxv2i1, nxv2f64, !cast<Instruction>(NAME # _ZPzZ_D)>;
+ }
+}
+
+defm FABS : sve_int_un_pred_arit_bitwise_fp_pat<AArch64fabs_mt>;
+defm FNEG : sve_int_un_pred_arit_bitwise_fp_pat<AArch64fneg_mt>;
----------------
momchil-velikov wrote:
[TBH, I don't quite understand your proposal for an alternative implementation, but here are some things I've considered which lead to the current solution.]
> I'd rather not push everything up into InstrInfo, at least not universally like this.
> From what I can see the _ZPzZ_H patterns can be part of the instruction's multi class.
They can be. I have chosen to not do so and put the two sets of mutually exclusive ISel patterns together, so it's
relatively easier to read and understand, as opposed to having the patters scattered across multiple files with no
obvious relation between them.
> Then what remains is the need to define the pseudo instructions for the non-SVE2p2 cases.
I don't understand this comment, most (or lots of) non-SVE2.2 pseudos are already defined.
> Here there's likely no choice but to add explicit definitions to InstrInfo but I'd rather follow the same style
> as used for the binary operations (e.g see the two definitions of FADD_ZPZZ) albeit with the new set of UNDEF
> definitions being protected by your newly introduced feature flag. This will mean the pseudo instructions are only
> defined when they're needed.
The difference to the FADD case is that the zeroing- and non-zeroing forms of the instructions and pseudos are guarded by a different predicate. I've considered the approach with lowering from a *single set* of ISEL patterns to
pseudos and then extending the current map from pseudos to merging forms to also include the zeroing forms.
Then in the pseudo-instruction expansion pass one or the other real instruction can be chosen, except that it can't
because I'm not aware of a way to test predicates given a arbitrary machine instruction opcode.
One can test specifically for SVE2.2, but that's not forward compatible with future extensions/instructions.
Alternatively, similarly tor the non-SVE2.2 pseudos, one can imply the existing of the real instruction on the target from the choice of a SVE2p2 pseudo-instruction. That'd mean introducing two sets of ISel patterns, each generating the respective pseudo. But than, for the case of the SVE2.2, there is no choice to be made for the expansion of the pseudo, we might just well skip that step.
Which yield the current patch.
https://github.com/llvm/llvm-project/pull/115535
More information about the llvm-commits
mailing list