[PATCH] D138888: [AArch64][SVE] Replace destructive operand of vector zeros with a bundled MOVPRFX instruction
Paul Walker via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 3 10:28:18 PST 2023
paulwalker-arm added a comment.
Hi @Allen, again sorry for the delay. The patch looks mostly good but I think for the unary instructions the zeroing can be handled completely during isel with perhaps no changes to AArch64ExpandPseudoInsts.cpp necessary. Given the unary instructions have a dedicated operand for the inactive lanes I believe we can add a constraint to PredOneOpPassthruPseudo to ensure safe register allocation for movprfx usage. Something like:
let Constraints = !if(!eq(flags, FalseLanesZero), "$Zd = $Passthru, at earlyclobber $Zd", "");
This will ensure @passthru is allocated the same register as the destination whilst also being unique to the real input operand (i.e. the one containing the active lanes). The existing instruction expansion should emit:
movprfx z0.h, p0/z, z1.h
flogb z0.h, p0/m, z1.h
This is preferred over the output shown in `sve2-intrinsics-fp-int-binary-logarithm-zeroing.ll` because it doesn't introduce any fake register dependencies, which is something we've already fixed for the UNDEF variants.
What do you think?
================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:2798-2799
-multiclass sve2_fp_flogb<string asm, SDPatternOperator op> {
- def _H : sve_fp_2op_p_zd<0b0011010, asm, ZPR16, ZPR16, ElementSizeH>;
- def _S : sve_fp_2op_p_zd<0b0011100, asm, ZPR32, ZPR32, ElementSizeS>;
- def _D : sve_fp_2op_p_zd<0b0011110, asm, ZPR64, ZPR64, ElementSizeD>;
+multiclass sve2_fp_flogb<string asm, string Ps, SDPatternOperator op, DestructiveInstTypeEnum flags> {
+ let DestructiveInstType = flags in {
+ def _H : sve_fp_2op_p_zd<0b0011010, asm, ZPR16, ZPR16, ElementSizeH>,
----------------
Are these changes required? `sve_fp_2op_p_zd` looks to already set `DestructiveInstType` accordingly.
================
Comment at: llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm-zeroing.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 -mattr=+use-experimental-zeroing-pseudos -asm-verbose=1 < %s | FileCheck %s
+
----------------
I believe this is the default and thus not required?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138888/new/
https://reviews.llvm.org/D138888
More information about the llvm-commits
mailing list