[PATCH] D138888: [AArch64][SVE] Replace destructive operand of vector zeros with a bundled MOVPRFX instruction

Fri Feb 3 10:28:18 PST 2023

paulwalker-arm added a comment.

Hi @Allen, again sorry for the delay.  The patch looks mostly good but I think for the unary instructions the zeroing can be handled completely during isel with perhaps no changes to AArch64ExpandPseudoInsts.cpp necessary.  Given the unary instructions have a dedicated operand for the inactive lanes I believe we can add a constraint to PredOneOpPassthruPseudo to ensure safe register allocation for movprfx usage. Something like:

  let Constraints = !if(!eq(flags, FalseLanesZero), "$Zd = $Passthru, at earlyclobber $Zd", "");

This will ensure @passthru is allocated the same register as the destination whilst also being unique to the real input operand (i.e. the one containing the active lanes).  The existing instruction expansion should emit:

  movprfx	z0.h, p0/z, z1.h
  flogb	z0.h, p0/m, z1.h

This is preferred over the output shown in `sve2-intrinsics-fp-int-binary-logarithm-zeroing.ll` because it doesn't introduce any fake register dependencies, which is something we've already fixed for the UNDEF variants.

What do you think?

================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:2798-2799

-multiclass sve2_fp_flogb<string asm, SDPatternOperator op> {
-  def _H : sve_fp_2op_p_zd<0b0011010, asm, ZPR16, ZPR16, ElementSizeH>;
-  def _S : sve_fp_2op_p_zd<0b0011100, asm, ZPR32, ZPR32, ElementSizeS>;
-  def _D : sve_fp_2op_p_zd<0b0011110, asm, ZPR64, ZPR64, ElementSizeD>;
+multiclass sve2_fp_flogb<string asm, string Ps, SDPatternOperator op, DestructiveInstTypeEnum flags> {
+  let DestructiveInstType = flags in {
+    def _H : sve_fp_2op_p_zd<0b0011010, asm, ZPR16, ZPR16, ElementSizeH>,
----------------
Are these changes required? `sve_fp_2op_p_zd` looks to already set `DestructiveInstType` accordingly.

================
Comment at: llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm-zeroing.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 -mattr=+use-experimental-zeroing-pseudos -asm-verbose=1 < %s | FileCheck %s
+
----------------
I believe this is the default and thus not required?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138888/new/

https://reviews.llvm.org/D138888