[PATCH] D99074: [llvm][AArch64][SVE] Fold literals into math instructions

Paul Walker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 19 03:37:10 PDT 2021


paulwalker-arm added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:415-422
+  defm FADD_ZPZI    : sve_fp_2op_i_p_zds_zx<sve_fpimm_half_one>;
+  defm FSUB_ZPZI    : sve_fp_2op_i_p_zds_zx<sve_fpimm_half_one>;
+  defm FMUL_ZPZI    : sve_fp_2op_i_p_zds_zx<sve_fpimm_half_two>;
+  defm FSUBR_ZPZI   : sve_fp_2op_i_p_zds_zx<sve_fpimm_half_one>;
+  defm FMAXNM_ZPZI  : sve_fp_2op_i_p_zds_zx<sve_fpimm_zero_one>;
+  defm FMINNM_ZPZI  : sve_fp_2op_i_p_zds_zx<sve_fpimm_zero_one>;
+  defm FMAX_ZPZI    : sve_fp_2op_i_p_zds_zx<sve_fpimm_zero_one>;
----------------
These will need to be split in two because the the `UNDEF` forms are safe to use by default but the `ZERO` forms must be protected by UseExperimentalZeroingPseudos.  You can see how we've done this for `FADD_ZPZZ` where there exists two multiclasses, `sve_fp_bin_pred_hfd` and `sve_fp_2op_p_zds_zeroing_hsd`.


================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:2247
 
+  /// Compact single bit fp immediates
+  multiclass intrinsic_compact_fp_immediates<string I, string IZ, string IX,
----------------
DavidTruby wrote:
> david-arm wrote:
> > Hi @DavidTruby, this is quite a lot of very similar looking patterns to maintain here. Is there a possible way to condense them somehow, i.e. the A and B patterns are duplicates, so you could invoke a multiclass for A, then a multiclass for B?
> @david-arm hopefully this helps a bit, it at least means that each pattern is only mentioned once so only need changing once if they have to be changed in future
My immediate thought is whether these (or at least some of these) be rolled into the instruction/pseudo multiclasses. By this I mean can sve_fp_2op_i_p_zds be extended with extra parameters so the necessary intrinsic patterns can be added.

Likewise for the multiclasses used to create the UNDEF and ZERO psuedos, can the multiclasses themselves contain the IR patterns plus the vselect variants.


================
Comment at: llvm/test/CodeGen/AArch64/sve-fp-immediates-merging.ll:5-6
+
+attributes #0 = { "target-features"="+sve" }
+attributes #1 = { "target-features"="+sve,+use-experimental-zeroing-pseudos" }
+
----------------
Minor point but attributes are normally placed after the function definitions as that where the IR printer would put them. 


================
Comment at: llvm/test/CodeGen/AArch64/sve-fp-immediates-merging.ll:14-15
+  ; CHECK-LABEL: fadd_h_immhalf:
+  ; CHECK: fadd z0.h, p0/m, z0.h, #0.5
+  ; CHECK-NEXT: ret
+  %elt   = insertelement <vscale x 8 x half> undef, half 0.500000e+00, i32 0
----------------
Please use update_llc_test_checks.py to generate the CHECK lines.


================
Comment at: llvm/test/CodeGen/AArch64/sve-fp-immediates-merging.ll:2
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s | FileCheck %s
+
+;
----------------
joechrisellis wrote:
> Please can we add:
> 
> ```
> ; If this check fails please read test/CodeGen/AArch64/README for instructions on how to resolve it.
> ; WARN-NOT: warning
> ```
Just a note to say this and the above `RUN: ... 2>%t` request are no longer necessary because llc will now error in these circumstances.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99074/new/

https://reviews.llvm.org/D99074



More information about the llvm-commits mailing list