[PATCH] D74632: [AArch64][SVE] Add initial backend support for FP splat_vector

Fri Feb 14 13:52:46 PST 2020

efriedma added a comment.

> There seems to be a missing lowering for f16 constants. Instead of becoming ConstantFPs, the f16 constants are being explicitly loaded from the constant pool

See AArch64TargetLowering::isFPImmLegal ?  We should generate an appropriate fmov with the right target features.  I don't think "-mattr=+sve" implies those features, though.

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:297
+  // Create variants of DUP_ZZI (I=0) that can be used without INSERT_SUBREG.
+  def DUP_ZV_H : Pseudo<(outs ZPR16:$Zd), (ins FPR16:$Vn), []>, Sched<[]>;
+  def DUP_ZV_S : Pseudo<(outs ZPR32:$Zd), (ins FPR32:$Vn), []>, Sched<[]>;
----------------
I'm not adding a pseudo-instruction is worthwhile if the only benefit is avoiding an INSERT_SUBREG.

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:310
+  // Duplicate +0.0 into all vector elements
+  def : Pat<(nxv8f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;
+  def : Pat<(nxv4f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;
----------------
What do we end up generating for a non-zero float immediate?  We might need a pattern to avoid an extra mov in the general case.

In theory, we can generate other float immediates using the integer dup/dupm, but I guess most of them won't be useful for 32-bit or 64-bit floats.  Some probably are, though; for example, you can generate 1.0 with dupm.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74632/new/

https://reviews.llvm.org/D74632