[PATCH] D153207: [AArch64] Add patterns for scalar FMUL, FMULX

Sun Jun 25 08:26:36 PDT 2023

overmighty added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrFormats.td:8541
+
+  def : Pat<(f32 (OpNode
+                   (f32 (vector_extract (v4f32 V128:$Rn), (i64 0))),
----------------
dmgreen wrote:
> I think these should be HasNEON too.
Right, I had it on the previous patterns for indexed FMUL, I somehow lost/forgot it while porting them to `SIMDFPIndexed`.

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.td:4432

+multiclass FPScalarFromIndexedLane0Patterns<string inst, string inst_f16_suffix,
+                                            string inst_f32_suffix,
----------------
I wasn't sure where to put this.

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.td:5231
+// Match scalar FMULX instead of indexed FMULX when extracting lane 0.
+let Predicates = [HasNEON, HasFullFP16] in {
+def : Pat<(f16 (int_aarch64_neon_fmulx
----------------
dmgreen wrote:
> I don't think this needs NEON according to the reference I was looking at. Just FP.
> The patterns for FMUL are very similar to the patterns for FMULX16. Could they share the same multiclass for the patterns?
Can you confirm that it doesn't need Neon?

I couldn't find some sort of AArch64 equivalent to the Intel 64 manual which lists CPUID feature flags along instructions, but a few things suggest that even scalar FMULX needs Neon:

- The Arm Cortex/Neoverse Software Optimization Guides that I read listed FMULX under the "ASIMD FP multiply" instruction group (and under SVE "Floating point multiply" too) but not under regular "FP multiply" where FMUL appears.
- I'm not sure what it's worth but FMULX is defined here with an intrinsic that contains `neon` in its name as `OpNode`, the predicate is `HasNEONorSME`, and `SIMDFPThreeScalar` defaults to `HasNEON`.

However, this page lists FMULX as an instruction that has been added as part of floating-point changes in AArch64 and not Neon changes: https://developer.arm.com/documentation/den0024/a/AArch64-Floating-point-and-NEON/New-features-for-NEON-and-Floating-point-in-AArch64?lang=en.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153207/new/

https://reviews.llvm.org/D153207