[PATCH] D153207: [AArch64] Add patterns for scalar FMUL, FMULX, FMADD, FMSUB

Sat Jun 17 15:05:56 PDT 2023

overmighty added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.td:6928

+// Match indexed FMUL instead of scalar FMUL if it might save a DUP.
+let Predicates = [HasNEON, HasFullFP16] in {
----------------
For example, this prevents the following regression:

```
define float @test_v3f32(<3 x float> %a) nounwind {
; CHECK-LABEL: test_v3f32:
; CHECK:       // %bb.0:
; CHECK-NEXT:    fmul s1, s0, v0.s[1]
; CHECK-NEXT:    fmul s0, s1, v0.s[2]
; CHECK-NEXT:    ret
  %b = call float @llvm.vector.reduce.fmul.f32.v3f32(float 1.0, <3 x float> %a)
  ret float %b
}
```

```
test_v3f32:                             // @test_v3f32
// %bb.0:
	mov	s1, v0.s[1]
	fmul	s1, s1, s0
	fmul	s0, s1, v0.s[2]
	ret
```

================
Comment at: llvm/test/CodeGen/AArch64/complex-deinterleaving-f16-mul.ll:14
+; CHECK-NEXT:    mov h3, v1.h[1]
+; CHECK-NEXT:    fmul h4, h0, v1.h[1]
+; CHECK-NEXT:    fnmul h3, h2, h3
----------------
Ideally this would be `fmul h4, h0, h3`, but this is prevented by the patterns to avoid using scalar FMUL if it might introduce an extra DUP/`mov`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153207/new/

https://reviews.llvm.org/D153207