[PATCH] D130564: [AArch64][SVE] Add patterns to select masked FP arith
Cullen Rhodes via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 2 03:28:00 PDT 2022
c-rhodes added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:373-374
+ return true; // it's the intrinsic
+ SDValue Sel = N->getOperand(2);
+ SDValue FMul = Sel->getOperand(1);
+ return N->getFlags().hasAllowContract() &&
----------------
c-rhodes wrote:
> paulwalker-arm wrote:
> > Do flags on the fmul matter? It's the result of the fadd/fsub that's affected by the contraction and so I think only those nodes require the contract flag.
> >
> > I'm not totally sure but I do wonder if we need to also check for no-signed-zeros because for the equivalent reduction code -0.0 is the nop value.
> > Do flags on the fmul matter? It's the result of the fadd/fsub that's affected by the contraction and so I think only those nodes require the contract flag.
>
> Not entirely sure to be honest, the existing SVE patterns we have to combine fmul+fadd into fma don't kick in unless contract is also on the fmul: https://godbolt.org/z/xWsn7vs5f
>
> I checked some other targets (X86 and Power9) and they also don't combine unless contract is on the fmul, but there is a combine in AArch64 for `fmadd` that kicks in without contract on fmul: https://godbolt.org/z/rzzTb8s9W
>
> > I'm not totally sure but I do wonder if we need to also check for no-signed-zeros because for the equivalent reduction code -0.0 is the nop value.
>
> Not sure either, I'll look into it.
>
> > I'm not totally sure but I do wonder if we need to also check for no-signed-zeros because for the equivalent reduction code -0.0 is the nop value.
>
> Not sure either, I'll look into it.
>
I think I understand the issue now.
``` printf("%g %g\n", 0.0f + 0.0f, -0.0f + 0.0f);
printf("%g %g\n", 0.0f - 0.0f, -0.0f - 0.0f);
printf("%g %g\n", 0.0f * 0.0f, -0.0f * 0.0f);```
gives:
```0 0
0 -0
0 -0```
so fadd produces different result and is unsafe with no-sign zeroes. Alive2 agrees:
| op | signed zeroes | no-signed zeroes |
| fadd | https://alive2.llvm.org/ce/z/qfBana | https://alive2.llvm.org/ce/z/wbhJh_ |
| fsub | https://alive2.llvm.org/ce/z/wqkSwC | N/A |
| fmul | https://alive2.llvm.org/ce/z/88Z_AG | https://alive2.llvm.org/ce/z/qig4sU |
nsz is required for the fadd/sel and fadd/sel/fmul (FMLA) patterns. Although the fmul/sel patterns aren't valid according to Alive2.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130564/new/
https://reviews.llvm.org/D130564
More information about the llvm-commits
mailing list