[PATCH] D130564: [AArch64][SVE] Add patterns to select masked FP arith

Cullen Rhodes via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 2 03:28:00 PDT 2022


c-rhodes added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:373-374
+    return true;  // it's the intrinsic
+  SDValue Sel = N->getOperand(2);
+  SDValue FMul = Sel->getOperand(1);
+  return N->getFlags().hasAllowContract() &&
----------------
c-rhodes wrote:
> paulwalker-arm wrote:
> > Do flags on the fmul matter?  It's the result of the fadd/fsub that's affected by the contraction and so I think only those nodes require the contract flag.
> > 
> > I'm not totally sure but I do wonder if we need to also check for no-signed-zeros because for the equivalent reduction code -0.0 is the nop value.
> > Do flags on the fmul matter?  It's the result of the fadd/fsub that's affected by the contraction and so I think only those nodes require the contract flag.
> 
> Not entirely sure to be honest, the existing SVE patterns we have to combine fmul+fadd into fma don't kick in unless contract is also on the fmul: https://godbolt.org/z/xWsn7vs5f
> 
> I checked some other targets (X86 and Power9) and they also don't combine unless contract is on the fmul, but there is a combine in AArch64 for `fmadd` that kicks in without contract on fmul: https://godbolt.org/z/rzzTb8s9W
> 
> > I'm not totally sure but I do wonder if we need to also check for no-signed-zeros because for the equivalent reduction code -0.0 is the nop value.
> 
> Not sure either, I'll look into it.
> 
> > I'm not totally sure but I do wonder if we need to also check for no-signed-zeros because for the equivalent reduction code -0.0 is the nop value.
> 
> Not sure either, I'll look into it.
> 

I think I understand the issue now.
```    printf("%g %g\n",  0.0f + 0.0f, -0.0f + 0.0f);
    printf("%g %g\n",  0.0f - 0.0f, -0.0f - 0.0f);
    printf("%g %g\n",  0.0f * 0.0f, -0.0f * 0.0f);```

gives:
```0 0
0 -0
0 -0```

so fadd produces different result and is unsafe with no-sign zeroes. Alive2 agrees:

| op | signed zeroes | no-signed zeroes |
| fadd | https://alive2.llvm.org/ce/z/qfBana | https://alive2.llvm.org/ce/z/wbhJh_ |
| fsub | https://alive2.llvm.org/ce/z/wqkSwC | N/A |
| fmul | https://alive2.llvm.org/ce/z/88Z_AG | https://alive2.llvm.org/ce/z/qig4sU |

nsz is required for the fadd/sel and fadd/sel/fmul (FMLA) patterns. Although the fmul/sel patterns aren't valid according to Alive2. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130564/new/

https://reviews.llvm.org/D130564



More information about the llvm-commits mailing list