[llvm] a253a2a - [SDAG] fold fsub -0.0, undef to undef rather than NaN

Mon Feb 24 07:09:33 PST 2020

On Mon, Feb 24, 2020 at 6:00 PM JuneYoung Lee <aqjune at gmail.com> wrote:
>
> Hi,
> I think the optimization is correct when fsub has nnan:
>
>  %r = fsub nnan half %x, undef
>  ret %r
> =>
>  ret undef
>
> The undef in src can be chosen to be NaN, causing %r to be poison.
I believe this is true for all the fp ops, not just fsub.

> Juneyoung Lee
>
> 2020년 2월 24일 (월) 오후 10:55, Roman Lebedev <lebedev.ri at gmail.com>님이 작성:
>>
>> On Mon, Feb 24, 2020 at 4:39 PM Sanjay Patel <spatel at rotateright.com> wrote:
>> >
>> >
>> >
>> > On Sun, Feb 23, 2020 at 11:41 AM Roman Lebedev <lebedev.ri at gmail.com> wrote:
>> >>
>> >> On Sun, Feb 23, 2020 at 7:37 PM Sanjay Patel via llvm-commits
>> >> <llvm-commits at lists.llvm.org> wrote:
>> >> >
>> >> >
>> >> > Author: Sanjay Patel
>> >> > Date: 2020-02-23T11:36:53-05:00
>> >> > New Revision: a253a2a793cda34d1f6421ee9b7ca76a03fdfc59
>> >> >
>> >> > URL: https://github.com/llvm/llvm-project/commit/a253a2a793cda34d1f6421ee9b7ca76a03fdfc59
>> >> > DIFF: https://github.com/llvm/llvm-project/commit/a253a2a793cda34d1f6421ee9b7ca76a03fdfc59.diff
>> >> >
>> >> > LOG: [SDAG] fold fsub -0.0, undef to undef rather than NaN
>> >> >
>> >> > A question about this behavior came up on llvm-dev:
>> >> > http://lists.llvm.org/pipermail/llvm-dev/2020-February/139003.html
>> >> > ...and as part of backend improvements in D73978.
>> >>
>> >> > We decided not to implement a more general change that would have
>> >> > folded any FP binop with nearly arbitrary constant + undef operand
>> >> > to undef because that is not theoretically correct (even if it is
>> >> > practically correct).
>> >> This is a bit too pessimistic. Alive experiments show that it would be fine
>> >> in general at least for the fsub, as long as we don't have NaN's
>> >
>> >
>> > Maybe I'm not seeing it correctly. We said this:
>> >     fsub float 4.000000, undef --> undef
>> >
>> > ...is invalid in D74713, and that's independent of "nnan". Ie, we can't produce some tiny number like a denormal given a known constant like "4.0". So it would be wrong to say the output is fully undef - there is some range of values that can never be produced given (just about?) any fixed FP constant and any of the FP opcodes.
>> Hm. As per current alive, it's only invalid if we are required to handle NaN's:
>>
>> ----------------------------------------
>> Name: t0
>>  %r = fsub nnan half %x, undef
>>  ret %r
>> =>
>>  ret undef
>>  %r = fsub nnan half %x, undef
>>
>> Done: 1
>> Optimization is correct!
>>
>> ----------------------------------------
>> Name: t1
>>  %r = fsub nnan half undef, %x
>>  ret %r
>> =>
>>  ret undef
>>  %r = fsub nnan half undef, %x
>>
>> Done: 1
>> Optimization is correct!
>>
>> ----------------------------------------
>> Name: t1
>>  %r = fsub nnan half 4.000000, undef
>>  ret %r
>> =>
>>  ret undef
>>  %r = fsub nnan half 4.000000, undef
>>
>> Done: 1
>> Optimization is correct!
>>
>> ----------------------------------------
>> Name: t1
>>  %r = fsub half 4.000000, undef
>>  ret %r
>> =>
>>  ret undef
>>  %r = fsub half 4.000000, undef
>>
>> ERROR: Value mismatch
>>
>> Example:
>> half %r = NaN   [based on undef value]
>> Source value: NaN
>> Target value: #x8500 (-0.000076293945?)
>>
>>
>> @Nuno: is this a bug in alive?
>>
>> >>
>> >> > This is the SDAG-equivalent to the IR change in D74713.
>> >> >
>> >> > Added:
>> >> >
>> >> >
>> >> > Modified:
>> >> >     llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>> >> >     llvm/test/CodeGen/X86/vec_fneg.ll
>> >> >
>> >> > Removed:
>> >> >
>> >> >
>> >> >
>> >> > ################################################################################
>> >> > diff  --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>> >> > index e809816d68be..d2db4bccc3ac 100644
>> >> > --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>> >> > +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
>> >> > @@ -5112,8 +5112,13 @@ SDValue SelectionDAG::foldConstantFPMath(unsigned Opcode, const SDLoc &DL,
>> >> >    }
>> >> >
>> >> >    switch (Opcode) {
>> >> > -  case ISD::FADD:
>> >> >    case ISD::FSUB:
>> >> > +    // -0.0 - undef --> undef (consistent with "fneg undef")
>> >> > +    if (N1CFP && N1CFP->getValueAPF().isNegZero() && N2.isUndef())
>> >> > +      return getUNDEF(VT);
>> >> > +    LLVM_FALLTHROUGH;
>> >> > +
>> >> > +  case ISD::FADD:
>> >> >    case ISD::FMUL:
>> >> >    case ISD::FDIV:
>> >> >    case ISD::FREM:
>> >> >
>> >> > diff  --git a/llvm/test/CodeGen/X86/vec_fneg.ll b/llvm/test/CodeGen/X86/vec_fneg.ll
>> >> > index 4d5539feef3c..c3c1932c2311 100644
>> >> > --- a/llvm/test/CodeGen/X86/vec_fneg.ll
>> >> > +++ b/llvm/test/CodeGen/X86/vec_fneg.ll
>> >> > @@ -76,12 +76,10 @@ define <4 x float> @fneg_undef(<4 x float> %Q) nounwind {
>> >> >  define <4 x float> @fsub_neg0_undef_elts_undef(<4 x float> %x) {
>> >> >  ; X32-SSE-LABEL: fsub_neg0_undef_elts_undef:
>> >> >  ; X32-SSE:       # %bb.0:
>> >> > -; X32-SSE-NEXT:    movaps {{.*#+}} xmm0 = <NaN,u,u,NaN>
>> >> >  ; X32-SSE-NEXT:    retl
>> >> >  ;
>> >> >  ; X64-SSE-LABEL: fsub_neg0_undef_elts_undef:
>> >> >  ; X64-SSE:       # %bb.0:
>> >> > -; X64-SSE-NEXT:    movaps {{.*#+}} xmm0 = <NaN,u,u,NaN>
>> >> >  ; X64-SSE-NEXT:    retq
>> >> >    %r = fsub <4 x float> <float -0.0, float undef, float undef, float -0.0>, undef
>> >> >    ret <4 x float> %r
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > llvm-commits mailing list
>> >> > llvm-commits at lists.llvm.org
>> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits