[PATCH] D49561: AMDGPU: Try to make isKnownNeverSNan more accurate

Fri Jul 20 13:06:27 PDT 2018

arsenm added a comment.

In https://reviews.llvm.org/D49561#1170203, @rampitec wrote:

> In https://reviews.llvm.org/D49561#1169653, @arsenm wrote:
>
> > In https://reviews.llvm.org/D49561#1169015, @rampitec wrote:
> >
> > > In https://reviews.llvm.org/D49561#1168791, @arsenm wrote:
> > >
> > > > In https://reviews.llvm.org/D49561#1168643, @rampitec wrote:
> > > >
> > > > > I would tend to say isKnownNeverSNan here basically tells not that it cannot be sNaN, but rather that we do not care even if it is. At least we should not care if FP exceptions are off.
> > > >
> > > >
> > > > Except that's not true since there can be sNaNs can be loaded, so even if exceptions aren't handled they need to be dealt with
> > >
> > >
> > > Do you mean we shall quiet snans with an fmin/fmax even if exception handling is off? My understanding was different. In fact I thought we can completely ignore them with say fast math.
> >
> >
> > Yes. My understanding is the exception handling aspect is different from the existence of sNaNs, and also orthogonal to enabling fast math. If exception handling is enabled, operations that can produce sNaNs will, but there can still be sNaNs loaded from memory, constants etc. I think if the function is marked no-nans, we can maybe assume this won't happen? The fast math flags aren't sufficient there because arbitrary operations like a load or function argument don't have an associated flag.
> >
> > With ieee_mode enabled (which is currently the default) v_min_f32 et. al.  a qNaN is returned if either input is an sNaN. If ieee_mode is off, it returns the non-NaN operand as-if it were a qNaN. I think for what we actually want, ieee_mode is harmful since it requires the library implementation of OpenCL's fmin to insert canonicalizes to quiet the inputs. Since LLVM doesn't properly handle sNaNs anywhere, I think enabling this is a bit pointless. However, whether it's on or not, I think we need to to get sNaN behavior correct at least for this one operation in order to be able to optimize out redundant canonicalizes.
>
>
> More general, we have several questions:
>
> 1. Should we care about sNaNs with FP exceptions disabled?

Yes. This has to work. The OpenCL conformance tests check for this

> 2. Should we care about sNaNs if a node is known never NaN?

Knowing it's never nan is how you know you don't need to care about it

> 3. Should we care about sNaNs if no-nans is enabled?

I think this is a grey area, but probably not.

> 4. Should we care about sNaNs if fast-math is enabled?
> 5. Should we turn ieee_mode off? Let’s see:
> 
> 
> 
> 1. Description of setHasFloatingPointExceptions():
> 
>   ``` /// Tells the code generator that this target supports floating point /// exceptions and cares about preserving floating point exception behavior. ``` I read it this way: if we say no here, we do not care about preserving floating point exception behavior. I.e. we may assume there are no sNaNs even if they are present.

This is not true. Assuming this will break the fmin/fmax conformance tests. This is now just something we always turn on. We don't enable trap on FP exception, but that doesn't mean that there aren't sNaNs somewhere that need to be handled correctly. This property also should probably be removed, since things are moving to relying on the STRICT_* versions of operations to get this behavior

> 
> 
> 2. sNaN is a signaling NaN, so it is a NaN. If a node is known to be not a NaN, it cannot be its signaling form as well.

Yes

> 3. -enable-no-nans-fp-math and derivatives: ``` /// NoNaNsFPMath - This flag is enabled when the /// -enable-no-nans-fp-math flag is specified on the command line. When /// this flag is off (the default), the code generator is not allowed to /// assume the FP arithmetic arguments and results are never NaNs. ``` So then if that flag is on we are allowed to assume not only FP arithmetic results are never NaN, but also arguments. Even if they are and even if they are loaded or constant sNaN/NaN.

Yes

> 4. UnsafeAlgebra includes NoNaNs and thus all of the above applies.

I do not agree with this. The per-instruction flags have decoupled the unsafe algebra properties from no-nans, and the per-function/global flags should follow suit.

https://reviews.llvm.org/D49561