[PATCH] D29338: AMDGPU: Basic folds for fmed3 intrinsic

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 27 11:55:21 PST 2017


arsenm added a comment.

In https://reviews.llvm.org/D29338#687325, @artem.tamazov wrote:

> Looks good, but IEEE-754 correctness needs to be verified. **Is IEEE compliance required for llvm.amdgcn.fmed3.f32? **If it is, we shall look to formal definition of fmed3 and check carefully.
>
> For example, transformations like fmed3(0.0, 1.0, x) -> fmed3(x, 0.0, 1.0) may be non-IEEE-compliant w.r.t. sNANs when shader is in IEEE mode. That depends on expected semantics of fmed3, of course. For example, this is how V_MED3_F semantics is defined for Gfx8:
>
>   If (isNan(Src0) || isNan(Src1) || isNan(Src2))
>     Result = MIN3(Src0, Src1, Src2)
>   Else if (MAX3(Src0, Src1, Src2) == Src0)
>     Result = MAX(Src1, Src2)
>   Else if (MAX3(Src0, Src1, Src2) == Src1)
>     Result = MAX(Src0, Src2)
>   Else
>     Result = MAX(Src0, Src1)
>


It should match the instruction behavior, but we don't necessarily care about it treating signaling NaNs correctly though. LLVM in general isn't aware of them and breaks their behavior everywhere. The new constrained FP intrinsics should be aware of proper snan behavior though. When we have a complete set of constrained FP intrinsics and when people start using them, we could add a constrained version which would need to properly handle sNaNs. As far as this intrinsic is concerned, as long as it preserves general NaN behavior ignoring quieting etc. that should OK


https://reviews.llvm.org/D29338





More information about the llvm-commits mailing list