[PATCH] D29338: AMDGPU: Basic folds for fmed3 intrinsic

Artem Tamazov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 27 07:16:20 PST 2017


artem.tamazov added a comment.

Looks good, but IEEE-754 correctness needs to be verified. **Is IEEE compliance required for llvm.amdgcn.fmed3.f32? **If it is, we shall look to formal definition of fmed3 and check carefully.

For example, transformations like fmed3(0.0, 1.0, x) -> fmed3(x, 0.0, 1.0) may be non-IEEE-compliant w.r.t. sNANs when shader is in IEEE mode. That depends on expected semantics of fmed3, of course. For exmaple, this is how V_MED3_F semantics is defined for Gfx8:

  If (isNan(Src0) || isNan(Src1) || isNan(Src2))
    Result = MIN3(Src0, Src1, Src2)
  Else if (MAX3(Src0, Src1, Src2) == Src0)
    Result = MAX(Src1, Src2)
  Else if (MAX3(Src0, Src1, Src2) == Src1)
    Result = MAX(Src0, Src2)
  Else
    Result = MAX(Src0, Src1)


https://reviews.llvm.org/D29338





More information about the llvm-commits mailing list