[PATCH] D29338: AMDGPU: Basic folds for fmed3 intrinsic
Artem Tamazov via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 27 07:16:20 PST 2017
artem.tamazov added a comment.
Looks good, but IEEE-754 correctness needs to be verified. **Is IEEE compliance required for llvm.amdgcn.fmed3.f32? **If it is, we shall look to formal definition of fmed3 and check carefully.
For example, transformations like fmed3(0.0, 1.0, x) -> fmed3(x, 0.0, 1.0) may be non-IEEE-compliant w.r.t. sNANs when shader is in IEEE mode. That depends on expected semantics of fmed3, of course. For exmaple, this is how V_MED3_F semantics is defined for Gfx8:
If (isNan(Src0) || isNan(Src1) || isNan(Src2))
Result = MIN3(Src0, Src1, Src2)
Else if (MAX3(Src0, Src1, Src2) == Src0)
Result = MAX(Src1, Src2)
Else if (MAX3(Src0, Src1, Src2) == Src1)
Result = MAX(Src0, Src2)
Else
Result = MAX(Src0, Src1)
https://reviews.llvm.org/D29338
More information about the llvm-commits
mailing list