[PATCH] D21284: Fold fmin(nnan x, inf) -> x, fmax(nnan x, -inf) -> x, fmax(nnan ninf x, -flt_max) -> x and fmin(nnan ninf x, flt_max) -> x

Sun Jun 19 16:22:44 PDT 2016

escha added a subscriber: escha.
escha added a comment.

Is this correct? I thought "nnan" on an instruction meant that we can optimize it assuming its inputs and outputs aren't NaN -- not that we can assume *for other instructions, that aren't fast-math* that *their inputs* aren't NaN.

So for example:

float foo(float x, float y) {
float z = x +nnan y; // pseudocode for a nnan add
return fmax(z, 1.0);
}

This optimization makes it legal for this function to return NaN even though fmax isn't fast-math. I don't think (from a quick grep) that we treat nnan or the other fast-math flags in this way anywhere else in LLVM. That is, if op A is not fast-math, and op B is fast-math, and op B is an argument to op A, it feels very odd to be able to optimize op A in a fast-math fashion using this knowledge.

More practically, I think this may actually break our existing use-case: we use fmax/fmin to implement NaN-flushing behavior in shaders, i.e. we allow users to use fmax/fmin/clamp to get rid of NaNs even though fast-math is on in other respects. If we allow this NaN-flushing behavior to be violated, fmax/fmin will no longer do what we need.

Repository:
  rL LLVM

http://reviews.llvm.org/D21284