[PATCH] D11866: transform fmin/fmax calls when possible (PR24314)

hfinkel@anl.gov via llvm-commits llvm-commits at lists.llvm.org
Sat Aug 15 12:36:17 PDT 2015


hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM (please add the comment about the signed zeros)

In http://reviews.llvm.org/D11866#225065, @spatel wrote:

> In http://reviews.llvm.org/D11866#219873, @jmolloy wrote:
>
> > This will be important when my latest patches finally land and x86 uses the new ISD::FMINNAN nodes (which represent the semantics of minss precisely).
>
>
> It doesn't affect this patch, but unfortunately, I don't think x86 can use the new DAG nodes. The x86 min/max insts don't conform to either DAG node definition. The hardware instructions may or may not return a NaN operand. From the Intel manual:
>
>   MAX(SRC1, SRC2)
>   {
>      IF ((SRC1 = 0.0) and (SRC2 = 0.0)) THEN DEST <- SRC2;
>      ELSE IF (SRC1 = SNaN) THEN DEST <- SRC2; FI; 
>      ELSE IF SRC2 = SNaN) THEN DEST <- SRC2; FI; 
>      ELSE IF (SRC1 > SRC2) THEN DEST <- SRC1; 
>      ELSE DEST <- SRC2;
>      FI;
>    }
>   
>
> The precise behavior of these instructions is tested quite thoroughly in:
>  test/CodeGen/X86/sse-minmax.ll


Alright, so if either operand is NaN, then it will return the second operand. Especially considering that the current SDAG nodes are all tagged as being commutative, we certainly can't match that, at least not with a single instruction (if we're not currently operating in nnans mode).


================
Comment at: lib/Transforms/Utils/SimplifyLibCalls.cpp:1221
@@ +1220,3 @@
+    if (Attr.getValueAsString() != "true")
+      return nullptr;
+    // No-signed-zeros is implied by the definitions of fmax/fmin themselves.
----------------
spatel wrote:
> hfinkel wrote:
> > Actually, why do we need no NaNs? We don't support FP exceptions, so we only need to do the correct thing with NaN arguments (by returning the non-NaN). This should be easy to guarantee by picking the right ordered vs. unordered fcmp predicate.
> > 
> An unordered compare would let us know that at least one operand is NaN, but not which one. So we'd have to check each operand for NaN. We'd be rewriting an fmax() implementation in IR?
Good point. Doing so may be a good idea, but we can deal with that later. We'd obviously need to do separate benchmarking.

================
Comment at: lib/Transforms/Utils/SimplifyLibCalls.cpp:1223
@@ +1222,3 @@
+    // No-signed-zeros is implied by the definitions of fmax/fmin themselves.
+    FMF.setNoSignedZeros();
+    FMF.setNoNaNs();
----------------
spatel wrote:
> hfinkel wrote:
> > What in the definition implies this?
> The C standard is silent about signed zeros for these, but says this in a footnote:
> "Ideally, fmax would be sensitive to the sign of zero, for example fmax(−0. 0, +0. 0) would return +0; however, implementation in software might be impractical."
> 
> Should we add that here in the comment?
Yes.


http://reviews.llvm.org/D11866





More information about the llvm-commits mailing list