[LLVMdev] [PATCH][RFC]: Add fmin/fmax intrinsics
Matt Arsenault
arsenm2 at gmail.com
Wed Aug 13 16:38:34 PDT 2014
Hi,
I’d like to re-propose adding intrinsics for fmin / fmax. These can be used to implement the equivalent libm functions as defined in C99 and OpenCL, which R600 and AArch64 at least have instructions with the same semantics. This is not equivalent to a simple fcmp + select due to its handling of NaNs.
This has been proposed before, but never delivered (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-December/057128.html)
To summarize:
1. If either operand is a NaN, returns the other operand
2. If both operands are NaN, returns NaN
3. If the operands are equal, returns a value that will compare equal to both arguments
4. In the normal case, returns the smaller / larger operand
5. Ignore what to do for signaling NaNs, since that’s what the rest of LLVM does currently anyway
- Handling of fmin/fmax (+/- 0.0, +/- 0.0)
Point 3 is worded as such because this doesn’t seem particularly well specified by any standard I’ve looked at. The most explicit mention of this I’ve found is a footnote in C99 that “Ideally, fmax would be sensitive to the sign of zero, for example fmax(-0.0, 0.0) would return +0; however, implementation in software might be impractical.” It doesn’t really state what the expected behavior is. glibc and OS X’s libc disagree on the (+0, -0) and (-0, +0) cases. To resolve this, the semantics of the intrinsic will be that either will be OK as long as the result compares equal.
For the purposes of constant folding, I’ve tried to follow the literal wording which was most explicit for the expected result from OpenCL (http://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/fmin.html) and taking the comparison +/-0.0 < +/-0.0 will fail.
This means the constant folded results will be:
fmin(0.0, 0.0) = 0.0
fmin(0.0, -0.0) = 0.0
fmin(-0.0, 0.0) = -0.0
fmin(-0.0, -0.0) = -0.0
Other options would be to always use +0.0, or to be sensitive to the sign and claim -0.0 is less than 0.0.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-fmin-fmax-intrinsics.patch
Type: application/octet-stream
Size: 84229 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140813/2b43526b/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Add-basic-fmin-fmax-instcombines.patch
Type: application/octet-stream
Size: 8386 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140813/2b43526b/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Fold-fmin-fmax-with-infinities.patch
Type: application/octet-stream
Size: 4096 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140813/2b43526b/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-Move-fmin-fmax-constant-folding-logic-into-APFloat.patch
Type: application/octet-stream
Size: 4061 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140813/2b43526b/attachment-0003.obj>
More information about the llvm-dev
mailing list